How Controllers Maximize SSD Life – Over Provisioning

Tempus FugitOver provisioning is one of the most common ways that SSD designers can help assure that an SSD has a longer life than the flash’s endurance rating would support.  If an SSD contains more flash than is presented at its interface, the controller can manage wear across a larger number of blocks while at the same time accelerating disk performance by moving slow operations like block erases out of the way of the SSD’s key functions.

Many people like to compare wear leveling to rotating a car’s tires.  In this vein, think of over provisioning as having a bunch of spare tires (rather than just one) in the trunk of your car.  The more tires you have, the farther you can drive before having to buy a new set of tires, although at any one time you are never using more than four tires.  On the other hand, this pushes up the cost of a tire change a good bit.

Over provisioning also helps reduce write amplification.  When an SSD gets full the garbage collection algorithm works to consolidate partial blocks to provide more free space.  An SSD with a lot of spare blocks doesn’t need to go through this process as frequently as an SSD with few spare blocks.

(A related, but somewhat different way that blocks are freed is through the Trim command, which will be described in a future post.  The Trim command allows the operating system to tell the SSD which sectors are no longer useful.  This allows the SSD to erase unused blocks well in advance of their use, reducing the likelihood that the SSD will fill up.)

How much does over provisioning help? Based on inputs from a few companies The SSD Guy concludes that, for standard MLC NAND flash, 45% over provisioning will give you about twice the drive life of 20% over provisioning, and 75% over provisioning will extend the disk’s life to three times.  A full 100% over provisioning won’t get you quite as far as four times the drive life, but it comes close.  Increases in over provisioning show a trend of diminishing returns approaching zero at around 250% over provisioning (and about 6 times the life of that 20% over provisioned SSD.)

Many contend that 7% over provisioning is a standard for client SSDs.  This level would provide about 1/3rd of the lifetime of a 20% over provisioned SSD.

Something very weird happens with compression, which is used in SandForce controllers.  These controllers compress the data as it is being written into the NAND flash.  This means that the amount of over provisioning is a function of how compressible the data is.  Highly-compressible data will leave a lot of flash unused for data storage, so the SSD will have a very large amount of over provisioning, extending its wear significantly.  Incompressible data (usually data that has already been compressed, like MP3 files, JPEG photos, or video files) will consume most of the flash in the SSD leaving very little for over provisioning.  SSDs storing incompressible data will wear out after fewer disk writes.  Quite fortunately, applications with high write workloads do not usually manage these file types.

A note about the terminology: Some folks spell this as a single word – Overprovisioning – and even The SSD Guy is inconsistent about which way to write it.  Since it’s a term that began its days decades ago with RAID you would expect some standard to already be in place, but this seems not to be the case.  I suggest picking one that feels right to you and overlooking the inconsistency across the industry.

This post is part of a series published by The SSD Guy in September-November 2012 to describe the leading methods SSD architects use to get the longest life out of an SSD despite the limited number of erase/write cycles that NAND flash specifications guarantee.  The following list provides the names of all of these articles, and hot links to them:

Click on any of the above links to learn about how each of these techniques works.

Alternatively, you can visit the Storage Networking Industry Association (SNIA) website to download the entire series as a 20-page booklet in pdf format.

15 thoughts on “How Controllers Maximize SSD Life – Over Provisioning”

  1. I run Ubuntu Linux on my laptop. I just ordered an SSD. When it arrives and I install Ubuntu on it, can I configure the overprovisioning? I’m doubtful that it can be done from the OS once it has been installed on the drive… But what if I only use, say, 100GB of a 120GB drive for a file system, and leave the remainder as unused space? Will the drive then be “overprovisioned” by that amount, and thus exhibit a longer lifespan?

    1. Karl,

      Good question!

      Most people would assume that only using 100GB of the SSD would leave you with the other 20GB for overprovisioning, but I would monitor Ubuntu to see what it really does. I know that some applications programs spawn an increasing number of temporary files the longer you keep a file open, and I wouldn’t doubt that an O/S might do the same thing.

      Although you may

        think

      that you’re only using 100GB, but find that the entire SSD gets filled anyway!

      The best way to gain control of this situation would be to set the SSD’s overprovisioning internally – many SSDs allow yo to do this through a utility.

      Sorry for the slow reply! Hope this helps.

      Jim

  2. Thanks Jim, this has proven really helpful in understanding the basic principle of overprovisioning. I have a practical problem (that actually prompted me to do the search which landed me on your site): assuming I have multiple PCI Flash cards in a rack, all set to 100% overprovisioning, is the fact that I’m seeing one card at 35% a sign of its EOL? From what I can tell, the drop in overprovisioning percentages isn’t transient. Cheers.

    1. Dan,

      Thanks for the comment. Sorry I took a while to reply.

      I think the EOL issue is better addressed by monitoring the SSD’s health. Since you’re using PCIe you probably don’t have SMART attributes to monitor, but I believe that most PCIe SSDs have other monitoring mechanisms.

      I’m not the best guy to ask about this, though. Check with the manufacturer.

      Good luck!

      Jim

  3. “Something very weird happens with compression, which is used in SandForce controllers.”

    Yes, If the manufacturer makes the assumption that it can use very little over-provisioning, base on an assumption that the client data will be compressible, then it may have unfortunate consequences for the consumer.

    For example: TCG Opal, LUKS-Crypt, write incompressible data to the entire drive, always. If the entire drive space is partitioned, the OS doesn’t issue DISCARD for freed blocks, and the OS writes to all blocks on the partition, then the drive is going to suffer a large amount of device-level write amplification.

    Three things you don’t mention when it comes to over-provisioning:

    1) If when you partition the drive, you choose to set aside a portion of unpartitioned space, then you are effectively increasing the drive’s stock over-provisioning, which can perhaps increase a consumer drive’s endurance to something closer to enterprise level.

    2) If you parition the drive using something similar to Linux LVM, and only increase the storage volume sizes as required over the duration of the life of the drive, and assuming that you don’t hit drive capacity until late in the operational life of the drive, then you will have virtually increased the over-provisioning to something approaching ideal level for your use case.

    3) Even with a typical partitioning layout such as provided by Dell, HP, Lenovo, Acer etc for Windows PCs from Windows 7 onward, where there’s 5 partitions: boot, recovery, boot backup, OS, and Data, you will never actually fill all these partitions so your effective over-provisioning will always be better than the factory provided amount, so long as you do not use a product which encrypts your partitions.

Comments are closed.