Write Balancing

Flash vs. DRAM in PCs – Flash Wins

Chart from Objective Analysis report: How PC NAND will Undermine DRAMSome time ago Objective Analysis ran nearly 300 standard benchmarks on a PC with varying amounts of flash and DRAM and found that a dollar’s worth of flash provided a greater performance boost than a dollar’s worth of DRAM once the DRAM size grew above a certain minimum (1-2GB) depending on the benchmark.

You might wonder how this could possibly be true.  Everyone knows that best way to improve any computing system’s performance is to add DRAM main memory.  How could flash, which is orders of magnitude slower than DRAM, provide a bigger performance boost than DRAM?

It all makes sense if you think of the DRAM of something that is there only to make the HDD look faster.  More is better, but if you can use a little less DRAM and add a large flash memory layer then disk accesses appear to speed up even more.

The benchmark data and the price/performance findings that are Continue reading

Baidu Goes Beyond SSDs

Baidu's SDF: Software-Defined FlashI have to admit that it’s embarrassing when The SSD Guy misses something important in the world of flash storage, but I only recently learned of a paper that Baidu, China’s leading search engine, presented at the ASPLOS conference a year ago.  The paper details how Baidu changed the way they use flash to gain significant benefits over their original SSD-based systems.

After having deployed 300,000 standard SSDs over the preceding seven years, Baidu engineers looked for ways to achieve higher performance and more efficient use of the flash they were buying.  Their approach was to strip the SSD of all functions that could be better performed by the host server, and to reconfigure the application software and operating system to make the best of flash’s idiosyncrasies.

You can only  do this if you have control of both the system hardware and software.

The result was SDF, or “Software-Defined Flash”, a card that Continue reading

How Controllers Maximize SSD Life – Internal NAND Management

Tempus FugitGiven that you have used all those other forms of improving SSD wear that we have discussed so far, but you still don’t find that this is enough, what do you do next?  Well a few SSD controllers go one step further and manage some of the inner workings of the NAND flash chip itself.

If that sounds like a significant undertaking to you, then you clearly understand why so very few controllers take this approach.  The information used to perform this function is not generally available – it takes a special relationship with the NAND flash supplier – and you can’t develop this relationship unless the NAND supplier Continue reading

How Controllers Maximize SSD Life – Feedback on Block Wear

Tempus FugitOne way that SSD controllers maximize the life of an SSD is to use feedback on the life of flash blocks to determine how wear has impacted them.  Although this used to be very uncommon, it is now being incorporated into a number of controllers.

Here’s what this is all about: Everybody knows that endurance specifications tell how much life there is in a block, right?  For SLC it is typically 100,000 erase/write cycles, and for MLC it can be as high as 10,000 cycles (for older processes) but goes down to 5,000 or even 3,000 for newer processes.  TLC endurance can be in the hundreds of cycles.  Now the question is: “What happens after that?”

In most cases individual bits start to Continue reading

How Controllers Maximize SSD Life – Over Provisioning

Tempus FugitOver provisioning is one of the most common ways that SSD designers can help assure that an SSD has a longer life than the flash’s endurance rating would support.  If an SSD contains more flash than is presented at its interface, the controller can manage wear across a larger number of blocks while at the same time accelerating disk performance by moving slow operations like block erases out of the way of the SSD’s key functions.

Many people like to compare wear leveling to rotating a car’s tires.  In this vein, think of over provisioning as having a bunch of spare Continue reading

How Controllers Maximize SSD Life – Reduced Write Amplification

Tempus FugitWrite amplification plays a critical role in maximizing an SSD’s usable life.  The lower the write amplification, the longer the SSD will last.  SSD architects pay special attention to this aspect of controller design.

Unlike the other factors described in this series this is not a technique that extends flash life beyond the 10,000 erase/write cycles that one would normally expect to result in a failure, but it is very important to SSD longevity.

Write Amplification is sufficiently complex that I won’t try to define it in this post, but Continue reading

How Controllers Maximize SSD Life – Other Error Management

Tempus FugitThere are more advanced means than simple error correction to help remove bit errors in NAND flash and those will be the subject of this post.  The general term for this approach is “DSP” although it seems to have very little to do with the kind of DSP algorithm used to perform filtering or build modem chips.

While ECC corrects errors without knowing how they got there, DSP helps to correct any of the more predictable errors that are caused by internal error mechanisms that are inherent to the design of the chip.  A prime example of such an error would be adjacent cell disturb.

Here’s a brief explanation of Continue reading

How Controllers Maximize SSD Life – Improved ECC

Tempus FugitError correction (ECC) can have a very big impact on the longevity of an SSD, although few understand how such a standard item can make much difference to an SSD’s life.  The SSD Guy will try to explain it in relatively simple terms here.

All NAND flash requires ECC to correct random bit errors (“soft” errors.)  This is because the inside of a NAND chip is very noisy and the signal levels of bits passed through a NAND string are very weak.  One of the ways that NAND has been able to become the cheapest of all memories is by requiring error correction external to the chip.

This same error correction also helps to correct bit errors due to wear.  Wear can cause bits to become stuck in one state or the other (a “hard” error), and it can increase the frequency of soft errors.

Although it is not widely Continue reading

How Controllers Maximize SSD Life – External Data Buffering

Tempus FugitSince NAND flash is weakened by erase/write cycles then it would make sense to try to reduce those cycles to prolong the life of an SSD right?  That’s what external data buffers are designed to do.

There are many ways to use RAM (either a RAM internal to the SSD controller chip or a discrete DRAM chip on the SSD’s printed circuit card) to stage data in a way that will reduce erase/write cycles.

One is to perform a function called “Write Coalescing.”  This involves Continue reading

How Controllers Maximize SSD Life – Better Wear Leveling

Tempus FugitIn this post we will explore how the right wear leveling algorithm  can help a controller maximize the life of an SSD.

Wear leveling is a fact of life with NAND flash – blocks start to suffer bit failures after a certain number of erase/write cycles (usually specified from the thousands to the hundreds of thousands) and it is only natural that software will attempt to over-write some blocks more than others.  In order to prevent this from causing failures, all of today’s SSD, USB flash drive, and flash card controllers incorporate some sort of wear leveling.

This is a simple re-mapping of the contents of the flash chips.  A more graphical explanation is Continue reading