SSD Reliabiity

Kaminario’s Performance and 7-Year Flash Life Warranties

Kaminario LogoToday Kaminario added a performance guarantee and a 7-year warranty to its arsenal.  The company introduced its “Consistency Under Failure Guarantee” which ensures customers will see no more than a 25% drop in performance during a system failure.  This means that critical operations and applications can continue to run at near-standard performance despite the failure of an SSD or even an entire node.  Kaminario president Dani Golan told The SSD Guy last week that this is a conservative guarantee, and that few customers see more than a 10% degradation during failure tests in their own production systems.

As for the 7-year flash endurance warranty, no matter which SSD Continue reading

Are HDDs Vibration Sensitive?

Brendan Gregg in Sun's Fishworks Lab shouting at an HDD arrayOne reason to use SSDs is that, with no moving parts, these devices are insensitive to shock and vibration.  HDDs, on the other hand, are sensitive enough to vibration that it can cause access delays.

How sensitive are they?  Well, I have seen some overblown claims from SSD makers that shock will cause HDD head crashes.  I am not sure that I believe such claims, but I certainly do believe that an HDD’s actuator (the read/write head mechanism) can be shaken away from its track, causing a Continue reading

LSI’s Take on Data Center Flash

LSI Corporation LogoLSI Corp. has launched a new blog that covers (among other things) flash storage.  It’s only natural – the company’s SandForce subsidiary is riding high on the SSD wave and LSI’s HBAs are finding widespread use, both internally and externally, in the production of two-hop PCIe SSDs.

A recent post called “What are the Driving Forces Behind Going Diskless” by LSI Fellow Rob Ober outlines the leading Continue reading

SSDs and TCO

Shed some light on your SSD decisionOne of the best arguments to use an SSD is also one of the most difficult ways to sell anything.  This is the Total Cost of Ownership, commonly abbreviated to “TCO.”

TCO has been used as an argument for buying anything from compact fluorescent bulbs to Jaguar automobiles.

The argument usually revolves around an item whose initial price is higher, but which has lower ongoing (or operating) costs, and when these costs are combined, the higher-priced item proves to cost less to own over the long run.  In the case of a compact fluorescent (CF) bulb, the bulb may cost $7, versus $1 for an incandescent bulb, but it consumes 18 Watts compared to the 75 Watts consumed by the incandescent bulb it replaces.  In addition the CF bulb lasts ten times as long (10,000 hours vs. 1,000 hours.)  This works out to a savings of 470 kWh – or about $50 – plus $3 in bulb costs. Continue reading

Extreme SSD Error Correction

Chuo University EmblemAt last week’s International Solid State Circuits Conference (ISSCC) Shuhei Tanakamaru, a researcher from Japan’s Chuo University, detailed a scheme to reduce MLC SSD bit error rates (BER) by 32 times over conventional techniques.  The approach used an impressive combination of mirroring, vertical and horizontal error correction, and a deep understanding of the most likely kinds of bit errors flash will experience.

This is a very novel and well-conceived technique that may find industry adoption in future SSDs.

The steps included in the paper are used in addition to the Continue reading

SSDs that Don’t Wear Out

The End of NAND Flash Wear?This is a bad day for The SSD Guy.  I just finished publishing an eight-part series explaining How Controllers Maximize SSD Life, then my evil twin The Memory Guy today published a post telling of a new flash design from Macronix that might just eliminate the flash wear-out mechanism!

But my concerns are inconsequential compared to the feelings of all those folks who have devoted phenomenal time and energy to develop wear management algorithms.

This all stems from an article in the IEEE Spectrum that details a flash chip design that Continue reading

How Controllers Maximize SSD Life – Internal NAND Management

Tempus FugitGiven that you have used all those other forms of improving SSD wear that we have discussed so far, but you still don’t find that this is enough, what do you do next?  Well a few SSD controllers go one step further and manage some of the inner workings of the NAND flash chip itself.

If that sounds like a significant undertaking to you, then you clearly understand why so very few controllers take this approach.  The information used to perform this function is not generally available – it takes a special relationship with the NAND flash supplier – and you can’t develop this relationship unless the NAND supplier Continue reading

How Controllers Maximize SSD Life – Feedback on Block Wear

Tempus FugitOne way that SSD controllers maximize the life of an SSD is to use feedback on the life of flash blocks to determine how wear has impacted them.  Although this used to be very uncommon, it is now being incorporated into a number of controllers.

Here’s what this is all about: Everybody knows that endurance specifications tell how much life there is in a block, right?  For SLC it is typically 100,000 erase/write cycles, and for MLC it can be as high as 10,000 cycles (for older processes) but goes down to 5,000 or even 3,000 for newer processes.  TLC endurance can be in the hundreds of cycles.  Now the question is: “What happens after that?”

In most cases individual bits start to Continue reading

How Controllers Maximize SSD Life – Over Provisioning

Tempus FugitOver provisioning is one of the most common ways that SSD designers can help assure that an SSD has a longer life than the flash’s endurance rating would support.  If an SSD contains more flash than is presented at its interface, the controller can manage wear across a larger number of blocks while at the same time accelerating disk performance by moving slow operations like block erases out of the way of the SSD’s key functions.

Many people like to compare wear leveling to rotating a car’s tires.  In this vein, think of over provisioning as having a bunch of spare Continue reading

How Controllers Maximize SSD Life – Reduced Write Amplification

Tempus FugitWrite amplification plays a critical role in maximizing an SSD’s usable life.  The lower the write amplification, the longer the SSD will last.  SSD architects pay special attention to this aspect of controller design.

Unlike the other factors described in this series this is not a technique that extends flash life beyond the 10,000 erase/write cycles that one would normally expect to result in a failure, but it is very important to SSD longevity.

Write Amplification is sufficiently complex that I won’t try to define it in this post, but Continue reading