LSI SandForce SSD Controllers Move the Knee in the Curve

LSI SandForce SF3700 Controller DuraWrite ImprovementsLSI’s SandForce has just rolled out its SF3700 family of four SSD controllers aimed at the Entry Client, Mainstream Client, Value Enterprise, and Enterprise Storage marketplaces. Performance is impressive, with worst-case random PCIe IOPS at 150K read/81K write and 94K/46K for the SATA interface.

The SF3700 family builds on the division’s first two product families by adding a choice of PCIe or SATA interfaces, LDPC error correction, and a boosted set of flash management features.  The SSD Guy will explore this last point after highlighting the other two.

By providing both PCIe and SATA interfaces LSI is directly addressing the future: PCs are aiming to move to the m.2 SSD specification rather than to “Fill the hole” left by a missing 2.5″ HDD.  This helps to shrink PC form factors while boosting the interface speed a bit.

In addition, the high-end version of the 3700 supports high-performance PCIe SSDs similar to those that LSI already ships.  Rather than build such cards using a PCIe HBA and four earlier-generation SandForce controllers, the SF3700 combines these into a single chip for better economy and performance.

LSI also benefits by using a single piece of silicon to support four different levels of product.  This helps the company to achieve better economies of scale than an approach using different chip designs.  I have been told that the product is supported with 300,000 lines of firmware that is a large part of the differentiation between the high end and the low end.

So, what about LDPC?  This next-generation error correction method is poised to become a standard offering in SSD controllers as future generations of NAND continue to become more error prone.  LSI tells me that some controller designs get LDPC wrong, even to the point that it is worse at correction than its predecessor BCH.  One benefit of LSI’s acquisition of SandForce is that LSI already had significant expertise in LDPC through its production of leading HDD controllers, so porting this understanding to an SSD controller was pretty straightforward.

When I think about SSD controller companies with HDD read channel experience in LDPC I only come up with a few names: LSI, Marvell, Anobit (acquired by Apple), and Link_A_Media (acquired by SK hynix).  All of these companies are in a good position to provide quality LDPC solutions without any false starts.

Now, finally, let’s discuss those flash management features and the headline of this post.  What has LSI done to its famed DuraWrite, SHIELD, and RAISE technologies to improve flash lifetimes?

First consider that SSD performance and lifetime are both improved through overprovisioning.  The more overprovisioning an SSD has, the lower the likelihood that a data access will step on the toes of a slower operation like a write or a block erase.  Also, those overprovisioned blocks will substitute for failed blocks as the SSD starts to wear out.

As the SSD wears and overprovisioning declines performance also declines and the wear on the remaining blocks increases, causing a kind of snowball effect, or a knee in the curve in flash wear.

Remember that DuraWrite involves compression, and that means that compressible data will take up less space in flash than the data’s actual size.  This means that an LSI controller will have a higher level of overprovisioning for compressible data than was actually designed into the SSD.  DuraWrite pushes out the knee in the curve, and LSI tells me that the technology has been improved to push that knee out even further.

LSI’s SHIELD error correction technology has been converted to use an adaptive code rate.  Blocks that have low error rates don’t need as many syndrome bytes (the spare bytes used to check and correct errors) so the bytes that aren’t needed are temporarily added to the overprovisioning pool.  When the block becomes more error prone, the error correction scheme is changed for that block, and more bytes are added to error correction, reducing the overprovisioning.

You can see where I am headed with this: by putting those extra bytes into the overprovisioning pool, performance is further improved and wear is further reduced.

The net result is longer wear even with inferior flash.  It’s pretty interesting, and it looks like LSI has a product that should sell very well.