Jim Handy

Failure is Not an Option — It’s a Requirement!

I was recently reminded of a presentation made by GoDaddy way back in the 2013 Flash Memory Summit in which I first heard the statement: “Failure is not an option — it is a requirement!”  That’s certainly something that got my attention!  It just sounded wrong.

In fact, this expression was used to describe a very pragmatic approach the company’s storage team had devised to determine the exact maximum load that could be supported by any piece of its storage system.

This is key, since, at the time, GoDaddy claimed to be the world’s largest web hosting service with 11 million users, 54 million domains registered, over 5 million hosting accounts, with a 99.9% uptime guarantee (although the internal goal was 99.999% – five nines!)

The presenters outlined four stages of how validation processes had Continue reading

Intel’s Optane: Two Confusing Modes. Part 3) App Direct Mode

Exploding HeadThis post is a continuation of a four part series in The SSD Guy blog to help explain Intel’s two recently-announced modes of accessing its Optane DIMM, formally known as the “Intel Optane DC Persistent Memory.”

App Direct Mode

Intel’s App Direct Mode is the more interesting of the two Optane operating modes since it supports in-memory persistence, which opens up a new and different approach to improve the performance of tomorrow’s standard software. While today’s software operates under the assumption that data can only be persistent if it is written to slow storage (SSDs, HDDs, the cloud, etc.) Optane under App Direct Mode allows data to persist at memory speeds, as also do other nonvolatile memories like NVDIMMs under the SNIA NVM Programming Model.

App Direct Mode implements the full SNIA NVM Programming Model described in an earlier SSD Guy post and allows software to Continue reading

HDD & SSD Combined Into One

Wafer Scale HDDThe SSD Guy has often explained to readers that the storage industry is caught between two alternatives:  fast and costly, or cheap and slow.  This is the key difference between SSDs and HDDs.  I have recently learned of a new secret government research effort, code named “SiliDisk,” that will provide the best of both worlds by marrying flash memory with the mechanics of an HDD.

The approach is incredibly ingenious, while remaining deceptively simple: All that is required is to replace the disks in an HDD with the wafers used to manufacture NAND flash.  Both are round, so there’s little engineering effort to switch from a magnetic disk to a flash wafer.

The NAND flash on the wafer is almost completely standard.  The only two changes are that the chips aren’t scribed or sawn apart, saving a small sum, but a hole must be etched through the center (which can be seen in the photo below) offsetting this savings.  The HDD mechanisms are unchanged with one exception: While today’s HDDs are largely manufactured using 2.5″ and 3.5″ platters (65mm & 90mm), NAND flash is exclusively produced on 300mm wafers.  This means that Continue reading

Intel’s Optane: Two Confusing Modes. Part 2) Memory Mode

Exploding HeadThis post is the second part of a four part series in The SSD Guy blog to help explain Intel’s two recently-announced modes of accessing its Optane DIMM, formally known as the “Intel Optane DC Persistent Memory.”

Memory Mode

The most difficult thing to understand about the Intel Optane DC Persistent Memory when used in Memory Mode is that it is not persistent.  Go back and read that again, because it didn’t make any sense the first time you read it.  It didn’t make any sense the second time either, did it?

Don’t worry.  This is not really important.  The difficulty stems from Intel’s marketing decision to call Optane DIMMs by the name “Intel Optane DC Persistent Memory.”  Had they simply called them “Optane DIMMs” like everyone expected them to then there would have been Continue reading

What is an SSD Trim Command?

TrimmerAlthough the Trim command has been defined for nearly a decade, for some reason I have never written a post to explain it.  It’s time for that to change.

Trim is something that was never required for HDDs, so it was a new command that was defined once SSDs became prevalent.  The command is required because of one of those awkward encumbrances that NAND users must accommodate: Erase before write.

NAND flash bits cannot be altered the same way as an HDD.  In an HDD a bit that’s currently set to a “1” can be re-written to a “0” and vice versa.  Writing a bit either way takes the same amount of time.  In NAND flash a 1 can be written to a zero, but the opposite is not the case.  Instead, the entire block (4-16k bytes) must be erased at once, after which all bits are set to a 1.  Once that has been done then zeros can be written into that block to store data.  An erase is an excruciatingly slow operation, taking up to a half second to perform.  Writes are faster, but they’re still slow.

Let’s say that a program needs to Continue reading

Intel’s Optane: Two Confusing Modes. Part 1) Overview

Exploding HeadIntel recently announced two operating modes for the company’s new Optane DIMMs, formally known as “Intel Optane DC Persistent Memory.”  The company has been trying to help the world to understand these two new operating modes but they are still pretty baffling to most of the people The SSD Guy speaks to.  Some say that the concepts make their heads want to explode!

How does Optane’s “Memory Mode” work?  How does “App Direct” Mode work?  In this four-part series will try to provide some answers.

Like all of my NVDIMM-related posts, this series challenges me with the question: “Should it be published in The SSD Guy, or in The Memory Guy?”  This is a point of endless confusion for me, since NVDIMM and Intel’s Optane blur the lines between Memory and Storage.  I have elected to post this in The SSD Guy with the hope that it will be found by readers who want to understand Optane for its storage capabilities.

Memory Mode is the easy sell for the short term.  It works with all current application software without modification.  It just makes it look like you have a TON of DRAM.

App Direct Mode is really cool if Continue reading

What is SNIA’s Persistent Memory Programming Model?

SNIA Persitent Memory Summit LogoIn this post The SSD Guy will discuss the SNIA Nonvolatile Memory (NVM) Programming Model, a framework to allow standard applications to take advantage of nonvolatile, or persistent, memory in any system that includes persistent memory,

This model is enormously important to the future of computing, yet few people even know that it exists.  It’s a fundamental change to the way that application programs access storage that will have significant ramifications to computer architecture and performance over the long term.

Here’s why: The industry is moving towards larger-scale systems that mix persistent memory with standard DRAM into a single memory address space.  Persistent memory has an advantage over volatile DRAM, since it maintains data after power is removed or lost.  Because of this certain application programs will want to know which memory is volatile and which is persistent and to take advantage of whatever persistent memory the system might provide.  I say “Larger-Scale” systems because small systems often combine Continue reading

Are SSDs Approaching Price Parity with HDDs?

July 2007 HDD vs SSD Price AnalysisA recent Storage Newsletter article argues that SSD prices are approaching HDD prices, and that the gap has narrowed to only a 2.7 times difference.

Upon closer inspection, though, the reader will note that this is only true at lower capacities.  The narrowing price gap at lower capacities has always existed in this market.  The SSD Guy was making that  argument back in 2007!

This post’s graphic shows a chart from the first report ever published by Objective Analysis over a decade ago: The Solid State Disk Market – A Rigorous Look.

The point of this chart was to illustrate that, at low capacities, SSDs are cheaper, while at higher capacities HDDs provide lower-priced storage.

The concept is simple: It’s uneconomical for an Continue reading

A New Spin on Memcache

IBM DataStore LogoData centers that use centralized storage, SANs or NAS, sometimes use servers to cache stored data and thus accelerate the average speed of storage. These caching servers sit on the network between the compute servers and storage, using a program called memcached to replicate a portion of the data stored in the data center’s centralized storage. Under this form of management more-frequently-used data presents itself faster since it has been copied into a very large DRAM in the memcached server.

Such systems have been offset over the past five or more years thanks to the growing availability of high-speed enterprise SSDs at an affordable price. Often direct-attached storage (DAS) in the form of an SSD within each server can be used to accelerate throughput. This can provide a considerable cost/performance benefit over the memcached approach since DRAM costs about 20 times as much as the flash in an SSD. Even though the DRAM chips within the memcached server run about three orders of magnitude faster than a flash SSD most of that speed is lost because the DRAM communicates over a slow LAN, so the DAS SSD’s performance is comparable to that of the memcached appliance.

There’ a catch to this approach, since the DAS SSD must be Continue reading

SSDs Need Controllers with More, NO! Less Power

More Power-Less PowerThe Storage Developer Conference in September gave a rare glimpse into two very different directions that SSD architectures are pursuing.  While some of the conference’s presentations touted SSDs with increasing processing power (Eideticom, NGD, Samsung, and ScaleFlux) other presentations advocated moving processing power out of the SSD and into the host server (Alibaba, CNEX, and Western Digital).

Why would either of these make sense?

A standard SSD has a very high internal bandwidth that encounters a bottleneck as data is forced through a narrower interface.  It’s easy to see that an SSD with 20+ NAND chips, each with an 8-bit interface, could access all 160 bits simultaneously.  Since there’s already a processor inside the  SSD, why not open it to external programming so that it can perform certain tasks within the SSD itself and harness all of that bandwidth?

Example tasks would include Continue reading