There have been numerous changes to SSDs since they moved into the mainstream 15 years ago, with controllers providing increasing, then decreasing endurance levels, and offering greater, then lesser levels of autonomy. What has been missing is any ability for the system to determine the level of performance that the SSD provides.
Recently Kioxia, the company formerly known as Toshiba Memory, announced a new initiative called “Software-Enabled Flash”, that aims to provide a consistent interface between software and SSDs that allows the software to choose the level of involvement it wants to have in the SSD’s behavior.
First, let’s talk a little bit about the problem. NAND flash memory requires significant management. The whole concept of NAND flash is that it’s OK for it to be phenomenally difficult to work with as long as it’s the cheapest memory available. Here’s a list of a few of the reasons that NAND flash is hard to work with:
-
-
-
- It requires error correction
- Bits get stuck (either high or low) if you write to them too many times
- It must be erased every time new data is to be written to it
- One full page must be written at a time. These pages have different sizes depending on the chip
- Erases are performed at the block level, with each block consuming thousands of pages
- It takes milliseconds to perform erases and writes, yet reads take only microseconds to set up, and tens of nanoseconds to stream out
-
-
The list goes on, but the issues above are sufficient to give the reader a pounding headache!
There’s little wonder that SSDs are almost all managed by controllers, and that the programming of these controllers is an art that is most often left to a team of specialists at the SSD manufacturer. Few others would be interested in developing the high level of expertise required to master NAND flash management.
(If you want to go into more depth in how controllers work there’s a series you can read on The SSD Guy blog called: How Controllers Maximize SSD Life)
But certain Internet data centers have determined that it pays back to closely manage the flash within the SSD. This is because the controller in the SSD is completely asynchronous to the workings of the server’s applications programs, and this can randomly slow down data access. I like to compare it to a revolving door, which rotates at a set speed. People have to pace themselves to the revolving door to get into the building. If you have 100 people waiting to enter the building then the revolving door will slow them down compared to a door that they can pull open at their convenience.
In the case of an SSD, though, it’s even worse, because the revolving door occasionally stops, and nobody ever knows when that might happen! This tends to happen when a garbage collection routine bumps into a host write command.
A lot of this is the result of HDD legacy: Since computers didn’t already have a natural place to put NAND flash it had to be fit into an interface that was designed for something else. Most SSDs have used the HDD interface. Some companies tried (with varying degrees of success) to add NAND to the memory channel in a DIMM slot, with NVDIMMs remaining as others faded away, and today’s SSDs are largely migrating to NVMe over a PCIe interface, an interface that was originally designed to connect GPUs and math co-processors to the processor. Since NAND flash is not a natural fit for any of these interfaces, each has required some degree of shimming to make things fit together. It is this shimming that can negatively impact the overall system’s performance.
Kioxia puts it boldly by stating that SEF: “Fundamentally redefines the relationship between host and solid-state storage.”
If the server takes more control of the SSD then it can schedule the SSD’s operations to synchronize them with the application program’s needs. A team at Baidu decided to try that back in 2015 and developed a stripped-down SSD that provided higher performance when operating with software that was tuned to that SSD. This was the beginning of an approach that became the Open-Channel SSD architecture (also known as LightNVM) that was adopted by a number of other hyperscale data centers. An Open-Channel SSD allows the server’s host processor to manage a number of housekeeping tasks like wear leveling, garbage collection, over-provisioning, and write coalescing (all defined in the series mentioned above.)
At this point we have two very different types of SSDs: Those that do everything internally, but that cannot be synchronized to the application, and those that can be synchronized to the application, but that require the host processor to manage most of the inner workings of the SSD. Wouldn’t it make sense to allow the programmer to choose which tasks to manage in the host, and which ones to allow the SSD to perform internally?
This is where Kioxia’s initiative comes in. Software-Enabled Flash (SEF) is a an application programming interface (API) designed to allow application programs to take varying degrees of control over the performance of an SSD. This requires the support of a specialized SSD that Kioxia calls a “SEF unit” that communicates using the SEF API. If a piece of software wants to leave all of the management to the SSD, then it tells the SEF unit to perform all of the standard functions of an SSD. If another program wants to assume complete control (as it would with an Open-Channel SSD) then that program tells the SEF unit to do as little as possible. Other programs can pick & choose which SSD features to leave to the SEF unit, and which to manage by themselves.
The figure below sketches out the inner workings of an SEF unit.
The DRAM is optional because it performs functions that may instead be performed in the server’s main memory. SSDs use DRAM for page tables (for wear leveling) and write buffers (for re-ordering and coalescing writes).
One big challenge caused by an Open-Channel SSD’s low level of management that flash chips differ, not only from vendor to vendor, but from chip generation to chip generation. They are programmed differently from each other, have different architectural features, and so on. SEF includes a level of abstraction that hides these differences. These will be managed within the SEF unit, with the flash and SSD vendors designing their SEF units to execute the API. The block above called “Generation Specific Program Logic” is involved with this task.
The “Micro-Controller Logic” block handles page programming, lifetime extension (wear), ECC, and defect management (bad block management), and supports multiple prioritized and weighted queues to each NAND flash die. SEF supports copy offload to help make garbage collection more efficient: An entire block of flash data can be copied into the host’s memory and modified before being written back into the flash. This minimizes write amplification, which stems from the fact that erase blocks are large and pages are small, and valid unused data must often be moved in order to erase a block.
Below we see a rough schematic of the software interface. As you can see, the User Applications can either be updated to communicate directly with the device driver through the SEF API or they can continue to use conventional File System calls to access the device. This is all open source, and Kioxia has committed to provide sample code and libraries.
Kioxia demonstrated a ZNS implementation of the technology at the Open Compute Summit in April. The company tells us that SEF today implements ZNS (Zoned Name Space), block, and flash-native drivers. Kioxia has committed to provide sample code and libraries.
Today SEF only exists as an open-source API pending a release this summer. There is no announced “SEF unit” hardware to support the initiative, but Kioxia indicates that something will be announced shortly.
If you want further details you can visit Kioxia’s SEF microsite: https://SoftwareEnabledFlash.com