This post is the second of a two-part SSD Guy series outlining the nonvolatile DIMM or NVDIMM. The first part explained what an NVDIMM is and how they are named. This second part describes the software used to support NVDIMMs (BIOS, operating system, and processor instructions) and discusses issues of security.
Software Changes
Today’s standard software boots a computer under the assumption that the memory at boot-up contains random bits — this needed to be changed to support NVDIMMs. The most fundamental of these changes was to the BIOS (Basic I/O Subsystem), the code that “wakes up” the computer.
The BIOS is responsible for detecting all of the computer’s hardware and installing the appropriate drivers, after which it loads the bootstrap program from the mass storage device into the DRAM main memory. When an NVDIMM is used the BIOS must understand that the data within the NVDIMM may already be valid and should not be over-written, and that the NVDIMM may need some time to move data from the on-DIMM flash to the DIMM’s DRAM. In the event of a power failure the BIOS also assumes the responsibility of storing the state of all of the processor’s registers, along with the “dirty” lines of the processor’s cache, into the NVDIMM. The register status must then be restored at power-on.
But this is all housekeeping, and doesn’t take full advantage of the NVDIMM’s speed advantage over an SSD or HDD. More software needed to be developed to prevent NVDIMM accesses from being bogged down by slow I/O routines that were developed for HDD and SSD. At the time that they were written these routines were significantly faster than the drives themselves. With NVDIMMs the opposite is true: The storage hardware (the NVDIMM) is significantly faster than the I/O routines accessing it.
The Storage Networking Industry Association (SNIA) has taken a leadership role in defining a standard software structure for communicating with a broad set of SCM types, which SNIA calls “Persistent Memory”. This allows operating system developers to create standard calls to access the persistent memory of an NVDIMM. These standard calls allow applications programmers to write portable applications software that can take advantage of persistent memory across a number of platforms without any custom redesign.
Last June SNIA released a revised version of its NVM Programming Model (the above diagram) which adds operating system calls that applications can use to access local (left) or remote (right) persistent memory either through the I/O block protocol or as memory. Most importantly, the application program knows which memory accesses are persistent and which ones are not.
Processor Architecture Upgrades
The use of NVDIMMs not only impacts software, but it also has repercussions on processor architecture. In today’s systems a power failure is treated as a total loss. The contents of DRAM are lost along with any dirty cache lines and the contents of all of the processor’s registers and write buffers. With NVDIMMs this doesn’t have to be the case.
Intel added new instructions to its standard instruction set to help move the processor’s entire state into an NVDIMM in the event of such an emergency. This required the addition of only three new instruction classes which are documented in Intel’s August 2015 update of the Intel Architecture Instruction Set Extensions Programming Reference.
All of Chapter 10 of this reference is devoted to these “Memory Instructions” even though only three new instructions were added:
- CLFLUSHOPT: An “optimized” flush of a dirty cache line that invalidates that cache line
- CLWB: The same thing but without invalidation
- PCOMMIT: “Commits” an NVDIMM’s DRAM (or iMC write buffers) to its back-up NAND flash
The first two new instructions take care of concerns that cached data may be more current than its main memory equivalent, by assuring that the data stored in the NVDIMM after a power failure should be the most current rendition.
The third instruction allows the processor to initiate the process of copying an NVDIMM-N’s DRAM contents to its NAND, rather than relying solely on hardware control. This instruction would not be used with those NVDIMMs that are built exclusively of nonvolatile memory and do not use a DRAM.
Security Issues
Since NVDIMMs are persistent they are subject to the same security risks as HDDs and SSDs: Someone might steal your HDD, SSD, or NVDIMM and all the data that’s on it.
Over the past few years Self-Encrypted Drives (SEDs) have become widely available, providing SSD and HDD users a way to ensure that their data will not fall into the wrong hands even if the SSD or HDD is stolen. So far NVDIMMs don’t offer such security.
SNIA is, once again, deeply involved in this, and is currently working on a white paper to discuss this issue and potential solutions. As this post was being written the white paper consisted of little more than an outline and a table that outlines the “Threat Model” as SNIA seeks input from its members as well as from the general public.
Unlike SSDs and HDDs, the very low latency of an NVDIMM makes standard encryption engines difficult to use, since they would add significant latency to the device. Over time alternate approaches to support NVDIMM encryption are certain to be developed.
Another potential problem arises out of the fact that an NVDIMM resides within fixed address range while the system’s standard DRAM resides in another fixed address range. This works against a hacking countermeasures like “address space layout randomization”, which is used in operating systems’ kernels to randomize where code is stored in memory. If the code moves around in the memory then it is very difficult for a hacker to insert malicious changes. Since NVDIMM resides at a fixed address range the addresses will be less random than they would be in a system without NVDIMMs.
For More Information
Objective Analysis is deeply involved in the NVDIMM market as well as with SSDs and memory chips. We offer our clients considerable insight into these products from the perspectives of both marketing and technology. We specialize in helping our clients plan for the future to achieve success.
If your company would like to take advantage of our insight and deep understanding of this market please visit the Objective Analysis website to learn the business models we use to support our clients.
This makes me smile. Ever since the introduction of the Motorola 68010, I have been waiting for a revolution in the way OS architects and application programmers look at storage.
I find the concept of “storage” to be dishonest. Filesystems and NAS and the expectations surrounding them are part of this lie.
There is no such thing as storage. Everything is in fact NUMA memory, which should be designed for, bearing in mind physical attributes (ie block sizes), latency (read, write, synchronous write, unison, eventual consistency…), and reliability statistics for its media. Every computer with ‘storage’ today is in fact a NUMA system in denial, and I really look forward to the direction (honesty) that this technology will take OS and application developers in.
Thanks for the insights William!
The industry has moved a very long way since the first persistent memories (core) were used. Too bad these moves were in the wrong direction for persistent memory support.
I hope you’re right, and programs move back to a more rational model. Time will tell.
Jim
Jim,
When do you expect the 3D XP DIMM to begin sampling? Also, what is the status of Diablo Technologies and their Memory1 DIMM?
Thanks, Geoff
Geoff,
Intel promises to deliver the Optane DIMM in 2018, but time will tell. New technologies are notoriously challenging to bring into production on-schedule! We have heard that certain hyperscale companies have seen samples, but I am not certain that these are fully functional.
As for Diablo, I have been asked that question before, and have reached out to a few people without success. I’ll come back and update this reply if/when I hear back.
Thanks for the comment,
Jim
So far it’s been two months, and no reply from Diablo’s website, their former PR form, nor their president on LinkedIn.
It’s looking pretty bleak.
On 15 December, 2017 the Ottowa Business Journal posted a story telling that Diablo had filed for bankruptcy.
http://www.obj.ca/article/ottawa-based-diablo-technologies-files-bankruptcy
Thanks for information Jim,
It would be great if you share the software testing tools available to test NVDIMM-N such as:
SDRAM tests, flash tests, NVDIMM controller test
Shashi,
I took your question to SNIA’s Persistent Memory and NVDIMM SIG (Special Interest Group) and received the following reply:
1) For performance testing and certain other tests the SIG uses EZFIO, which is a superset of FIO, for simplicity. This is an open-source tool. https://github.com/earlephilhower/ezfio
2) To look for CECC/UECC controller errors on bad NVDIMM-N flash blocks, there are DRAM ECC error registers on the NVDIMM that the host needs to interpret and decide how to handle. Other registers do the same for the bad flash blocks.
3) If you want to determine if any problems are occurring within the NVDIMM-N (for example: the controller is not responding while copying data from DRAM to the flash, or if the Automatic Data Recovery (ADR) is taking too much time after a system power loss, or if the battery/supercaps have insufficient energy to complete a successful backup), ether the BIOS or the system software will check for battery/supercap drain out before “Arming” the NVDIMM. If there is not enough energy in the batteries to support a backup operation, the NVDIMM will not Arm. If the NVDIMM is not Armed, then the ADR sequence will not take place. The ADR sequence is initiated by the motherboard. This includes the ADR sequence timings. Therefore the NVDIMM have no control over the host but will report the status of the failed backup from the NVDIMM internal registers. The BIOS constantly checks these registers for issues and will report warnings if any problem is detected.
4) To test the NVDIMM-N’s endurance, you can automate a “Backup and Restore” test by using the following procedure:
(1) Write a data pattern to the NVDIMM,
(2) Remove AC power using an external controllable power supply,
(3) Reapply AC power through the external controllable power supply, then
(4) Read and verify the data pattern
I hope that does the trick.
Jim