This post is the second part of a four part series in The SSD Guy blog to help explain Intel’s two recently-announced modes of accessing its Optane DIMM, formally known as the “Intel Optane DC Persistent Memory.”
The most difficult thing to understand about the Intel Optane DC Persistent Memory when used in Memory Mode is that it is not persistent. Go back and read that again, because it didn’t make any sense the first time you read it. It didn’t make any sense the second time either, did it?
Don’t worry. This is not really important. The difficulty stems from Intel’s marketing decision to call Optane DIMMs by the name “Intel Optane DC Persistent Memory.” Had they simply called them “Optane DIMMs” like everyone expected them to then there would have been far less confusion. That sentence above would have instead said that Optane DIMMs are not persistent when used in Memory Mode.
Readers would then have said: “Well, OK. But why use Optane, then, if it’s not persistent?” The answer is very simple: Optane DIMMs are enormous! Consider the fact that Samsung’s largest DRAM DIMM (a very costly one) is 128GB, and Intel’s smallest Optane DIMM is 128GB and should sell for a fraction of the price; this gives you very good reason to use Optane. Everybody wants more memory!
So why in the world is it not persistent? The answer is involved, but relatively simple to understand.
Optane cannot be used as the only memory in a system – it has to be accompanied by DRAM. This is because Optane doesn’t like to communicate with the processor the way that the processor likes to be communicated with. The module is pretty slow compared to DRAM for three reasons:
- The medium, 3D XPoint Memory, writes more slowly than it reads. Some say that a write takes three times as long as a read. If this chip were to communicate with the processor over a standard DDR4 interface then all reads would have to be slowed to the same speed as writes.
- Another difficulty is that 3D XPoint Memory wears out, so it has to use wear leveling. That means that address translation must be inserted into the critical timing path, slowing down every access, reads as well as writes.
- The third reason is one you probably never thought of: The data must be encrypted before it is stored and decrypted when it is read, further slowing the critical path. Many organizations worry that storage (HDDs, SSDs, and now NVDIMMs, including the Optane DIMM) will fall into the wrong hands making data available to evildoers. Those organizations would not use the Optane DIMM if it did not support data encryption. (Alert readers will object that this can only be an issue if the Optane DIMM is, in fact, persistent, and they’re right. I’ll explain that shortly.)
The solution is to do what all cell phones do, which is a very similar technique that has been used for decades to manage data between DRAM and HDDs or SSDs. In a cell phone the processor can’t efficiently communicate with the NAND flash, so it moves lines of code and data from the flash into a DRAM and operates on them there. In Intel’s new Memory Mode the processor moves data back and forth between the Optane DIMM and the system’s DRAM, and only executes code or operates on data in the DRAM.
The Optane DIMM is paired with a DRAM that behaves as a cache, and, like a cache, it is invisible to the user. You heard that right – if you use the Optane DIMM in Memory Mode then your DRAM becomes inaccessible. A typical system might combine a 64GB DRAM DIMM with a 512GB Optane DIMM, but the total memory size will appear to the software as only 512GB. This is the same thing that cache memory does: The size of the cache is not added to the size of the DRAM, it’s simply invisible. In either case the faster medium (the DRAM in this case) temporarily stores data that it copied from the slower medium (the Optane DIMM in this case) and the cache controller manages the data’s placement in a way that makes it appear that the Optane DIMM is as fast as the DRAM. At least, it appears that way most of the time. In those rare instances where the required data is not already in the DRAM, the data accesses slow down a lot. This is because the processor stops everything and moves data around. If necessary it copies modified DRAM data back into the Optane DIMM, and then it copies the missing data from the Optane DIMM into the DRAM. This usually occurs very rarely (maybe 1-5% of the time, depending on the software that’s being run) so the other 95-99% of the time the system will run at DRAM speeds. That’s close enough for most people.
If you want a really deep dive into this you can order The Cache Memory Book, which dissects all of the principles of caching. I happen to know, because I wrote it.
So let’s talk about persistence. Nothing has been written into persistent memory until it actually reaches the Optane DIMM. The software thinks that it’s writing to the Optane DIMM, but it’s actually writing into the DRAM cache. When there’ a surprise power outage the data in the DRAM cache vanishes. This is why Intel’s Memory Mode is not considered persistent. The data that was in DRAM waiting to be written into the Optane DIMM is lost. If some of the data is persistently stored in the Optane DIMM, but some hasn’t been updated, and if nobody knows what data has missed being written into the Optane DIMM, then all of the data is suspect. The easy answer is to say that none of the data persisted – just start over. That’s what existing DRAM-only systems assume, so it’s not an alien concept. If it’s in memory (DRAM or Optane) and the power is lost, then the data is lost as well.
So it’s not considered persistent.
I have been told that the processor memory controller could have been designed to flush all of the “Dirty” (new) data in the DRAM cache back into the Optane DIMM when power fails, thus making it fully persistent, but since the whole point of Memory Mode is to make existing software see a giant memory space without any modification, this was considered unnecessary.
Anyone who wants to take advantage of the Optane Memory’s persistence will need to use it a different way, and that’s the subject of the next post. This mode is called App Direct Mode, and it not only supports persistence, but it also allows the user to access the DRAM as DRAM (without hiding it) and the Optane Memory as Persistent Memory, so a system with 64GB of DRAM and 512GB of Optane Memory will appear to have 64+512=576GB of some kind of memory.
Just to complicate things, the memories (DRAM and Optane) don’t both have to be entirely dedicated to either Memory Mode or App Direct Mode. The software can determine just how much of either memory type will operate in Memory Mode and how much will work in App Direct Mode.
Is that confusing enough? I suspect that very few programs will manage the memory in both modes. At least, not for a very long time!
But at least you now understand that sentence at the top of this post: That Intel Optane DC Persistent Memory when used in Memory Mode is not persistent!
A I said in 2015 when I published the industry’s first 3D XPoint forecast, Memory Mode should account for the bulk of Optane’s early sales, because it can be used with existing software with no modification whatsoever. Later on, when software that uses App Direct Mode becomes generally available that should change, but this will take a number of years.
This four-part series, published in early 2019, explores each of Intel’s two modes to explain what they do and how they work in the following sections: