Solving SSD Power Spike Issues
SSD spec sheets, might lead you to believe that power is just not an issue. For example, Samsung lists the power use for a 512GB 830 SSD at 0.127W (typical) for “Active Power Use”. This implies very low demands on the system power supply.
If you do some more research, you find that the peak power usage is a lot higher. AnandTech, in a review article reports sequential write power draw at 5.14W and random write power draw at 5.8W. In that 2.5” SSDs use the 5V power rail exclusively, this is more than 1 Amp for a single drive.
That the situation seems to be getting worse. Newer SSDs are drawing more power. I saw a note for a SanDisk SSD that quotes 1.6A peak on the 5V rail.
This issue that is not restricted to SSDs in a conventional HDD form factor, but extends to all flash drives. I talked with a PCIe vendor whose product’s power consumption went well past the 25W power allotment for a PCIe slot. By well past, I mean over 100W peak. This was in a model that supported write speeds well above 2GB/sec. The write speed of PCIe SSDs is proportional to their power envelope.
A Real-Life Example
My first test system was configured with 24 SSDs (16 Samsung 830 128GB and 8 Crucial M4 256GB) connected in four 6-bay passive backplanes. The power supply was a 350W Antec Earthwatts.
350W seemed like enough, and the system would come up and each drive was functional. When you tried to use all of the drives at once, random IO errors started showing up. These errors included both read and write data errors, plus the occasional SCSI hang.
It turns out the 350W Antec was only rated for 10 Amps on the 5V rail. So the next try was a 600W ThermalTake. The 5V rail was now rated for 18 Amps. Testing showed that things were better, but still flakey.
Next, I got out an old oscilloscope and hooked it up to the 5V line just behind a drive bay. When the drives were quiet, the 5V rail was right where it was supposed to be with no noise. Write to the drives and a mess of digital noise shows up with peak-to-peak values as high as 0.4V. This noise had both low and high frequency components. It was obvious that power was still the problem.
So the next step was to upgrade the power supply again, this time to a Corsair 700W with a 28 Amp 5V rail. In addition, 6,600 µF decoupling capacitors were also added to the back of each SSD bay. The result is clean operation of all drives. The oscilloscope shows the line noise lessened, but it’s still there. At least the capacitors were effective at killing off the high-frequency components. Given a choice, I think the capacitors should be bigger. 22,000 µF 6V capacitors are available, but my local store did not carry these.
If your 5V supply is maxed out, but you have plenty of spare current at 12V, a trick I have been able to use is to convert 12V down to 5V with parts such as Current Logic’s 30A/150W 12/24V to 5V, step-down DC/DC converter.
Servers Have Similar Issues
The test above was performed on a storage array – what about converting servers from HDDs to SSDs?
My bench chassis uses desktop power supplies. In looking at server power supply models from SuperMicro and Zippy, the 5V rail seems to be much larger. For example, the 900W redundant supply that ships with the 24-drive 2U chassis from SuperMicro has a 75 Amp 5V rail. A 500W Zippy redundant model has a 50 Amp 5V rail. I guess server supplies are more targeted at large numbers of drives whereas desktop chassis are more interested in 12V power for CPUs and video cards.
Meanwhile, most redundant server power supplies have migrated to a single 12V output. The servers use power distribution units (PDUs) to reduce this to 3.3V and 5V for peripherals. Most of the PDUs are rated at 45A on the 5V rail, which is enough for 24 SSDs, but perhaps marginal for 32.
Regardless, with servers, you are still dealing with high current devices. As such, trying to run 24 or 32 drives off of a single power lead is probably a very bad idea. The use of extra decoupling capacitors is probably overkill, but might still be good engineering, especially considering the random failure symptoms.