Dealing with a Data Shortage

Thumbnail image of this post's first chartFor the past several years presenters at data-oriented conferences have quoted forecasts for data growth that taxed their vocabularies.  Exabytes gave way to Zettabytes, and then Yottabytes, and these were often charted out on standard linear charts like the one below, with the inevitable result that the curve always starts out slow, and takes off like a rocket as it rises to the right.

Chart of global data creation over time, according to a popular forecast. The line starts off staying in the bottom of the chart, but it curves upwards as it moves right, and is at about 45 degrees at the right side, as data creation really takes off.

This approach to speaking got so overused that the SSD Guy blog stopped quoting these numbers because they became the preface of every keynote at every conference throughout the course of the year!

But what the keynoters and other presenters seemed to miss was the fact that data growth is actually slowing.  Here’s the same chart, but reformatted onto a semilogarithmic format:

the same data plotted in a semilogarithmic format. Now the line is pretty steady on the left side, but its angle decreases towards the right as growth slows.

I should explain that on a semilogarithmic chart constant growth becomes a straight line.  You’ll notice that the line in this chart isn’t straight – it tapers off.  Growth is expected to slow over time.

While the industry could breath a sigh of relief, knowing that data is not growing uncontrollably, there is another concern that should be very worrisome to industry leaders: While data may not keep growing at today’s heady rate, the devices used to store this data continue to increase their capacity at a constant growth rate that is slightly higher than the current pace of data growth.  Not only that, but HDD and SSD capacity growth are not expected slow down over time.

This means that the number of devices that will be needed to store data should slowly decline over the years.  Given enough time, the entire amount of data created in the world should be able to fit within a single SSD or HDD.  This will present a serious issue to manufacturers of these devices.

The chart below extrapolates HDD capacity with data growth.  Where the paths intersect lies a Singularity: The point at which only a single HDD will be needed to store all of the world’s data.

Same semilogarithmic chart, but with another line, labeled HDD capacity, that begins orders of magnitude below the data line and grows steadily to intersect. The intersection is labeled Data Singularity.

This is concerning enough to have even raised the attention of government agencies due to security concerns.  What if the drive fails?  What if it is stolen?

A good example of the results of this phenomenon already exists.  The chart below shows combined HDD quarterly unit shipments for Seagate and Western Digital for the past decade.

Stacked area chart of Seagate's and WDC's quarterly HDD unit shipments, 2013-2022. The total declines from 120 million units in 2013 to roughly 30 million at the end of 2022.

The downward trend does not stem from SSDs replacing HDDs, as many assume.  SSD unit shipments aren’t growing fast enough to compensate.  Instead HDD capacities, now surpassing 20 terabytes, are becoming too large for the data that they store.  This is the leading cause for the great consolidation that has occurred in the HDD market, and it has taken its toll on DRAM, too.

What can be done about this?  The answer is that the storage industry must rally together to encourage the world’s population to increase its data production over the long term, doing whatever it takes to consume more storage.  There is no other way to prevent the industry from collapsing.

To this end a new organization has been formed called “The Committee to Foster Continuing Data Growth.”  It’s still too new to have a website, but this committee hopes to convene its first global summit in Timbuktu within a year.  Be on the watch for this meeting, which is tentatively scheduled for 1 April, 2024.

One thought on “Dealing with a Data Shortage”

  1. Wouldn’t new technologies like LLM, Chat GPT et al, spur the growth of data creation at rapid pace? Every decade a new paradigm assumes this responsibility and fuels data growth. Please correct me if I am wrong.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.