Return to Stories

Measuring the Criticality of Data within the Global Datasphere

By David Reinsel

Data is at the heart of a digitally transformed business or life – it is the lifeblood of digital transformation. But that doesn’t mean that all data is critical, let alone life-critical.

The criticality of data within the Global Datasphere can be determined by a number of characteristics, the first of which is the impact on life itself. Also considered is the autonomous nature of data’s use, the speed at which a transaction/response is required, and the frequency of which a given transaction is required. No single measure necessarily dictates the criticality of data, but instead it is a combination of these measures that help to determine data criticality.

In DataAge 2025, we measured just how much data is critical in the Global Datasphere, and categorized it into three categories:

  • Potentially Critical data is the set of data that has the potential to be critical or hyper-critical. It is 40% of the 2025 Global Datasphere. Examples of this data could come from autonomous vehicles, traffic/weather information, wearable data, intelligent assistants (always listening devices) within the home, etc.
  • Critical data is data known to be necessary for the expected continuity of businesses and users’ daily lives. It is expected to grow less than 40% over the next decade, resulting in 17% of the 2025 Global Datasphere. Examples of this data include public services like transportation, utilities, and communication.
  • Hyper-critical data (or life-critical data) is data that has a direct and immediate impact on health and well-being of life. It is expected to grow 54% per year over the next decade, resulting in 1% of the 2025 Global Datasphere, or 400GB per digitally connected person. Examples of this data would include biometric data that is monitored, analyzed, and used to operate life-critical devices like implanted insulin pumps or cardioverter defibrillators.

Life-critical data is not new, especially in the medical field. Understanding compatible blood types, proper medication to treat various ailments, and all the data used to sustain life during a critical operating procedure are examples of life-critical data – without which lives are put at risk.

However, in a digitally transformed world, instead of the deliberate and subjective use of data in the previous examples, data will be used to ‘run’ certain aspects of our lives. In other words, critical data is not just a one-time calculation, but instead an integrated continuous stream of calculations informing automated decisions or decisions that must be made, many times in real-time.

Businesses have long been running on data, business-critical data. There is no clearer example than in manufacturing (think 6 sigma) where continuous data feeds keep processes in control by tweaking parameters based upon data measurements. Today’s advanced datacenters have autonomous, self-learning applications that balance workloads across its infrastructure to maintain peak performance, and future smart cities will govern traffic, utilities, buildings, and many other things to drive efficiency and convenience throughout the infrastructure. These are clear examples of business-critical data-driven applications where any disruption or use of bad data can bring a business or city to its knees. Life-critical data shifts the focus from putting businesses at risk to human life.

In DataAge 2025, IDC looked at just how integrated data would be in a connected person’s life by calculating the number of daily digital interactions. By 2025, an average connected person will have a digital interaction nearly 4,800 times per day, or 1 every 18 seconds. A majority of these interactions will be automated, and some will be life-critical – where a life will be put at risk if there is an error of some sort. Well known examples include embedded medical devices that automatically dispense medication based on continuously sensing body chemistry and other metrics. Self-driving cars is another obvious example where an incorrect decision or lack of one can put lives at risk. Tailored medicine based one’s DNA portends a future where ailments can be alleviated more cost-effectively and rapidly, so long as the data is correct.

“Junk in, Junk out” as the saying goes. Herein lies one of the key roles for storage, ensuring data integrity from the point of storage through every use in any application, location, or on any device. In addition to data integrity, storage must do its part to meet latency requirements by delivering data to the necessary compute engine that results in a timely decision.

Finally, this type of data will provide monetization opportunities (data as a service) as companies that acquire such data may only be focused on providing one vector of analysis. There may be other companies that desire access to the same data to provide additional services or insights to businesses, cities, or our lives by aggregating it with other data that is purchased or owned.