Open

Blog

Seagate’s Data Center Workloads Served Entirely by Hybrid and Hard Drive Storage Systems

Learn how Seagate’s updated data center strategy improved performance and efficiency, deploying hard drive and hybrid systems to meet all workload needs—from routine to demanding.

Table of Contents:

Seagate’s Data Center Workloads Served Entirely by Hybrid and Hard Drive Storage Systems Seagate’s Data Center Workloads Served Entirely by Hybrid and Hard Drive Storage Systems Seagate’s Data Center Workloads Served Entirely by Hybrid and Hard Drive Storage Systems

In an era of insatiable demand for data, Seagate is not only the leading manufacturer of storage solutions; it’s also a major global enterprise managing vast amounts of data across its own expansive data centers. Facing aging all-flash storage systems, Seagate saw an opportunity to modernize its strategy to meet current needs and future data growth. The company chose to deploy hybrid storage systems that provide mass capacity via hard drives plus a thin layer of flash—for their ability to scale mass data capacity efficiently, without sacrificing performance in any way. Multiple Infinidat InfiniBox storage arrays were deployed across several Seagate data centers.

A lot has been made recently about the suitability of hard drives and SSDs for data centers. There are good reasons hard drives continue to shoulder approximately 90% of hyperscale and cloud capacity requirements.1 Seagate’s own experience has shown why: A well-rounded strategy, where flash and hard drives not only coexist but complement each other, ensures that the entirety of Seagate's storage needs, from the most routine to the exceptionally demanding, are met.

Seagate’s deployment of these hybrid storage systems is a great example of how the synergy of the two technologies is most effective to meet modern data center demands. Flash storage has its place in enhancing performance for specific tasks, while hard drives form the backbone, handling the mass data storage. 

Hard drives and SSDs are both important but different storage technologies, with fundamentally different approaches to reading and writing data. And of course, hard drives have a significant cost-per-terabyte advantage over SSDs. Scale-out storage architecture requires a mix of media devices—optimized to meet the budget, capacity, and performance needs of workloads.

What Factors Drive Storage Architecture Decisions

For data center architects and operators, several key factors drive storage architecture decisions: high availability and resiliency, performance, capacity, supportability, and overall cost. These elements ensure that the storage infrastructure can handle diverse and demanding workloads efficiently and economically.

Seagate’s data center requirements were shaped by clear goals addressing the company’s complex operations, which span research, design, manufacturing, and a diverse market presence encompassing B2B and B2C channels. Recognizing the increasing volume of data generated by IoT devices, automation, and digitalization in manufacturing, Seagate set out to cost-effectively boost its storage capacity and performance. This enhancement would be vital for harnessing AI and data analytics, which drive business value by deriving insights from large volumes of unstructured data.

The company’s wide range of critical operations are anchored by: 

  • Enterprise Resource Planning (ERP), which is central to Seagate's operations, enabling functions from accounting to supply chain management
  • Real-time databases, notably a 350-terabyte database crucial for tracking every manufacturing, testing, and technical detail of every individual unit Seagate has shipped
  • Analytics workloads that help Seagate extract insights from large datasets for strategic decisions
  • Virtual Machines (VMs) and file services essential for day-to-day IT operations and application hosting

Analysts have noted that the vast majority of the data associated with enterprise workloads requires mass capacity and nominal-time data transfer, which are well-suited to the scale and TCO advantages afforded by hard drives. Vinod Pasi, Seagate’s VP and Global Head of IT Infrastructure, confirms this paradigm reflects Seagate’s experience in crafting a data storage architecture that effectively serves all its data center workloads.

Serving All of Seagate’s Data Center Workloads

Seagate’s strategic deployment of hybrid storage systems has been instrumental in efficiently managing its diverse array of data center workloads. The company has identified specific workloads that demand varying levels of data transfer performance and mass storage capacity.

For instance, non-real-time reporting databases, such as BDW and Informatica, along with factory databases like ODS, TS, and PIC, represent a significant portion of Seagate's data storage needs. These workloads, which also include VMware VMs hosting Linux and Windows applications, file services (NFS, CIFS, SFTP, FTP), Hadoop HDFS for several sites, and MinIO storage clusters for backup and machine learning applications, collectively account for about 90% of Seagate's storage capacity. These workloads are predominantly served by hard drives due to their substantial capacity requirements and the cost-effectiveness of hard drive storage.

For workloads requiring real-time data transfer, such as factory line support databases and Citrix VDI, which make up 10% of Seagate's storage needs, the hybrid storage systems’ intelligent caching and data placement capabilities ensure that performance is not compromised. These applications benefit from the thin layer of SSDs integrated into the hybrid systems, providing the necessary speed and low latency for real-time operations while still leveraging the high-capacity hard drives for the bulk of the data storage.

Seagate's data center workloads illustrate a broader industry trend in which the majority of enterprise data is efficiently managed by hybrid storage solutions. By optimizing the balance between hard drives for mass storage requiring nominal- to real-time data transfer performance, and SSDs for highly performance-intensive tasks requiring real-time to ultra real-time data transfer, hybrid storage systems offer a versatile and cost-effective architecture. That architecture can handle diverse and demanding data center workloads, ensuring high performance and scalability without the prohibitive costs associated with an all-flash infrastructure.

Designing a Storage Architecture to Meet Workload Demands

Cloud, hyperscale, and large enterprise storage architects tend to select the most appropriate blend of storage types to optimize cost, capacity, and performance. Advanced hybrid storage arrays are a great fit for that goal. SSDs are ideal for high-performance, read-intensive workloads requiring ultra real-time data transfer—a very small proportion of workloads—while hard drives provide the necessary access to mass data and serve the overwhelming majority of workloads. Hard drives handle workloads that flash should not, and flash handles workloads that hard drives should not. Deploying advanced hybrid systems can simplify the architecture, ensuring each storage medium is utilized when it is most needed.

Storage Solutions Should Meet Specific Operational Needs

Seagate's data centers use a mix of storage solutions tailored to meet specific operational needs.

Previously, Seagate addressed some storage performance demands using all-flash systems, which provided high performance but at a significant cost—flash media costs more than six times that of hard drive media per terabyte (TB). Seeking a more cost-effective solution that could offer comparable performance and the scalability needed for future growth, Seagate deployed 17 new hybrid storage systems.

In addition to the hybrid arrays, Seagate employs 26 Exos hard drive storage systems for specific functions such as security camera data storage, backup targets, and certification log retention.

The majority of our storage capacity, over 50 petabytes, is provided by Seagate Exos hard drives, which are integrated into both the Infinidat hybrid systems and the purely disk-based Exos storage systems.

Each hybrid array provides 4.6 petabytes of usable hard drive space plus a thin flash layer. Intelligent caching technology dynamically optimizes data management among its varied storage media, adjusting to changing workloads to automatically ensure high performance for Seagate's demanding applications. The architecture meets the demand for increased data storage, enabling the company to efficiently manage any workload by optimizing both capacity and access speed for data-intensive tasks, all with a lower TCO per TB.

How Hard Drives and Flash Work Together in Hybrid Systems

Hard drives and SSDs complement each other in storage solutions, with SSDs handling high-speed, low-latency requirements, and hard drives managing large-scale, high-capacity storage needs. Generally, SSDs are ideal for block and file types that require very low latency of less than 1 millisecond, making them suitable for very high-performance read-intensive workloads. On the other hand, hard drives are appropriate for a wider range of file types, including block, file, and object types, especially where high capacity is essential. Hard drives are best suited for applications with moderate to high latency requirements, ranging from 1 to over 100 milliseconds.

Like most hybrid storage systems, Infinidat's InfiniBox incorporates hard disk drives as its primary storage. It also includes a larger-than-usual DRAM cache and a solid-state tier that serves as a secondary cache. The larger DRAM cache enables more data to be stored close to the CPUs, which boosts performance and helps in the effectiveness of data placement strategies. Most of the system's data resides on hard drives, the foundation for mass storage capabilities. By intelligently coalescing data in the write cache and writing it out sequentially, these systems ensure higher write efficiencies and minimize the impact on flash media endurance. The system’s metadata is kept in DRAM using trie data structures for fast, efficient access, contributing to the system’s high performance and scalability.

Algorithms manage data placement intelligently across a tiered storage hierarchy. The systems use metadata tagging to monitor metrics including access frequencies, block sizes, read/write frequencies, and associated application I/O profiles, using dynamic information on which data is most likely to be referenced and used together. The system then prefetches data efficiently, leading to high read cache hit rates.

This integrated approach illustrates how advanced hybrid storage systems leverage both hard drive and SSD strengths, allocating workloads where they can be handled most efficiently—mass storage on hard drives and performance-boosting tasks on SSDs.

Cost Efficiency and Productivity Gains

Seagate’s deployment of a hybrid storage strategy has led to cost efficiency and productivity gains that provide annual financial benefits per petabyte of storage capacity, including reduced IT infrastructure costs, improved backup times, enhanced load times, and accelerated transaction rates.

Vinod Pasi says his IT team’s benchmarking shows the new hybrid systems surpassing the performance of previous all-flash arrays at a lower cost, while providing substantial capacity growth, accommodating various workloads with high efficiency—including everything from large databases and analytics to file services and VMware workloads.

The transition enabled Seagate to streamline its data storage operations, moving from multiple all-flash arrays to a single hybrid system for managing its crucial 350TB database. This shift simplified the architecture and reduced the complexity of support and maintenance, marking a strategic step towards more efficient data management.

 

Broadly across workloads, the IT team has seen significant improvements across several performance metrics. Backup times improved by 90%, dropping from hours to minutes. Load times improved 40%. Transaction rates increased 35%. Query speeds are more than 20% faster.


Broadly across workloads, the IT team has seen significant improvements across several performance metrics. Backup times improved by 90%, dropping from hours to minutes. Load times improved 40%. Transaction rates increased 35%. Query speeds are more than 20% faster.

Reducing CapEx and OpEx

By consolidating storage arrays, Seagate significantly reduced both capital expenditures (CapEx) and operational expenditures (OpEx), nearly halving overall expenses. The adoption of these hybrid systems has enhanced the company’s IT operational capabilities, simplifying management and improving resilience.

The transition has enabled Seagate to easily achieve its requirements for both capacity and performance. A single hybrid system can manage up to 17.287PB of effective capacity—so in addition to facilitating Seagate’s IT workload consolidation today, Seagate can easily scale its systems’ capacity in the future by multiple factors, as the company deploys its latest Exos hard drives with Mozaic 3+ technology offering 30TB+ per drive and 3TB+ per platter.

The simplicity and reduced complexity of Seagate's data center infrastructure is another significant benefit. By consolidating a diverse range of workloads onto fewer hybrid systems, Seagate streamlined operations, reducing the overheads and logistical challenges associated with managing a heterogeneous storage array landscape. This simplification translated into not just cost savings but also enhanced operational agility, allowing Seagate's IT team to focus more on innovation and less on maintenance. The flexibility and scalability of the systems complemented Seagate's strategic direction, providing the capability to dynamically scale storage capacity in alignment with evolving business needs, without the financial and logistical burdens typically associated with scaling all-flash solutions.

Balancing Performance and Capacity

Vinod Pasi notes a fundamental truth at the heart of Seagate's decision-making process: the balance between performance and capacity is paramount. While its previously deployed all-flash arrays offered high performance, the holistic needs of enterprises like Seagate also require voluminous data capacities. The hybrid storage systems adeptly bridge this gap, delivering high-performance metrics without sacrificing the ability to store petabytes of data economically. This equilibrium supports not just immediate operational requirements, but positions Seagate to handle future data growth and technological shifts.

  1. IDC, Multi-Client Study, Cloud Infrastructure Index 2023: Compute and Storage Consumption by 100 Service Providers, November 2023