Data growth has never been faster than it is today. Cloud services, for example, are a high-growth sector for storage requirements over the next five years, with Gartner forecasting Infrastructure as a Service (IaaS) revenue to reach US$24.4B in 2016, with storage accounting for almost 40% of that figure (Gartner, Public Cloud Services,Worldwide, 2011-2016, 2Q12 Update, Published: 20 August 2012). There are a number of other trends driving data growth in addition to cloud and social media, BYOD and big data analytics.
The more important question is: What are we doing to prepare our server and storage infrastructure to not only meet the exploding demand for storage capacity and still deliver unprecedented levels of performance, scalability and reliability?
Introduction to Interfaces
With the explosion of data over the past decade, storage technology was forced to evolve. In the early years of enterprise storage, parallel SCSI was the interface of choice for server-based storage. As networks evolved, the concept of accessing data from shared pools of storage resources, commonly referred to as storage area networks (SAN), gave credence to the fibre channel interface. At this point, parallel SCSI dominated server and direct attach storage (DAS) architectures, while fibre channel dominated the SAN.
It was at this time that both the ATA desktop interface and SCSI enterprise DAS interface were becoming serialised. The introduction of the serial AT attachment (SATA) and serial attach SCSI (SAS) interfaces led to a boom in desktop and enterprise storage. SATA was quickly adopted as the standard interface for client compute, delivering faster throughput, thinner, more manageable, cabling, and greater signal integrity. At the same time, SAS quickly began competing with fibre channel in the enterprise SAN and DAS spaces, but yet another new architecture was emerging– network attach storage (NAS) and the iSCSI protocol– and offered lower cost and wider connectivity.
How Did We Get Here?
With the growth of NAS and iSCSI storage, the large installed base of Ethernet cabling within businesses of all sizes was being leveraged for storage. Lower-cost NAS solutions enabled seamless connectivity with department–level and workgroup clients, displacing the backend data centre realm, which had been exclusively populated by enterprise–level SAN. It was here that the desktop–class SATA drive was introduced into the enterprise.
Because desktop–class SATA hard drives were less expensive, as were SATA controllers, and the added benefit that SATA could function on a SAS backplane, both NAS and DAS solutions began using desktop SATA drives for low-cost, high–capacity storage; but this approach was not without issues. Desktop–class SATA drives were not designed for the 24×7 duty cycles demanded in an enterprise environment. Customers experienced degradation of performance and poor reliability, as shown later. The culprit: rotational vibration. Too many hard drives running non-stop in a system caused the system to vibrate, leading to a significant degradation of input/output operations per second (IOPS) and reliability. Still, desktop–class hard drives continue to proliferate into what are marketed as enterprise–class storage solutions, especially with the advent of the cloud.
When data grows at the pace it is growing and forecast to grow, companies tend to look for any and all opportunities to drive cost out of their storage solutions. It has become habit for many to simply choose the highest capacity hard drive they can find at the lowest possible cost. In many cases, this ends up being a desktop–class product not designed for enterprise workloads and duty cycles. This may address the challenge of reducing initial storage capacity acquisition costs, but it introduces a whole new set of hidden costs to the data centre.
Enter an All–SAS Enterprise
The dynamic technology trends above spurred hard drive manufacturers to develop a new category of high-capacity, low-cost, enterprise-class storage: nearline. Nearline storage (commonly referred to as tier-2 storage) and business-critical storage, midline, and even bulk storage, are the most critical storage metrics businesses look at in terms of cost per gigabyte or, in today’s storage environments, cost per terabyte. Nearline was the answer to the data explosion.
Already well established in high-performance drive markets, SAS is now penetrating secondary storage tiers with capacity-optimised drives. When it comes to scalability, or the ability of an enterprise data centre or cloud data centre to scale rapidly as capacity and performance demands increase, there is no proven solution that provides the performance, data integrity, reliability and long term investment protection of SAS.
How All-SAS Enterprise Changes the Storage Mindset
Performance – Enterprise-class drives must maintain high performance levels in multi-drive configurations where physical vibrations transmitted through a cabinet occur. This phenomenon is known as Rotational Vibration (RV).
RV itself is a twisting and torquing type of action and is measured in radians as the angular rate of change in seconds. In other words, it’s how much angular acceleration movement the drive can handle.
The main sources of RV energy are:
- Hard drive seek movements
- Additional drives inside the cabinet accessing data (eg, multi-spindle environments)
- External forces acting on the cabinet
If RV is not taken into account in the design of the drive, the force of RV can push the head off track, causing missed revolutions and delays in data transfers. Tests on drives not capable of handling RV have shown significant reductions (over 50%) in performance.
Fortunately, enterprise drives have used numerous technologies within the drives for over a decade to negate the effects of RV. To optimise these densely packed, multi-drive environments, Seagate uses RV sensors as well as linear vibration sensors on its enterprise drives. These sensors enable the drive to compensate for any vibration that occurs from the drive itself or from outside the drive (eg, cooling fans, poorer-quality chassis, etc.) and still continue to read and write data.
It’s also a good idea by design to reduce the amount of vibration that a drive may generate on its own. Seagate’s high-capacity enterprise drives are built using a top-cover-attach spindle motor that increases rigidity within any drive that uses a 4-disc configuration. Further enhancements of the design are made by optimising the seek profiles of the drives in firmware to minimise any emitted torque.
Xyratex is industry-recognised for managing drive rotational vibration, power and cooling for 2.5- and 3.5–inch drives under its rigorous drive processing systems (DPS), with over 3 million test slots worldwide. In leveraging 25+ years of knowledge in disc drive and storage component test capabilities, Xyratex utilises thermal modulation and drive isolation techniques from various vibration sources (e.g., fans and other moving components) or by choosing components that minimise vibration, such as weighted fans, to create ideal individual drive test environments. This allows Seagate and other drive vendors to recognise increased drive performance, density and operational flexibility, as well as allowing Xyratex to produce modular enclosures and high performance application platform solutions with very dense storage capacities. Xyratex works closely with Seagate in qualifying their drives with performance testing within its processes and enclosures.
Now how about the question of using desktop drives in an enterprise environment? The diagram below shows the performance impact of desktop, nearline and enterprise-class drives at different vibration levels. The 6, 12 and 21.5 radians/sec^2 figures represent the specific tolerances of desktop, enterprise capacity (nearline) and enterprise performance (enterprise) drives respectively in maintaining throughput performance up to 80%. Notice that when the RVs increase beyond those specified tolerances, the performance of the desktop drives in particular drops dramatically, while the enterprise-class drives maintain their performance at nearly 100% under all but the most extreme conditions.
How does rotational vibration affect hard-drive operation?
The read/write heads follow concentric tracks on the disc. The head position must remain within an allowable tolerance window for read and write operations to occur. The allowable write window is smaller than the allowable read window, making write operations more sensitive to vibration than read operations.
If the writer position exceeds the allowable position window, the write operation is temporarily stopped. A write-position-window violation is called a write fault. The write operation resumes once tracking accuracy returns within the allowable window and the target write location (logical block address, LBA) passes under the writer. The target LBA passes under the writer once per disc revolution. When a write fault occurs, the write operation usually resumes one disc revolution later. Under extreme vibration the write operation may be delayed for several disc revolutions. Delayed read/write operations are the root of all vibration induced I/O degradation.
The bottom line: customers with multi–drive applications will benefit greatly from actually achieving the specified performance levels made available by integrating enterprise-class drives across multiple storage tiers.
In a tiered storage environment, data not only has to move quickly from tier 0 and tier 1 storage, but also to and from lower–cost tier 2 storage. The ability to move data as quickly as possible up and down the storage tiers is becoming more critical to meeting customer expectations. Failure to meet expectations leads to opportunity costs associated with customer abandonment, lost sales and fewer customers.
Scalability – To meet growing data demands, businesses look for solutions that provide the greatest level of scalability. Being able to expand network storage capacity quickly and cost effectively without disrupting service level agreements is critical in today’s world of terabytes, petabytes and even exabytes of data.
With the move to storage networks also comes the need for more advanced error checking. Getting data from a motherboard to a directly attached drive is one thing. Getting that same information through multiple switching points, whether within a server rack or across the country, is something else. Every point where there is an address change introduces an opportunity for error. Desktop drives use basic error checking, but there is nothing in the drive that says, “I need to make sure that the information I’m receiving is the same information that was originally sent to me.” If a bit gets flipped in transit, a desktop drive will record the error and not know any better. A nearline SAS drive will use advanced methods, similar to those used with ECC server memory, along with metadata embedded in the information stream to identify and remedy miscompared errors.
Addressing and error correction start to assume more importance as we look at how SAS can be scaled beyond a typical two- to four-drive implementation. As you know, most motherboards only accommodate from four to eight SATA devices. But, unlike SATA, SAS devices operate within a domain, much like a conventional business network. As with networks, you can employ various types of SAS switches (generally called expanders) to aggregate devices within a domain. You can have 128 devices on an edge expander and 128 edge expanders per fan-out expander. All told, a SAS domain can contain 16,384 devices.
Of course, most users are not likely to approach anything remotely close to this amount. Still, at a rack mount level, needing a storage solution containing dozens or hundreds of drives is not uncommon. This is where SAS’s architectural potential in terms of rapid scale becomes so great.
Reliability – As mentioned before, poor drive reliability in most cases is usually a result of deploying the wrong type of storage device within an enterprise class system, or for a specific enterprise class workload. Hard disc drives, being mechanical devices, are designed with specific features and components for specific workloads. An enterprise-class drive is equipped with additional features and functionality which allow it to reliably read and write data in a more stressed 24×7 data centre environment.
For example, SAS drives help decrease storage system failure rates by reducing the number of physical interconnects and adding dual-port capability. While IT professionals may be reluctant to deploy storage systems that may disrupt operations, enterprise SAS drive design allows seamless integration into the same SAS infrastructures currently supporting critical Tier 1 storage. The support for a large number of host connections helps SAS drives avoid the single-point-of-failure risk that characterises SATA drives. By eliminating the need for a SATA interposer card, SAS drives also reduce total system parts count, a key consideration when designing for higher reliability.
Rotational vibraton (RV), as mentioned in the Performance section, obviously affects drive reliability as well as performance in multi-spindle environments. However, enterprise drives specifically designed with RV in mind include RV sensors to provide feedback in counteracting mechanical vibration – both linear and rotational. Dual-stage actuators to provide higher bandwidth for RV immunity and tied shaft motors to provide increased mechanical stability are other examples of technology designed into enterprise drives not seen as necessary in desktop drives. The use of enterprise drives has many tangible benefits over the life of the drive, or rather the mean time between failure (MTBF) which is much higher in enterprise drives, including lower integration efforts, better unrecoverable error rate, speed optimisation, more extensive manufacturing testing and, most importantly, less drive replacement purchases for better return on investment (ROI).
Xyratex extensive drive testing knowledge and experience over the past three decades improves drive quality, reliability and robustness through early detection of individual drive weaknesses or defects. One of Xyratex’s proprietary integrated testing processes, known as Combined Environmental Reliability Test (CERT), is a highly efficient and scalable storage test platform that exposes, identifies, and eliminates devices with inherent defects or defects resulting from manufacturing aberrations, which cause time-and-stress-dependent failures. In addition, CERT performs accelerated stress screening using aggressive customer simulated and bespoke testing on assembled storage systems. Further, system-level performance and data integrity testing within this process includes core areas of shock and vibration, rotational vibration (RV), power measurements and fault replication and diagnosis.
Seagate and Xyratex hope you found this paper both interesting and informative and sincerely appreciate your taking time to read. It is hoped that all readers obtain some information not yet known, but also that some key takeaways are evident. To summarise:
- Storage requirements are continuing to grow exponentially with no slowdown in sight.
- Desktop drives are not designed for data centre and multi-spindle environments with sustained performance, scalability and reliability as forefront drivers
- Not all data centres and infrastructure are equal, and the related performance ability of any environment must match the requirements necessary for critical applications.
- Organisations expecting to purchase or lease reliable sustained storage performance at scale are advised to seek out and demand the better sustained match of high-performance enterprise drives (nearline SAS) for their needs.
While we have touched on some of the technology variances between desktop (SATA) and enterprise (SAS) drives throughout this paper, there are, of course, more details about the capabilities and differences between the drives. Seagate and Xyratex will release a follow-up paper on this subject in the very near future, so stay tuned.