Academia Sinica Case Study
Climate research academy uses Seagate to solve mass data issue.
Academia Sinica is a preeminent climate research institution in Taiwan. The institution’s researchers predict that over the next four years, they will see a 16-fold increase in data, amounting to more than 10PB. As the data used for simulations expands, the team needs reliable data storage that will keep pace. Seagate’s versatile, scalable solutions meet their growing demand.
Taiwan is proactively working to reach zero net carbon (ZNC) emissions by 2050, using the TaiESM climate predictive model in collaboration with the Oceanographic Institute at the National Taiwan University.
Reaching net zero by 2050 is an imminent global endeavor. In response to this goal, Taiwan is proactively promoting climate change-related legislation and restructuring its environmental departments as a move toward the net zero transition. The climate team from the Environmental Change Center of Academia Sinica, Taiwan’s most prominent research academy, has participated by conducting relevant climate research. Last year, the climate team overcame the challenges of mass-data collection and management with the help of the TaiESM model they built exclusively for Taiwan. The model—which is powered by Seagate data-storage systems—allows the team to produce and predict climate data by locally implemented global climate simulations.
Not only was TaiESM recognized for meeting CIMP6 standards, a project supported under WCRP, but the model was cited as a reference in the Sixth Assessment Report (AR6) of the United Nations Intergovernmental Panel on Climate Change (IPCC) 2021 by Working Group I (WGI). At present, the climate team collaborates with many academic teams, including Institute Oceanography, National Taiwan University (IONTU), to provide TaiESM with Academia Sinica modules.
The first phase of TaiESM allowed the climate team to successfully participate in international data exchanges, and even placed the team among the top-ranking countries in each score. The current version is built on the United States’ Community Earth System Model (CESM). Without a 100% self-developed model, Academia Sinica is unable to call it their own and name it TaiESM. Huang-Hsiung Hsu, CEO of Anthropogenic Climate Change Center and Deputy Director of Environmental Change Research Center at Academia Sinica, said, “Developing our own model encapsulates [our] own distinct features, specifications, technicalities, and success. Our next phase is to keep refining the first version of TaiESM with a goal to develop a fully independent climate prediction module by a Taiwanese team—from core programs to internal modules—that is close to local needs and truly exclusive to Taiwan.”
The improved TaiESM climate prediction model will offer data collection for a global bank that produces and disseminates long-term, accurate forecasts. It is slated for use when conducting more research and simulation experiments, collecting additional data, and building a data bank for global access. Just like the European Centre for Medium-Range Weather Forecasts (ECMWF), the research academy aspires to be a global presence through international collaborations that produce and disseminate long-term, credible weather predictions and data usability. Such a presence would increase their reputation and recognition in Taiwan and across the globe, enhancing opportunities for global partnerships and data exchanges.
The climate team of TaiESM needs 10 petabytes (PB) data storage within the next four years to maintain higher resolution images. Existing storage availability rates and limited server space no longer meet their application and workload requirements. The increasing frequency of data exchanges and improved data collection means the team needs always-on availability, speed, and improved data protection.
In pursuit of the second phase of TaiESM, Environmental Change Research Center needs more data storage capacity to cater to stronger data analysis and higher visual resolutions while managing a structured data surge that is complicated by data growth and data sprawl.
The existing storage equipment space, performance, and availability rate no longer met the team’s application and workload requirements. To future-proof and scale their storage so it can handle more research data and analysis reports, the storage equipment needs immediate expansion.
Academia Sinica’s data growth has far exceeded the capacity of a standard research institute. The current research data capacity at the climate change research center is approximately 3PB. The center predicted that over the next four years, the amount of climate data will grow by at least another 10 PB. The climate team simulates at least two to four terabytes (TB) of data a day, and their requirements demand seamless climate data exchange, rather than siloed data.
Simultaneously, there is a dire need to improve image quality. The current version of TaiESM offered pixelated images that are not ideal for research. To accurately simulate landforms and weather conditions such as typhoons, the team is looking to enhance their visual resolution by at least four times. This four-fold resolution increase would mean at least a 16-fold increase in the amount of data.
When it comes to replicating weather conditions realistically and practically in the climate research lab, the team’s needs only get more challenging. A significant roadblock is measuring the long-term average of weather conditions, as temperatures change from land to ocean. Technology has advanced storage drives capacity and computing accuracy, allowing more grids or earth-sectoring, and thus, increasing the visual resolutions for accurate model predictions. While processing more raw data leads to greater resource opportunities, processing also creates even more data.
To pursue a visual resolution of up to 50km and improve simulation accuracy, the team creates 2TB to 4TB of data a day, and records data up to eight times a day. As Academia Sinica continues contributing data modules to IPCC, they must meet IPCC data standards: 1,000 miles of data simulation using 500 years of climate data, coupled with 164 years of historical data simulation from 1850. The climate team uses that simulation to generate a fixed climate prediction model. Such simulations consist of a plethora of modules, each of which is responsible for different functions, such as carbon emissions, weather, and oceanography. The modules require massive amounts of historical data and they in turn generate even more predictive data.
The climate team at Academia Sinica must regularly adjust and calibrate module settings and compare data for each different module setting, which further generates data. Effective and reliable data storage is crucial for the team. Hard disk drive failures were a significant pain point, often requiring the costly replacement of four to ten faulty units a month.
In the field of climatology, there is almost no cold data. Academia Sinica requires a permanent storage solution for all climate data. Forecasts, analyses, rework of forecasts and analyses, and multi-model data is made available through dedicated data servers using a distributed file system.
As the sheer volume of data grows exponentially, the team must carefully consider requirements for data storage capacity, storage efficiency, storage performance, as well as less obvious factors, such as the hardware footprint and associated physical plant requirements.
Seagate’s high-density data storage system, the Exos X Series 5U84, achieved sequential read and write performance of 7GB and 5.5GB at the current stage of the climate team’s research. Seagate’s ultra-dense intelligent solution also exceeded the team’s expectations with a 75% reduction in the data center rack space and an 80% decrease in the total cost of ownership. Seagate’s Advanced Distributed Autonomic Protection Technology (ADAPT) also helped the team reduce 93% of the storage rebuild time resulting from drive failure.
The climate research team looks to Seagate’s versatile architecture to deploy a high-capacity, high-performance platform that addresses extreme data growth and efficiently manages hold and cold data with real-time data-tiering options. Seagate’s solution allows Academia Sinica to scale their storage with data access freedom, while simplifying operations and optimizing costs.
Less downtime and lower maintenance and IT costs help TaiESM focus on climate prediction refinements and manage data without sacrificing performance.
Exos X 5U84’s five nines availability (99.999%) has helped Academia Sinica deliver consistently high reliability. The maximum-density 5U chassis accommodates 84 drives and can expand to 336 drives for up to 8PBs of storage. It is tuned to maximize drive performance by protecting against vibrational and acoustic interference, heat, and power irregularities. With ADAPT, it distributes climate-research data across every drive, offers advanced data protection, and provides fast rebuilds without sacrificing performance, reducing downtime. And less downtime extends the product life cycle and reduces IT spend on repair or replacement.
With less downtime and lower IT cost, TaiESM can focus on refining their climate prediction model. Overall, the Exos X Series 5U84 helps the climate team efficiently manage mass data and reduce hefty maintenance expenses for storage equipment, allowing the team to contribute mission-critical climatology models to a growing international community.