At the site of a former church in San Francisco’s Richmond neighborhood, Seagate is helping preserve the Internet’s past and ensuring that its vast resources remain easily accessible to future generations. Servers, hard drives, digital scanners and other high-tech tools blend seamlessly with the building’s still-intact stained-glass windows and pews.
During a recent tour of the Internet Archive, founder Brewster Kahle showed off a server room loaded with three petabytes’ worth of Seagate storage—“enough room to store all the books in the Library of Congress, many times over,” said Kahle. A petabyte is equal to 1 million gigabytes, or about 1,000 terabytes.
Earlier this year, the Internet Archive purchased 1,000 Seagate 3TB Barracuda® hard drives and installed them in several rack-mounted servers. Each server is embedded with blue lights that blink non-stop, signaling each time someone uploads or downloads a file from the Internet Archive. Those files could be digitized books, videos, TV news broadcasts, movies, web pages and more. The non-profit organization (it’s funded from foundation grants and donations) has evolved considerably since it began operations in 1996—and its storage needs have grown exponentially.
Reliable, Power-Efficient Storage
“What was important to us when we purchased Seagate’s drives is that they were reliable, inexpensive and power-efficient,” Kahle explained. “If you’re going to entrust the heritage of your culture to a storage medium, you really need to have those characteristics. Seagate’s drives have performed very well for us.”
The Internet Archive’s storage workload is intense. Some 2 million people use the Archive’s online material each day. One of its most popular resources is The Wayback Machine, which lets users browse a database of more than 160 billion web pages going back to 1996. That tool receives about 1,000 queries from around the globe each second.
The Archive currently uses more than 12 petabytes of storage—a mixture of Seagate and other hard drive brands—and that number, like the Internet itself, keeps expanding.
“There’s an exponential increase in the World Wide Web, with video taking up more and more of our storage needs,” Kahle said. “As long as companies like Seagate continue to increase storage densities, cultural institutions like the Internet Archive should not only be able to keep that data safe but keep it accessible to everyone.”
Universal accessibility to the Internet is “hugely important” to Kahle and his 150 employees—along with ensuring that the Web’s content is preserved for future generations to enjoy. He sees the Archive as a kind of “Library of Alexandria 2.0.”
“All libraries live in the shadow of the Library of Alexandria,” Kahle explained. “It was the center of learning of the ancient world. It wasn’t just a great collection; it was where new ideas came about. People came together and they learned from each other. We’re seeing some of the same things on the Internet, by giving people access to information.”
That information can be fleeting, however.
“The average life of a web page is about 100 days—before it’s either changed or disappears,” Kahle said. “It’s fantastic how much information is available at our fingertips now, but it’s also highly volatile. We support universal access to all knowledge. And we want to make sure that knowledge doesn’t disappear into the past.”
Preserving the massive amounts of digital information available on the Net—and making it more accessible—is a daunting goal. It’s one that Kahle and his team hope others will continue working toward well into the future.
“No one’s here for the money,” said David Rinehart, a digital artist. “We all love this place, and we have a passion for what we’re doing. This isn’t a 10-year project. It’s infinite.”
— By Steve Pipe