Challenge: Reducing Cost, Boosting Availability and Scalability of Unstructured Data Storage
Many of today’s companies are faced with the inevitable need to accommodate exponential growth in unstructured data; meeting this challenge is not a question of if but of when. Deploying sufficient storage capacity through a conventional centralized server becomes increasingly problematic as capacity needs rise. Not only does each storage device’s effective throughput decrease as more devices are added to the server, the availability of their data is severely compromised by reliance on a single potential point of failure (the server).
Distributed database architectures are an alternative solution to these scalability challenges (especially when there’s a need to maintain SLAs); by spreading unstructured data across multiple servers and storage nodes, performance and data availability are significantly improved. However, this cluster approach does entail some challenges; adding more servers within a cluster can be costly and labor-intensive, in both initial hardware deployment and rebalancing of data when servers are added or deleted, as well as replication of data throughout the cluster to ensure adequate fault tolerance.
Solution: Basho Riak and Riak CS—Open Source Distributed Databases
Basho is a distributed systems company that makes Riak, an open source distributed NoSQL database, and Riak CS, open source cloud storage software. Both Riak and Riak CS are extremely effective when used to store large amounts of critical unstructured data that applications and users can constantly access (for example, patient health records, archived weather data, video storage for major media networks, user information and settings for mobile apps, and much more).
Riak differentiates itself from other NoSQL database solutions with its particular emphasis on high availability, making it an excellent choice for environments where downtime is unacceptable. Riak’s built-in replication and masterless design allow it to survive network partitions and hardware failures that would significantly disrupt most other databases.
Another distinguishing characteristic of Riak is its operational simplicity, especially at scale. With Riak, users can easily add (or remove) nodes and servers to a cluster, and Riak automatically rebalances data across the cluster. It also achieves a near-linear performance increase as users add capacity. Additionally, since Riak is built to withstand node failure, operators don’t need to worry when nodes go down, as the system will still remain read/write available.
Riak CS is built on top of Riak, exposing higher-level storage functions, including large object support, an S3- and Swift-compatible API, multi-tenancy, and per-user storage and access statistics. Using Riak as the underlying storage ensures that Riak CS has the same level of high availability, fault-tolerance and operational simplicity as Riak. Riak CS is ideal for building public or private clouds, as reliable storage to power applications and services, or as a turnkey replacement for applications leveraging other cloud solutions.
Riak CS exposes large object support and features multi-part upload, allowing for the storage of objects in the terabyte range. When objects are uploaded to Riak CS, the object is broken up into smaller chunks that are streamed, stored and replicated in the underlying cluster. Each block is associated with metadata for retrieval. Since data is replicated, and other nodes automatically take over responsibilities of nodes that go down, data remains available even in failure conditions.
High-Availability Object Storage With Lower Cloud TCO
Basho’s Riak and Riak CS solutions deliver exceptional data availability and fault tolerance (via tools such as automatic and intelligent replication), as well as uncompromised performance in large-scale environments—without burdening users with complex installation and management tasks. As a result, organizations spend much less time administering their database infrastructure, which directly translates into reduced TCO.
What’s more, Basho has committed to significantly advancing the economics and performance potential of cloud architectures by collaborating with Seagate Technology on its development of the Seagate Kinetic Open Storage platform.
Basho Leverages Seagate Kinetic Open Storage Platform to Decrease Cloud Architecture TCO, Boost Performance
The Seagate Kinetic Open Storage platform eliminates the storage server tier of traditional data center architectures by enabling applications to speak directly to the storage system, thereby making possible reduced expenses associated with the acquisition, deployment and support costs of hyperscale storage infrastructures. The platform is a fundamentally new architecture, integrating an open source API and Ethernet connectivity with Seagate hard drive technology.
Customers deploying Riak on the Seagate Kinetic Open Storage platform can realize the following benefits:
- An increase in I/O efficiency by removing bottlenecks and optimizing cluster management, data replication, migration and active multi-datacenter performance
- An improvement in customer total cost of operation (TCO) by up to 50% through simplified operations
- An additional cost savings by maximizing storage density through reduced power and cooling costs, and receiving potentially dramatic savings in cloud data center build outs
Basho’s distributed database relies on key/value stores directly attached to servers, and the Seagate Kinetic drive simplifies the management of the key/value store, file system, logical volume manager, RAID controllers and actual devices by replacing them with a simple socket-based network interface. Freeing drives from their server chassis enables independent scaling of capacity and throughput of a cloud architecture—providing more options in the long run.