spacer Home > Support > Knowledge Base > Disc Knowledge Base > FAQs spacer
spacer
spacer Object Oriented Devices - NAS Environment - NAS Goal: Enable Scalable Clustering spacer
spacer
 

Network Attached Storage & Object Oriented Devices

NAS Environment (continued)

NAS Goal: Enable Scalable Clustering

NAS offers the prospect of solving both open system shortcomings mentioned on the previous page.

Scalable Connectivity of Storage to Clients

The ability to scale processors and storage independently and linearly is a fundamental goal of NAS. The NAS concept physically disassociates storage from processors. Devices are no longer peripheral components of a processor but a separate and equal architectural entity. Storage can be managed, changed, and expanded with no impact on the processor configuration or operations – if NAS is done right. It should also liberate the processors in the same way. This is the easy part of scalability.

In order for NAS to deliver fully on its scaling promise, there must be no overheads that increase appreciably faster than the rate of scaling. For instance, in loosely coupled, shared processor systems today, this interprocessor communication activity goes up exponentially as the processors increase. This kind of overhead severely limits scalability, and NAS research aims to eliminate this barrier to scaling.

Making a cluster of servers process the same workload a mainframe accepts is easy, unless it requires that all the servers process transactions on the same data. Unless a cluster can run enterprise applications like a Fortune 500 manufacturing system, an airline's reservation system, or a financial institution's complete inventory of transactions, a cluster cannot replace a mainframe.

If, on the other hand, each processor in the cluster can accept any transaction that would have arrived at the mainframe and process it, there is hope for the cluster model. The challenge to the cluster's ability to do this is giving the servers the ability to share data, including having concurrent update rights in a way that increasing the number of processors does not destroy performance. With this, any server in the cluster could process any transaction, and processors can be added as needed with no fall off in performance. This is what a cluster model must demonstrate if it is to replace the mainframes.

Specifically, NAS aims to do the following:

  • Increase scalability of high volume servers by applying clusters of them to large, monolithic applications that traditionally have been serviced only by mainframes or tightly coupled processors. NAS not only will provide sufficient connectivity for a large group of servers to have access to all the storage they require – in terms of both distance and number – but also provide a mechanism so that many servers sharing update access to common storage can do so without the excessive overhead that accompanies clusters today.

  • Enable heterogeneous computation. That is, let systems running different operating systems access a common pool of storage and share data. Perhaps an end user with a cluster of servers in a data mining application finds that a supercomputer doing statistical analyses on the data would be of value. If the end user could just wheel up such a system and have it act upon the data already in place, it would be much more cost effective than if the end user had to purchase a completely separate system, including a duplicate set of storage devices. Given the size of many data warehouses, the latter would not be possible; the data movement time alone would be prohibitive.
     
    The fact that OODs make the storage data organization free for the operating system should make it possible for a server running any operating system to access the data. (There is still the problem of the internal data structures of a file, which must be reconciled at the application level.)

  • Increase the I/O bandwidth to Requesters. Making storage available to Requesters at storage network speeds without "channeling" it through a server can mean more performance for high data applications. (This is a principal objective of the work being done at CMU.)

  • Improve performance by eliminating the necessity for a server to translate a request into physical device accesses. When several servers are accessing the same OOD data, the metadata never travels over the Interconnect. Though there are a lot of factors affecting how caching is done and how it is affected by the object abstraction, there are some clear opportunities for benefit. A common cache of the metadata in the OODs serves all Requesters, with no coherency issues. The number of I/O operations is correspondingly reduced, as a single metadata retrieval can service requests from multiple Requesters. What is more, with the OODs understanding quite precisely which objects are in use and which are not, the cache space can be more effectively utilized. Typically, even system caches are not tied closely to the file system and do not reflect what the OS knows about files being open or closed. (Of course, sometimes this can be misleading; a file can be closed and then be needed quite soon thereafter. A cache that is not as closely tied to the file system may happen to fortuitously retain data that otherwise would have been discarded. Even this condition can be accounted for in the OOD environment, with an indication that an object being closed will be soon accessed again.)

Storage Manageability

The second fundamental problem is that of storage management. The NAS view is to make all components of the storage architecture participate in the management. By breaking down management from a huge CPU task to many small activities assigned to the lowest possible level and directing those activities by means of simple attributes expressed by the user, storage takes on the responsibility for managing itself.

NAS should make it possible to:

  • Make storage be as much self-managed as possible. This will eliminate the associated drudgery now imposed on the operating system by such requirements as space management. It should also make scalability more linear by increasing storage management capability at the same rate as the number of storage devices increases. Today an operating system assumes the responsibility for allocating space on a disc drive, reclaiming space from deleted files, and – in some cases – deallocating bad sectors. Doing this for one drive is not difficult, but a server with dozens of drives could find it consuming quite a good portion of its processing cycles. OODs would take over space management, eliminating any increase in OS overhead associated with the number of drives on a system. A server with dozens of disc drives would get the benefit of all those devices contributing horsepower for managing the storage resource.

  • Support attribute based management by having the devices take action based on the properties assigned to a given object or set of objects. The many engines (each storage device having some usable processing power to apply to this task) available would be put to additional use by helping with more than just space management. They could contribute to breaking the task of data management into many simple, small functions performed concurrently. For instance, an attribute could be set for an object that called for that object to be versioned. Every time the object was closed after an update, the storage device could automatically keep the old version of the object while giving the new one a separate identifier.
     
    Similarly, an attribute might be set to indicate that an object should be exported after it was updated. This could cause the device to start a sequence of actions independent of the application processor that would result in a copy of that object being sent to another device. Extrapolating this very simple process, an entire storage complex could participate in a more timely and less intrusive backup function. If the devices knew enough about what work was going on, they could make sure that an export operation only took place when an object was in a consistent state. This is a little more complicated for some data; complex data structures like data bases may require that multiple OODs take coordinated action to set them up, but even this can be handled by a straightforward extension of the basic capability.

  • Support quality of service management by having the storage devices be as knowledgeable as possible about their own conditions and informing the appropriate service of those conditions or acting in response to those conditions as guided by policy assignments. Suppose a disc drive has several levels of transfer rate performance to offer a Requester. The device could allocate an object to whatever zone is most appropriate given the users interest in cost versus performance. This could as easily apply to NAS that has combinations of mirrored, RAID 5, and unprotected storage, with the user selecting the residence for particular data depending on the user's requirements for resiliency. A quality of service management system could interpret user choices into attributes associated with particular objects, leaving the device to act on those attributes and use its resources accordingly.

  • Facilitate the dynamic reconfiguring and expanding of storage. Whenever additional space is required, there would be a central authority to which any sever in the network could turn to find additional space or to find all storage devices available to it. This could be the basis for operating systems being more dynamic and flexible as to what hardware they are operating with at any point. It need not be the peripheral set that was present at system generation time or even power up time.

  • Present as nearly as possible a single system view to users. If all Requesters accessing a NAS configuration – and all users contacting the Requesters – get a single view of the data regardless of which member of a cluster they are connected to, clusters of systems can look and feel much more like a single system than otherwise. If clusters of high volume servers are to replace large monolithic systems, this will be key.

  • Present as nearly as possible a single system view for management purposes. If clusters are to work as well as mainframes, the management tools must let the user control the configuration as if it were a single, coherent computing facility, even if there are multiple vendors and operating systems represented.

Inherent Security

Object oriented storage offers new opportunities for improving system security. At the very least, it could prevent random writing of sectors not associated with a file operation. The most common type of virus is one that changes the contents of a single sector on a PC's hard disc. Any software module has the ability to do this, as there is little or no security in most PC systems.

More than this, new security provisions could be architected into the OOD definition so that any spying on disc or tape traffic could not make available any usable information, either for espionage or sabotage.

Return to OOD Index «PREVIOUS NEXT»

spacer