Current Trends and Innovations in the Storage Industry
By: Kirill Malkin: Chief Technology Officer - Over the past couple of years the storage networking industry has been going through a quiet revolution. Traditional scale-up storage array designs built on hard disks now approach a petabyte and are being augmented with terabytes of ultra-high performance solid-state tiers or caches. Proprietary single-processor hardware architectures are being replaced with x86-based multi-core number crunchers running at 3 GHz or more each. The dynamic RAM cache capacities have grown by at least a magnitude, now reaching tens and even hundreds of gigabytes.
The storage and network connectivity speeds have kept up with the pace – 6Gb/s SAS & SATA disks are ubiquitous, 10GbE is a de-facto networking standard, and so is 8 Gb/s Fibre Channel, with the next generations of all of them just around the corner.
With such massive improvements across the storage hardware technologies, the natural expectation is that the storage systems will follow the example of hypervisors and be able to provide shared storage services to a hefty mix of different applications.
And yet largely, the storage network designs have stayed the same. Both array vendors and IT admins take the high road of recommending and building isolated, siloed infrastructures where each application is allotted a certain redundant set of hard disks, possibly fronted by a solid-state tier to achieve higher performance for business critical applications. Even if the storage system offers multiple access methods (block and file), the protocols are usually stacked on top of each other and are routed to independent RAID groups that can’t be easily reconfigured. While this approach delivers certain performance guarantees, lots of hardware capacity and resources remain underutilized and the systems are generally complex to tune and scale.
Why is this? The key reason is that the firmware stacks have not evolved as rapidly as the hardware they are now running on. Many products incorporate battle-hardened, tried-and-true SAN and NAS architectures developed a decade ago and mashed together, sometimes with the help of virtualization, into a unified solution with a limited ability to adapt to the heavy mix of workloads presented by the modern applications. Unfortunately, these stacks were not designed to run concurrently and share the available storage resources based on application requirements, so vendors recommend physically separating the resources into isolated pools as the only method of escaping the unpredictability of the “I/O blender”. Another challenge is a sudden explosion of fundamentally new storage devices – SSDs, coming in very different shapes and forms in terms of both connectivity (SATA, SAS, PCIe) and performance characteristics. Traditional algorithms developed for HDDs simply don’t work very well with the SSDs, and integrating them simply as a faster storage tier proves to be risky and ineffective. And of course, the tremendous growth of capacity under management, from a few terabytes to hundreds of terabytes is quickly obsoleting the stacks built on legacy 32-bit architectures.
This is why the storage market is in need of a next-generation storage networking solution, the one that would be able to predictably handle the mixed workloads generated by a given set of applications consistently with the requirements of each individual application. With the state-of-art hardware resources readily available, the gap can be closed by deploying an innovative multi-protocol storage stack.
The key feature of this stack would be the ability to track application context, allocate storage tiers, prioritize requests and route I/O while striking a delicate balance between available resources and just-in-time needs of each application. All storage and processing resources (HDDs, SSDs, NV/DRAM, CPUs, networking, etc.) should be pooled and made available for allocation on-demand. That way, all applications will have access to all resources of the storage system, and not confined to a rigid, isolated silo.
The use of various efficiency and performance-enhancing techniques such as compression and deduplication can further extend the available resources, as long as it is possible to control their use per application context.
The configuration and administration of the storage system will shift from storage domain to application domain, enabling the IT admin to configure their storage system based on application (as opposed to storage) requirements. This calls for a more dynamic, flexible user interface with pluggable modules (“storage apps”) designed to interview the admin about the application features, translate them to requirements and auto-tune for each resulting application context.
It is exciting to see that several startups have seen the writings on the wall and are hard at work on various aspects of this solution. Almost every new storage product is built on some type of a dynamic pool that allows for multiple redundancy schemes (virtual storage layouts) spanning all devices that belong to a certain storage class by type and performance. Many are moving from rigid, static separation of HDD and SSD tiers to seamless multi-level I/O acceleration, where all types of storage work together on delivering the required level of service to each application.
There is no doubt that in the nearest future we will see a variety of innovative storage products coming to the market. They will deliver true virtualization of the storage hardware, much the same way hypervisors virtualized the servers, and the typical data center will collapse to a hypervisor cluster connected to a storage system.