The Music is only as Good as the Orchestra that Plays It

When EMC first announced ViPR just a short year ago, it was all poised to take the industry by storm. A control-plane management system that orchestrates a heterogeneous storage environment and provides a policy-based abstract layer to serve the workloads though an open API. How cool is that?! Conceptually, isn’t that what Software-Defined Storage (SDS) is all about−to commoditize storage? Even the infomercial-inspired teaser video featuring Mr. Amitabh Srivastava himself puts a wide grin on the face of anyone who knows anything about enterprise storage. Seriously, there simply couldn’t be a better organization to take this initiative than EMC itself, the company that, for all practical purposes, owns the enterprise storage market.

Back then, most analysts cautiously bought the ViPR story but remained suspicious that the very concept won’t sit too well inside the 800-pound gorilla whose bread and butter still comes from revenues generated by its enterprise arrays. Some even called it outright vaporware. Not so. EMC quickly released ViPR 1.0 in October, 2013. Although it only supported EMC, NetApp, and VMware hypervisor systems, Storage Swiss concurred that “its very inclusive roadmap demonstrates EMC’s intent focus on developing a true software defined storage solution which can be consumed by many different user communities−from cloud service providers and enterprise data centers to academia.” However, the lack of a general support for third-party storages gave people enough reason to believe that ViPR was merely a bandage solution that EMC scrambled to put together to solve the very problems it created itself in the first place, namely, the lack of interoperability between EMC’s many product lines.

Well, that market sentiment didn’t last long, because EMC, again, swiftly announced ViPR 2.0 in May, 2014. Version 2.0 addresses most concerns about third-party support (through OpenStack Cinder) and offers a wide range of advanced data services. It enables customers to automate the management of both traditional (file, block) and new (object, HDFS) storage infrastructure from a single pane of glass. Any doubts are all but gone and it is clear that EMC is fully committed to go down this route. And that is excellent news for the storage end users and the industry as a whole. Sure, individual vendors may have their own agenda and think otherwise. Nevertheless, when we saw the ViPR 2.0 demo video, we were so impressed that our only question was “how can any storage user not want to have this?” Kudos to Srivastava and his guys at Advanced Software Division (ASD) for a job so very wonderfully done!

Some might wonder why EMC goes out of its way to create something so powerful that could potentially cannibalize its other product lines. What they probably fail to see, at least in our opinion, is that EMC has a bigger plan, and a very well thought out one at that. If you think storage orchestration is a facilitator, a productivity tool that tremendously simplifies storage management, you are only partially correct. Think carefully, and you will realize that storage orchestration is actually an enabler. It represents a paradigm shift that leads to a new level of data freedom previously unattainable with traditional enterprise arrays, no matter how powerful or how versatile they have individually become. Today, end users may not fancy how much storage orchestration can be exploited to advance their business, but tomorrow, they will wonder how they ever survived without it.

Storage systems come in all shapes and forms and they create steep, if not insurmountable, barriers for data to move freely across their boundaries. Imagine a city of upscale high-rise buildings with no roads, cars, or any form of public transportation. That wouldn’t be a very lively city, would it? Data agility has always been plaguing enterprises large and small but they managed to get away with it using third-party software or manual operations. Two things that happened in the last few years changed the status quo: flash and big data.

All of a sudden there comes a super-fast tier-zero storage option that allows a single CPU to handle much higher workloads. A welcome side effect of this is lower licensing fees for software that are priced on a per-core or per-socket basis. Of course, there is no free lunch. Flash based storages are still very expensive and only high-value assets (data) deserve such a special treatment. The problem is, most high-value data have a short lifespan, and once they outlived their usefulness, they need to move out of their six-star hotel and make room for the new data that had just gained the VIP status. Without storage orchestration, the need to constantly move data in and out across multiple tiers will bring an enterprise’s storage infrastructure to its knees.

The arrival of big data and Internet-of-Things does not make the situation any easier for enterprises, either. To process this much data, people resort to scale-out computing, using an army of servers based on commodity hardware to divide-and-conquer the bits and bytes. To speed things up, the entire scale-out community religiously depends on server-side direct-attached disks or even flash. They won’t even touch a network array with a ten-foot pole. Since servers and disks are cheap enough they happily depend on the three-copy strategy to take care of their redundancy 9’s. But just like a certain major Internet company had found out, their PCIe powered NoSQL instances still need to be backed up to a traditional enterprise array to negate logical errors that could wipe out all three good copies at once. In addition, more and more companies are using Hadoop or NoSQL as a staging platform to hold or preprocess the huge amount of unstructured data before they are distilled, condensed, organized, and offloaded to traditional relational databases. Without storage orchestration, how do you bridge the big data wonderland and your old-school transactional applications?

Not to beat the subject to death, but how do you effectively handle backups, disaster recovery, copy management, quality assurance, troubleshooting, and DevOps, without storage orchestration? It does not matter whether you have one or 10 types of storages. For tomorrow’s data freedom and business profitability, enterprises and datacenters must have their storage infrastructure federated, one way or another.

Remember, the notes dancing on Mozart’s ingenious symphony manuscripts are not called music until they are played through the fine instruments of a well-conducted orchestra. People, upgrade your storage orchestra today and play your data the way they should be played.