Cloud storage technology options, along with developing trends such as Big Data, are driving rapid industry growth.
Cloud storage developed around three model options: Infrastructure as a Service; Platform as a Service; and Software as a Service. Merging with these cloud service models are technologies providing cloud computing space and cloud backup services. Don’t forget to factor in options such as public cloud, private cloud and hybrid cloud.
One of the biggest tripping points in managing cloud computing is selecting a solution that avoids vendor lock-in. That is where open source technologies have pushed the envelope as an alternative to proprietary, or closed source, solutions.
“Most if not all of the really big public clouds, as well as some of the more successful private clouds, are based on Linux or some flavor of an open source operating system,” Suda Srinivasan, senior director of product marketing for Coraid, told LinuxInsider.
Having entered the Linux marketplace in 2004, Coraid focuses on technology to drive scale-out performance, Ethernet simplicity and an elastic storage architecture to handle massive data growth.
LinuxInsider recently spoke with Srinivasan about the trends driving changes in private and public cloud storage technology.
LinuxInsider: What are the dominant forces that you see shaping the cloud storage industry?
Suda Srinivasan:
A trend we are seeing is the emergence of platforms like OpenStack. It is a prominent platform in the cloud storage business. We believe that this type of an open cloud model is the way to go. This is something that is driven by customer need. Customers are asking their providers to make the OpenStack environment available so they can start playing with EMC, Coraid or other open source environments. We announced in April that we now support OpenStack as well. This is important. If you look at how storage is going to be consumed in the future, it is primarily going to be driven by the cloud platform — and the cloud platform really needs to access resources at every level.
LI: Is there an emerging technology that you see pushing the storage capabilities envelope better than other approaches?
Srinivasan:
At the data train or storage level, the secret sauce is scale-out Ethernet storage. One of the cool things about this technology is that it is not a point-to-point connection based protocol. It allows you do massive IO from the server to the storage nodes. That means you are not pushing all of your data through predefined, port-to-port connections. Instead, what you are doing is using the broadcast capabilities of Ethernet to send your traffic through all available ports — and because of that you get things like port-like multitasking for free. This way you automatically use all available ports to send out your traffic.
LI: Why is this a significant change in cloud technology?
Srinivasan:
On the storage side, there is a building block style approach that is very similar to what you see on a computer network side. There has been this pretty drastic shift away from monolithic, proprietary compute modes onto a homogeneous model where you have Intel-based commodity hardware and virtual machines sitting on top of that. Now, instead of having one application running on each physical server, you have the opposite. You have the ability to spin off virtual machines at the flick of a button. The process of dealing with compute modes has become so much simpler and much less expensive.
LI: What is the impact of this new approach?
Srinivasan:
The problem now is, how do you manage all of these systems? If you look at what has happened at VMWare over the last few years, they used to sell hypervisors — now they are giving them away for free. The hypervisors have become commoditized. The real value has become, how do you coordinate across all of these fields? And how do you automate the process? And so on. That is a shift we have seen in compute. Now we are seeing it in networking.
LI: So this is now the current trend for cloud storage and Big Data even?
Srinivasan:
We are seeing new entrants who are coming up with commodity-based hardware and an all-software layer. The value will be in the control. How do you manage your entire deployment from a centralized control point and have visibility across the entire data system? The problem is not ‘can I scale effectively?’ It is ‘how do I manage that scale?’ That applies to both public and private clouds.
LI: How is this affecting the cloud storage concept?
Srinivasan:
The big concerns are now that you have terabytes of storage, how are you going to a) deploy all that storage, and b) … manage it over time? The shift is definitely towards management. And the cost is definitely going up, because you are talking about large deployment. People have to figure out how they are going to deploy and manage at scale. A lot of the manual operations that storage admins used to do are not going to be applicable any more.
LI: Does this also impact the status of how IT operates?
Srinivasan:
The biggest problem that IT is facing today is competition from public cloud providers like Amazon. Amazon is defining the standard for how infrastructure is consumed in the market today. A lot of enterprise customers are coming to their IT departments and saying that they are able to get volume from Amazon within a minute. It is extremely simple. If you use the Amazon platform, it is a simple as making a few clicks. So you go to your own IT department, and they tell you it will take them two weeks to get you more volume.
LI: Does that reality give public clouds a marketing advantage?
Srinivasan:
That is one reason why public clouds are becoming more and more popular. You can get ease of use; you can get agility and a very good price for it from vendors like Amazon. So the public cloud is very viable. But Amazon provides very few flavors of storage. So if a customer wants a particular performance set with unique parameters, most public cloud providers do not provide that kind of service. They only have one or two flavors. So the big question is, how do you take that ease of use and that simplicity from Amazon or another public cloud, and marry that with customizability and the ability to specify different parameters for your storage?
LI: What are the challenges cloud storage users face today?
Srinivasan:
One of the biggest challenges is striking a balance between your desired level of agility and the ease of Amazon. Coraid’s product is basically trying to tackle that by creating different storage profiles that customers can expose at Amazon-style cloud services. Coraid’s EtherCloud is our management and optimization framework. It has an interface that is very similar to Amazon’s and lets you ask for storage at a single click. The process of provisioning and managing storage is automated. It is a policy-based system that is going to bring down a user’s overhead.
LI: How viable are private clouds against the reach that public clouds provide customers?
Srinivasan:
Private clouds as a stepping stone to public clouds is something you are going to see more often. Enterprises — private clouds — are trying to do what Amazon has done in the public cloud so they can be as nimble. Also, the move towards hybrid clouds becomes easier. If you have a private cloud infrastructure which looks the same to an application, then it is easy for you to do things like cloud bursting. If you have a temporary spike in storage, you can use the public cloud to move excess data over when you have a spillover.
Cloud bursting is something that companies are doing more oftena, but for that to work seamlessly, you need to have a way of organizing your internal structure that is also cloud-based. It is hard to go from your legacy model of operating and managing your infrastructure to the cloud. So what enterprises are doing is becoming a private cloud that has API compatibility with the public cloud.
LI: Can private clouds survive as anything other than a stepping stone to public clouds?
Srinivasan:
The private cloud services need to think about how they can provide differentiated services at a price point that can actually make sense. A lot of early stage public cloud providers today are still buying traditional legacy infrastructure storage and then packaging it as a cloud service. That just does not work.
Look at how Amazon or Facebook or Google have built their infrastructure. They all have cloud-like architectures, but they don’t use legacy storage facilities.They do not use EMC. For one, it is too expensive. Second, all of those architectures are controller-based. All those architectures are scale-up in the sense that they use a pair of very powerful machines called “controllers” and dumb disks at the back. When you reach capacity, you have to buy another controller and start over again or rip out and replace it with a more powerful controller. That is an expensive, complex process.
LI: Are the two approaches radically different?
Srinivasan:
Providers like Facebook and Google and all those guys have internal infrastructure that looks very different from the model of enterprise storage. What they do is use a lot of small boxes. With the scale-out architecture, each box has its own processing capacity. As your needs grow, you just have to keep adding these boxes. You do not have to buy the big powerful controllers with the dumb storage discs in the back. That is more cost-efficient storage, and it lets the vendor provide telescoping storage when a customer needs it.
LI: Is there an effective way to pick the right approach rather than relying on luck or guesswork?
Srinivasan:
Everyone is driving towards functionality that looks the same, but there are some fundamental differences between them. For instance, the OpenStack model has no vendor lock-in and is much more open environment.
There are proprietary options available where the platform is tied to a small subset of vendors. So if I am a cloud service provider, I have to figure out if I want vendor lock-in or not, and what kind of features do I want with what kind of capabilities.
From an end-user point of view, I need to look at how can I provide a private cloud with an infrastructure that lets me scale out that is commodity-based with an architecture that lets me do things like cloud bursting and easily migrate to a hybrid model.
The bottom line is this: Enterprise IT needs to look at two different things. They need to look at business agility. They need to look at what end users need in terms of speed of access and ease of use. And they also have to have an eye on the TCO. How much is it costing them to provide this service?