The Underlying Assumptions of Virtualization

by bob on June 15, 2007 · 2 comments

in Editorial

There are so many ways to breakdown virtualization it’s nearly maddening, or it would be if it wasn’t so interesting to think about.

One way to look at types of virtualization is by the “virtual entity” that you’re presenting—i.e., desktop, server, storage, application, and even service virtualization (I’m speaking on that particular topic at the SOAWorld 2007 conference in a week or so … more on that later). In this ontology what’s common is that by separating the logical (or virtual) thing that’s being presented from the underlying physical reality, a certain operational flexibility is introduced. In other words, those virtual entities can be freely moved to different physical infrastructure.

This has largely enabled the server consolidation trend, triggered by the stark reality that much of the costly IT infrastructure in the enterprise has capacity to spare. As a side benefit, putting the same number of virtual entities on a consolidated set of physical infrastructure may potentially save on operational costs as well (we all know that this reality has much more “texture” to it than that rather naive statement, but that’s a post for another day!).

When you think about this a bit, you realize that there is a common underlying assumption to all of these, namely

Infrastructure is costly to acquire, costly to deploy, & costly to operate, so it must be run as highly loaded as reasonably possible.

There are at least three direct, and usually unforeseen consequences of this. First, effectively “putting all of your eggs in one basket” makes it more important that individual physical entities not fail. That means better engineered hardware with more redundancy and so forth … pretty much the opposite of commodity. Second, it increases the pressure to ensure that the applications can deal with failure well, which in a similar manner tends to make the applications and their supporting infrastructure more complex, costly, and shall we say, perhaps “corpulent”? Third this approach tends to reduce flexibility since the physical infrastructure is being run with less excess capacity (it’s expensive, remember?)

Taken together this forms a sort of gordian knot, in which the costly infrastructure leads to more expensive operations with great downside risk in the event of failure, which leads to great expenditures on infrastructure and so on—well you get the game.

What’s interesting about all of these forms of virtualization is that as a direct result of the underlying assumption they more or less enforce a sort of lower-bounds on aggregate cost (TCO).

In direct contrast is a newer form of virtualization, in which the underlying assumption is diametrically opposed to the earlier forms. In particular, in these forms of virtualization the key assumption is

infrastructure is a commodity, which may well break, may be added or removed freely, all without any disruption to the application

Examples of this form include our application fabric, as well as the proprietary infrastructures of Google and a handful of others (though their proprietors would likely protest that label!). These are all types of virtualization that aggregate the infrastructure—that is, many individual entities work together to appear to be one (from the taxonomy of virtualization developed by Rachel Chalmers and the451 group).

In the case of our application fabric the abstraction is very simple—an application fabric is the single most scalable, reliable computer that you’ve ever seen. Always works, changes size as needed, operates itself. A very solid, albeit “virtual” reality.

With this type of virtualization the gordian knot is cut in one fell swoop—infrastructure may be truly commoditized, grown and shrunk as needed, operations can be very, very simple and therefore low cost, and in the best implementations this type of virtualization will even deliver the very highest SLAs. All of this is delivered for even ordinary applications.

And that’s not all! (now I feel eerily like Ron Popiel). Almost the best part is that the application abstraction is far simpler than the alternatives—when you can assume failure recovery and scale are intrinsically handled by the abstraction the world becomes a lot more fun. This leads to simpler, less costly development that’s done in less time. Even better, in many cases (Spring, for example) existing applications can run on the fabric without new code.

All this from just thinking about the problem a little differently at the beginning.

Technorati Tags: , , , , , , , ,

{ 2 comments }

John Ehrig June 19, 2007 at 3:17 pm

Isn’t the ‘new’ form of virtualization you are describing basically grid computing?

Bob Lozano July 10, 2007 at 8:47 am

There is a sense in which this new form of virtualization is basically traditional grid computing – namely that they both involve lots of computers, but that’s about it. Rather, this new form of virtualization can also be thought of as a new form of grid computing.

In the original forms of grid computing (which are still prevalent in many academic circles, and from the traditional grid vendors) the developers still have to take much of the physical topology into account. There is also very little help in reliability (other than coarse level retries, generally only suitable for chunky batch jobs). In addition, they tend to be operationally complex.

What I’m trying to describe here is a significant evolution of those ideas. As you suggest, you can certainly think of it as a much newer form of grid computing that happens to have some very deep virtualization effects. On the other hand, in this post I took the other approach and talked about it as a new form of virtualization that looks like a much improved form of grid computing.

Either way you get to the same point – simplicity, reliability, ease of development, low deployment and operational costs. Hope that helps … thanks for your comments and the chance to clarify a bit.

- Bob

Comments on this entry are closed.

Previous post: Twitter, Linq and Fabric Accessible Memory

Next post: Google’s Map Reduce and database dinosaurs