Will 100% Utilization Kill Our Clouds?

lego-blocks.jpg

Over at the cloud computing group in googlegroups there is an interesting discussion about optimal load-utilization. Along the way Tim Freeman brought up an interesting point:

Are there hidden costs at running this high in the first place? We've heard the opinion from someone who is in charge of buying 100s-1000s of computers a year that commodity hardware isn't made to run at this capacity. That you're not getting as much value for your money over time because of far higher failure rates (i.e., that failures don't increase linearly with utilization and that there is usually a sweet spot)

So that got me to thinking ...

Heat Really Does Kill
Obviously there are many factors in failure rates of computing equipment (spinning or simply processors etc.), but assuming that you have not-horrible power cleanliness the #1 enemy will be heat.

Heat. Heat. Heat.

So, with that in mind one important way stuff becomes server-grade (i.e., expensive, non-commodity gear) is to get better at cooling than commodity gear. Interestingly, server-grade stuff also tends to try to get that last hard-to-obtain chunk of performance out of the components as well as provide varying amounts of built-in redundancy, both of which exacerbate the heat problems considerably, causing the heat-dissipation to get even better, which requires more power, etc.

So in that sense what Tim's contact has a point, since when running at full utilization most processors throw off lots of extra heat, necessitating (at the very least) extra gear to handle.

And there's always the chance that the heat will be poorly dissipated, thereby resulting in increased failures ... yet that does not mean that buying server-grade gear is the right way to go anymore. Far from it.

A Better Choice
A couple of choices come immediately to mind -

  • use lower-power components (as in laptop grade stuff). These will naturally generate less heat, and thereby tend to reduce their self-inflicted failure tendencies.
  • run much leaner power supplies than most folks want to supply off the shelf

There's other ideas - some interesting, some dumb - but those are a few for starters.

Is the Commodity Gear Today What We Need?
Interestingly enough, most of the stuff that folks have bought to build out grids has been server-grade in drag, more or less. Just look at the components and the power supplies - high energy consumption processors, big power supplies, beaucoup fans etc. Not always, of course, but that has generally been the norm.

In fact, it's this "server in disguise" gear that passes for commodity in most enterprise data centers today ... fine so far as it goes. As Cameron pointed out in the thread you can run the current commodity gear at 100% utilization with no particular increase in failure rates. True enough, but what if we think more aggressively?

In fact, let me go so far as to suggest that if we really are able to run at 100% for months without a failure, then we've massively overbuilt the "commodity" gear.

Back to what will be possible in changing our infrastructure as we make our transition to clouds - public or private.

This is the Key - Absolutomente Crucial!
Underlying all of the power / failure related infrastructure choices is an unspoken reality - the real key to using commodity at scale is to ensure that the application will survive the failure of individual computers / drives / switches / whatever without losing a darn thing.

Once you do that, at the application level, then you are free to experiment with different infrastructure choices to your hearts content, different utilization rates, whatever comes to mind - provided that your apps don't care.

In other words, many of the benefits that may result from cloud computing - flexibility, scalability, lower costs, reliability, and so on - are actually enabled at the application layer.

One more thing - when failure of individual computers doesn't matter to the application then you can pick lower power stuff that is also very cheap - now you're starting to talk about a great cloud infrastructure.

The New Black Commodity
So as you carry this thinking further then you can start to imagine a much more aggressive type of commodity, one as yet unrealized.

Start thinking of bare-bones, fairly dense components that are uber-cheap ... sort of a lego-block approach. Cheap as in $300 -$400 cheap all-up. Perfectly suited for enterprise-grade clouds - public or private - at least those that play by these new rules.

There was (and will continue to be) quite a bit more conversation on this point - it's one of the more interesting parts of commoditization. In any case, in a future post I'll outline some more thoughts on the "new commodity" that I believe is fundamentally possible.

Categories:

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
1 + 5 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.