reliability

Self-Healing Robots & Cloud Apps

Self-healing is at the core of much of what makes application fabrics work - whether you deploy on commodity in your own place or on a cloud.

It just has to be a non-event when stuff breaks - both for the work in progress, and for the "structural integrity" of the app fabric itself. Ensuring both enables the app fabric to provide the simple abstraction of a reliable, uber-scalable computing surface.

At any rate, I was reminded of this a couple of weeks ago when this video first made the rounds... cool stuff.

I particularly like it because it's a great illustration of the value of simple goals & simple organizational rules, both in theory & in practice.

Categories:

How to Make Twitter Scalable

In the past week+ the whole business about Twitter scalability & reliability came to a head.

Yet, despite infrastructure that is visibly "hitting the wall", now it appears that the company is gaining interest in a funding round at a decent valuation (maybe even signed one, but more on that later).

How is this possible?

I think the answer to this is that

Twitter Goes Splat ...

Yesterday we talked about whether Twitter really ever need to be reliable or not ... some said yes, others contend that it's not necessary.

It's been bugging me for awhile that something this popular ... and Twitter is so ... just keels over as often as it does.

Does Twitter Need to Become Reliable?

A few outages ago I wondered aloud whether Twitter was taking the whole business of failure somewhat casually (triggered by some comments Blaine Cook made at SXSW).

Blaine replied with some great points, including

For the record, saying that the press surrounding the downtimes was a plus was a joke. Downtime is never good, and you should do everything you can to avoid it. However, it's a misrepresentation to say that you can build something successful without any downtime.

...

How the “Mad Kitty” Came To Be …

Having a bit of "down time" with fam and friends has been great ... hope yours has been at least as good.

Seems like about a zillion years since I was buying student tickets for games at Mizzou (the University of Missouri), but like many folks I've continued to follow their teams over the years. Really only problem in that ... for most of the past 30 years they've been pretty bad, well actually that would be understating things ... a lot!

But all that's changing now, and for the first time in my lifetime I got to see Mizzou play a football game in January. With polite apologies to any of you Arkansas fans out there, today's 35-7 win over Arkansas was awesome.

I mention all of this to give you a bit of background on the rest of this post.

Categories:

Promises, Promises … (broken) Promises

blog_logo c2.jpegOk, well the broken stuff for today (so far) includes Digg and Yahoo Small Business.

Gmail Down (again) … *sigh*

As I have commented before, I am a big fan of Google's basic approach to scalable computing. There is much to like - Captain Enormous scale on commodity gear, rapid deployment of applications, and so on.

Yet it is by no means perfect.

In particular, there is a chronic level of failure in (at least) some of the flagship services that should not be acceptable in any modern day offering, least of all something which is a standard part of many people's workflows.

Scalability Google-Style

Just ran across a very good post by Robin Harris from the misty dawn of time (last summer) stemming from the Google Scalability conference. Why should we care how Google scales? Like Robin points out,

They roll out new applications for millions of users with surprising speed, especially compared to corporate IT. They build data centers with hundreds of thousands of servers - and millions of disk drives - and run it all on free software.

Costly corporate kit, like RAID arrays and 15k FC drives, aren’t used. Yet they do more work in an hour than most companies do in a year.

‘Git Me Some of That Simplicity!

GibsonB.jpg

One of the often-repeated baseball truisms is "that you can never have too much pitching". Even if you don't know anything about baseball, you can tell that this is true by just searching on that phrase and see what comes up. Go ahead: I've made it easy!

Google Outages Today?

I wonder if there's a "rolling brownout" in google applications today?

Earlier in the morning google reader (generally a really decent app to have around) was hanging, going into eternal "loading" screens (see below).

google reader failure 10-16-07 - c.tiff

Since all of my blog / news feeds go through google reader (for now), I decided to switch gears and go research something. Except that google search was down as well.