Self-healing is at the core of much of what makes application fabrics work - whether you deploy on commodity in your own place or on a cloud.
It just has to be a non-event when stuff breaks - both for the work in progress, and for the "structural integrity" of the app fabric itself. Ensuring both enables the app fabric to provide the simple abstraction of a reliable, uber-scalable computing surface.
At any rate, I was reminded of this a couple of weeks ago when this video first made the rounds... cool stuff.
I particularly like it because it's a great illustration of the value of simple goals & simple organizational rules, both in theory & in practice.

Ok, well the broken stuff for today (so far) includes Digg and 
