Is your code cloud-ready and multi-core friendly? (pt 3): Statelessness

by guerry on October 6, 2008

in Editorial

Fullerene_c540As part of an ongoing series, we are discussing design principles that influence how ready our code is for distributed  computing in the cloud, as well as for multi-core utilization.

A friend of mine used to say, "All generalizations are false, including this one." Generally, the rule of thumb in distributed computing is that stateful objects are bad, and stateless is good. Today we’ll explore the advantages of being stateless on the cloud and in concurrent programming.

Last time, when we discussed Atomicity, I said this about distributed computing in the cloud:

"For example, a general rule of thumb is that your code may execute anywhere in a distributed computing environment at a given time. If method A is called on an instance of object Foo on one server, and then if method B may be called on a different instance of object Foo on another server, then obviously any interdependency cannot be easily satisfied without additional overhead and effort. An example of such overhead falls under the topic of statefulness versus statelessness, which we’ll discuss in part three of this series. Such overhead, if not handled correctly, can inhibit scalability and availability."

Today we’ll examine scalability and availability in the context of stateless versus statefulness, and add to the discussion load balancing, reliability, and concurrency-safety.

Statelessness

First, some definitions are in order. A stateless object does not hold information or context between calls on that object. In other words, each call on that object stands alone and does not have reliance on prior calls as part of an ongoing "conversation." A caller could call a method on one instance of a stateless object, and then make a call on a different instance of the same object and not tell the difference. This does not mean that a stateless object does not deal with data or context, but rather it just does not hold it across multiple method calls, nor access it concurrent to other processes, be they distributed processes or local threads or processes.

Statefulness

Statefulness implies certain conditions. Take for example a stateful shopping cart object. Each instance of a stateful shopping cart object might represent an individual customer. Therefore, if we had a thousand customers shopping on a site, there would be, in theory, one thousand instances of the stateful shopping cart object. Therefore, to update the state of a given customer’s shopping cart, we find the correct object for that customer, and make calls on it. This is the Neo the One version of statefulness. Only one object can do the job, and Morpheus must find it…somewhere in the Matrix.

Statefulness may also imply shared state. Shared state may be as simple as an object that is generally stateless in terms of member data (per instance values) and yet hold onto static data (per class values) that changes from method call to method call, and affects the outcome of subsequent method calls. The static data injects its importance into the conversation between callers and the object. Shared state gives rise to increased complexity whether shared across a cloud of computing resources, or across local threads or processes.

Therefore, a general definition of statefulness is in order: a stateful object holds onto data, information or context that is important across the conversation of multiple calls on a given instance of that object, and multiple callers may need concurrent access to that state.

Neo the One

Stateless objects mean that the cloud or the multi-core environment is free to use any instance of an object to get the work done. Therefore, any instance of the called object can be Neo and we’re good. From a design standpoint, this affects scalability, availability, load balancing, and reliability. Let’s examine each.

Statelessness and Scalability

In terms of scale in a distributed computing environment like the cloud, stateless objects mean that work can be load balanced onto the machine most suited to do it. With stateless objects, I do not have to worry about contention for a single instance of that object, a situation potentially leading to race conditions, dead lock, and starvation. I can just add more machines and let the work scale out horizontally onto them. If I am dependent on a single, golden instance of a stateful object, and it is getting all the attention from calling clients, then depending on the stateful model I’m using, I may or may not be creating a scalability bottleneck. But, no matter what stateful model I’m using, I do have more overhead than I likely would if I could have accomplished the same work without state.

Statelessness and Availability

If my cloud environment is free to use any machine to get the work done, then I have very wide spread availability. In case of failure, the loss of any machine does not affect me, provided I have enough machines to continue. Again, if I am dependent on a single, golden instance of a stateful object, and the resources holding it goes down, now I have lost availability for that instance of that object, and have to reconstitute it, if possible, elsewhere, hopefully without loss of state. At a minimum, I have introduced complexity into my fabric application.

Statelessness and Load Balancing

In terms of load balancing, if the cloud environment is free to use any machine to get the work done, then whatever machine has the least overhead can be given the work. If there is a single, golden instance of an object on a single machine, and it is getting all the attention from calling clients, then I may have both a load balance problem and a resource contention problem. The model of statefulness I am using will determine the severity and complexity of the problem, and determines whether I can spread the work to multiple machines sharing a cache of the same stateful instance of the object.

Statelessness and Reliability

The stateless model also bolsters a cousin of availability: reliability. Statelessness, coupled with a self-healing cloud environment affords us a lot of flexibility. In such an environment, if a machine goes down due to hardware failure then the work can continue with another instance of the object on another machine. I don’t have to worry about loss of consistency or other complexity like I would if I depended on a now lost, stateful object.

Statelessness and Concurrency Safety

Stateful code in any form in a distributed or parallel environment requires being mindful of concurrency issues. Stateless code, on the other hand, is more likely (no guarantees of course) to be concurrency safe as you execute it across distributed nodes, multiple threads and cores. Stateless code also has more likelihood of being easier to execute across multiple processes instead of multiple threads, a model seen in Google’s Chrome and Microsoft’s IE 8 browsers where each tab is a process and not a thread.

Concurrency safety is a complex topic, and I’m already running long on this post. So, rather than rambling on myself, I’ll point you to a great article titled "Java theory and practice: Are all stateful Web applications broken? HttpSession and friends are trickier then they look," by Brian Goetz. In his thesis, Goetz states "many web applications that use HttpSession for mutable data (such as JavaBeans classes) do so with insufficient coordination, exposing themselves to a host of potential concurrency hazards."

There’s lots of that kind of code already out there, and it just gets more complex in distributed and multi-core parallel applications.

See you next time for part four.

Comments on this entry are closed.

Previous post: Parallelization: Multi-Core, In a Cloud, Here or There, Anywhere

Next post: Meltdown 2008, Part 1 – How I Learned to Love Chaos