As part of an ongoing series, we are discussing design principles that influence how ready our code is for distributed computing in the cloud, as well as for multi-core utilization. Today, we talk about…
An atomic piece of code is code with a specific, clearly defined purpose. In object-oriented terminology, it has "cohesion." It’s the pepperoni pizza and not the garbage pizza with everything. Likewise, atomic code can stand alone from the order of execution of other code. We’ll explore different aspects of atomic code below.
Don’t Lean on Me
First, though atomic code may require other libraries, its execution is self-contained. In other words, there are no call-level interdependencies. For example, if a call on method A must precede a call on method B and method B can’t be called unless method A is called first, then A and B are not atomic separately. If there is interdependency across individual method calls, then your individual methods are not atomic, though they may form an atomic operation all together.
It’s Simple, My Dear Watson
Second, atomic code is concise by its nature, and, as stated above, has a specific, clearly defined purpose. Fat, hairy, do-multiple-things methods are not atomic. If a method serves multiple purposes, then the general rule is to break those purposes up into separate atomic methods.
The Long and Winding Code
The same holds true for long running methods. Long-running methods are not usually atomic. Perhaps the long running code has a single purpose, but if the long running method can be broken down into more atomic steps, then your gain flexibility when running that code in cloud environments or when utilizing multiple cores.
Flexibility and Control
The choice to break our code down into more atomic steps is all about flexibility and control. Whether our code is executing across multiple cores on multiple threads or processes, or across multiple servers in a distributed computing environment, the choice to break our code down into atomic steps is about flexibility and control in terms of how our code can execute in these environments.
Advantages of Atomic Code in Distributed Computing
For example, a general rule of thumb is that your code may execute anywhere in a distributed computing environment at a given time. If method A is called on an instance of object Foo on one server, and then if method B may be called on a different instance of object Foo on another server, then obviously any interdependency cannot be easily satisfied without additional overhead and effort. An example of such overhead falls under the topic of statefulness versus statelessness, which we’ll discuss in part three of this series. Such overhead, if not handled correctly, can inhibit scalability and availability.
In addition, some distributed computing environments plan for failure, indeed, expects there to be failure at any time when code is executing. When a power supply takes down a server in the middle of a method that does multiple things, and your distributed environment allows for automatic retries, what are the implications? How far did the method get in its work? Which steps were satisfied, and what was left undone? Atomic steps allow more fine grained levels of reliability and problem resolution.
Long running code only makes this situation worse. Perhaps the long running code has a single purpose, but if it runs for a long time to achieve a single end, and the server running it dies ninety percent through, do you retry it from the beginning? Starting over is fine for some purposes, while in time critical work, it would be unacceptable. If the long running method can be broken down into more atomic steps, then your distributed application can perhaps snapshot progress along the way, and retry from failure mid-process more efficiently.
Essential Value of Atomicity in Parallelism and Multi-core
In terms of parallelism and multi-core, Daniel Spiewak wrote an excellent entry on this very topic just this week on his Code Commit blog (our friend Alex Miller twittered about the article). Daniel makes the argument that
“there is actually a deeper question underlying concurrency: what operations do not depend upon each other in a sequential fashion? As soon as we identify these critical operations, we’re one step closer to being able to effectively optimize a particular algorithm with respect to asynchronous processing.”
Later in the article he states:
“This is truly the defining factor of atomic computations: it may be possible to reorder a series of atomic computations, but such a reordering cannot affect the internals of these computations. Within the “atom”, the order is fixed.
“So what does reordering have to do with concurrency? Everything, as it turns out. In order to implement an asynchronous algorithm, it is necessary to identify the parts of the algorithm which can be executed in parallel. In order for one computation to be executed concurrently with another, neither must rely upon the other being at any particular stage in its evaluation.”
All of that should sound familiar at this point, and that serves as a good stopping point for now on the topic of atomicity. Here’s the link to Daniel’s whole article, which I highly recommend: http://www.codecommit.com/blog/scala/higher-order-fork-join-operators














Comments on this entry are closed.