Data in the Cloud

by Ryan on March 24, 2010

in Appistry, Webinar

Welcome to the 4th installment of our webinar recap series. In the wake of our recent cloud storage announcement, today’s installment in which we recount the conversation around data, storage and the cloud seems particularly appropriate.

Read on to learn more about our thoughts on architecture, cloud-friendly storage and other points relating to data-specific challenges in the cloud.

Data in the Cloud

Sam: So, Bob, I know this is one of these areas that you care pretty deeply about, please feel free to jump in. Last year, we joked about a blog that was written, that referred back to a presentation you made, and the blogger talked about you calling, you kind of predicting, I guess, the death of Oracle and the traditional relational database. That wasn’t in our predictions webinar per se, so we are not going to hold you accountable for that particular prediction, but what are you thinking here, in terms of data, storage, cloud, how do all these things mesh together?

Bob Lozano: Well I think Michael made a key point that for many of us in the business, an app developer, management, executive, whatever, the assumption of put things into the database, and by that, meaning relational database, has just been a given, and it really has not even been something you could discuss. What has happened is the cost of that, is that we have built in a priority limits to scale, and so on.

A great example, and I have seen this now in many industries, a whole class of applications of processing things that just happen, you know, settling out customer accounts, processing insurance claims, settling out payment transactions, whatever, we give it zillions, and zillions, and zillions of things to do. Well what we have done, historically we always throw it into a database, and then we wonder why we have trouble scaling those applications, they are expensive, they are costly, and they are crucial.

I think that we have already seen some examples, as Michael said, if we just take that assumption out, and say, is there a better way to do this, what we are finding is a whole host of better ways to do that, that at the same time are far more cloud friendly. So, I agree that this is the year they become mainstream.

I think we will start thinking about broad classes of applications, not as technologies, but as places where the relational database does not have to be, if there is a cloud friendly storage layer underneath it. I think that is, also, very inevitable for this year. It is some of the bigger payoffs we will see anywhere in the cloud world.

Michael: Yes, I think today when you hear people talk about cloud computing, people will talk very aggressively about commoditization, and multi tenancy, and high utilization rates, that you can get with just virtualization, and pay by the hour or usage based accounting, and those tend to be the issues that get discussed.

I think one of the things that we have believed from the very beginning is that as people get comfortable with cloud computing approaches and cloud computing implementations, that they are going to see that the greatest value is in the architectural shifts that are enabled by this approach. I think that this is a great example of where, I think in 2010 we are going to be talking a lot more about architectures and new approaches, then just some of the underlying…

Bob: And with a particular emphasis on the data related issues, right? I mean that is…

Michael: Right

Sam:  So Michael, a technical, this question came in, that I will address to you. One of our audience members is really looking to understand these NoSQL solutions, and is wondering if they are based on a new base concept of transactional consistency, as opposed to traditional action, and you touched on this but maybe you can elaborate on it.

Michael:  Yes, I can drill in just a little bit. Not to take too long and to get too “geeky” with everyone.

[laughter]

Michael:  We can do another Podcast just on this, if we wanted to. There is this interesting theory running around, about how computing in cloud architecture that is referred to as the CAP theorem, where you can talk about consistency, availability, and partition tolerance, the ability to survive having multiple data centers running as a single application. Of those three concepts, that you can take two, but you are going to lose the third every time.

This relates directly to active properties as it relates to databases, because usually what you see in cloud architectures, usually what you see, is a relaxation of the consistency argument, right at the C enacted, obviously. We keep all the other active properties in tow, but we relax consistency, and you hear the term “eventual consistency” a lot.

Where changes may happen on one side of a data center and we try to propagate it to the other side as fast as we can, but maybe not immediately, and that is really the mind‑shift that you have to take with cloud computing. When we are talking about databases, you also, in cloud computing, you get a lot of relaxation of normalization.

When we are talking about big, expensive databases, normalization is key, because we want to hold as little as possible to maximize the space on the machine. When we talk about cloud architectures doing duplication in order to facilitate the scale and speed, is often a correct thing to do, especially when you have cheap resources available to you.

So getting back to that principle, I heard a great talk, I believe it was Pat Hanlon did at Tech Ed, a couple of years ago, and he was talking about some of the understanding in Amazon when they were talking about relaxing the consistency of databases. He had this great example about, imagine that there was a single book on the shelf for sale at Amazon, and that you had your database behind it, and that you want to have that perfect consistency that the one book is really that one book, and we spent all this time and effort for that one book.

If a user was to come along, or if a customer was to come along and buy that book, and in the process of the individual‑‑who actually picked the book off the shelf and dropped it into a package to send to you‑‑that book fell on the floor and got run over by a forklift that was running through the warehouse. Then, what good was all that perfection, and effort, and time, that you put into consistency? They are still going to have to send out a letter to the customer saying, “Oops, something went wrong, we promise to fix as fast as we can.”

Really, is there value in being that perfect for every application? Now of course, sometimes the answer is yes, but it is not always yes, and when we can relax it and take advantage of these cloud systems, things become more simple, more reliable, and more scalable. So I hope I touched on all the points you are looking for. I am happy to… Again, we could spend an hour on this subject alone.

Sam:  It is an interesting possible segue to an interlude here on the topic of books. One of the folks out on Twitter‑‑KillingComputer‑‑asks, is there any recommendations for a good soup to nuts book on, presumably on cloud computing.

I am not sure how I did not hit this in the introduction, but one of the things that none of us in our wildest dreams could have predicted in 2009 was that our very own Bob Lozano would co‑author a book, The Executive’s Guide to Cloud Computing, which is going to be published by Wiley in April of next year. Bob, do you want to take a minute to kind of touch on your perspective on the book, and maybe pitch it to our friend, KillingComputer, here?

Bob:  Sure, that would be great. As Sam mentioned, we did finish the manuscript for Executive’s Guide to Cloud Computing, and we really had a broad range of audiences in mind. We took kind of a book within a book approach. Kind of the beginning and the end of the book, really makes a case of why cloud computing has to happen, and how fundamental it is. It is not just marketing hype and so on, but there is very underlying structural reasons, and we elaborate on those.

We spend a chunk of time talking about this data transition, and what it means. It comes at the high level, and then in the interior of the book is a section a little more focused at the IT practitioner, at the CIO, and that was primarily driven by my coauthor, Eric Marks, who has a long history in doing modeling, and thinking about adoption models, and so on. So you can kind of pick and choose, or take all of the above.

We are pretty satisfied that this will give a good, basic overview introduction, and a good set of answers to the questions, why should I do this now, should I care, is this just hype or is this real, and how fundamental is this? I think we make that case. It will be out, physical and electronic, I think, in the first or second week of April. So, there will also be a website that goes with it, Exec’s Guide to Cloud, we have a splash page up there now, but we will host a bunch of resources there as well.

So I am pretty excited about it. I have gotten a lot of real nice feedback. We have, I think, a conference that is going to hand it to every customer. We have another person taking a look at it for a course, and so on, and so forth. So, it has been a lot of fun. It has been a real great chance to sit back and think about how the industry is going, and why it is happening. I came out of that process far more convinced then I went into it, that this transition is one of the two or three big ones that have occurred in the history of computing.

Sam:  So, let’s keep going with the audience questions, here. We have got another one on the topic of data and some of these new approaches. It seems to be a very lively topic here, both in the Q&A Panel and Twitter. To Michael and Bob, are there any tools or processes that you can recommend to help application owners start to think about whether their apps are appropriate and amenable for these news types of approaches? What should they be looking for?

Michael:  You know it is funny, it is almost, I almost just started to look at that question in the opposite. I almost see in every case the ability that you can move it to the cloud, except in those extreme consistency cases, find a need to say, that a NoSQL style solution is wrong. So, I don’t look for problems that are good for the cloud, I look for specific features in applications in which I know I have to be more careful, because of that consistency requirement.

Again, you see any application that has reliability needs, strong storage needs, these NoSQL cloud architectures often make things easier to put these applications together. I can’t tell you how many developers that I have talked to, who come eager to figure out what they need to do to be in the cloud.

As they understand that it is really not something more complicated, and in a way it is actually simpler, but you need to have that awareness of, can my application run on multiple machines? Do I have an expectation of a single data element that I am running up as a counter, or as a central flag in the system? Is there a way to relax that flag so that every machine can have it’s own counter, it’s own flag, it’s own way of dealing with the data, without having to constantly interact with other machines on the network.

As I run through these, I hope it doesn’t sound too much of a challenge, it actually can often be approached directly and simply, once you get familiar with that idea. Think about your application running on a hundred notes. What is different, what no longer acts the same way, and how can you get those systems to work in isolation, and yet together, to a common goal? Once you do that, and…

Bob:  Or a thousand, or ten thousand, or a hundred thousand.

Michael:  Or a thousand or ten thousand, as Bob likes to kid me all the time. Once you relax that goal, you’re there, and strangely enough, it is often an easier solution

Register for a free download of the full webinar here.

After registering you will be able to watch the full event online, view the slides, download the audio, or even grab an iPhone-compatible version for your next flight.

Stay tuned for Kevin’s 5th and final prediction about the “put up or shut up” year for cloud and startups.”, where we will dive into our final predictions of the year and how Appistry sees the future of private vs. public clouds.

You won’t want to miss it, so check back with us later this week to learn more.

Comments on this entry are closed.

Previous post: Appistry Announces CloudIQ Storage: A Smarter Approach to Storage for Data-Centric Applications

Next post: Appistry Webinar: A Smarter Approach to Storage for Data-Intensive Applications