JavaOne ‘07 - Day 3

Today started with a keynote talk by the Oracle guys. Cameron Purdy, previously the CEO of Tangosol which is now owned by Oracle, stood up and showed a neat data grid demo.  Interestingly, his main points about reliability and scalability all sounded exactly like what I have been telling people at our booth.  The only difference is he’s talking about data reliability and I’m talking about processing reliability.  I’m still unsure about the Tangosol/GigaSpaces/TerraCotta data grid concept.  It sounds so much like EJB CMP to me.  The problem with CMP wasn’t that it was inside J2EE or on a cluster.  It was that the bigger you scaled, it the more problems you had with either stale data or performance.   If you try to fix the stale data problem by locking data instances on the cache, you now have scalability problems.  I don’t see how the data grid solutions address this.  I think what’s really happening in that space is specific domains are able to use optimistic locking, and therefore the ability to virtualize commodity hardware as a large memory cache works for them.  Maybe I’ll go lurk around their booths tomorrow and see if I can get one of them to explain to me how they address the stale data issue differently than CMP.

I had several good conversations with show attendees today. We didn’t give out as many slinkys as the first day, but they were still popular.  I think we have about 1 or 2 boxes left, so they should be gone tomorrow.  I do end up describing our product to people a lot.  Our concept isn’t an easy one to label.  We call ourselves an Application Fabric, but sometimes we say "virtualized grid", "next generation application server", or "execution platform".  Our VP of sales and I were having lunch and talking about different names we could use that would resonate with developers and architects.  I proposed Application Grid, so maybe we’ll try that out.  I had a booth visitor come up and ask straight out if we were a compute grid or a data grid.  He obviously understood the grid space, but I think compute grid wouldn’t explain what we do to most people.

Tomorrow I’m going to try to find some leftover t-shirts.  I have been so busy at the booth that all I have so far is a couple of pens from the Interface 21 booth.  They hinted at some upcoming ultra cool schwag.  I think we may have to turn this into a competition.  I’ll keep the updates on the blog.  Maybe we can get a bet going and an online poll or something.  I noticed we got a mention on Java the Hutt’s blog today which was cool.

I’m going to go get some food.

later,

-j

2 Responses to “JavaOne ‘07 - Day 3”

  1. Cameron Purdy Says:

    > I’m still unsure about the Tangosol/GigaSpaces/TerraCotta data grid
    > concept.

    Regardless of marketing, GigaSpaces and TerraCotta are both client/server technologies, not data grids. (In TerraCotta’s case, you are limited to one server.)

    > The problem with CMP wasn’t that it was inside J2EE or on a cluster. It
    > was that the bigger you scaled, it the more problems you had with either
    > stale data or performance. If you try to fix the stale data problem by
    > locking data instances on the cache, you now have scalability problems.
    > I don’t see how the data grid solutions address this.

    Tangosol Coherence provides in-memory consistency with configurable levels of isolation, even during and after server failure conditions. It’s technologically difficult, which is probably why it is still unique. (That also explains why it is selling so well!)

    > I think what’s really happening in that space is specific domains are able
    > to use optimistic locking, and therefore the ability to virtualize
    > commodity hardware as a large memory cache works for them.

    No, the data grid simply becomes the temporary (in-memory) system of record. If necessary, the results of the transactions are played back to a database (usually asynchronously). The data grid itself can pump hundreds of thousands of transactions per second.

    > Maybe I’ll go lurk around their booths tomorrow and see if I can get one
    > of them to explain to me how they address the stale data issue
    > differently than CMP.

    By managing the data instead of just caching it. :-)

    Peace.

  2. Jasen Says:

    Hey, Cameron, thanks for the comment! I loved your demo at JavaOne by the way.

    I’m still wondering about a scenario with 50 machines that have data spread out in the memory of each physical machine. If one person attempts to update a record and 1 second later someone attempts to read the same record, how does any “datagrid” prevent an incorrect value from being returned in the read?

    The response time is inflated by network latency becuase of the choice to use multiple machines, right? So either you lock the record before the write, and delay the read OR you potentially could get the old copy before the update propagates to all the machines? My guess is that the kind of application that works best is one that can tolerate the low latency and basically isn’t doing repeated access to the same records in a high enough frequency such that it would matter. Then the stale data concern wouldn’t occur, does this sound right?

    I’m also curious how you find the right instance of data across a grid of machines to make the reads fast enough. I’m guessing you have some type of index manager that must handle all the requests that enter the grid?

    Anyway I didn’t get a chance to come talk to anyone at the Oracle/Tangosol booth at JavaOne. I did see they were giving out cool t-shirts but I didn’t get one. Maybe I’ll see you at the next show and we can have a talk over some refreshing beverages about scaling data access.

    -jasen

Leave a Reply