« Recent insight from a book I am reading | Main | Measuring Terracotta Latency can be tricky... »
June 13, 2007
The World's Best Grid for POxOs
posted by ari
Terracotta just announced the world's best, most scalable, easiest to use grid on the market. Everything you need, trust me, it does it. We got:
1. Event grid
2. Data grid
3. compute grid
4. Hyper grid
5. über grid
6. Even, grid grid
We even got grid virtualization with virtualized grid scaling and our stuff can be used with commerce grid, concurrent grid, J2EE grid w/ Spring support, messaging grid, and all the usual use case shtuff.
In our grid you can store and work with POJO, PONO, POCO, JOPO, NOPO, COPO, any plain old object. We call it POxO (pronounced: poh - shows)! That's right! We are so down with grid we can handle ANY object and handle it at hyper linear scale.
What's hyper linear scale, you may ask? Simple...every machine you add makes all the other machines in the grid faster. Yes, we outrun the speed of light. We just roll like that. How can that be? We move the compute to the data. The partition to the network. We move the bits to the bytes and the gnats to the knights.
If you think you got grid, you don't. You think your grid scales but it won't. The virtual grid's whats what so give it a shot.
So much grid you gotta see it today. From Miami to the Bay, we got grid _your_ way!
--------------
When you think about it, vendors have to be joking with all this grid discussion they throw out there. Java in the enterprise is either about JSE / JEE (Spring, Hibernate or JPA, Rife and Wicket, Tomcat or Jetty or JBoss), or it is about workload distribution w/o a database (master / worker or divide & conquer or grid, like Google). I sometimes simplify things down to "scaled-out apps" meaning several application servers running copies of an application (EAR, WAR, classes, what have you) underneath a load balancer all trying to appear as one large scaled out server. The other option is "divide and conquer" where a business problem is spread across machines such that machines do not need to share data to complete a task or a portion of a task nor do they need access to each other's data for availability. So if we have 2 types of architecture:
1. Scaled-out apps
2. Divide and Conquer
Where can grids help? The answer is "Divide and Conquer." And there, the real challenge is not in moving bits and bytes on the network, but in defining the unit of work so that bits don't have to move across processing contexts. I have seen too many use cases on Wall Street and in the gaming & betting universes to try to sell you on the notion that there is a 1-size-fits-all grid API for dividing workload. Anyone who sells grid fundamentally gets the same questions from their customers:
1. How well does it scale?
2. Why can't I just use messaging for this?
The short answer is these grid-things scale only if you can make sure no nodes block on data transformations / workload from other nodes meaning workload must be embarassingly parallel (like searching / querying as opposed to order processing). And, yes, messaging will work. If you partition the problem into the grid, then you can pass workflow control around the grid using messaging...that is, as long as your messaging is orthogonal to your data (i.e., only control flow messages and not data flow). None of this is usually easy. Why? Because most business problems are not data analysis / read-only problems but are, in fact, serialized or workflow-like in nature which means one thing...
Stateful work passes around the grid alongside control flow data when you are trying to process a serialized workflow in a divide & conquer model. Example: a business function needs to transform input from a DB of type "A" into output of type "C" by going from A -> B -> C. So you create workers in your grid. Some take db records and turn those into "A". Some convert "A" to "B" and put "B"'s back in the grid. Other's take "B" and convert to "C". Grid gives you a very clean API for working with the db, and for abstracting the message-passing between A->B transformers and B->C transformers. And by having designed your system such that each transformation is done autonomously and unware of any other steps in the workflow, you can now scale different pieces of workflow separately. Say A->B transformations takes twice as long as B->C. You can now deploy twice as many servers / processes doing A->B as doing B->C. But grid is not going to scale any better than a fast non-persistent message queue. Its a fact that the 2 solutions are being used in the same way and, thus, ignoring implementation details, will scale in the same way.
So, if grid is not as much for scaling as it is for ease of programming, is there an easier way to program? Yes there is. POJO / SEDA / CommonJ / Master-Worker. I and others at Terracotta keep talking about it, but the benefits are several:
1. performance: since objects are not serialized and since objects seemlessly fault in and out of a processing context, the app will benefit in ways a developer may not have contemplated. I saw a use case last week where the start-up time of a grid worker was greatly improved simply by the fact that it did not need to rehydrate its state on restart and could lazily fault in the objects it needed on demand. The notion of network attached memory that is implicit inside the Terracotta model proved powerful enough to speed up the app without a rewrite in this case.
2. simplicity: it is all POJO w/o serialization, w/o put-backs, and if you use Master / Worker, you can even abstract synchronization altogether.
3. flexibility: I have already seen users of our master / worker framework who require intelligent routing that (like a layer 7 http load balanacer) needs to inspect the payload to decide where to route it. Since our master / worker engine is OSS, you can change it to suit your needs. Proprietary / one-off implementations can't do this. You find shortcomings in the interface, and you have to file change orders / feature requests and wait.
Which do you prefer? Open or closed? Flexible or proprietary? Reality and source code or mystery and hype?
Trackback Pings
TrackBack URL for this entry:
http://blog.terracottatech.com/cgi-bin/mt/mt-tb.cgi/8