« Call for Clustering Projects Hosted Together | Main | The right to abstract... »

December 17, 2006

Tired of the Exaggerations? Let's Define a Standard

posted by ari
The benchmark results showed that XXX enables linear scaling of parallel processing across the grid as the grid (and the data set) increases in size and is limited only by aggregate CPU cycles across the grid. In the test, XXX was able to linearly scale from 2 million aggregations with two servers to more than 60 million aggregations across the 96 servers. This 30 times increase in processing throughput was achieved with only one tenth of a second increase in processing time, or 1.2 seconds compared to 1.1 seconds. Additionally, the tests demonstrated that the data grid storage capacity increases linearly as additional resources are added to the grid and is limited only by the amount of RAM available to the data grid.

Impressive numbers indeed. This vendor did a great job building scalable software. Lately, however, I have grown tired of claims of "infinite linear scale" and even "linear scale." Look at this paper. It calls 30 times (2 vs. 60MM aggregations) throughput from 48 times (2 vs. 96 servers) the servers "linear scale." It is in fact a 38% degradation in performance as the application scales.

Over the next few months, Open Terracotta will have significant performance and availability improvements added to it. In all the tests our customers and prospects have run, we have run faster than they expected (usually 10X faster than a serialization-based clustering solution), but it is very much use case-specific.

How do we as a community define a performance benchmark for clustering? Read on for my thoughts...

Ok. First a rant. So, Kirk Pepperdine once did a bunch of work on this and he shared his conclusions with me, which were that SpecJBB and SpecJAppServer test the database and JMS more heavily and that clustering the stateful part of the application server would yield no material performance improvement. Therefore, no clustering vendor can publish numbers like "we provide a 50% SpecJAppServer performance increase" because the harness tests J2EE (EJB's and all that). So the vendors would have to edit the test to show the true power of clustering without the database.

Furthermore, SpecJAppServer, and every app server vendor do not allow publishing of performance numbers without their express permission, which of course no one in his or her right mind would ever give. "Sure, tell the world my app server can run a lot better than I have made it perform on my own."

So, what can we test? We can test many things, but at the core, we have to agree on what we are testing. I submit that we should test:

  1. OBJECT SIZE: impact of object graph size on clustering
  2. SCOPE OF CHANGE: impact of small-to-large graph changes on clustering
  3. CORRECTNESS:impact of ACID-compliance on clustering
  4. SCALABILITY:impact of number of clustering nodes
  5. USABLE AVAILABILITY:fail over objects to secondary servers regularly
  6. MANAGEABILITY:kill -9 the whole cluster and see if it can restart where it left off. Operators will need to kill random boxes throughout a normal work day.

How do we go about building a standard test harness for all this? I propose that we take the JPetStore application, augment it in OSS, and host it somewhere like the Terracotta Forge for all to grab and use for testing ANY clustering technology. While session clustering is not the hardest test case of clustering technologies, it is something that most development teams can setup on the cheap, and it is easy to adjust for constructing tests closer to one company's particular use case.

Let's do it, then. Someone propose it on the Comments

Post a comment




Remember Me?