« A Story About Tuning DB / SAN and Application bottlenecks | Main | Analyzing application architecture patterns, just for fun... »
September 25, 2007
FUD OF THE WEEK: Object Identity is akin to cheating
posted by ari
I heard a new one this week. A user built a performance harness for testing the throughput of various clustering approaches. The harness took a single String and wrote it to a Map millions of times under constantly incrementing keys. (This isn't exactly what they did but I must protect the innocent).
At the end of a test run of several clustering solutions (clustered MySQL with heap tables, memcached, and TC), the engineer concluded TC was significantly faster. Then, someone aware of the various technologies caught wind of the results and said "Terracotta cheated because they have object identity."
This is both true and false. Terracotta had an advantage in this test over other approaches you might think of. That advantage is that object identity is preserved, so the following code:
String myVal = new String( "Hello There." );
Map myCache
for( int i = 0; i < 1,000,000; i++ ) {
myCache.put( i, myVal );
}
...
ends up doing in a cluster what you would expect on a single JVM. Sure, there is work to be done because myVal is referenced under 1 million different keys. That work is in updating the Map's internal structures (buckets, etc.) as the map grows. But, we wouldn't expect the 12-byte String to be stored one million separate times, now would we? But, when using approaches that hide copy-on-read / copy-on-write semantics under an interface, we get unexpected behavior. Try it yourself. Take that code and wire up your favorite clustered cache, then wire up Terracotta and tell me what happens.
Terracotta behaves as we would hope in this case and preserves object identity. So the claim that we are cheating is strictly-speaking false.
What I recognize, however, is that Terracotta is different and people have to wrap their head around the concept. If we define "cheating" as "not testing the same thing" then surely we are not testing the same thing. But cheating requires a desire to deceive.
In this case, Terracotta's desire is not to deceive or cheat anyone.
Our desire is to not disrupt your coding model, your expectations of your heap's behavior and, thus not disturb your application.
I find it quite funny that behaving just like the JVM can be construed as cheating. Think about all the places in your code you would get bitten right now if you introduced collections or interfaces behind which identity broke and, instead copy-on-read occurred when you invoke get() and copy-on-write() occurred when you invoke put(). BTW, when I say copy-on-read I mean "deserialization" and copy-on-write refers to "serialization." Its not perfect, but it is descriptive of the problem domain, nonetheless.
Trackback Pings
TrackBack URL for this entry:
http://blog.terracottatech.com/cgi-bin/mt/mt-tb.cgi/21