« Extreme Hibernate Performance with Terracotta | Main | A Story About Tuning DB / SAN and Application bottlenecks »
August 22, 2007
FUD of the Week: Terracotta has no API
posted by ari
While most Terracotta users get started in one of a handful of ways, I have heard a few people assert the following (all of which shocked me):
- Multi-threaded programming is hard
- I want my team to know how to properly use bean-style programming and java.io.Serializable in a cluster
- I don't know where to start with Terracotta because it has no API
I wanted to take a moment and explain what people's thinking is in each of these scenarios. I also want to explain how most people consume Terracotta and how it should be consumed--based on successes that I have seen.
Multi-threaded programming is hard
Yes. Multi-threaded programming is hard. But not everyone has to do it every time. It is possible when using Spring, EJB (or JPA), some uses of Grid, some uses of caching frameworks, as well as bean-style coding, to hide the synchronization or threading. It is taken care of inside the API. Developers rarely deal with Tomcat thread pools, for example.
Terracotta offers several abstractions to help write apps without seeing threading or synchronization. Those include our EHCache config module and our Master - Worker API.
I want my team to know how to use bean-style programming
Fact is, many frameworks are expressed in a bean-style fashion. But bean-style get() and put() calls with a Manager / Home context can be too simplistic for integrating clustering into an application. Why is this? Clustering requires coordination and sometimes it requires object identity. The most recent example I saw of this is a use case sharing large strings. With object identity built into Terracotta, a string would be compressed once when it is first addded to an object graph. It would then decompress back into a JVM only once for each JVM. With a Manager or Home, an object exhibits copy-on-read semantics which means the String must be decompressed on every checkout. In certain systems, just adding compressions for strings on get() and put() could actually eat up the entire CPU!
Should clustering be something you know is present and think about? Yes. Should clustering be something that imposses a particular programming model and, in some cases, introduces overhead that would not exist across threads on a single JVM? No.
Terracotta has no API
This is just not true. Terracotta can be used with java.util.concurrent, just to mention one framework. One user on our forums is currently working on clustering ActiveMQ's in-memory queuing option. Terracotta offers Config Modules for easy integration with other OSS. I call that stuff "#include"-style integration. You can #include LUCENE clustering or #include EHCACHE clustering or #include HIBERNATE clustering. I think our current release (2.4.3) now adds support for annotations via a #include.
So What's the Real Deal?
Terracotta users who are in production who have gotten there using only our docs and forums, tend to exhibit the same behaviors:
- They used a Terracotta Config Module and their existing app
- They took an asynch app (message-based, SEDA, or other) and deleted the transport layers such as JGroups, MQ, or what have you
- They were using Terracotta Sessions or Terracotta for Spring
- They were used to POJO and Terracotta fit with their thinking and expectations
The most important thing to keep in mind is that you do not have to suffer and change your expectations when clustering. You don't need new APIs. You don't need new methodologies. You don't need to put synchronized{} all over your code.
The one thing you need is a JVM plug-in that integrates in a fine-grained manner exactly where you want it to and doesn't bleed "clustering juice" all over your source code. And, you need tools for the cluster that extend your natural ability to profile and performance tune apps out into the cluster.
Trackback Pings
TrackBack URL for this entry:
http://blog.terracottatech.com/cgi-bin/mt/mt-tb.cgi/18