<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
   <channel>
      <title>POJO Mojo</title>
      <link>http://blog.terracottatech.com/</link>
      <description>Talk about Terracotta</description>
      <language>en</language>
      <copyright>Copyright 2010</copyright>
      <lastBuildDate>Tue, 09 Mar 2010 21:23:11 -0800</lastBuildDate>
      <generator>http://www.sixapart.com/movabletype/</generator>
      <docs>http://blogs.law.harvard.edu/tech/rss</docs> 

            <item>
         <title>Today we launched Terracotta-cloud</title>
         <description><![CDATA[<p>Over the past 3 weeks, I built a cluster running Eucalyptus + Terracotta.  Today, Rich Wolski--Eucalyptus's CTO--and I did a joint webinar where the 500 or so people in attendance used our clustered solution.  The system built itself in 4.5 minutes and then took 150 users generating about 2000 cache hits PER SECOND.</p>

<p>Want to see how to use Terracotta, Eucalyptus, and put the 2 together, watch our joint webinar over at http://www.terracotta.org/</p>

<p>If you were in attendance today, you saw us build a cluster of 4 Jetty servers and 4 Terracotta Servers plus a database node and a Terracotta load balancer.  All in, we had 10 nodes running and were handling the throughput no problem.  Makes sense...this is a rather large cluster for such a small number of concurrent users--150.</p>

<p>Anyways, I will blog more about the architecture of our new Terracotta-cloud product but for now I thought I would show everyone a picture of just how compact and powerful the cluster you were hitting is.  Enjoy...</p>

<p><a href="http://blog.terracottatech.com/IMG00410.jpg"><img alt="IMG00410.jpg" src="http://blog.terracottatech.com/IMG00410-thumb.jpg" width="160" height="120" /></a></p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2010/03/today_we_launched_terracottacl.html</link>
         <guid>http://blog.terracottatech.com/2010/03/today_we_launched_terracottacl.html</guid>
        
        
         <pubDate>Tue, 09 Mar 2010 21:23:11 -0800</pubDate>
      </item>
            <item>
         <title>Amazon EC2 + Terracotta.  A good match?</title>
         <description><![CDATA[<p>Looking for feedback here.  With all of Amazon Web Services' amazing progress around MySQL, Hadoop, and more, it occurred to me that there is room for a built-in Terracotta service for Amazon EC2 users as well.</p>

<p>Think about it.  My favorite demo of cloud-enablement right now is Chris Richardson's CloudFoundry tool kit.  A little known secret about Cloud Foundry is Chris can and has made it support Terracotta as part of app deployment in EC2 or vCloud Express.  My thoughts with Chris have always been:</p>

<p>1. Ok, so Cloud Foundry makes push-button app deployment at scale very easy.  If I want 2 HTTPDs,4 Tomcats, 1 MySQL master node and a slave for backup, I can just click my way to such a deployment description, add my WAR file, stir and presto...instant production-scale app in the cloud.</p>

<p>2. So, Terracotta could be added to this story in that, once I add my WAR, can't Cloud Foundry detect the presence of Ehcache and Hibernate jars IN MY bundle, and ask me if I want my Hibernate cached, or my Ehcache clustered, or if I wanted clustered caches or ... well you get the idea.  A simple checkbox in my app saying "cluster and cache my DB please.  Thank you."</p>

<p>I mean, why wouldn't everyone want to free their app from the ball&chain that is the RDBMS.  Sure, MySQL in the cloud is kewl and all, but does it help as much as a distributed cache built in to the things I construct apps with every day?  Of course it is not as helpful.  So, with push-button distributed caching, I can take the app I already have, deploy it to the cloud, and go faster than I used to go in my own datacenter too.  Great!</p>

<p>So, my question to everyone out there seeing this is, if it can be push-button now, thanks to Terracotta Ehcache, why not build it into EC2 directly.  basically like ordering off a McDonald's menu.  "Would you like a distributed cache with that?"  "Yes, please!"  "Ok, 2 nodes or 4?"  "4 please, than you."</p>

<p>So, what do people think?</p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/11/amazon_ec2_terracotta_a_good_m.html</link>
         <guid>http://blog.terracottatech.com/2009/11/amazon_ec2_terracotta_a_good_m.html</guid>
        
        
         <pubDate>Wed, 18 Nov 2009 21:33:17 -0800</pubDate>
      </item>
            <item>
         <title>More big news coming this week I think</title>
         <description><![CDATA[<p>We got big news again.  This is becoming a monthly occurrence.  Stay tuned!</p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/11/more_big_news_coming_this_week.html</link>
         <guid>http://blog.terracottatech.com/2009/11/more_big_news_coming_this_week.html</guid>
        
        
         <pubDate>Wed, 18 Nov 2009 01:02:38 -0800</pubDate>
      </item>
            <item>
         <title>Congrats to Cloud Foundry</title>
         <description><![CDATA[<p>Quick congrats to SpringSource and <a href="http://cloudfoundry.com">Cloud Foundry</a>.</p>

<p>Chris Richardson has been a friend of Terracotta for years and we are glad <a href="http://chris-richardson.blog-city.com/cloud_foundry_is_now_part_of_springsource.htm">his latest ideas landed somewhere so great to be.</a></p>

<p>The great news for Terracotta's community is that Cloud Tools and CloudFoundry.com both support Terracotta, directly.</p>

<p>So with our own <a href="http://www.dzone.com/links/a_framework_for_running_anything_on_ec2_terracott.html">puppet master-based experiments in the Cloud</a> plus CloudFoundry being part of SpringSource, I guess Terracotta has just been thrust into cloud computing.</p>

<p>Gotta love open source :)</p>

<p>BTW, email me if you want access to or help with our puppet + Terracotta AMI.  I would love the feedback.</p>

<p>Thanks,</p>

<p>--Ari<br />
</p>]]></description>
         <link>http://blog.terracottatech.com/2009/08/congrats_to_cloud_foundry.html</link>
         <guid>http://blog.terracottatech.com/2009/08/congrats_to_cloud_foundry.html</guid>
        
        
         <pubDate>Wed, 19 Aug 2009 08:13:48 -0800</pubDate>
      </item>
            <item>
         <title>Terracotta and EHCache: A marriage made in Java</title>
         <description><![CDATA[<p>I am very pleased to be able to announce that Terracotta and EHCache are now one.  </p>

<p>If you were curious what this hash is: ? d7073c02eca990a65c2c4c911fe33b20 ?</p>

<p>It is the Md5 hash of the contract between Terracotta and EHCache leadership that both cements and represents this new day in Enterprise Java scalability.</p>

<p><H1>The rationale</H1><br />
EHCache has massive adoption...<br />
EHCache provides the world-standard caching interface (both de facto and soon, JSR-107) to Java applications; and it is easy to use.  EHCache has hundreds of thousands if not millions of production deployments. And EHCache is embedded in many popular products from the Spring framework, to Liferay, to Alfresco, to Documentum, to Hibernate.  If you name it, it is likely using EHCache.</p>

<p>Terracotta has a proven open source scalability solution...<br />
Terracotta provides the world's best open source Java clustering and HA platform on which to run Enterprise-class applications.   Terracotta is used underneath hundreds of the world's most critical applications.  Terracotta's interface in more than 50% of use cases has been EHCache.  Basically, developers design applications to use EHCache and use Terracotta's EHCache clustering module to get massive scale and high availability at runtime.</p>

<p>The two together will provide the most seamless path from 1 node up to 100.  Instead of having to worry about which version of EHCache Terracotta supports, or if your EHCache integration will work well with Terracotta, EHCache's and Terracotta's users alike can rest assured the two will always work in perfect harmony from today forward.</p>

<p>This makes Terracotta + EHCache the largest vendor in the market focused only on Java scalability and reliability.</p>

<p><H1>What this means for EHCache Users</H1><br />
EHCache users will get a few things:<br />
1. The same Apache 2 license they currently rely on<br />
2. A new hosting environment operated by Terracotta with state-of-the-art forums, source contro, maven infrastructure, etc. all running alongside sourceforge infrastructure that will remain in place<br />
3. a dedicated team of engineers working full-time on EHCache performance and features<br />
4. Direct upgrade path to Terracotta that is seamless and nearly configurationless<br />
5. Enterprise support and training for existing EHCache installations</p>

<p><br />
Terracotta users will get a few things as well:<br />
1. EHCache interfaces will replace Terracotta distributed cache as a single caching interface / standard for Terracotta distributed caching<br />
2. a single-node version of Terracotta that can run on the desktop w/o our server array<br />
3. Full freedom to run on the latest version of EHCache at all times, knowing it works with Terracotta<br />
4. One vendor support structure for their caching interfaces / libraries as well as their scalability / reliability runtime.</p>

<p><br />
<H1>Now the fun begins</H1><br />
Next steps together<br />
1. Greg Luck's role will be as CTO of EHCache here at Terracotta, reporting to me<br />
2. We will merge our product roadmaps including, seamless upgrade from 1 EHCache node to 100's as well as adding some new interfaces / APIs around searching / indexing caches, etc.</p>

<p>We now have a very well rounded solution for Enterprise Java applications.  The decision about where to keep state has always gone in the database's favor save for the most highly trafficked sites and systems.  That's fine by us.  Keep the data in the database.  Just cache your catalogs, products, and users with EHCache, through Hibernate or directly from JDBC to EHCache by hand.  Write your sales orders, trades, matching operations to Terracotta and write-behind to the database or just write-through--both will be fast.  And build your conversational state in memory using HTTP Session and Terracotta container clustering or use EHCache directly.  With our EHCache distributed cache, our HTTP Session product, and our core DSO platform you can do all 3 in the same application without giving up your database and without sacrificing scale or performance.</p>

<p>The marriage of EHCache and Terracotta: a wedding where you can have your cake and eat it too - scalability and ease of use without having to worry about side effects or impacts.</p>

<p>Here's to Java and the Java Community</p>

<p>--Ari<br />
</p>]]></description>
         <link>http://blog.terracottatech.com/2009/08/terracotta_and_ehcache_a_marri.html</link>
         <guid>http://blog.terracottatech.com/2009/08/terracotta_and_ehcache_a_marri.html</guid>
        
        
         <pubDate>Tue, 18 Aug 2009 04:00:00 -0800</pubDate>
      </item>
            <item>
         <title>WHAT IS THIS NUMBER?</title>
         <description><![CDATA[<p>d7073c02eca990a65c2c4c911fe33b20</p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/08/what_is_this_number.html</link>
         <guid>http://blog.terracottatech.com/2009/08/what_is_this_number.html</guid>
        
        
         <pubDate>Sun, 16 Aug 2009 10:26:46 -0800</pubDate>
      </item>
            <item>
         <title>Great post on performance.</title>
         <description><![CDATA[<p>Fair enough: Steve is our head of engineering and I generally hate when people do this sort of cross-posting thing but his post is actually very educational:<br />
<a href="http://dsoguy.blogspot.com/2009/08/distributed-data-structures.html">http://dsoguy.blogspot.com/2009/08/distributed-data-structures.html</a></p>

<p>Makes me want to write a post on performance benchmarking with Terracotta.  Very few users seem to know what they want to test, what their tests are actually testing, and how to test what they need.</p>

<p>I think its time to clear that up with a bit of framework magic and general documentation.  Definitely need to find a way to start immediately on this project, for the betterment of all distributed cache users, not just Terracotta users :)</p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/08/great_post_on_performance.html</link>
         <guid>http://blog.terracottatech.com/2009/08/great_post_on_performance.html</guid>
        
        
         <pubDate>Thu, 13 Aug 2009 00:30:16 -0800</pubDate>
      </item>
            <item>
         <title>Massive throughput for Hibernate apps</title>
         <description><![CDATA[<p>Ok.</p>

<p>So I saw a performance benchmarking report recently.  We tested JPetClinic domain model being written to through Hibernate.  With Terracotta as second level cache, we could do upwards of 200,000 reads per second on 8 JVM cluster.  And 150K reads / writes at 90/10 ratio (creating new pets and appointments 10% of the time) on that same 8 JVM cluster.  This is 30X faster than a few options we benchmarked against (after paying experts to tune the other options...not just benchmarking on our own).  The other options did more like 5K tps from the same cluster.  And MySQL by itself w/o a second level cache could support 1.1K tps (136X faster for Terracotta).</p>

<p>I can't wait to do the webinar for everyone on what we have done in this 3.1 release.  Basically, we are delivering the power and scalability of the best of distributed caching architectures--the latest and greatest in application design--but through the Hibernate / RDBMS model most apps already use.  Things are really getting fun for me and for us now.</p>

<p>--Ari<br />
</p>]]></description>
         <link>http://blog.terracottatech.com/2009/08/massive_throughput_for_hiberna.html</link>
         <guid>http://blog.terracottatech.com/2009/08/massive_throughput_for_hiberna.html</guid>
        
        
         <pubDate>Tue, 11 Aug 2009 22:21:50 -0800</pubDate>
      </item>
            <item>
         <title>The irony.</title>
         <description><![CDATA[<p>We were at lunch today, quoting the Austin Powers series of films.  We got to a team-favorite:</p>

<p>"I eat because I am unhappy.  I'm unhappy because I eat."</p>

<p>Then we morphed it:</p>

<p>"I cluster because I'm not scalable.  I'm not scalable because I cluster."</p>

<p>Some people are still afraid of the sort of thing products like ours do.  I do think more and more architects each day feel like clustering is now has highly available as the database (or more) and yet far better at delivering low latency, high throughput, and linear scalability.  Just a thought, but if you have been afraid to cluster, now is the time to take another look.</p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/07/the_irony.html</link>
         <guid>http://blog.terracottatech.com/2009/07/the_irony.html</guid>
        
        
         <pubDate>Thu, 16 Jul 2009 14:20:46 -0800</pubDate>
      </item>
            <item>
         <title>Sun&apos;s Wacky JavaOne idea...which I actually like</title>
         <description><![CDATA[<p>Sun is offering folks the opportunity to have a nice catered lunch with Alex Miller and me.  I was quite flattered when they approached with the concept.  And now I am just excited and eagerly anticipating the opportunity to meet 8 Java developers and discuss in detail what you guys are working on over a nice lunch, all on Sun.</p>

<p>To sign up, you have to pay for your JavaOne badge through an EBay auction.  (Don't ask me why.)  But here's the URL:<br />
<a href="http://shop.ebay.com/merchant/javaone2009_W0QQ_nkwZQQ_armrsZ1QQ_fromZQQ_mdoZ">http://shop.ebay.com/merchant/javaone2009_W0QQ_nkwZQQ_armrsZ1QQ_fromZQQ_mdoZ<br />
</a><br />
Expect other speakers to show up there soon as well.</p>

<p>Cheers!</p>

<p>--ARi<br />
</p>]]></description>
         <link>http://blog.terracottatech.com/2009/05/suns_wacky_javaone_ideawhich_i.html</link>
         <guid>http://blog.terracottatech.com/2009/05/suns_wacky_javaone_ideawhich_i.html</guid>
        
        
         <pubDate>Fri, 01 May 2009 11:17:41 -0800</pubDate>
      </item>
            <item>
         <title>Answering a reader&apos;s question: When to offload the DB with Terracotta</title>
         <description><![CDATA[<p>I have been remiss in not answering a reader's very intelligent question.  Here's the question.</p>

<blockquote>Write-behind would be wonderful for performance, but how do you handle db transaction errors?  What if someone places a bet (in your real-life example), and the bet fails to commit?  How do we recover from that?  Was the user originally told (when the bet was marked dirty) that the bet was successfully placed (and committed)?  Or is the user told that the bet is “posted” or something else that implies that the bet is not truly committed yet? 

<p>If the application considers the objected committed once it is marked dirty (and/or placed on the write-behind queue), then I assume that there could be potential consistency issues with write-behind, where one box has the bet placed (on the queue, but not in the db), but the other boxes do not see the bet placed?  </p>

<p>And even on the box that placed the bet, if the db is queried with a projection that includes the bet (rather than the Bet domain object; eg. Some report of all bets, joined in with the user and their account status), then I assume the query will return inconsistent results (ie. The report of bets will not include the bet on the write-behind queue)?</p>

<p>Not that any of this is an argument against write-behind; I just want to understand the use cases where this can be applied safely.</p>

<p>Thank you.<br />
</blockquote></p>

<p>Now to attempt an answer.  </p>

<p>Write-behind and detaching are not a panacea.  I mean that it is true that this pattern cannot always be used.  But at the same time it does not suffer from the issues you raise, at least not in this use case.</p>

<p>Always start from the top down as we would in this online betting use case.  What is the user experience we want and why?  Make design decisions from there.  In the questions you ask, I sense a desire to keep the database as the master of all data which is a bottom-up approach.  Let's cover the pitfalls of a bottom-up or database-up approach first.</p>

<p>As an example, I was on a panel with Brian Goetz and several "grid" vendors in 2008 at QCon San Francisco.  A similar question to yours was asked--"Can I really always build asynchronous apps?"  The answer from all the grid vendors was a thundering, "Yes!"  They need you to believe you can.  My answer was, "No.  You can't always afford it.  Asynchronous nature might be hard to add given your business requirements."  Brian got much more concrete and asked the audience to think about EBay and Amazon.  Amazon can be simplified down to an e-catalog site.  This is not to take away from their monumental achievements in scale and reliability.  I only mean to say that they have a massive need for caching where people buy maybe 5 - 10% of the time so 90% of visits are readers.  EBay has a massive write rate and those writes are highly localized to auctions ending in the next hour or minute.</p>

<p>EBay and Amazon need to make almost entirely inverse architecture decisions yet both are called eCommerce sites.  Both have a catalog.  Both have a user account management function.  And, both have payment clearing capabilities.  Amazon also has fulfillment capabilities and warehousing concerns that EBay does not but that is beside the point I am making.</p>

<p>Amazon might tell you that Sleepycat or voldemort or memcache rule (they only use Sleepycat, BTW.  Voldemort is an OSS clone of their Dynamo approach).  EBay would tell you that partitioned Oracle works great.  The two are both correct.  EBay needed transactional ACID updates in many of their pageviews whereas Amazon needed a different optimization (read Amazon's Dynamo paper for their approach to eventually consistent data storage).</p>

<p>Back to our topic. In the case of betting, you have:<br />
<ol><br />
<li> a catalog of games / matches / things on which to place bets</li><br />
<li> User account info</li><br />
<li> payment clearing system</li><br />
<li> placed bets which are like items you have sold that you have yet to fulfill / deliver</li><br />
</ol><br />
and more.</p>

<p>We can surmise together that when I place a bet and I get an email or web-based confirmation saying my bet has been placed for, let's say, $100, at 2:1 odds in the affirmative (meaning I think the results will be in favor of the bet direction), I fully expect to get paid $200 after the game if I win and to have my $100 debited if I lose.  If I win and I do not get the $200, there will be no excuses the bookmaker can make that will have me come back.  A bookmaker who does not know the commitments he has made will soon be out of business.</p>

<p>So as an architect, I start to think about coherence and transaction isolation, and databases sound really good.  But are they necessary?  No.  What I really want is:<br />
<ol><br />
<li>all my commitments on disk so I never risk losing anything</li><br />
<li>make that on more than 1 disk in case I even lose my disks</li><br />
<li>coherent recording of the odds and the amounts of the bet</li><br />
<li>a transactional view of my users' account balances so I don't extend credit when I do not intend to.</li><br />
<li>a fast, efficient way to search for, display matches and also a coherent way to update odds as my staff learn new info (say the quarterback breaks his leg).</li><br />
</ol><br />
Mapping Terracotta to this business requirement I would suggest:<br />
<ol><br />
<li>well, Terracotta is disk-based just like the DB so I can in theory use it in more places than I would use a cache</li><br />
<li>Yes, Terracotta server array will have at least 2 copies or more if I configure properly, on 2 or more separate _machines_.  That's lots of redundancy, more than the DB in fact.</li><br />
<li>Terracotta operations have isolation levels just like a DB.  There are read locks, write locks, synch-write locks and concurrent locks.  I would likely use a combination of these to record a bet that is being placed.  Specifically, read lock the catalog to bet against the currently-available odds, and write-lock the user's account or my list of placed bets or both to debit the account and write the placed bet.</li><br />
<li>I can use readwritelocks in util.concurrent to compose a transaction across a debit + credit operation against a user's account.  But, I want the accounts recorded in the DB and cannot use Terracotta in a 2PC manner, so I might choose to cache the account in Terracotta but keep it in the DB, using instead a 1.5PC (I covered 1.5PC in the original blog and will not cover it again here).</li><br />
<li>Perhaps put the catalog of games in the DB, and cache the catalog in Terracotta and when my back office content management tools change the DB, also send a JMS message to a node in my app cluster to update the in-memory cache as well, or I could just make my back-office updater a part of my production app cache in Terracotta.</li><br />
</ol></p>

<p>In summary, cache all the read only data in Terracotta to offload the DB of its otherwise wasted usage under web-onlookers.  Put user financial info in the DB and manage account balances there, using DB transactions to update.  Then put the bets into Terracotta, lazily flushing to the DB so that a back office SQL script can run payouts and collections at the end of each match, against people's accounts. This will offload not just read-only onlookers but actual updates and business transactions as well, at least for a time.  At Terracotta we call this "shaving the peak load" where the DB does not have to be sized large enough to handle traffic spikes.</p>

<p>To make this all work, we need a few more details though.  We have to make sure that we write an "end of match" eventing system that makes sure to flush all placed bets to the DB so that we can clear payments accurately at the end of a match from entirely within the DB.  Could you put accounting in Terracotta?  Yes, but all your back office tools would have to be rewritten to work against Java web services of some sort instead of against the DB they most likely work against today.</p>

<p>As for a stable view of bets and odds that are constantly being changed by the back office team members who are trying to optimize company revenue and there are a few challenges.  First, the odds.  Easiest way to update them in a transaction with the DB is to have the back office app that changes the odds, work within a Terracotta transaction.</p>

<p><a href="http://blog.terracottatech.com/Transactional%20writer.png"><img alt="Transactional%20writer.png" src="http://blog.terracottatech.com/Transactional%20writer-thumb.png" width="497" height="444" /></a></p>

<p>Now, the odds will be updated or fail and the business will know what risk the failure poses.  Our employees will always get clear messages from the Office updater app that they successfully changed the odds for a match or that the change failed.  They can even choose to stop all betting if something goes horribly wrong.  But they will always know the status of the games and odds.</p>

<p>Now, let's secure our betting model.  Bets are being placed into Terracotta and onto 2 disks so they are safe.  And the odds at which they are placed were valid odds at the time given we made a good back office tool for updating odds.  We just need to make sure all bets are flushed to the DB before the DB executes its payout process.  Should be simple.  When a match begins, flip the DB state to "INCOMPLETE" for that match in some table.  the payout script will refuse to run payouts against incomplete matches.  Now, have the app cluster flip the match to complete using a timer task of some sort.  Put the COMPLETION marker in the asynch write-behind queue with bets.  When the app tier decides to COMPLETE the match, all subsequent bet attempts will be rejected.  And a COMPLETION marker will be placed in the queue that will flip the DB state.  But it is placed in sequence or in order with all the bets.  So all you need is an in-order asynch write behind and bets can be placed asynch with the DB commit.  Let's take a look at in-order queuing in a picture.  It is a kewl technique that can help offload the db with fewer worries.</p>

<p><a href="http://blog.terracottatech.com/inorder%20queuing.png"><img alt="inorder%20queuing.png" src="http://blog.terracottatech.com/inorder%20queuing-thumb.png" width="567" height="513" /></a></p>

<p>We have 2 recipes.  First, adding Terracotta and 1.5PC updates to the DB-based application.  Second, using write behind queuing and sometimes guaranteeing the queue's processing order.  There are more but I just want to give you a taste of the sort of thinking you must embark upon when offloading the DB using Terracotta, in a by-hand manner.</p>

<p>As you can see, these recipes won't apply in all use cases.  And now I think we close in on an answer to your question.  Bets are not shared data amongst multiple users.  Bets are per-user, you see?  And thus bets can be asynchronously written back to the DB and we can even build db tasks that should only run when the asynch queue is drained and have no polling.  If bets were in fact as you suggest and predicated on what bets others have placed we couldn't do asynch write behind.  We would instead have to do write through.  If we were not a bookmaker, by the way, and instead a betting platform, then the pattern would shift from asynch write behind to matching engine.  Matching engines can be built in Terracotta that are 100X faster than Oracle RDBMS, but that's a different blog entry.  Nonetheless, I would still not need to write through to a DB nor do I have to give up on offloading the db in such use cases where I share data amongst user threads.  I would just need a different pattern.  In fact, I hear from many customers that they have moved highly contended multi-user write operations to Terracotta from the DB because the DB lock manager deadlocks regularly and Terracotta can handle much much higher write rates (check out the quote from Guy Moller, CTO of Brands4Friends on our terracottatech.com homepage--that is one such use case).</p>

<p><br />
If I were to give you 1 rule it is this:  use the freeze/thaw method.  If you are using an ORMapper and thawing data out of a DB record back into memory every time you use that data in the app, consider keeping that data thawed in Terracotta.  If the thaw occurs, but the data is only thawed once a day, or once a week or once in a month or once in a year, you could leave it frozen.  Thawed data format is good for data being accessed once a second or once a minute or once an hour.  Once you leave data in thawed form, you eventually have to freeze it back if your database is supporting reporting or other back office operations that your app cannot support on its own.  Freezing data that has been thawed for a long time is a write-behind pattern.</p>

<p>Hope this helps.  If not, consider just using Hibernate, Spring, and Terracotta together as in our Examinator reference app.  You can then plug Terracotta in as a 2nd level cache providor and offload the DB as much as possible w/o major app surgery.</p>

<p>Cheers,</p>

<p>--Ari<br />
</p>]]></description>
         <link>http://blog.terracottatech.com/2009/04/answering_a_readers_question_w.html</link>
         <guid>http://blog.terracottatech.com/2009/04/answering_a_readers_question_w.html</guid>
        
        
         <pubDate>Sat, 18 Apr 2009 12:39:33 -0800</pubDate>
      </item>
            <item>
         <title>Know your use case and optimize accordingly</title>
         <description><![CDATA[<p>Ok,  this one WAS going to be short.  But here it is anyways.</p>

<p>&lt;soap-box&gt;<br />
Let's not get caught up in technology for technology's sake</p>

<p>I was reading about Twitter-this and Twitter-that for the past few weeks.  People keep writing about its "architecture" which is really a way of writing about how they would solve the problem had they been a web giant with a killer idea, like Twitter.  Again, they are not, so I have to hearken back to Werner Voegel's tweet last week that said something like...<br />
<blockquote><br />
those who have built massively scaled architectures do not comment on the challenges of those in the middle of building such things</blockquote></p>

<p>So, lots of armchair quarterbacking going on here.  But it has reached the point of absurdity because now everyone seems to be pulling in their favorite technology du jour.  "I can do it this way with Scala."  "No no.  Hadoop, you idiot!"  "Erlang would save them!"  Don't get me wrong, I am not about to dig at these technologies nor am I about to claim Terracotta is better or competes or that Terracotta should be used for Twitter.</p>

<p>I read a post yesterday where the author asserts that on every visit to twitter.com in order to view the tweets awaiting you from friends and general twitter community, you should do a MapReduce operation over a grid of twitter users' tweet data, looking for the ones that match your personal subscription list.  Yes, folks.  That's 1 MapReduce of the entire tweetset per pageview.   And, by the way, when a tweet occurs it can be super fast because it just as to write my tweet to my personal tweet bucket.  Others will get my tweet when they come and check for it.</p>

<p>So, let's break this down.  O(1) for the write operation.  And that constant is very small, and efficient.  Good.  But wait.  for n twitter readers, I need to look at n-1 twitter users' accounts for tweets about which I care.  That's O(n(n-1)) which is O(n^2).  ICK!  I have millions of people viewing twitter an hour and that's an O(N^2) MapReduce (because MapReduce r3w1z!) yet I have hundreds of tweets per second which is O(1) because why?  I don't know. </p>

<p></p>

<p>I think you would want to optimize the read path, not the write path.  Thus, instead of a bulletin board pattern, you want to use a mailbox pattern (and something like Scala) to send tweets to each individual user's mailbox for viewing whenever he or she returns.</p>

<p>By using the mailbox pattern, we are trading off space for time.  Let's break it down again.  A tweet should go to all the interested parties who are following me.  That would be my 10 - 1000 friends plus the general twitter mailbox that all can see.  So the write would be O(n) for n friends (ignoring optimizations like the listener pattern which could arguably make this O(1) as well).  My message, in the worst case, will visit every inbox in the cluster.  But the read is now simple.  Just open my inbox and display all the tweets that have arrived since last check; O(1).</p>

<p>There.  Now the operation I do millions of times an hour through the Twitter API and direct through the web interface work and the writes aren't that slow either.</p>

<p>By the way, you will notice that the read-optimized solution we just built is pub/sub essentially so I can create a massive grid of point-to-point communications sockets to send messages between individuals and I can push instead of pull updates.  By getting a push-model going, I can easily send the SMS update, send the AIR-clients their update, and do a Comet-style conversation with browser-based users to push the updates.  This push probably helps eliminate 50% plus of my traffic by keeping the manual polling to a minimum.  A MapReduce-based solution must be invoked by the user...it shouldn't run on a regular schedule in my opinion. </p>

<p>UPDATE:  If you run the MapReduce on behalf of all twitter users, I just realized you have an O(n^n) Algorithm.  Worst I have EVER seen.  Think about it...n users each visiting n-1 other users tweet buckets looking for tweets.  Yes.  On a regular schedule, you will attempt an n^n data operation.</p>

<p> Thus, meta-refresh for browsers and polling-based architectures would flood my system with more and more inefficient traffic.</p>

<p>In sum, know your use case.  Optimize for the read path versus the write path.  Optimize for push versus pull.  Don't just use MapReduce or Scala or Erlang or even Terracotta because you can.  Use us because we solve the business problem top-down the most appropriate way.<br />
&lt;/soap-box&gt;</p>

<p>There is no silver bullet,</p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/04/know_your_use_case_and_optimiz.html</link>
         <guid>http://blog.terracottatech.com/2009/04/know_your_use_case_and_optimiz.html</guid>
        
        
         <pubDate>Thu, 16 Apr 2009 09:44:37 -0800</pubDate>
      </item>
            <item>
         <title>Terracotta working on top of Coherence</title>
         <description><![CDATA[<p>After quite a lot of work I got it to function.  The thing that might shock you is there is not a lot of value in this config.  It doesn't do anything special.  It doesn't do anything new.  It doesn't make anything better.</p>

<p>Why?</p>

<p>Because all I did is rename the host I was running the Terracotta Server on to "Coherence."  There.  Terracotta running on Coherence.</p>

<p>April Fool's :)</p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/04/terracotta_working_on_top_of_c.html</link>
         <guid>http://blog.terracottatech.com/2009/04/terracotta_working_on_top_of_c.html</guid>
        
        
         <pubDate>Wed, 01 Apr 2009 07:06:57 -0800</pubDate>
      </item>
            <item>
         <title>The fallacy of Peer to Peer</title>
         <description><![CDATA[<p>And one more image.  The fallacy, BTW, is that you don't care if you can push to all your peers in parallel.  The fact that you have to is crazy and unscalable.  Data mastering / partitioning is way more scalable w/o a loss in reliability, but way more availability than a mesh (peer-to-peer or multicast).</p>

<p><img alt="grids%20n%20fabrics.png" src="http://blog.terracottatech.com/grids%20n%20fabrics.png" width="626" height="570" /></p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/03/the_fallacy_of_peer_to_peer_1.html</link>
         <guid>http://blog.terracottatech.com/2009/03/the_fallacy_of_peer_to_peer_1.html</guid>
        
        
         <pubDate>Fri, 06 Mar 2009 16:17:45 -0800</pubDate>
      </item>
            <item>
         <title>2-tier coherent clustering</title>
         <description><![CDATA[<p>Contemplate this one for a while</p>

<center><H1>Terracotta</H1></center>
<img alt="2-tier%20coherence.png" src="http://blog.terracottatech.com/2-tier%20coherence.png" width="550" height="491" />

<p><br />
<center><H1>Relational DB</H1></center><br />
<img alt="1-tier%20coherence.png" src="http://blog.terracottatech.com/1-tier%20coherence.png" width="520" height="408" /></p>

<p>I will tell you more about it all soon.</p>

<p>--Ari</p>]]></description>
         <link>http://blog.terracottatech.com/2009/03/2tier_coherent_clustering_1.html</link>
         <guid>http://blog.terracottatech.com/2009/03/2tier_coherent_clustering_1.html</guid>
        
        
         <pubDate>Thu, 05 Mar 2009 22:21:46 -0800</pubDate>
      </item>
      
   </channel>
</rss>
