I have been remiss in not answering a reader's very intelligent question. Here's the question.
Write-behind would be wonderful for performance, but how do you handle db transaction errors? What if someone places a bet (in your real-life example), and the bet fails to commit? How do we recover from that? Was the user originally told (when the bet was marked dirty) that the bet was successfully placed (and committed)? Or is the user told that the bet is “posted” or something else that implies that the bet is not truly committed yet?
If the application considers the objected committed once it is marked dirty (and/or placed on the write-behind queue), then I assume that there could be potential consistency issues with write-behind, where one box has the bet placed (on the queue, but not in the db), but the other boxes do not see the bet placed?
And even on the box that placed the bet, if the db is queried with a projection that includes the bet (rather than the Bet domain object; eg. Some report of all bets, joined in with the user and their account status), then I assume the query will return inconsistent results (ie. The report of bets will not include the bet on the write-behind queue)?
Not that any of this is an argument against write-behind; I just want to understand the use cases where this can be applied safely.
Thank you.
Now to attempt an answer.
Write-behind and detaching are not a panacea. I mean that it is true that this pattern cannot always be used. But at the same time it does not suffer from the issues you raise, at least not in this use case.
Always start from the top down as we would in this online betting use case. What is the user experience we want and why? Make design decisions from there. In the questions you ask, I sense a desire to keep the database as the master of all data which is a bottom-up approach. Let's cover the pitfalls of a bottom-up or database-up approach first.
As an example, I was on a panel with Brian Goetz and several "grid" vendors in 2008 at QCon San Francisco. A similar question to yours was asked--"Can I really always build asynchronous apps?" The answer from all the grid vendors was a thundering, "Yes!" They need you to believe you can. My answer was, "No. You can't always afford it. Asynchronous nature might be hard to add given your business requirements." Brian got much more concrete and asked the audience to think about EBay and Amazon. Amazon can be simplified down to an e-catalog site. This is not to take away from their monumental achievements in scale and reliability. I only mean to say that they have a massive need for caching where people buy maybe 5 - 10% of the time so 90% of visits are readers. EBay has a massive write rate and those writes are highly localized to auctions ending in the next hour or minute.
EBay and Amazon need to make almost entirely inverse architecture decisions yet both are called eCommerce sites. Both have a catalog. Both have a user account management function. And, both have payment clearing capabilities. Amazon also has fulfillment capabilities and warehousing concerns that EBay does not but that is beside the point I am making.
Amazon might tell you that Sleepycat or voldemort or memcache rule (they only use Sleepycat, BTW. Voldemort is an OSS clone of their Dynamo approach). EBay would tell you that partitioned Oracle works great. The two are both correct. EBay needed transactional ACID updates in many of their pageviews whereas Amazon needed a different optimization (read Amazon's Dynamo paper for their approach to eventually consistent data storage).
Back to our topic. In the case of betting, you have:
- a catalog of games / matches / things on which to place bets
- User account info
- payment clearing system
- placed bets which are like items you have sold that you have yet to fulfill / deliver
and more.
We can surmise together that when I place a bet and I get an email or web-based confirmation saying my bet has been placed for, let's say, $100, at 2:1 odds in the affirmative (meaning I think the results will be in favor of the bet direction), I fully expect to get paid $200 after the game if I win and to have my $100 debited if I lose. If I win and I do not get the $200, there will be no excuses the bookmaker can make that will have me come back. A bookmaker who does not know the commitments he has made will soon be out of business.
So as an architect, I start to think about coherence and transaction isolation, and databases sound really good. But are they necessary? No. What I really want is:
- all my commitments on disk so I never risk losing anything
- make that on more than 1 disk in case I even lose my disks
- coherent recording of the odds and the amounts of the bet
- a transactional view of my users' account balances so I don't extend credit when I do not intend to.
- a fast, efficient way to search for, display matches and also a coherent way to update odds as my staff learn new info (say the quarterback breaks his leg).
Mapping Terracotta to this business requirement I would suggest:
- well, Terracotta is disk-based just like the DB so I can in theory use it in more places than I would use a cache
- Yes, Terracotta server array will have at least 2 copies or more if I configure properly, on 2 or more separate _machines_. That's lots of redundancy, more than the DB in fact.
- Terracotta operations have isolation levels just like a DB. There are read locks, write locks, synch-write locks and concurrent locks. I would likely use a combination of these to record a bet that is being placed. Specifically, read lock the catalog to bet against the currently-available odds, and write-lock the user's account or my list of placed bets or both to debit the account and write the placed bet.
- I can use readwritelocks in util.concurrent to compose a transaction across a debit + credit operation against a user's account. But, I want the accounts recorded in the DB and cannot use Terracotta in a 2PC manner, so I might choose to cache the account in Terracotta but keep it in the DB, using instead a 1.5PC (I covered 1.5PC in the original blog and will not cover it again here).
- Perhaps put the catalog of games in the DB, and cache the catalog in Terracotta and when my back office content management tools change the DB, also send a JMS message to a node in my app cluster to update the in-memory cache as well, or I could just make my back-office updater a part of my production app cache in Terracotta.
In summary, cache all the read only data in Terracotta to offload the DB of its otherwise wasted usage under web-onlookers. Put user financial info in the DB and manage account balances there, using DB transactions to update. Then put the bets into Terracotta, lazily flushing to the DB so that a back office SQL script can run payouts and collections at the end of each match, against people's accounts. This will offload not just read-only onlookers but actual updates and business transactions as well, at least for a time. At Terracotta we call this "shaving the peak load" where the DB does not have to be sized large enough to handle traffic spikes.
To make this all work, we need a few more details though. We have to make sure that we write an "end of match" eventing system that makes sure to flush all placed bets to the DB so that we can clear payments accurately at the end of a match from entirely within the DB. Could you put accounting in Terracotta? Yes, but all your back office tools would have to be rewritten to work against Java web services of some sort instead of against the DB they most likely work against today.
As for a stable view of bets and odds that are constantly being changed by the back office team members who are trying to optimize company revenue and there are a few challenges. First, the odds. Easiest way to update them in a transaction with the DB is to have the back office app that changes the odds, work within a Terracotta transaction.

Now, the odds will be updated or fail and the business will know what risk the failure poses. Our employees will always get clear messages from the Office updater app that they successfully changed the odds for a match or that the change failed. They can even choose to stop all betting if something goes horribly wrong. But they will always know the status of the games and odds.
Now, let's secure our betting model. Bets are being placed into Terracotta and onto 2 disks so they are safe. And the odds at which they are placed were valid odds at the time given we made a good back office tool for updating odds. We just need to make sure all bets are flushed to the DB before the DB executes its payout process. Should be simple. When a match begins, flip the DB state to "INCOMPLETE" for that match in some table. the payout script will refuse to run payouts against incomplete matches. Now, have the app cluster flip the match to complete using a timer task of some sort. Put the COMPLETION marker in the asynch write-behind queue with bets. When the app tier decides to COMPLETE the match, all subsequent bet attempts will be rejected. And a COMPLETION marker will be placed in the queue that will flip the DB state. But it is placed in sequence or in order with all the bets. So all you need is an in-order asynch write behind and bets can be placed asynch with the DB commit. Let's take a look at in-order queuing in a picture. It is a kewl technique that can help offload the db with fewer worries.

We have 2 recipes. First, adding Terracotta and 1.5PC updates to the DB-based application. Second, using write behind queuing and sometimes guaranteeing the queue's processing order. There are more but I just want to give you a taste of the sort of thinking you must embark upon when offloading the DB using Terracotta, in a by-hand manner.
As you can see, these recipes won't apply in all use cases. And now I think we close in on an answer to your question. Bets are not shared data amongst multiple users. Bets are per-user, you see? And thus bets can be asynchronously written back to the DB and we can even build db tasks that should only run when the asynch queue is drained and have no polling. If bets were in fact as you suggest and predicated on what bets others have placed we couldn't do asynch write behind. We would instead have to do write through. If we were not a bookmaker, by the way, and instead a betting platform, then the pattern would shift from asynch write behind to matching engine. Matching engines can be built in Terracotta that are 100X faster than Oracle RDBMS, but that's a different blog entry. Nonetheless, I would still not need to write through to a DB nor do I have to give up on offloading the db in such use cases where I share data amongst user threads. I would just need a different pattern. In fact, I hear from many customers that they have moved highly contended multi-user write operations to Terracotta from the DB because the DB lock manager deadlocks regularly and Terracotta can handle much much higher write rates (check out the quote from Guy Moller, CTO of Brands4Friends on our terracottatech.com homepage--that is one such use case).
If I were to give you 1 rule it is this: use the freeze/thaw method. If you are using an ORMapper and thawing data out of a DB record back into memory every time you use that data in the app, consider keeping that data thawed in Terracotta. If the thaw occurs, but the data is only thawed once a day, or once a week or once in a month or once in a year, you could leave it frozen. Thawed data format is good for data being accessed once a second or once a minute or once an hour. Once you leave data in thawed form, you eventually have to freeze it back if your database is supporting reporting or other back office operations that your app cannot support on its own. Freezing data that has been thawed for a long time is a write-behind pattern.
Hope this helps. If not, consider just using Hibernate, Spring, and Terracotta together as in our Examinator reference app. You can then plug Terracotta in as a 2nd level cache providor and offload the DB as much as possible w/o major app surgery.
Cheers,
--Ari