What's the Easiest, Most Scalable, Enterprise Architecture You Know?
I just finished an offsite where all our execs got together with our sales reps and walked through the accounts we won this quarter. What were the patterns? What resonated with the users about our product and our value proposition?
Here's the pattern. Everyone chose Terracotta because it is transparent. Several also chose because it is open source. But the pattern was that people wanted simple scalability and _then_ tested and did proofs of concept around performance and availability. In deployment, they all had roughly the application they had started with, but it was running on Terracotta and could be spread across application nodes with no recurring changes. In short, the business's problem was scalability and availability, but the developer's challenge was simplicity and time to market.
The punchline is that scalability and availability mean everything to application teams. It is what keeps their paycheck showing up in their bank account each month. But, if teams can like the application they build, they will pick that path of least resistance to getting scalability and availability. Thus, I conclude that simplicity is key.
A few examples:
- One customer wanted to drop-in a cache of their system of record. It had to do tens of thousands of queries per second, but it had to work within their existing architecture.
- One customer wanted to drop-in clustering of reference data for what they called a map of maps. They started with clustered caching and quickly realized that the nested maps would lead to HUGE serialization payloads and was a non-starter. Before rewriting and flattening their object graph, they tried Terracotta. And it worked
- One customer wanted to distribute a query engine they had built using ExecutorService and multi-threaded code. They found that Terracotta could spread that query engine across 10 servers without changes because Terracotta supports many util.concurrent constructs out of the box.
So I noticed this simplicity pattern, especially when juxtaposed with my QCon, San Francisco experience where every "real world" architecture is predicated on partitioning and "eventually correct" algorithms.
Partitioning asserts that scalability comes from "add a brick" scale-out approaches. Add a node and get a node's worth of capacity.
Partitioning asserts that availability comes from stateless database-backed storage approaches. This does not conflict with scalability because a database instance can be isolated down to a function (order management and customer management could be in 2 separate instances--vertical partitioning), or to key range (users starting in "A" can be stored in a separate instance from those starting in "B"--horizontal partitioning).
And what of simplicity? Well, anyone who builds such an architecture has told me that this is no harder than building the simple form of the application. Why then do all the people I speak to at conferences, and all the customers we win say that "the less I have to partition, the happier I am."
Given that I helped build and run one of these for 4 years I think I know the answer. Transparency. When the partition is on the wrong boundary, what do you do? Change the code. When the partition needs to be chopped to an even finer grain, what do you do? Hash to more buckets, drop the data, reload it under the new number of buckets, and you are up and running again. When the partition fails to partition further, what do you do? Start tuning and caching.
In almost all cases, IT engineers are buying and deploying more hardware, architects are tuning infrastructure services like MQ, ESB servers, and database servers, and developers are changing code to compensate for the presence of more servers and services.
The most scalable architectures I know are partitioned. The most available architectures I know are stateless. And, the most simple architectures I know are stateful. It all conflicts.
I think the more we can deliver stateful development models that partition for scale and persist for durability at runtime, the more we lessen the trade-offs.
I am going to think about this some more, but for now I will end with a use case where a user had a caching service partitioned (vertically) outside the application. That caching service could partition the data, and spread it out any way it saw fit because that cache was abstracted from the core application code onto both its own interfaces and its own servers!
Well, if Terracotta is already a separate piece of software from your application, why can't you just write an application, configure which objects need to be shared and cached, and then we partition the cache transparently inside our infrastructure software? We can in fact. I need to think more about this!
I want the simplest, most scalable enterprise architecture we all know to be stateful in development, stateless at runtime, and all transparent to the Java developer.