You are here
Scale up and Speed Operations with New Application Architectures
“Batteries included” programming languages let coders do several important things (partial list):
- Prototype new, customer-facing applications quickly (at least in scaled-down forms sufficient to handle a few users, stimulate feedback and begin the virtuous cycle of agile sprints).
- Write terse, but powerful, internal-facing, ad-hoc applications, automating workflow and speeding operations.
- Iterate research applications, where an algorithmic payload and the data it works on are the stars of the show, and “housekeeping,” though always necessary, takes a conceptual back seat. Examples include machine learning (e.g., training a model), data science (e.g., performing map/reduce operations on large datasets), etc.
- Quickly create and deploy terse, but extremely powerful stateless routines for deployment on frameworks that handle the details of dispatch and scale-out. Examples include ‘functions’ for deployment on so-called ‘serverless compute’ platforms – used with growing frequency to perform batch processing tasks in response to data inbound from mobile and IoT devices; to provide minimalistic web services on demand, or serve other use-cases.
Of course, coders also use powerful, rich, languages to create real applications for production. The simplest way of doing this is to build a so-called monolith: a single, do-everything program running on an application server and talking to clients (also called a “two-tier” architecture). In the expanded “three-tier” architecture still used by many enterprise applications, the application server fronts for a database.
The problem with monoliths (and groupings thereof) is that they’re not efficiently scalable. In recent decades, three-tier apps have been extended by webifying the front end and replacing bespoke clients with browsers. But in many cases, the main gain has been to prettify the user interface. A single box (containing the webserver, application, and the server side of any presentation framework logic) is still doing everything except storing the data, and the data is being stored on another single box.
The only dead-simple way to scale this is up: make the servers faster, give the database more storage space. If you need to scale further, you can cluster the database, which isn’t simple, but at least it’s the database provider’s problem and (bonus!) nets you greater resiliency (clustering a database lets the software write data across a number of physical volumes in a way that prevents any data-loss if a single volume goes down).
If your application server is burdened, however, you need to think about how your application works. Best-case is if it’s truly stateless: i.e., each time someone connects to your app, they make a request, the app does a thing (e.g., serves a web page), and that’s it, rinse, repeat. If so, you can duplicate your app server and stick a load balancer in front of it to make sure requests bounce to an instance of the app server that’s working and available. Many public websites scale this way.
But if you need to maintain any kind of state – even just maintain a session for a logged-in user while they make several requests in a row – you start potentially encountering problems. Web languages/frameworks will maintain state for you between application invocations (which is part of their charm), but if an app server fails, users will be bounced to a new server that doesn’t automatically know they’re in a session, much less what step they’re at in your process.
So now you need to solve this problem: either by configuring your load balancer to help you maintain session continuity (and perhaps adjusting your app to favor the load balancer’s way of signaling session, rather than its own built-in method(s)) or by deploying yet another dependency on each of your app servers. Something like memcached would do the trick here, as it lets you distribute state “stickiness” across a server pool. (And bonus! It also unburdens your application stack by caching requests and serving redundant ones from memory.)
That doesn’t, however, completely solve your potential scaling problems. Now, for example, you might notice that certain specific user behaviors contribute far more than others in bogging down your app servers. “If only I could scale just that small part of my application,” you think, “instead of building out more and more copies of the whole thing.” A quick review of your Amazon AWS bill confirms your suspicions: “Wow. That could save me a ton of money.”
Ask around the office, and your DevOps folks concur. “That computation is the fastest-moving part of our app,” they say. “It’s our core intellectual property, critical to our business model, and gets lots of changes with each release. We’d love to break it out and version it separately. It would let us move faster on the dev side, and it would keep us from having to rebuild, retest, and redeploy the entire stack each time we make a change to this one thing.”
So they do, and all sorts of win emerge: better scaling, better resource utilization, better user experience, better reliability, fewer failed deploys (because most deploys are of new revs of single components), easier rollbacks from problematic deploys (because ditto, so just roll back to the old version of the component), money saved.
In a nutshell, the picture we’ve drawn above explains most of what’s happening in software architecture, today. Home-grown, naive software monoliths are being replaced: initially by multi-tier solutions that – to achieve greater resilience, scalability, and for many other reasons – heavily leverage a wide range of open source and commercial components; and then, more gradually (and often by leveraging even greater numbers of third-party components) being refactored into collections of small, single-purpose, isolate and stateless parts that scale independently.
We’ll revisit this “more and more moving parts” theme again and again, from different angles, as we discuss dependencies, automation, containers and microservices, and the discipline of continuous integration.
Portability, Self-Healing, Elasticity
Stateless applications, microservice architectures, and containerization together deliver huge (potential) benefits mostly because of how they interoperate with container execution and orchestration environments like Docker Swarm, Kubernetes, and Mesos DC/OS.
They’re portable, which enables many ‘hybrid cloud’ efficiencies. Individual containers and multi-container deployments are instantly portable between (for example) version-identical premise and hosted container orchestration environments. This solves the basic problem of workload mobility that has long stood in the path of beneficial cloud applications, such as the ability to quickly acquire public cloud capacity, deploy and vector traffic to (elastically-scalable) versions of your key applications during periods of high traffic.
They’re self-healing, which changes everything about how you manage critical applications. Bugs always exist. Components sometimes crash. Nodes go down. But when you’re running containerized, stateless-microservice applications on modern orchestrators, this can be something other than an emergency. Orchestrators provide built-in high availability services that self-heal comprehensively: i.e., if an orchestrator node fails or an app component crashes, the orchestrator can be configured to restart affected components or deployments in remaining capacity. If these components are stateless, application availability is restored very quickly (carefully-designed and deployed apps may not even lose active connections if they’re able to recognize a redirected user and recover state from persistent storage).
By that same token, the orchestrator can, under most circumstances, readily heal itself from worker and even from master node failures – even automatically, if the orchestrator gets some support from underlying IaaS and automation. Indeed, in an infrastructure-as-code environment where orchestrator deployment is fully automated, you can treat orchestrator capacity (save for persistent storage) as disposable, because it takes less than ten minutes to auto-deploy a typical orchestrator from first principles, on pre-prepared VMs or bare metal machines.
They’re elastic. Adding worker capacity to an orchestrator is very easy, fast, and readily automated. At that point, workloads and deployments can be manually scaled or simple triggers developed to automate elastic scale-out/back into available capacity.
Kids: Don’t (Necessarily) Try This at Home
Building (and then operating and maintaining) microservice-type distributed applications from scratch is hard. Breaking up working monoliths into microservices is also hard, and may, in some cases, be entirely impractical. Further, there are efficiency advantages to some monolithic designs (example: local function call overhead vs. network message-passing overhead) that shouldn’t be abandoned lightly. In this blog, Martin Fowler, a famed software engineer, argues in favor of monoliths. In this one, Jake Lumetta, CEO of ButterCMS, tries poking a few holes in the “always start with a monolith” conventional wisdom.