When you deploy a web application in the cloud, you typically select a region for your cloud resources first — a region being the physical datacenter location where your cloud-based application will live. It’s common to choose just one region, but you can also stitch multiple regions together.
Why would you choose multi-region architecture? Three words: resiliency, latency and localization.
Most cloud resources you will need for any application must be placed within a region (if they’re not, they’re called global resources). Examples of cloud resources include virtual machines, databases, and load balancers.
Regions are independent geographic areas that consist of zones. Zones and regions are logical abstractions of underlying physical resources: physical datacenters, filled with real hardware — servers, storage and networking infrastructure.
Cloud regions are available across multiple continents. Regions are divided into at least three availability zones (AZ), which are individual fault domains in a region. Google Cloud, for example, currently operates 97 AZs in 32 regions. If one zone in a given region becomes unavailable it is very unlikely that other zones in the same region will also have the same issue. (And if the whole region goes down, we’ve still got you covered – keep reading).
As Cockroach Labs cofounder Spencer Kimball says, We know that sometimes shit happens. And, at scale, shit is *always* happening. Multi-region applications are resilient in ways that single region and single AZ applications cannot match. Single region architecture can survive machine failures. Single availability zone architecture can survive an AZ failure. But only multi-region application architecture can survive when an entire region fails.
Not every application has users scattered around the entire planet, but most apps very much want to deliver a good user experience across a single country. Even a two or three region deployment has the same challenges as a full-on global deployment, so either way applications must be built with the same architectural primitives. Multi-region has become the new global!
Multi-region has built-in benefits for user and developer both: low latency and data residency. By serving your application and bringing your data close geographically to your user, you get the best possible performance and lowest possible latency — and they get the best possible experience. As a developer, localizing data by region lets you simply handle data residency requirements in the data layer rather than through heavy application architecture.
Distributed systems are inherently complex, and designing multi-region application architecture has traditionally been difficult (and often painful to deploy). Two relatively recent developments help take away the majority of that complexity and pain: Google Cloud Run and CockroachDB Dedicated.
On Cloud Run, your code can either run continuously as a service (running code that responds to web requests or events) or as a job (performing work and quitting when the “job” is done). Both services and jobs run in the same environment and can use the same integrations with other services on Google Cloud.
CockroachDB Dedicated gives you instant access to a horizontally scalable relational database in the cloud that you can spin up in seconds without having to worry about hardware details or capacity planning.
When developers build with Cloud Run and CockroachDB, they liberate themselves from configuring, operating, and scaling both their cloud service and database. Both provide built-in autoscaling and reliability, and using them together also takes away many of the difficulties associated with building a multi-region application. Cloud Run serves your application via CDN right to where your users are – wouldn’t it be great if your data was right there next to them, too? Previously, to make that happen, developers had to plan, build for, and manage redundant databases in each region. And, with CockroachDB Dedicated as a single-instance database that spans across multiple regions, you just `set region` and you’re done. It’s really that basic.
But what does the architecture itself look like?
Let’s see how to design a multi-regional web application with Cloud Run and a CockroachDB cluster on Google Cloud. As a bonus, we will also cover how to configure your application to remain available even if an entire region fails, plus how to improve response times of requests involving database reads in all regions.
The logical architecture of the system is refreshingly uncomplicated. From the app developer point of view, you’re just working with an application server connected to a single database.
The application server is Cloud Run and the database is CockroachDB. Let’s take a quick look at considerations for working with each, before combing them and diving into the multi-region architecture.
Cloud Run is a container platform on Google Cloud that runs and auto-scales your container directly on top of Google’s infrastructure. The defining characteristics of Cloud Run are:
You deploy applications to Cloud Run packaged in a container image. Expert practitioners usually like container images, because it provides them with a lot of freedom – any binary (more specifically, all Linux x86_64 ABI binaries) can run on Cloud Run.
However, some users (experts included) value more convenience, or they want Google to be responsible for building and packaging their application.
As this mini-tutorial shows, you can also use a source-based deploy on Cloud Run. Here’s how to scaffold and deploy a sample Node.js app on Cloud Run in three steps (try it out!):
With a source-based deployment, Cloud Run takes a directory with source code and builds and packages it into a container image using the open source Google Cloud Buildpacks. Once again, removing complexity: now you don’t have to worry about creating that perfect Dockerfile.
For hands-on information for building production-ready applications with Cloud Run, you can download the complete O’Reilly book Building Serverless Applications with Google Cloud Run for free.
As a distributed database, CockroachDB runs on a cluster of nodes. Here’s the great part about CockroachDB Dedicated: As an application developer, you don’t need to know about that, or even care. You’re using one database, which just happens to be spread across multiple nodes…Which can easily be in different regions.
If disaster strikes and a node in a CockroachDB cluster fails, your database remains available for querying. You can even configure your database to be resilient against the failure of all the nodes in any given region.
(In contrast, traditional database systems such as MySQL and PostgreSQL are designed to run on a single node. If you want to run multiple consistent copies of the same database in multiple locations, you’ll need to set up replication, automate failover (in case the primary node fails), and figure out how to maintain data consistency, which is especially hard if you allow writes to multiple nodes).
CockroachDB provides consistent data replication across all nodes. Every node in a CockroachDB cluster can handle all queries. An individual node might need to reach out to other nodes before it can return a result. (This has important performance considerations, so we will talk about it more later).
To get started with CockroachDB Dedicated on Cockroach Cloud (three more words: permanent free tier!) you can both register and set up a new CockroachDB Dedicated database in the time it takes to make a bowl of ramen:
Now that we’ve covered getting started with Google Cloud Run and CockroachDB Dedicated (and possibly taken a noodle snack break), let’s put everything together in a multi-regional system design. The example architecture in this diagram spans three continents. The components are:
A key component in this design is the global External HTTPS Load Balancer, which provides you with a global Anycast IP address. Anycast is a network addressing and routing method that allows incoming requests to be routed to a variety of different nodes by creating an IP address that can be shared between multiple devices on the internet. Packets destined for an anycast address are routed to the device that is closest.
In a multi-region application, Anycast typically routes incoming traffic to the nearest data center with the capacity to process the request efficiently. Selective routing allows an Anycast network to be resilient in the face of high traffic volume, network congestion, and DDoS attacks.
Visit gcping.com for a great demonstration of the external HTTPS load balancer. GCPing.com is a small web app that makes requests from your browser directly to Cloud Run services deployed in every region, and to the same services through an External HTTPS Load Balancer.
GCPing.com determines the median response time for every region by sending multiple requests and shows the results in a table. There is also a row for the global External HTTPS Load Balancer – it should have the fastest median response time.
If you’re curious to know exactly what region is closest to your location in terms of BGP network hops (Border Gateway Protocol or BGP is the routing protocol for the Internet), open https://global.gcping.com/api/ping directly to see your home region. I’m based out of Baltimore, Maryland in the US, so it shows me US-east-4.
Let’s discuss latency, before you get too excited about the prospect of serving requests all over the globe in less than 30 milliseconds!
If your application data is distributed across multiple continents, you’ll always face limitations due to the fact that data takes some time to get from one location to another location. For example, if a packet travels from Iowa to Delhi and back, you can expect a round trip time of around 267 milliseconds.
The latency between your user and Google Cloud is an important factor in how fast your application is. For example, if your users are primarily on a cellular network, they are typically much more tolerant to latency and typically won’t notice a 100 milliseconds improvement.
Most web applications benefit from using a global content delivery network (CDN). A CDN makes sure all cacheable content (including static assets, images, and your frontend code) is served from an edge location close to the user. By getting a head start loading pre-cached content, a CDN makes your application feel faster, even when calls for uncacheable dynamic content are still relatively slow.
Finally, to get the fastest possible response times in a multi-regional application, you also need to make sure that the data users need is managed close to them, and not in a different region – which is why the pairing between Cloud Run and CockroachDB is particularly powerful!
To create a multi-regional CockroachDB cluster, you’ll need to create a cluster in at least three regions, with three nodes in every region. (Pro tip: increasing your replication factor beyond three nodes empowers the cluster to be resilient even in the face of multiple, simultaneous node failures).
The easiest way to do that is with CockroachDB Dedicated, which hosts your database on Cockroach Cloud. (If, however, you just want to experiment with the multi-region features you can start nodes in containers on your local machine).
You’ll need to pass the location (region, zone) to a node when it starts. CockroachDB provides you with several high level controls that influence where data is placed and how failures are handled.
If you create a database in a multi-regional CockroachDB cluster, you select one primary region (the other regions are follower regions). Nodes in the primary region will return results to SQL queries faster than the nodes in the follower regions.
Every database in a multi-regional CockroachDB cluster has a primary region with the lowest query latencies, and follower regions that have slower query performance. Let’s dive into why that happens and what you can do to improve performance in follower regions.
CockroachDB manages consistency by making one node at a time responsible for reads and writes to a partition of data. A partition of data, called a range, is usually a table but it can be more granular. The “leader” node that is responsible for the reads and writes is called the leaseholder.
(In a multi-region cluster, the nodes that can become leader for a range are constrained to a region. The primary region can be changed for whole tables, partitions of a table, or even at the single row level).
Every node in a CockroachDB cluster has a primary region and follower regions. You can change the primary regions for every table or even for different partitions of a table. Queries that require a strongly consistent view are always handled in the primary region.
Serving from followers is useful in multi-region clusters because multiple regions can have followers and geographically diverse clients can route their request to a follower nearby - thus getting local latencies for their reads.
If a node in a follower region receives a query that requires an up-to-date and strongly consistent read, it refers to the primary leaseholder in that region to verify that it holds identical data. This can introduce slightly longer read times. In our 3-region diagram above, we can see that the difference is between 99ms and 156ms.
Even if you do not care much about ~100 ms latency, there are three not-too-difficult things you can do to improve performance with no changes to your infrastructure:
There is a lot more to this topic of how CockroachDB is built to optimize best performance from your database across multiple geographic regions. You can read this deep dive on follower reads…Or simply leave it to CockroachDB Dedicated to manage it for you.
Ready to apply this multi-region application architecture to an actual project? The hands-on tutorial Develop and Deploy a Global Application walks you through building a multi-regional web application for the fictional vehicle-sharing company, MovR, using Flask, SQLAlchemy, and CockroachDB Cloud. Next, deploy your sample app with Google Cloud Run and CockroachDB Cloud.
And, while learning all things serverless and distributed (including multi-region deployments), don’t forget to download your free copy of our O’Reilly book CockroachDB: The Definitive Guide.
Slow applications kill business. Greg Lindon (in this now archived deck), noted that 100ms in latency lowered Amazon’s …Read More