What is multi-region architecture? The key to high availability & risk mitigation

Nearly every business has tier 0 applications that are critical for delivering their services to customers: these are the applications that make the money. Any disruption or downtime can result in significant financial losses and damage to a company’s reputation. Adopting multi-region application architecture mitigates these risks and ensures high availability. In this blog post, we will explore the concept of multi-region application architecture and discuss how it helps in risk mitigation while providing high availability.

Region and availability zone differences

First let’s get clear about the difference between cloud platform regions and availability zones. Understanding the distinction is essential because it’s not uncommon for people to confuse the two, thinking that deploying your app across multiple availability zones is the same thing as deploying your app across multiple cloud provider regions. Yes, both are distributed architectures. No, they do not ensure the same level of high availability. Here’s why.

What are cloud provider regions?

Cloud provider regions are geographic locations where a cloud provider has physical data centers. For example, AWS has four regions in the US: us-east-1, us-east-2, us-west-1 and and us-west2. Each region is fully independent of the others, and data stored in one region is not automatically replicated to another. This means that if a region experiences an outage, only applications or data stored in that region will be affected. Regions are subdivided into availability zones.

What are availability zones?

Availability zones are separate and distinct areas within a region, and each availability zone (AZ) has its own data center. AWS’s us-east-1 region, for example, has six availability zones. (This is actually AWS’s largest region; three AZs per region is the most common pattern). AZs are designed to be isolated from one another so that if one availability zone experiences an outage, applications and data can fail over to another availability zone in the same region.

Deploying across multiple AZs usually provides a higher level of redundancy and availability than deploying to a single one for critical applications and data. But some cloud providers don’t advertise this dirty secret: separate AZs can actually be located in the same data center. Each AZ will have fully independent power, cooling, and networking, but they can share the same physical location. This matters when a physical catastrophe strikes, such as the fire in a Paris datacenter that caused Google Cloud Platform’s entire europe-west9 region to fail. The whole region was out because all three of its availability zones (europe-west9a, europe-west9b, europe-west9c) were in the same colo facility. Yes, each AZ had its own independent power and networking resources, but that doesn’t help when the building is on fire.

A brief history of high availability

What is multi-region application architecture?

Multi-region architecture distributes multiple instances of an application and its database across both cloud provider availability zones and regions. This replication and redundancy ensures the highest availability possible while mitigating the risks of financial loss and reputation damage that arise for a business when critical tier-0 applications experience disruption or downtime.

Why multi-region matters

Multi-region matters because cloud providers can, and do, have entire regions fail. It’s not frequent, but it does happen. Mitigating the risks inherent in this inevitability requires multi-region architecture.

A whole region going dark can lead to data loss or corruption, particularly when appropriate backup and recovery processes are not in place. This happens because data is usually replicated across multiple zones (remember that zones are logical representations of physical data centers within a region). So if all the data centers in a region fail simultaneously, there may not be a backup available to restore the data.

In addition, if the outage results in data loss or corruption (which might not be immediately recognized), businesses can face the risk of legal liability, data breaches and compliance violations, to name but a few potential negative consequences. And of course, any of these could result in significant financial penalties or damages.

Mitigating risks with multi-region architecture

By deploying an application across multiple regions, we can minimize the impact of natural disasters, regional outages, or infrastructure failures. If one region becomes unavailable, traffic is automatically routed to the functioning regions, reducing or fully avoiding downtime while ensuring continuous service availability.

Multi-region architectures can also mitigate risks associated with network failures and connectivity issues. By having redundant connections and utilizing traffic management techniques like DNS-based load balancing, organizations can ensure seamless failover to alternative regions if connectivity problems arise.

Multi-region architectures also enhance data security and compliance by implementing data replication and backup mechanisms across regions. In the event of a security breach or data loss, having replicated data in multiple regions ensures business continuity and minimizes the impact on customers and stakeholders.

This is why the pinnacle of fault-tolerant, high-availability application architecture centers on deploying applications across multiple zones and multiple regions. This pattern is called multi-region architecture.

Serverless for cost-effective multi-region applications

Despite high availability and risk-abatement benefits of multi-region architecture, the number of enterprises adopting it has been low. Why? For many, the database is the main thing holding them back.

Most multi-region databases come with significantly increased operational complexity and expense. So, even if you only have minimal customers and lighter traffic in certain regions, you typically have to provision and pay for dedicated machines in each one. This makes it challenging for cost-conscious teams to expand into new markets.

The costs associated with operating a relational database across multiple regions is a huge barrier, because of the fixed costs associated with maintaining hardware in each region, regardless of how much it’s being used. Even if 80% of your users are in one region, a database of equivalent size would have to be maintained in the other region to serve the other 20%. Then there’s the cost of scaling linearly with every region added. In other words, you’d have to scale to the expected peak that you might see in your largest region, but do so in all regions.

Thankfully, databases have evolved! The advent of Distributed SQL made it possible to operate a globally distributed database that functions as a single logical instance, no matter how many regions it is deployed in. Pairing Distributed SQL with serverless computing removes many, even most, of the complexity and cost barriers to multi-region architecture. You can build, iterate, and scale global applications with zero complexity, paying only for the resources consumed by your queries.

Serverless eliminates the need for sizing infrastructure, since the database is instantiated on demand and then automatically scaled to the size of the workload. The pay-as-you-go component of most serverless offerings also makes cost variable with usage. This results in significant savings because, region by region, you pay only for the resources consumed by your queries.

As yet, however, there are few databases offering the ability to build, iterate, and scale global applications with high availability in a serverless, consumption-based model that also natively supports multi-region applications. (We can think of only one, actually). CockroachDB Serverless is the missing link for mission-critical, multi-region application stacks.

If you’re deploying across multiple AZs, your app likely already has the same architectural primitives needed for deploying across multiple regions. By leveraging both regions and availability zones, you can ensure that your applications are highly available and resilient in the face of infrastructure failures. Paired with the right distributed database, multi-region architectures enable businesses to provide uninterrupted services to their customers — and maintain a competitive edge over their rivals in the marketplace.