Building High Availability Services with CockroachDB

Download Book
Building High Availability Services with CockroachDB

CockroachDB's multi-active availability model makes high availability easy. In this guide, we'll start with an overview of different availability models, and cover every consideration you need to make to ensure maximum availability when building on top of CockroachDB. 

Building High Availability Services with CockroachDB

Services are only useful when they're online. Unavailable services not only lose money, but also deteriorate your credibility in customers' eyes. This could lead to immeasurable costs to your company in the future. For most of your application's components, ensuring availability is straightforward––however, for the databases that underpin most of your application's functionality, instrumenting high availability is much more complex. Through decades of research and practice, engineers have developed a few different available models, each with their own distinctive behaviors. This guide will compare Active-Passive, Active-Active, and Multi-Active availability

Active-Passive: Choose Either Anomalies or Unavailability

In these models, one node receives all requests and replicates data to its follower. This kind of simplistic approach was common for databases developed in the pre-cloud era, e.g. PostgreSQL & MySQL. While the simplicity of Active-Passive models is a virtue in some cases, its lack of dynamism isn't well suited for today's more fluid deployments.

Active-Active: Available, but with Anomalies

Active-Active availability represents an evolution from Active-Passive, enabling databases to scale beyond single machines by letting nodes in a cluster serve reads and writes. With this kind of replication, you set up at least two active sites––each of which must contain all of the cluster's data. Clients read and write from the node in one of the active sites, which propagates any modifications to other active sites.

Thinking about this configuration for a moment reveals its shortcoming: what happens if two sites receive a write for the same key? For example, Oracle Golden Gate is an active-active system. If you aren't familiar with it, it's an additional component to the Oracle database that is meant specifically to enable Active-Active replication between nodes. For being such an expensive piece of additional software, you would hope it would avoid introducing inconsistencies in your data. Instead, its availability story is prone to introducing anomalies in your data.

Multi-Active Availability: CockroachDB's Consistent & Available Mode

Multi-Active Availability is a term coined by Cockroach Labs for CockroachDB's availability model. With it, you set up at least 3 nodes, each of which can perform reads and writes for any data in the cluster without generating conflicts. Through Multi-Active Availability, CockroachDB is strongly consistent, and offers incredible resilience in the face of outages.

How to Build High Availability Services with CockroachDB

When it comes to actually building services on top of CockroachDB, there are many considerations you need to make to achieve maximum availability. In this guide, we will cover every consideration you need to make, starting with the lowest level and progressing through the rest of your technology stack.