CockroachDB Node Locality: Why it's important and how to set it in Kubernetes

CockroachDB is a highly scalable, highly resilient Distributed SQL database. This means that nodes can be distributed across multiple locations across the globe to ensure that the data can meet scalability requirements and survive outages. Data can also be pinned to locations to help with performance and data residency regulations. In this blog we will talk about node locality, why this is important in CockroachDB and how to set this in Kubernetes environments.

What is Locality?

Locality is a set of key value pairs that are associated with each node to describe where it is geographically located. The --locality flag accepts arbitrary key-value pairs that describe the location of the node. The locality should include a region key-value if you are using CockroachDB’s multi-region SQL capabilities.

CockroachDB’s multi-region capabilities make it easier to run global applications. To be able to understand the concept you need to be familiar with the following terms. A Cluster region is specified at node start time and sets the geographic region. Database region is a geographic region in which a database operates. You must choose a database region from the list of available cluster regions. A Survival goal dictates how many simultaneous failures a database can survive. CockroachDB is able to optimise access to a table’s data and this is called Table Locality.

Depending on your deployment you can also specify country, availability zone, and other locality-related attributes. The key-value pairs should be ordered into locality tiers from most inclusive to least inclusive (e.g., region before availability zone as in region=eu-west-1,zone=eu-west-1a), and the keys and order of key-value pairs must be the same on all nodes. It’s typically better to include more pairs than fewer, as this provides more flexibility as to how data is distributed across the cluster.

Specifying a region with a region tier (eg. region=us-east-1) is required in order to enable CockroachDB’s multi-region capabilities. The regions added during node startup become the database regions. To add the first region to a database, use the ALTER DATABASE ... PRIMARY REGION statement. This will add the primary region to that database; additional regions can then be added. Now all data in that database will be stored within its assigned regions, and CockroachDB will optimise access to the database’s data from the primary region.

CockroachDB spreads the replicas of each piece of data across as diverse a set of localities as possible, with the order of the key value pairs in the localities argument determining the priority. Locality can also be used to influence the location of data replicas in various ways using high-level multi-region SQL capabilities or low-level replication zones.

When there is high latency between nodes (e.g., cross-availability zone and multi-region deployments), CockroachDB uses locality to move the data to range leases closer to where the database is most actively being used at the time, reducing network round trips and improving read performance — also known as “follow-the-workload”.

In a deployment across more than 3 availability zones, however, to ensure that all data benefits from “follow-the-workload”, you must increase your replication factor to match the total number of availability zones, to ensure data is available in any availability zone it might be requested from. Increasing the replica count will have a detrimental effect on write latency, though, so this should be considered when setting the replication factor.

Locality is also a prerequisite for using the Multi-region SQL abstractions, table partitioning, and Node Map enterprise features in CockroachDB.

How is locality set?

The following shell commands use the --locality flag to start 3 nodes to run across 3 regions: us-east-1, us-west-1, and europe-west-1. Each region’s nodes are further spread across different availability zones within that region.

cockroach start --locality=region=us-east-1,zone=us-east-1a ``# ... other required flags go here

cockroach start --locality=region=us-west-1,zone=us-west-1a ``# ... other required flags go here

cockroach start --locality=region=europe-west-1,zone=europe-west-1a ``# ... other required flags go here Note: These are not complete start commands. Other required flags go at the end of each command.

Setting locality nodes in Kubernetes

Setting locality on nodes that are created manually is straightforward as you know where you have placed them. You will know which cloud, region, and availability zone they have been deployed in. If you are using Kubernetes to deploy your workloads, though, you have less control over where pods are created.

There are methods to gain more control over node locality, including node selectors, Affinity and anti-affinity rules. These are added to your manifest files but add to the complexity — and can also cause workloads not to be scheduled if implemented incorrectly. You tell Kubernetes the desired state of your workload and it is the responsibility of Kubernetes to make this so.

This is ok for most workloads. With CockroachDB, however, locality is an important attribute that enables a unique set of capabilities (as we discussed earlier). CockroachDB has a small application, written on Go, called locality-checker to help manage locality. Locality checker is a container for detecting the region and availability zone of a CockroachDB pod running in a Kubernetes cloud offering.

It is meant to be run as an init container, which writes the region and zone values of the pod to a volume mounted at /etc/cockroach-locality. These values are then read by the CockroachDB pod at startup, to fill in its --locality flag value. This allows CockroachDB to be aware of its pods' region and zone placements within a Kubernetes deployment.

A complete locality flag which can be passed into cockroach start is written to /etc/cockroach-locality/locality. For users that want more control of the locality flag, the region a pod is running in is written to /etc/cockroach-locality/region, and the zone is written to /etc/cockroach-locality/zone.

Some changes need to be made to the CockroachDB example StatefulSet that is available in the GitHub repo. First you need to ensure that the ClusterRoleBinding is created in the same namespace that CockroachDB is deployed into. In this example it’s the default namespace but typically this may be named something else.

apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: cockroachdb labels: app: cockroachdb roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cockroachdb subjects: - kind: ServiceAccount name: cockroachdb namespace: default

The locality-checker works for all three of the hosted Kubernetes services (AKS, EKS, and GKE), it runs as an init-container. Init Containers are specialised containers that run before app containers in a pod. Init containers can contain utilities or setup scripts not present in an app image. In our case an application to retrieve location information from the node labels of the Kubernetes worker nodes. Below is the section that needs to be added to the initContainers: section of the StatefulSet configuration to enable the init container.

- image: cockroachdb/locality-checker:0.1 imagePullPolicy: IfNotPresent name: locality-checker volumeMounts: - mountPath: /etc/cockroach-locality name: cockroach-locality env: - name: KUBERNETES_NODE valueFrom: fieldRef: fieldPath: spec.nodeName

The information collected from the labels is written to a location in the container, this is then mounted into the application container. The volume mount is below.

volumeMounts:`` - mountPath: /etc/cockroach-locality`` name: cockroach-locality`` readOnly: true And this is the volume: volumes:`` - name: cockroach-locality`` emptyDir: {}

The start command needs to have the following added to write out the contents of the file which contains the updated locality:

$(cat /etc/cockroach-locality/locality || true)

Below is an example of a complete cockroach start command for additional context.

command: - "/bin/bash" - "-ecx" # The use of qualified `hostname -f` is crucial: # Other nodes aren't able to look up the unqualified hostname. - exec /cockroach/cockroach start --logtostderr $(cat /etc/cockroach-locality/locality || true) --certs-dir /cockroach/cockroach-certs --advertise-host $(hostname -f) --http-addr 0.0.0.0 --join cockroachdb-0.cockroachdb.uk south,cockroachdb-1.cockroachdb.uk south,cockroachdb-2.cockroachdb.uk south --cache $(expr $MEMORY_LIMIT_MIB / 4)MiB --max-sql-memory $(expr $MEMORY_LIMIT_MIB / 4)MiB --store=cockroach-data --store=cockroach-unencrypted --enterprise-encryption=path=cockroach-data,key=/cockroach/cockroach-key/aes-128.key,old-key=plain

By adding these changes to your Kubernetes manifest, the locality information will be automatically added to the start command of CockroachDB. This will automate the deployment and management of CockroachDB running in a Kubernetes environment.

If you would like to try this out for yourself, then I have some demo code available in a git repository that has this configured.