Running CockroachDB on Kubernetes

Running CockroachDB on Kubernetes

Since this post was originally published in 2017, StatefulSets have become common and allow a wide array of stateful workloads to run on Kubernetes. In this post,  we’ll quickly walk through the history of StatefulSets, and how they fit with CockroachDB and Kubernetes, before jumping into a tutorial for running CockroachDB on Kubernetes. 

-– 

Managing resilience, scale, and ease of operations in a containerized world is largely what Kubernetes is all about—and one of the reasons platform adoption has doubled since 2017.  And as container orchestration continues to become a dominant DevOps paradigm, the ecosystem has continued to mature with better tools for replication, management, and monitoring of our workloads.

And as Kubernetes grows, so does CockroachDB as we’ve recently simplified some of the day 2 operations associated with our distributed database with our Kubernetes Operator.  Ultimately, however, our overall goal in the cloud-native community is singular: ease the deployment of stateful workloads on Kubernetes.

Bringing State to Kubernetes

CockroachDB helps solve for stateful, database-dependent applications through replication of data across independent database nodes in a way that will survive any failure (just in case our name didn’t make total sense).  CockroachDB, combined with Kubernetes’ built-in scale out, survivability and replication strategies, can give you the speed and simplicity of orchestration without sacrificing the high availability and correctness you expect from critical stateful databases.

How do CockroachDB + Kubernetes = Retained State?

While Kubernetes is fairly straightforward for use with stateless services, management and surviving state has been a challenge.   

Why?  You can’t simply swap out nodes as they depend on data in pod-mounted storage.  And rolling back doesn’t work for databases either.  

Some best practices have evolved to workaround the challenge of deploying data-driven apps on K8s: 

  • Run the database outside of Kubernetes: This creates lots of extra work, adds redundant tooling, and can actually invalidate the value you were looking to gain from K8s.
  • Use a DBaaS: This, too, limits the value of scale and resilience of Kubernetes and limits your choice to only those options provided by your Cloud Service Provider

To keep up with the demands of modern, data-driven apps, the Kubernetes community developed  a native way to manage state, via StatefulSets. 

  • StatefulSets assign a unique ID that keeps application and database containers connected through automation.  
  • Note: The use of “Unique ID” is a bit tricky here. Each resource in kubernetes gets UID to identify it but that UID will change whenever the resource is updated. StatefulSets assign pod identity which is persistent across pod generations but is separate from UIDs.

StatefulSets are ideal for CockroachDB because the UID means it doesn’t get treated as a new node in a Kubernetes cluster, cutting way back on the amount of data replication required to keep data available.  This is key to efficiently supporting fast distributed transactions and our consensus protocol. For a real life example of a CockroachDB running on Kubernetes to retain state check out this Pure Storage case study

Step By Step Kubernetes Tutorial

Step One: Building Your Kubernetes Cluster

The year is 2021.  There are lots of ways to get your Kubernetes cluster up and running.  For this walkthrough we’ll use GKE.  If you’re interested in other paths, we have resources for:

With the Google Cloud CLI installed, create the cluster by running

gcloud container clusters create cockroachdb-cluster

Step Two: Spinning up CockroachDB

Just like most Kubernetes deployment configuration work, CockroachDB config is managed by a YAML file like the one below.  We’ve added comments to help provide some context for what’s going on.

  • Start by copying the file from our GitHub repository into a file named ‘cockroachdb-statefulset.yaml’. This file defines the resources to be created, in this case including the StatefulSet object that will spin up the CockroachDB containers and then attach them to persistent volumes.
  • You’ll then need to create the resources shown below.  If you’re using Minikube, you may need to manually provision persistent volumes.

You should soon see 3 replicas running in your cluster along with a couple of services. At first, only some of the replicas may show because they haven’t all yet started. This is normal, as StatefulSets create the replicas one-by-one, starting with the first.

$ kubectl create -f cockroachdb-statefulset.yaml

service "cockroachdb-public" created

service "cockroachdb" created

poddisruptionbudget "cockroachdb-budget" created

statefulset "cockroachdb" created

$ kubectl get services

cockroachdb          None         <none>        26257/TCP,8080/TCP   4s

cockroachdb-public   10.0.0.85    <none>        26257/TCP,8080/TCP   4s

kubernetes           10.0.0.1     <none>        443/TCP              1h

$ kubectl get pods

NAME            READY     STATUS    RESTARTS   AGE

cockroachdb-0   1/1       Running   0          29s

cockroachdb-1   0/1       Running   0          9s

$ kubectl get pods

NAME            READY     STATUS    RESTARTS   AGE

cockroachdb-0   1/1       Running   0          1m

cockroachdb-1   1/1       Running   0          41s

cockroachdb-2   1/1       Running   0          21s

Lifting the Hood

If you’re curious to see what’s happening inside the cluster, check the logs for one of the pods by running kubectl logs cockroachdb-0.

Step Three: Using the CockroachDB cluster

If all has gone to plan, you now have a cluster up and running.  Congratulations!

To open a SQL shell within the Kubernetes cluster, you can run a one-off interactive pod like this, using the cockroachdb-public hostname to access the CockroachDB cluster. Kubernetes will then automatically load-balance connections to that hostname across the healthy CockroachDB instances.

$ kubectl run cockroachdb -it --image=cockroachdb/cockroach --rm --restart=Never -- sql --insecure --host=cockroachdb-public

Waiting for pod default/cockroachdb to be running, status is Pending, pod ready: false

 

Hit enter for command prompt

root@cockroachdb-public:26257> CREATE DATABASE bank;

CREATE DATABASE

root@cockroachdb-public:26257> CREATE TABLE bank.accounts (id INT PRIMARY KEY, balance DECIMAL);

CREATE TABLE

root@cockroachdb-public:26257> INSERT INTO bank.accounts VALUES (1234, 10000.50);

INSERT 1

root@cockroachdb-public:26257> SELECT * FROM bank.accounts;

+------+---------+
|  id  | balance |
+------+---------+
| 1234 | 10000.5 |
+------+---------+

(1 row)

Step Four: Accessing the CockroachDB Console

To get more information into cluster behavior and health, you can pull up the CockroachDB Console by port-forwarding from your local machine to one of the pods as shown below:

If you want to see information about how the cluster is doing, you can try pulling up the CockroachDB admin UI by port-forwarding from your local machine to one of the pods:

kubectl port-forward cockroachdb-0 8080

You should now be able to access the admin UI by visiting http://localhost:8080/ in your web browser:

CockroachDB on Kubernetes - DB Console Screen

Step Five: Simulating node failure

We talked about DB survivability earlier.  Now you can test it for yourself.  What happens when a pod goes bad or gets deleted?

  • To test the resiliency of the cluster, try killing some of the containers by running a command like kubectl delete pod cockroachdb-3.  This must be done from a different terminal while you’re still accessing the cluster from your SQL shell. 
  • If you get a “bad connection” error from deleting the same instance your shell was communicating with, simply retry the query.

The container will now be recreated for you by the StatefulSet controller, just as it would happen in the event of a real production failure. 

If you’re up for testing the durability of the cluster data, you can try deleting all the pods at once and ensuring they start up properly again from their persistent volumes. To do this, you can run kubectl delete pod –selector app=cockroachdb, which deletes all pods that have the label app=cockroachdb.  This includes the pods from our StatefulSet.

Just like during setup, it might take some time for them all to come back up again. But once they are up and running again, you’ll be able to get the same data back from the SQL queries you’re making in the shell.

Step Six: Scaling the CockroachDB cluster

Before removing nodes from your cluster, you must first tell CockroachDB to decommission them. (This lets nodes finish in-flight requests, rejects any new requests, and transfer all range replicas and range leases off the nodes.

Now that the nodes are decommissioned you can scale your Kubernetes cluster by simply adding or subtracting replicas by resizing the StatefulSet as shown below:

kubectl scale statefulset cockroachdb --replicas=4

Step Seven: Shutting the CockroachDB cluster down

Once you’re done, a single command will clean up all the resources we’ve created during our oh-so-brief Kubernetes tutorial. The labels we added to the resources do all the work.

kubectl delete statefulsets,pods,persistentvolumes,persistentvolumeclaims,services,poddisruptionbudget -l app=cockroachdb

You can also shut down your entire Kubernetes cluster by running:

gcloud container clusters delete cockroachdb-cluster

Where to go from here

 Now that you’ve mastered the basics, what next?  

References

More information and up-to-date configuration files for running CockroachDB on Kubernetes can be found in our documentation.