How to choose the right metadata store

Choosing the right metadata store should be contingent on your system architecture. There is no definitive ‘right choice’. But if you are building your products or services on top of distributed systems then your choices are certainly narrowed. Recently, the data security company Rubrik, wrote a three-part blog series about how they chose a metadata store for Rubrik CDM. Their journey began with Cassandra and ended with CockroachDB. This blog is a summary of the challenges they ran into with Cassandra, the reasons they chose CockroachDB as their metadata store, and what their CockroachDB use case looks like.

Rubrik’s requirements for a distributed metadata store

Originally, Rubrik used Cassandra as the metadata store to get their product to MVP. They liked a few different qualities of Cassandra including: higher-order column types like maps/lists, high-performance point queries, easy setup, simple deployment and maintenance. But they quickly ran into several issues:

Uneven distribution of data across nodes: Depending on the random numbers generated for range allocation, some nodes would be allocated more rows (hashes) than others. For this, they used a workaround in which they considered one Cassandra node to be a virtual node.
Inefficiencies of Secondary Indexes: In Cassandra, you can get local results from every node in the cluster from a secondary index query. This is a big problem in large clusters and caused poor performance.
Out of memory (OOM) crashes: Cassandra is a Java process and all Java apps can run into OOM issues when not enough heap space is allocated. Rubrik gave sufficient heap space to Cassandra but a few clusters still ran into these errors.
Resurrections of deleted rows: This was the main reason that Rubrik ultimately chose to move on from Cassandra. When they deleted a row from Cassandra, Cassandra marked the row with a “tombstone” marker which has a configurable lifespan. The main problem with tombstones occurs when a node goes down for a duration greater than the tombstone’s configured lifespan.

Workarounds were found to address each of the above, but the operational overhead for maintaining the best practices was too burdensome. So Rubrik looked for an alternative metadata store.

Rubrik’s criteria for evaluating distributed databases

Rubrik had three clear pieces of evaluation criteria for their Cassandra replacement:

Deliver on high consistency guarantees even in the presence of node failures
Be capable of good performance when under stress / high load
Be easy to deploy and maintain

Check out the consistency and load testing framework that Rubrik built to stress test database options for their required criteria. Upon successfully passing these stress tests, Rubrik’s team selected CockroachDB for their metadata store, and moved on to evaluating the migration process.

Migrating off Cassandra with CQLProxy

The most compelling piece of Rubrik’s migration from Cassandra to CockroachDB is Rubrik’s implementation of a stateless translator called CQLProxy. This tool translates CQL (Cassandra Query Language) into PostgreSQL (CockroachDB’s chosen SQL dialect).

(Image credit: Vijay Karthik)

CQL schema has features like static columns and higher-order column types (think, map columns) which do not exist in SQL. Rubrik implemented these features in CQLProxy by using extra tables in CockroachDB which made application development much easier.

Rubrik details more of the migration from Cassandra to CockroachDB in Part 2 of their blog series, but it essentially involved two steps:

One time migration of existing data from Cassandra to CockroachDB
Switch applications to use CQLProxy instead of Cassandra

And with CQLProxy, Rubrik didn’t have to make any changes to their application layers. Otherwise, the complexity of changing application code while also swapping out distributed databases would have been painful.

What Rubrik learned developing on CockroachDB

Generally speaking, the migration to CockroachDB from Cassandra was a win for Rubrik. They had less operational overhead and the new support cross-table transactions (through CQLProxy) simplified their application logic. But they did need to find workaround for a couple challenges in CockroachDB: clock skew and backpressure.

For the issue with clock skew Rubrik built a distributed time service that they plugged into CockroachDB. Some of Rubrik’s physical clusters were prone to clock skew because of the NTP servers misbehaving. Ordinarily, NTP helps correct clocks (it’s one of CockroachDB’s recommendations for avoiding clock skew). But when the NTP server hiccups the clocks get too far out of sync.

Kronos, the custom time service build by Rubrik, has the following properties:

It has the same rate of flow as real-time
It returns the same time on all nodes in a cluster.
It is immune to system time jumps.
It is always monotonic. Time can never go backward.
It is fault-tolerant; i.e. works in the presence of node failures (as long as a quorum - more than half of nodes are alive).

Kronos runs inside CockroachDB on each node. It elects an “Oracle” within the cluster to make the time selection. It’s really a fascinating tool that solves an important problem. Go here to read more about Kronos.

Backpressure was the next challenge Rubrik faced in CockroachDB. When rows were being updated super frequently or garbage collection was lagging behind the prescribed TTL this issue crept up. Rubrik implemented a few changes which helped reduce the number of backpressure errors they encountered:

changed the application to decrease updates required for the same row
added internal retries in their Object-relational Mapper (ORM) for times when an update fails due to backpressure
added patches to CockroachDB to more aggressively run garbage collection on ranges prone to backpressure.

Learn more about metadata management

At CockroachDB we’re grateful to Rubrik for the contributions they’ve made to our public repo. Collaborating with their team over the last few years to solve problems has been an important learning experience and has improved our product. To learn more about Rubrik’s CockroachDB use case you can watch this video:

If you’re interested in learning more about metadata management you can check out our reference architecture and watch this high level overview video: