Using “Follow-the-Workload” to Beat the Latency-Survivability Tradeoff in CockroachDB

Geographically distributed databases like CockroachDB offer a number of benefits including reliability, security, cost-effective deployments, and more. Critics often counter that distributed databases increase latency. What if a database could offer all of the benefits of distribution, but also provide low-latency?

With this challenge in mind, we set out to minimize latency in CockroachDB, all the while providing exceptional reliability for mission-critical workloads. We built “follow-the-workload” to be a key feature to improve performance and provide additional control to database administrators (DBAs).

This blog post is the second of a two-part series. The first part defined the latency-survivability tradeoff and explained the use of zone configurations, while this post moves forward to expand on that and discuss “follow-the workload” as a feature.

Optimizing Performance with “Follow-the-Workload”

“Follow-the-workload” is a new feature released in 1.0 designed to improve performance. “Follow-the-workload” allows CockroachDB to automatically change the location of the leaseholder (you can think of a “leaseholder” as a read-holder) for a given piece of data (provided one enables localities) to reduce latency.

For each range of data in the cluster, CockroachDB maintains a stable leaseholder replica that serves all requests for the data contained within it. In Consensus, Made Thrive, we explained the reasons for electing a leaseholder for each range that serves all reads and writes on that range. “Follow-the-workload” optimizes the placement of these leaseholders to be closer to the applications that use them.

Every query, whether a read or a write, requires interaction with the leaseholder for the appropriate range. As the distance between the leaseholder and the request increases, so does latency. For example, if a leaseholder located in the U.S. receives a request from a node located in Australia, the request has to travel halfway around the world (and back) before processing, adding 150-200ms of network latency.

follow-the-workload-1

On the other hand, if the leaseholder was located in Australia, processing this same request would result in single digit milliseconds, if not microseconds, of latency. Without a feature like “follow-the-workload,” it is more likely that users would encounter service latency.

follow-the-workload-2

As we’ve illustrated, CockroachDB can reduce latency and drastically improve user experience simply by transferring the lease to a range closer to the origin of the majority of its requests without moving data.

We designed leaseholders to track the number of requests from each locality and automatically move the lease if a large number of requests originate from distant locations on the network (as measured in round-trip service time). The more imbalanced the request load and the further away the source of the requests, the faster CockroachDB will move the lease.

If all the load on the system results from one locality, then all the ranges in the system will have their lease placed in that locality. This results in faster reads and writes than if the leases were distributed automatically around the world.

This optimization has come to be known as “follow-the-workload” leaseholder rebalancing after the original “follow-the-sun” customer use case that inspired it.

Our customer, a globally operating business, deployed CockroachDB to handle worldwide transactions. After deployment, we noticed that, at any given time, the majority of requests originated from daytime locations. As the day proceeded, the workload would move around the world synchronously with daylight hours, essentially “following the sun”.

We recognized the value in re-configuring our leaseholder program to mimic “follow-the-sun”-style workloads. Further, we understood that “follow-the-sun” represents only one special type of workload that would require a leaseholder re-balancing via “follow-the-workload”.

“Follow-the-Workload” Use Cases

We expect that “follow-the-workload” will be heavily used by customers who need widely distributed data (e.g. for fault tolerance), and also by those who experience traffic patterns that can be segmented by location.

With “follow-the-workload”, segmentation can be accomplished by both time, as described in “follow-the-sun”, and by primary key. Primary keys can be established based upon a user’s location (e.g. state or country) and further used to pre-split tables. If DBAs decide to do this, CockroachDB will employ “follow-the-workload” to ensure that the lease for a given range will be located nearest the state that its users originate requests from.

“Follow-the-workload” will rarely take effect for workloads that do not vary based on user location.

We took “follow-the-workload” one step further as we designed it to run automatically. In fact, it may end up finding and optimizing patterns in workloads that DBAs would never have noticed! CockroachDB smartly executes these automatic optimizations for DBAs thereby helping to fulfill our mission to “make data easy.”

Future Work & Final Thoughts

Replica “Follow-the-Workload” We currently only employ “follow-the-workload” for lease behavior. A logical next step would be to expand “follow-the-workload” to data replicas. Moving data replicas costs more than moving leases but may offer additional future latency improvements.

For example, if a range currently in localities A, B, and C receives a majority of its requests from locality D (geographically distant from A, B, and C), CockroachDB could create a replica for that range in D (provided approval from relevant replication zone configurations). We’d love to learn more about your interest in this feature as we continue to evolve our offerings.

CockroachDB is built to be a high-performance distributed database. We hope that by using features like “follow-the-workload” we will continue to “make data easy” for our users.

To learn more and try it out yourself, click here.

How does CockroachDB guarantee strong consistency at scale?

Read the ACID docs