Improving application performance with Duplicate Indexes

* As of CockroachDB 23.1 we no longer use the duplicate indexes topology pattern. Instead, we use global tables.

When you’re distributing a SQL database geographically, it can be tough to get fast application performance. Data often has to make long, round trip journeys across the world and is restricted by a speed limit (the speed of light).

So what can you do to ensure you’re getting the best performance possible for your use case? Utilizing a “duplicate indexes” topology pattern is a great place to start.

Topology patterns are our recommendations of how to distribute and configure your data in order to optimize performance. That includes things like: how you’re replicating data, to which geographic regions you’re replicating it, and where you’re pinning leaseholders (leaseholders are what coordinate reads and writes for a particular copy of data in CockroachDB). There are a handful of different topology patterns that we recommend in CockroachDB. Each pattern is intended to optimize a database for a particular kind of user experience.

In this blog we’ll discuss how to use the Duplicate Indexes topology when you want extremely quick read performance, and are willing to sacrifice a little bit of write performance to achieve that.

When Read Speed Trumps Write Speed

Say you have a reference table of zip codes that matches zip codes up to geographic regions in the United States. Zip codes don’t change very often, so you’re probably only going to be writing that data once. However, you might need to read that data frequently to look up, for example, the nearest locations of retail stores based on a zip code that a customer entered into your website.

This is an example of the type of use case where the write performance isn’t as much of a concern, because it happens infrequently and the read performance is super important because your users are reading data often and expecting fast performance.

The same read-over-write principle applies to usernames and hashed passwords. When a user comes into your website, and logs into your service, they’re only creating their account once – except for maybe an occasional password update. Since you’re only writing that data once per user, but your users will be logging into your website often, the bulk of your performance relies on read speed.

Distributed Data Improves Survivability, Not Read Speed

If you’re using a legacy SQL database, you might have all your data in a single data center. Let’s say that data center is on the east coast. If you have a customer on the west coast who’s looking up their email and hashed password, it’s going to require round trip communication with your data center on the east coast. That round trip could take in the 100s of milliseconds.

Since we know that experiences that take less than 100 milliseconds feel instantaneous, introducing this lag time means your users are going to be waiting on their login experience. That isn’t great when you have users who might be logging into your service every day.

What we want to do is reduce the length of time it takes to fetch data from a data center so that a user’s login experience feels instantaneous, regardless of where they are geographically. A simple approach to moving data closer to users might be to create a CockroachDB cluster with nodes that are spread out across the entire country. Although you would expect this to improve performance somewhat, it might not work in exactly the way you expect.

If you have your Cockroach cluster with nodes in the east coast, the west coast, and the central region of the United States, Cockroach replicates this data three times. It also tries to spread this data out across as wide a geographic range as possible in order to improve the survivability of your data. By default, you’ll end up having one copy of the data in each region.

However, the issue with having one copy of the data in each region is that only one of these copies is the leaseholder, which again, is what actually coordinates all the reads and writes.

So even though we have three copies of the data in each of these regions, the leaseholder is still only in one region. As a result, you might end up in a scenario where the user is talking to machines in your database that are local to them, but the user’s request is routed to the leaseholder which is in a different region. In that case, we’ve improved survivability, but we haven’t improved performance. That’s where duplicate indexes come in.

Duplicate Indexes Create Instantaneous Reads

Let’s say you have a reference table with emails and hashed passwords. What we’re going to do is create a secondary index on the email column in all three regions and store the hashed passwords along with that secondary index. So, the table itself will be constrained to one geographic region, and we’ll have a secondary index for each of the remaining regions.

The thing to keep in mind about these indexes is that they’re also replicated three times and constrained to a specific geographic region. What this means is you have three copies of the data in each of the three regions. You can see why this comes at the trade-off of write performance, since we’re now having to write this data nine times instead of three.

Because we have to perform those additional writes, and because we’re writing the data in multiple geographic locations, we’re expecting write latency to get a little bit worse. However, we’ve observed that write latency is still faster than 300 milliseconds, so although it doesn’t feel instantaneous, it’s generally acceptable.

What we’re striving to do is get our reads in a single-digit millisecond range. The duplicate indexes topology pattern helps achieve that, because when each of the indexes has three copies constrained to a specific geographic region, it means that each of the regions now has a leaseholder. That way, when a user makes a request for the data, it’ll always come from their local data center, as opposed to a data center in a different region.

The really nice property of the duplicate indexes topology pattern is that users will be able to look up data in data centers that are close to them even if they’re migrating from one geographic region to another. That means they’ll experience fast application performance, and thus have a satisfying user experience, no matter where they are physically.

If you’re ready to deliver the best user experience possible, download CockroachDB and get started today.