How to reduce costs, improve uptime, and increase developer speed with distributed SQL

Modern businesses run on data. Storing, organizing, managing, and accessing the increasingly complex and large volumes of data required to operate any business at scale accounts for a big chunk — 20% — of all IT infrastructure spending.

With the market now shifting from a focus on “growth at all costs” to profitability and more measured growth, companies are looking to make their IT operations more efficient.

At the same time, they can’t slow the pace of innovation. Capital may not be as free-flowing as it once was, but users haven’t gotten any less demanding. Competitors haven’t stopped improving. Leaders in IT infrastructure are faced with a difficult question: how do you accelerate the pace of innovation while reducing costs?

Do more with less is an exhaustive new report that details the answer to that question. The short version? Switching to data management tools that are fundamentally easier to live with can dramatically reduce costs.

Get your free copy of the report.

Based on interviews with seven CockroachDB customers conducted by Greenoaks Capital Partners, the report finds that switching to a distributed SQL database like CockroachDB can reduce overall costs for customers by nearly 70%. For customers operating at global scale — companies interviewed for the report include a massive logistics company and a Fortune 50 financial services company — the switch resulted in savings of millions of dollars.

How is that possible? Because while licensing CockroachDB can cost more than legacy SQL databases, those databases come with hidden operational costs that scale up as businesses scale — costs that CockroachDB can reduce and, in many cases, eliminate entirely.

We’d need another five to six engineers just to manage things that CockroachDB automates for us now. CockroachDB has a significantly lower administrator-per-QPS ratio than any other database we have used in production. –Senior Staff Engineer for a logistics platform

The hidden costs of a traditional relational database

Understanding the true cost of a traditional relational database means digging into the labor that’s required to deploy, maintain, and operate it day to day, as well as the labor that’s required to scale it as your company grows.

Ops and management costs

Architecting and deploying a traditional relational database that will function at scale is itself a complex effort that requires specialized personnel such as senior DevOps experts and SREs. You must invest significant time planning the cluster environment, provisioning and configuring machines in the right location, ensuring the right connectivity, and setting up and testing the production environment.

Once the database is in production, significant manual effort is also required to keep it online and performant. Accomplishing much of this work also requires planned downtime, which requires your team to do several things:

Plan and execute database backups.
Implement software patches and version upgrades.
Test and implement schema changes.
Evaluate and fix performance issues.
Regularly audit backups, security, compliance, and more.

Many of these tasks, such as software updates and schema changes, can only be executed when the database is offline. For most businesses, this means scheduling downtime during low-traffic periods such as nights and weekends and, by extension, asking your team to work those hours. For most successful businesses, going offline even during off-peak hours also comes with a significant cost in terms of lost revenue. According to a 2022 study from analyst firm Enterprise Management Associates (EMA) and AIOps company BigPanda, the average cost of downtime is $12,900 per minute, or $774,000 per hour.

And of course, that’s just the average – for large enterprises, the costs are much higher.

The complexity of this work is magnified if you choose to implement high-availability configurations such as an active-passive setup. But without such a configuration, your database will be highly vulnerable to unplanned downtime, which can be significantly more costly in both the short term (due to lost revenue) and long term (due to customers who churn because of the poor experience). The labor costs of unplanned downtime are also significant, as engineers will need to scramble to failover to your backup database, and then manually resolve the inconsistencies between two or more out-of-sync databases once the crisis is over.

Manual sharding and the costs of scaling to address demand

In addition to the significant labor costs associated with day-to-day operations, scaling a relational database horizontally requires partitioning (sharding) it into multiple instances so that the growing workload can be distributed across multiple nodes. And since there are finite limits to the performance gains that can be achieved through vertical scaling, scaling out horizontally is almost always required.

However, manually sharding a database is a time-consuming, manual, and complex process that requires additional things for your team to do:

Design a table sharding (partitioning) model.
Develop application logic to route data to the correct shard.
Develop application logic to perform joins of data across multiple table shards.
Develop application logic to perform atomic transactions across multiple table shards or design the application to avoid needing cross-shard atomic transactions.
Develop application logic to handle failure scenarios where some shards or partitions are not available.

Manual sharding, in other words, requires making significant changes to the code of your application itself — every part of your application that interacts with the database will need to be reworked. Other elements of your stack that touch the database, such as data pipelines, will also need to be reworked.

And the costs associated with manual sharding are not a one-time expense — all of this work will need to be repeated the next time you need to scale up. As more shards are added, the system also becomes increasingly complex and brittle, making sharding and managing the database more and more labor intensive as time goes on.

The costs of scaling to address new markets or regions

There are a wide variety of reasons why businesses choose to deploy their databases across multiple geographical regions. Multi-region deployments can decrease latency for users and increase the resilience of the overall systems. They are also sometimes required to ensure compliance with localized privacy and sovereignty regulations (such as GDPR).

Traditional relational databases, however, were designed for single-instance, single-region deployment. Deploying a traditional RDBMS across multiple regions means sharding it every time you move into a new region. Since sharding is such a labor-intensive process, this massively increases labor costs. Engineer and database administrator (DBA) management resources must be allocated to execute the changes, and additional DBA resources will be required to manage each database partition.

Data management costs in numbers: Real-world examples

The research shows that the total cost of data management is almost an order of magnitude higher than the cost of paying a DBMS vendor.

For example, one customer that was using a traditional relational database — a Fortune 50 financial services company — reported that they spent about $800 million on data management in 2021. Just $200 million of that went to database and ETL vendors. The other $600 million was spent on labor.

This kind of expense is not uncommon. As US states legalize sports betting, a major sports betting company that was also part of our research reported that expanding into each new state while keeping their traditional relational database would require an additional 5–10 technical hires. They estimated that each state would cost an additional $1,507,500 just due to the additional technical staff they needed to shard and maintain the database.

“We had been looking at hiring 5 to 10 new technical headcount for each state that we launch in, compared to doing what we are doing now — which is zero per new state.”

Both of these companies — as well as five others detailed in the full report — came to the same conclusion: upgrading to a modern distributed SQL database, despite coming with higher licensing costs, would actually save them millions of dollars.

Understanding the total cost of ownership for cloud databases

While every organization is different, all of the seven companies researched in the report were spending the bulk of their data management budget on labor. In all cases, significant savings and efficiency gains were available by switching to a modern, distributed SQL database.

Having a database that automates scaling and many of the other labor-intensive operational tasks enables companies to recapture engineering hours and redirect them to value-generating work for the business (such as feature development).

Overall, CockroachDB customers report their teams spent 22% less time operating, scaling, and managing their databases and data-related application development tasks after switching to CockroachDB. That translates to annual data cost savings ranging from $0.53 million (for a smaller startup) to $8 million (for an enterprise company that freed up 25 engineers to work on other projects after switching to CockroachDB).

What could your engineers do if they had 22% more time to work on your product? What could you do with an extra $0.5 million to $8 million freed up from your data management budget?

Learn more about the costs that might be hiding in your legacy RDBMS, and the savings and efficiencies you could unlock with distributed SQL: get your free copy of the report now.