Like all software, CockroachDB builds on the legacy of prior work. CockroachDB takes inspiration from more than four decades of DBMS research, from the early work of Michael Stonebraker to the more recent ideas presented in the Spanner paper. We recognize that we owe a lot to academics and researchers, and today, we’re thrilled to give something back to that community: a research paper of our own.
Over the past few months, a team of our engineers, technical writers, product managers, and sales engineers codified the research and learnings of CockroachDB and are now contributing this knowledge back into the very system from which we have benefited with the hope of further advancing distributed systems research and design.
The research paper, "CockroachDB: The Resilient Geo-Distributed SQL Database", is a labor of love that we are honored to have published by SIGMOD, the Association for Computing Machinery's (ACM) Special Interest Group on Management of Data, which specializes in large-scale data management problems.
"We live in an increasingly interconnected world, with many organizations operating across countries or even continents. To serve their global user base, organizations are replacing their legacy DBMSs with cloud-based systems capable of scaling OLTP workloads to millions of users.
CockroachDB is a scalable SQL DBMS that was built from the ground up to support these global OLTP workloads while maintaining high availability and strong consistency. Just like its namesake, CockroachDB is resilient to disasters through replication and automatic recovery mechanisms.
This paper presents the design of CockroachDB and its novel transaction model that supports consistent geo-distributed transactions on commodity hardware. We describe how CockroachDB replicates and distributes data to achieve fault tolerance and high performance, as well as how its distributed SQL layer automatically scales with the size of the database cluster while providing the standard SQL interface that users expect. Finally, we present a comprehensive performance evaluation and share a couple of case studies of CockroachDB users. We conclude by describing lessons learned while building CockroachDB over the last five years."
We owe a tremendous thank you to the authors of the paper: Rebecca Taft, Irfan Sharif, Andrei Matei, Nathan VanBenschoten, Jordan Lewis, Tobias Grieger, Kai Niemi, Andy Woods, Anne Birzin, Raphael Poss, Paul Bardea, Amruta Ranade, Ben Darnell, Bram Gruneir, Justin Jaffray, Lucy Zhang, and Peter Mattis.
First author Rebecca Taft will be presenting the paper at SIGMOD 2020. This is the first time that Cockroach Labs will be presenting at the SIGMOD conference. Due to COVID-19, SIGMOD 2020 will be run as a virtual conference this year from June 14-19, 2020. Rebecca's session is scheduled for Wednesday June 17 from 1:30pm-3:30pm PDT. Registration is still open, and we can't recommend enough the opportunity to hear directly from those leading the research of today's data management problems.
"CockroachDB: The Resilient Geo-Distributed SQL Database" is an in-depth resource for someone looking to read up on the most recent advancements in modern database tech. It's an effective vehicle to consume and understand some of the core concepts of CockroachDB, and we hope it will inspire a software engineer or two or three to build the next generation beyond what is presented within. The paper is available to download and read here.