Baidu Migrates Off Sharded MySQL to CockroachDB
Baidu migrated from sharded MySQL to CockroachDB to automate operations of two production applications that access 2TB of data with 50M inserts a day.
Baidu is a $90B+ internet company serving nearly 1 billion users with web products ranging from search to shopping to cloud storage. Their database engineering team needs to support huge volumes of data while meeting the needs of internal application developers. With CockroachDB, Baidu gets a distributed database that scales horizontally while providing the SQL interface application developers are familiar with. CockroachDB now stores information gathered from across the web to drive interactive customer experiences.
Baidu started using CockroachDB when several engineers started testing and contributing to it on GitHub. They evaluated CockroachDB by testing it with real application workloads, and it was through this testing that Baidu’s engineers became convinced that CockroachDB had the appropriate architecture to support their needs.
Baidu’s Biggest Challenges
Baidu supports nearly a billion users accessing applications in production, and requires infrastructure that can support reliable performance at scale. They were relying on sharded MySQL to do the job with multiple replicated shards and middleware to support critical applications.
Baidu’s database engineering team, however, wanted to try a different approach for a new application that needed to store increasing amounts of data while supporting continuous inserts with highly concurrent and real-time access. This application also needed secondary indexes to speed up queries, as well as support for basic real-time analytics to extract insights from existing data.
Their existing MySQL deployment would have required application developers to transform and modify data at the application layer, while NoSQL databases would that sacrifice secondary indexes, aggregations, and transactions would have similarly introduced complexity for application developers. For applications that needed distributed SQL, Baidu’s database engineering team had to stick with a relational database. They needed to invest in a different database.
Baidu’s database engineering team carried out a deep investigation into CockroachDB, contributing to our open source project and providing valuable feature feedback. They found that CockroachDB could handle their new application use case more elegantly than MySQL could with no middleware for operators to maintain and reduced complexity for application developers.
Development teams could keep using SQL, while database engineers could provide a faster Recovery Time Objective (target time in which to recover from a disaster) and keep up with the growth needs of their development teams without having to modify or transform data for application-level use. They could add capacity by simply spinning up another server, installing CockroachDB, re-configuring a load balancer, and pointing the new node at an existing cluster. The database would then automatically route query traffic, rebalance, and replicate transparently to developers and operators. Better yet, all of the hardware Baidu provisioned to run CockroachDB was utilized to serve live application traffic.
Baidu’s database engineering team is now running CockroachDB in production to support two new applications that would have previously used MySQL. These applications access 2TB of data with 50M inserts a day, taking advantage of SQL features like secondary indexes and distributed SQL queries.
Baidu’s deployment of CockroachDB is simple with ten nodes installed on bare metal servers. A load balancer sits above the ten nodes to distribute traffic. With CockroachDB, Baidu’s DBA team can automate many of their manual processes, including setting up replication, managing rebalancing, and surviving failures.
Baidu’s database engineering team continues to contribute to CockroachDB, helping to build new features and improve the product’s usability. They have also partnered with Cockroach Labs to popularize CockroachDB globally with the first China-based CockroachDB Community Conference (videos from the conference are available here). We are excited to collaborate with Baidu to improve and spread the word about CockroachDB.