It’s been a little over four years since we started our mission to deliver an enterprise-ready distributed SQL database.
Today, we’re excited to release CockroachDB 19.1. With this release, we enhanced distributed SQL capabilities, and expanded upon enterprise-grade features for security and data integrations. 19.1 continues to solve the challenge of complex, distributed SQL while meeting all the “enterprise” requirements that are expected of a database.
Here’s Nate Stewart, our VP of Product, with a quick intro on what you can expect in CockroachDB 19.1. And for a deeper tutorial with Nate, register for our CockroachDB 19.1 webinar.
For a full list of features, check out the feature appendix below, or our release notes.
19.1: SQL features, but distributed
A lot of work in our 19.1 release serves our mission of building a distributed SQL database that will allow you to scale more easily while also meeting the needs of even the most demanding applications.
This includes general performance improvements and a lot of small changes to our new cost-based optimizer, which we built from scratch and introduced this past fall.
- Join hints: The optimizer now supports hint syntax to control the use of a join algorithm. In most cases, the optimizer is smart enough to pick the right join algorithm based upon cost inputs such as cardinality. Occasionally, the optimizer can pick a plan that is less efficient than other available options. Hints put over-ride control into the hands of users to allow users to force certain join algorithms when the optimizer missed an opportunity to operate at peak efficiency.
- Reverse index scans: Specifying scan direction in an index hint is that weird screwdriver sitting in the bottom of your toolbox. It’s not used that often, but it’s really handy when nothing else fits. Forcing a reverse can be really helpful, for example, during performance tuning.
- Optimizer index locality constraints: CBO + index locality = magical speed for read-only queries of reference tables in geo-distributed clusters (e.g. postal codes). In a geo-distributed cluster, this can lead to large reductions in latency due to improved data locality and reduced network traffic.
- Automatic statistics: Table statistics have been available for a while now, but we wanted to ease the friction in running them. Automatic statistics does the work for you. With automatic statistics enabled, the optimizer is able to make better selections on how to optimize queries.
Two additional highlighted SQL features we introduced in this release include:
- Logical Query Plans in the Web UI: Users can now view the sampled execution plan for their queries in the UI, giving them greater visibility into how it will be executed and helping to identify bottlenecks.
- Follower reads: We’ve added functionality for running historical queries against nearby replicas rather than leaseholders to dramatically reduce latency in geo-distributed clusters. Queries using AS OF SYSTEM TIME with a sufficiently old timestamp will be automatically directed to the closest replicas rather than the leaseholders. For users who want to use the most recent timestamp that qualifies a query to be read by a replica, we’ve added a new built-in function (experimental_follower_read_timestamp) to automatically generate an acceptable timestamp.
19.1: Meeting the demands of the enterprise
Our enterprise customers have a discrete set of requirements around security and integration that we have pushed forward in 19.1. With this release, we extend and deliver on some of these core capabilities
Change data capture for downstream processing
While CockroachDB is an excellent system of record, it also needs to coexist with other systems. For example, you might want to keep your data mirrored in full-text indexes, analytics engines, or big data pipelines. To that end, we’ve improved our change data capture (CDC) capabilities to allow data to flow more easily to backend warehouses, event-driven microservices, streaming analytics, search indexing, and “big data” applications.
With CDC, CockroachDB 19.1 assures that writes to watched tables are efficiently and consistently emitted to configurable data sinks. CDC captures data that has changed from a source database as a result of create, update, or delete operations. Change data capture provides efficient, distributed, row-level change feeds into Apache Kafka (including compatibility with the Confluent stack) for downstream processing such as reporting, caching, or full-text indexing.
CockroachDB now integrates with existing directory services within an organization, simplifying the management of user accounts through single sign-on and bringing them in line with corporate standards. CockroachDB 19.1 also allows organizations to define policies to encrypt data at rest, securing data both at rest and in flight to provide end-to-end protection for the sensitive information typically found in transactional workloads.
Learn more about 19.1 and use it today!
If you are interested in checking out the complete list of features in this release, we added a summary below and you can also reference our release notes. We’ve also got a 19.1 webinar coming up on May 9 to introduce some more features of the release. And if you have any questions, please join in the conversation over at forum.cockroachlabs.com.
Finally, you can download and use 19.1 today!
Thanks! (from all of us here at Cockroach Labs)
Appendix: Complete list of new features in CockroachDB 19.1
CockroachDB Core Features
- Load-based splitting CockroachDB will automatically split a range based on load so that users experience fewer hotspots and maximize their resources
- Read from Follower with Timestamp Bound We’ve added functionality for running historical queries against nearby replicas rather than leaseholders to dramatically reduce latency in some geo-distributed clusters. Queries using AS OF SYSTEM TIME with a sufficiently old timestamp will be automatically directed to the closest replicas rather than the leaseholders. For users who want to use the most recent timestamp that qualifies a query to be read by a replica, we’ve added a new built-in function (experimental_follower_read_timestamp) to automatically generate an acceptable timestamp.
- Achieve TPC-C 10k using partitioning with 15 nodes We’ve improved our efficiency and successfully hit this benchmark!
CockroachDB SQL Features
- Schema Change Performance Improvements In this release, we altered how schema changes are implemented so that we use a bulk ingest to speed up the process by 10x.
- Schema Change Jobs CockroachDB now allows users to see many schema changes as a ‘job’ in the web UI, giving them greater visibility into when schema changes are complete.
- Schema Changes in Transactions We have improved upon CockroachDB’s ability to modify schema elements within a transaction to include created columns and indexes. Also support for created tables.
- Inventory SQL Errors Error messages are now more descriptive, and include links to our documentation or Github where appropriate.
- Parallel Range Scans CockroachDB increased performance by allowing range scans to be conducted in parallel.
- Improved Vectorized Execution (Prototype) We have demonstrated that a vectorized execution prototype in CockroachDB can dramatically improve performance (up to 3x) for read queries.
- Increased Migration and Integration Support Fixed some compatibility issues to improve CockroachDB’s overall integrations (Hibernate, Spring ORM, etc) and customer migrations from Postgres. Examples:
- SERIAL and INTEGER 32-bit implementation To improve the customer migration experience, CockroachDB now supports a flag to modify our SERIAL and INTEGER types to run as 32-bit integers, so Postgres customers can use these types without making changes to their data structure.
- Customized Retry Savepoint Names CockroachDB now offers a connection setting that lets users name their transaction retry savepoints. This enables us to better integrate with ORMs like Spring and will allow users to take advantage of client-side retries.
- Cost Based Optimizer
- Correlated Subqueries (Part 2) Building on the last release, we now support nearly all correlated subqueries in the optimizer. Correlated subqueries are nested queries where the inner query relies on values from the outer query – a structure that often leads to inefficiencies. In 2.1, we optimized these inefficient subqueries by de-correlating them, and now in 19.1 we’ve added support for correlated subqueries that can’t be de-correlated by adding an apply operator that executes a sub-plan for every row in its input.
- Query Plan Cache The optimizer will now cache optimized query plans. This will improve performance, as CockroachDB will not recalculate the optimized plan for the same queries repeatedly and can now leverage existing plans to quickly create the optimized plan for any query.
- Join Reordering This functionality allows the optimizer to reorder joins automatically to investigate multiple functionally equivalent plans to pick the best plan for performance. It is a configurable setting (defaulting to 4) such that picking 0 will default to the order in which the query was written.
- Locality Preference The cost based optimizer can now take locality into account when calculating optimal queries, preferring indexes in the same locality to improve overall performance of geo-distributed clusters.This feature requires the use of per-index zones, an enterprise feature, to work, but it is not an enterprise feature.
- Query Optimizer Hints In conjunction with improvements to our cost-based optimizer, CockroachDB now provides support for manually editing query plans generated by the optimizer where the generated plan does not maximize efficiency for a particular workload.
- Automatic Statistics Collection CockroachDB will automatically collect statistics for the cost based optimizer to improve the performance of queries. Previously, we assumed all tables had 1,000 rows and all columns had 10% distinct values, which served as a reasonable guide, but having real statistics will ensure the CBO actually optimizes for a specific scenario.
- Deprecated Heuristics Based Optimizer In this release, we’ve improved the functionality of features previously released to meet customer expectations and can now serve almost all queries through the cost based optimizer.
CockroachDB Visibility & Troubleshooting
- Logical Query Plans in the Web UI Users can now view the execution plan for their query in the UI, giving them greater control over how it will be executed and helping to identify bottlenecks.
CockroachDB Ops & Tools
- GSSAPI (Kerberos) Authentication for LDAP/Active Directory CockroachDB now supports integration with Kerberos, a common enterprise-level authentication protocol. Users can use their corporate credentials to access CockroachDB according to their pre-configured access level.
- Encryption at Rest CockroachDB now fully supports the ability to encrypt data while it is stored on disk without requiring changes to client applications for enterprise users.
- Change Data Capture (CDC)
- Enterprise Implementation (Iteration 2) In this release, we’ve expanded CDC to be production-ready for enterprise customers. This included adding support for new output formats and improving push latencies.
- Core implementation An experimental Postgres protocol-based change data capture implementation is now available for both core and enterprise users. This enables CockroachDB users to consume insert, update, and delete events for watched tables without needing to use our Kafka or Cloud Storage data sinks.
- Cloud Storage Sink CockroachDB now has built-in CDC functionality to deliver changefeed data to a cloud storage sink for ingest into OLAP or big data systems, without requiring transport via Kafka.