Modesty in simplicity: CockroachDB's JOIN

CockroachDB’s JOIN: An Early Implementation

When our VP of engineering, Peter Mattis, made the decision in 2015 to support SQL, little did he know that the team would get as far as shipping the first implementation of CockroachDB’s JOIN exactly one year after that. A celebration is in order!

cockroachDB announces JOIN on SQL birthday-300x214-300x214

The good news is that CockroachDB’s JOIN seems to work, as in, “it returns correct results.”

However, we’d like to underline that this is just our first, unoptimimized implementation.

Like, really not optimized. As in “this will cause your client to wait forever and/or your server to stop with an out-of-memory-error if you join large tables” not optimized. And we are sharing this blog post today to explain why this is so and what we are going to do about it.

Of course, a better JOIN is on the roadmap. There will be a version 2, then perhaps 3 and further, as we figure how our users would best like their JOINs to behave.

We are providing the SQL JOIN operation in CockroachDB today first and foremost to establish a baseline. A baseline for testing, for experimentation, for benchmarking. Even if this implementation probably isn’t ready for most production uses yet, by including it now, we’ll be able to ensure that the feature is tested and maintained as we work to improve and expand it.

Under the hood

The primary motivation for this first implementation is to serve as a baseline reference point during tests. While preparing this first shot, we aimed for the simplest, most natural implementation that provides correct result rows, deliberately ignoring the last 40 years of database research on how to perform fast JOINs.

One advantage of this approach is that it’s rather simple to analyze and validate manually. The other advantage is that it is also easy to show and explain! For example, there is really not much more in the code that the following flowchart does not show:

What this flowchart tells you: the algorithm reads all the rows from the JOIN’s right operand into memory and then, for each row read from the left operand, looks up the corresponding row in memory and produces the joined row if a match is found. Some extra conditions here and there deal with the NULL rows produced by OUTER JOIN when no match is found.

The only allowance for cleverness in this code was to make it even simpler: when a query specifies RIGHT OUTER JOIN, we swap the operands and run it with the logic of LEFT OUTER JOIN.

So this time, we optimized for simplicity and thus not for performance. Concretely, given N rows in the left operand and M in the right operand, the time complexity of a single JOIN is O(N×M) for now, and thus quadratic for self-joins, and the space overhead is O(M). This is far from the usual linear time complexity of a moderately optimized ordered join, or the constant space overhead when joining using a key or index. In short, unless your JOIN right operand has very few rows, you probably don’t want to use this for production workloads yet.

What this first JOIN brings to the table

By supporting SQL JOIN, we enable developers to follow many existing SQL tutorials and courses with CockroachDB. It enables more diverse evaluations of CockroachDB and creates a richer environment for learning and tinkering.

For us internally, it provides a basis for comparison during tests. We could not easily reuse existing JOIN tests from other databases because our typing system is slightly different and their test/reference queries do not behave exactly the same with CockroachDB. So we prepared this version to build our own tests.

There is another advantage to our users as well. Through the rest of 2016, we aim to increase compatibility between CockroachDB and popular ORMs. These ORMs make heavy use of database introspection, querying the information schema with JOINs to infer table structure. Our first implementation of JOINs will enable ORM compatibility. The information schema contains mostly small tables, for which performance is not a significant concern.

We like to illustrate our progress during Beta development like this:

Image Credit: Henrik Kniberg

Our JOIN may not be production-ready yet, but at least it’s functional and you can already start using it to discover CockroachDB.

Next steps on the horizon: performance optimizations

Our overall implementation strategy for CockroachDB is to start with those features of SQL that are most useful to our users. In other words, we’d want our JOIN in version 1.0 to be reasonably fast for the most common use cases.

In OLTP applications, we see that most requests join products with categories, prices with tax rates, customers with organisations, etc. The common characteristic of these use cases is that the join column is either a primary key or has an index. That’s where we plan to optimize first: use the ordering information from the JOIN operands and reduce the operation’s time complexity back to linear.

From that point forward, we will also start applying some of the state-of-the-art in database theory. For cases where the database doesn’t provide a native ordering for joined data, we’ll perform sorting and merging instead of nested loops, and prepare the sorted data on-disk instead of keeping it all in memory where it doesn’t fit. Likely, hash joins will also appear in CockroachDB before long.

Meanwhile, we have already started working on a new distributed query execution engine a few months ago. This feature is intended to parallelize the execution of complex operations over multiple nodes in the cluster. In particular, sorting and merging and hash joins have “natural” extensions in a distributed implementation. There are caveats of course, like the fact we can’t promise fast results if you simultaneously issue complex JOINs from clients connected to all nodes in the cluster! (Parallelization brings more speed only if there are more available nodes to execute the query than there are nodes demanding query results.) However, we expect a performance boost for common workloads using distributed SQL execution.

Expect more blog posts to explain our distributed query engine later in 2016. For now, we can already share that we used Google’s research on Sawzall for inspiration.

Beyond that, we’ll work on optimization as we learn how our users like to use CockroachDB. We do not believe it is wise to already start implementing classical optimizations before we know which are applicable to the sort of workloads our users want to throw at a distributed database. We suspect we may need to develop new types of optimizations for CockroachDB because the bottlenecks will not appear in the same places as in traditional RDBMs. Stay tuned!

A note about feature parity

If you’re following the development of CockroachDB, you may have noticed several choices that seem to follow a pattern. We support SQL. We support the same client/server SQL protocol as PostgreSQL. We support SQL functions with the same names as PostgreSQL, and we even often reuse the same error messages! And we now support JOINs, including FULL OUTER JOINs which are not common but are supported by PostgreSQL.

The temptation might be great to deduce that we want to match PostgreSQL feature-wise, and perhaps suggest we’ll soon claim to become a drop-in replacement! However, please resist this temptation.

First of all, we believe that attempting to reach feature parity this early would be rather unwise, not to say unrealistic. CockroachDB has less than ten man-years invested in its SQL features, where PostgreSQL already has accumulated hundreds. Also, we are currently spending quality time on CockroachDB’s internals to stabilize things and guarantee scalability; making our SQL front-end much larger now would create inertia against productive work on the core technology.

Furthermore, our goal for CockroachDB version 1.0 is that the product can support new businesses and projects built around OLTP workloads. A basic functional JOIN was a strong demand for the “minimum relational toolbox”; we now plan to make it fast for those queries where our users need that. This also means we’d love to hear from you and learn how you like to use JOIN.

That said, we’re not actively bumping compatibility, including performance compatibility, off the road map, either. We promise you will see compatibility increase over time.

Take correlated sub-queries, for example, which are not that common in application code (they can be replaced by JOINs in many workloads). Yet we acknowledge they make the life of developers easier. Supporting them with proper performance optimizations will enable more tutorial / course examples to work right away with CockroachDB. So that too will come to CockroachDB, in time.

Closing words

On this one year anniversary of the decision to support SQL, JOINs were added to CockroachDB. This first implementation is simple and straightforward; it may be slow and memory-hungry, but it is designed to serve a particular purpose. And we already started working on better algorithms for 1.0.

Of course, we could have kept this simple implementation to ourselves, for internal testing only. Or we could have released without any promotion or documentation. But that’s just not our way. We are committed to keeping our progress transparent to the community. This is where we are now, and from here on we can only do better.

If building out distributed SQL engines puts a spring in your step, then good news — we're hiring! Check out our open positions here.