CockroachDB + ActiveRecord (and Ruby on Rails!)

Update on June 17, 2020: since initially publishing this post in 2017, we’ve now completed full support for Active Record. Also, CockroachDB does now fully support Common Table Expressions.

In our first blog post on ORM support, Cuong Do detailed how we provided support for Hibernate, a full-featured Java ORM, in CockroachDB. He also discussed our general motivation for providing support for popular ORMs in CockroachDB: to make it as easy as possible for developers to build applications with CockroachDB using a variety of languages and frameworks.

Today, we’ll be discussing what went into providing support for another ORM: ActiveRecord! We have existing documentation on how to use ActiveRecord with CockroachDB, but we wanted to explain a little bit more about this support and go into some detail about how we built it.

Why ActiveRecord?

ActiveRecord is the ORM that powers Ruby on Rails, one of the most popular open-source frameworks out there for developing web applications, and one of the most popular open-source projects of all time. Rails has enjoyed wide popularity since its release in 2003, boasting status as the foundation for hundreds of thousands of web applications, including those of big-name companies like Twitch, Square and GitHub. Rails also has a great community that has inspired many programming newcomers to build their first apps, including global organizations like Rails Girls that are breaking down traditional barriers to learning to program in an era where anyone should be able to build an idea and put it on the web.

Besides making CockroachDB robust, scalable and survivable, one of our guiding principles here is to make it easy to use. Operating a modern SQL database at scale should be easy - and it’s one of our goals to make CockroachDB a joy to develop for. So it’s only natural that we’d want to provide support for Ruby on Rails, a framework with the guiding principle of optimizing for developer happiness! The results from our developer surveys, market research, and informal questionnaires on our developer forum only confirmed what we already suspected - that we definitely needed to support Rails users who wanted to use CockroachDB.

Background

We made the decision early on to support the PostgreSQL wire protocol, so that client drivers that work with PostgreSQL would work with CockroachDB without any extra fuss. This has enabled support for the most popular PostgreSQL drivers for at least 9 languages without requiring any language-specific work. When you have a team as small as ours, any kind of multiplicative effect like this really goes a long way.

However, there was one hazard of supporting PostgreSQL’s wire protocol that we didn’t foresee when we made the decision: PostgreSQL’s wide and full-featured SQL API. It turns out that when people implement language drivers for PostgreSQL, they tend to expose a lot of that API for use of higher-level components like ORMs. As we saw in the Hibernate blog post, ORMs tend to need to do a lot of really complicated introspection on the data and schemas stored in the database to get their job done, and when programmers are implementing ORM support for Postgres, it’s hard to resist using the fancier and more powerful bits of Postgres’s API surface. This creates a challenge for us at Cockroach Labs: we’ve already got our hands full trying to build a scalable, survivable and consistent database; we’d be foolish to presume that we could simultaneously rebuild 30 years of Postgres’s features into CockroachdB v1.x! On the other hand, we also want to avoid reinventing the wheel on the ORM side - implementing a CockroachDB adapter for every ORM out there is not something that we’re able to do with the resources that we have.

So, how do we walk that fine line? For Hibernate, we chose to go the route of teaching CockroachDB to properly understand all of the queries that Hibernate issues to Postgres, since we thought that the features that enable them were popular enough that they’d get a good deal of use across many ORMs and client drivers. What about ActiveRecord?

Common Table Expressions

credit: hyperbole and a half

We started off using the same approach for ActiveRecord that we did for Hibernate: implement all the (Postgres) things! We made a valiant attempt, adding support for at least 12 features that ActiveRecord uses in Postgres. In the end, however, our efforts were stymied by ActiveRecord’s use of a really cool PostgreSQL feature that we don’t support yet: Common Table Expressions.

What are Common Table Expressions (CTEs), you ask? They’re a common SQL extension that permits breaking up large queries into smaller chunks, which, surprise, makes the lives of developers easier! They allow a query to be built up as a series of temporary views in an efficient way, which can clarify what the query is doing if it’s very long or complicated. ActiveRecord uses CTEs in the following statement, which is used to retrieve the primary keys for a table:

See that WITH pk_constraint AS statement at the beginning of the query, and the , cons AS half way down? That’s a CTE! The query creates pk_constraint as a temporary view that contains the result of the SELECT query within parentheses, uses that view to create another temporary view, cons, that runs a window function over pk_constraint, and then uses cons to produce the final query result at the end.

Implementing CTEs correctly is tricky, and requires some backend infrastructure improvements that we aren’t quite ready to start working on. So, we decided to punt on implementing them until our SQL backend is good and ready. Stay tuned for more developments on that topic! In the meantime, though, we were stuck - we couldn’t proceed with our standard strategy of implementing all of the features that the Postgres adapter expects because we found one that was significantly more weighty in implementation cost than, say, adding a new table to our pg_catalog implementation.

A Hybrid Solution: a lightweight adapter

We decided to proceed by taking a hybrid approach between creating a complete ActiveRecord database adapter and implementing all of the features that the Postgres adapter uses in our backend. We created a lightweight ActiveRecord adapter gem that extends from the existing Postgres adapter, modifying just the bits that we couldn’t support, such as that query that uses a CTE. Unsurprisingly, it’s called activerecord-cockroachdb-adapter, and you can find it on GitHub.

This gem, which is at present available only for Rails 5.x, allows Rails users to use CockroachDB as their backend database. All that’s necessary is changing a few configuration options in config/database.yml to instruct Rails to use our adapter, like so:

For a complete example of how to use CockroachDB with Ruby on Rails, check out our example repo.

To all our friends using ActiveRecord and Ruby on Rails, we have some good news: we both delivered a bit more PostgreSQL compatibility and the bit of necessary glue in the shape of an ActiveRecord adapter gem! We consider this to be an important achievement in our ongoing quest to bring CockroachDB’s scalability to a wider audience. The story is not finished yet, however: we need you, friendly Ruby developer, to help us improve it further. Please, try it out in your apps and give us your feedback!

💖💖💖

CockroachDB + ActiveRecord (and Ruby on Rails!)

Why ActiveRecord?

Background

Common Table Expressions

A Hybrid Solution: a lightweight adapter

Summary: we did our homework, please share your love too!