Building Support for Java ORM Hibernate in CockroachDB

We’re working hard to make CockroachDB scalable and robust, yet simple to use. One way we’ve approached this is by adding support for existing object-relational mappers (ORMs), which make it fast and easy to develop applications that interact with CockroachDB. To determine which ORMs to support first, we asked users of our forum, conducted developer surveys, and performed other research. Hibernate came up most often, likely because of Java’s large developer community and Hibernate’s popularity within that. So, we focused on that first and are excited to announce that Hibernate now works with CockroachDB. The rest of this post describes the implementation.

Initial Goals

Originally, we had the ambitious goal to add support for five ORMs in three months. We also had a stretch goal of supporting two additional ORMs. As is often the case with software estimates, we were overly optimistic. After three months, even our most basic Hibernate app wasn’t working with CockroachDB.

Bootstrapping ORM Support

We had two options for quickly implementing ORM support:

  1. Implement the SQL features needed for ORMs to function. This was feasible, because the wire protocol and SQL dialect CockroachDB supports most closely resembles PostgreSQL’s. Because many ORMs support PostgreSQL, this option involved adding SQL features that ORMs need but CockroachDB was lacking.
  2. Create a CockroachDB adapter for each ORM. Though SQL has a set of standards, each commonly used database has significant deviations from standards and/or proprietary SQL features. This requires ORMs to have an adapter for each database they connect to. So, we’d create an adapter (also referred to as a “dialect”) that performed ORM-specific tasks using features CockroachDB already supports.

We wanted to release ORM support quickly, and after some investigation, we discovered significant overlap in the SQL features ORMs need. So, we went with option #1.

Option #2 would likely have resulted in more robust ORM support that would leverage CockroachDB’s unique strengths, but it would have taken more initial development time (and arguably more ongoing maintenance) than #1. Additionally, other tools, such as database administration and schema migration tools, use the same SQL features as ORMs. In short, we decided that leveraging prebuilt adapters by fitting the interface they expected would be less effort in the long run than creating a custom adapter for each individual ORM or tool.

So, what are those features? ORMs issue two basic types of queries. The first set consists of CRUD (CREATE, READ, UPDATE, DELETE) queries. This was relatively straightforward given our existing implementations of those SQL statements. Minor issues existed, such as incompatible date and time formats, but were simple to fix.

The vast majority of the work involved supporting schema synchronization.

Schema Synchronization

One of the important responsibilities of an ORM is making sure that the tables, columns, and indices in the database are consistent with the models as defined by the code. For example, consider this Hibernate ORM model:

private static final SessionFactory sessionFactory =
            new Configuration()
                    .configure("hibernate.cfg.xml")
                    .addAnnotatedClass(Account.class)
                    .buildSessionFactory();

@Entity
@Table(name="accounts")
public static class Account {
    @Id
    @Column(name="id")
    public long id;

    @Column(name="balance")
    public long balance;
}

During startup, Hibernate performs the following actions:

  1. Creates the accounts table if it doesn’t exist. Hibernate “knows” about the table because of the function call addAnnotatedClass(Account.class).
  2. Adds the id column to accounts if it doesn’t exist and makes it the primary key for accounts.
  3. Adds the balance column to accounts if it doesn’t exist.

Hibernate knows how it wants the accounts* table to be structured based on the @Table, @Column, and @Id annotations. Now, it needs to know if accounts already exists and, if it does, what its current schema is.

Schema Synchronization: pg_catalog vs. information_schema

Schema synchronization makes the database schema consistent with the models (e.g. Account in the above example). To do this, Hibernate must retrieve the database schema through the mechanisms provided by the database.

information_schema is a standard set of views that SQL defines to expose the structure of a database for introspection by clients. Unfortunately, the ORMs we’ve tested do not use this when connected to PostgreSQL. Instead, they use the relations in the PostgreSQL-specific pg_catalog schema. So, we’ve implemented the subset of pg_catalog views needed for Hibernate. This allows a Hibernate app to connect to CockroachDB as though it were PostgreSQL.

This process was relatively straightforward, but involved significant changes to SQL execution, the type system, and a variety of other code. It also exercised parts of our SQL engine that were previously underused. For example, consider this query that Hibernate issues while running a sample app:

SELECT NULL::text AS PKTABLE_CAT,
       pkn.nspname AS PKTABLE_SCHEM,
       [15 lines omitted]
       CASE con.confdeltype
           WHEN 'c' THEN 0
           WHEN 'n' THEN 2
           WHEN 'd' THEN 4
           WHEN 'r' THEN 1
           WHEN 'a' THEN 3
           ELSE NULL
       END AS DELETE_RULE,
       [8 lines omitted]
FROM pg_catalog.pg_namespace pkn,
     pg_catalog.pg_class pkc,
     pg_catalog.pg_attribute pka,
     pg_catalog.pg_namespace fkn,
     pg_catalog.pg_class fkc,
     pg_catalog.pg_attribute fka,
     pg_catalog.pg_constraint con,
     pg_catalog.generate_series(1, 32) pos(n),
     pg_catalog.pg_depend dep,
     pg_catalog.pg_class pkic
WHERE pkn.oid = pkc.relnamespace
  AND pkc.oid = pka.attrelid
  AND pka.attnum = con.confkey[pos.n]
  AND con.confrelid = pkc.oid
  AND fkn.oid = fkc.relnamespace
  AND fkc.oid = fka.attrelid
  AND fka.attnum = con.conkey[pos.n]
  AND con.conrelid = fkc.oid
  AND con.contype = 'f'
  AND con.oid = dep.objid
  AND pkic.oid = dep.refobjid
  AND pkic.relkind = 'i'
  AND dep.classid = 'pg_constraint'::regclass::oid
  AND dep.refclassid = 'pg_class'::regclass::oid
  AND fkn.nspname = 'public'
  AND fkc.relname = 'orders'
ORDER BY pkn.nspname,
         pkc.relname,
         con.conname,
         pos.n

It’s reasonable for you to have looked at this query for a few minutes and still not understand what it’s trying to do. This query retrieves the foreign keys for the orders table in the public schema, using the relations in pg_catalog.

Hibernate is not unique in using pg_catalog instead of information_schema. Several other ORMs, including the other four we prioritized - Sequelize, GORM, SQLAlchemy, and ActiveRecord - use pg_catalog as well. Fortunately, we also confirmed that these ORMs mostly use the same parts of pg_catalog that Hibernate does. Supporting Hibernate took us about three months of development time; each additional ORM took three to four weeks.

Lessons Learned

In addition to Hibernate, we’ve also implemented support in CockroachDB for ActiveRecord (Ruby), SQLAlchemy (Python), GORM (Go), and Sequelize (Node.js). More ORMs will come later as community demand arises. Also, we’ve realized that compatibility with Postgres’s SQL dialect requires at least limited compatibility with Postgres’s abstractions, such as generator functions, OID type support, and rudimentary ARRAY support. Schema synchronization also required more efficient joins. We are still considering future efforts to create custom client adapters that will use CockroachDB more natively and thus more efficiently.

Want to Use Another ORM?

We may already be working on it. You can first search for the ORM in our issue tracker. If you find a relevant issue, please voice your support with a GitHub reaction.

Pull requests to add ORM-related features are greatly appreciated. CockroachDB is open-source and has already benefitted from significant community pull requests. If that’s not feasible, please file an issue that includes a test script or unit test that uses the ORM. It’s easier to prioritize support if we don’t first have to invest in understanding it and its incompatibilities.

Build a Hibernate/CockroachDB App Today

After about 3 months of work by 4 engineers, our test app finally worked with Hibernate. The size of Hibernate ORM (over 700,000 lines of code) means that this is the starting point for CockroachDB’s support of Hibernate and, as always, we’ll need our community’s help to evolve from here. Please file an issue if things aren’t working as expected.

If you want to build an application with Hibernate ORM and CockroachDB, get started today by reading how to build a Java app with Hibernate. If you build an app, please let us know at [email protected].

Which language drivers and ORMs does CockroachDB support?

Read the Docs