Building an application with CockroachDB and SQLAlchemy

CockroachDB’s support for SQLAlchemy is currently in beta, but we’re actively developing new features to improve the integration. You can find the documentation here.

One of the great things about CockroachDB’s support for SQL is the wide variety of frameworks and tools for working with SQL data. Today, we’ll demonstrate this by building a simple application in Python, using SQLAlchemy and Flask.

Adapting SQLAlchemy to CockroachDB

Every SQL database is a little bit different, so a library like SQLAlchemy requires some code (called a dialect) to adapt its interface to the database in use. CockroachDB is similar enough to PostgreSQL that SQLAlchemy’s built-in PostgreSQL dialect gets us most of the way there, but we still need a few tweaks that can be found in our cockroachdb python package. As of this writing, those tweaks are:

Reflection in CockroachDB uses commands like SHOW TABLES instead of the pg_tables database.
The best type for an automatic ID column is INT DEFAULT unique_rowid()
instead of SERIAL.
We don’t yet supportforeign keys.
We require special SAVEPOINT statements for the most efficient transaction retries. This will be discussed in detail below.

To use this package, simply pip install cockroachdb and configure SQLAlchemy with a URL that begins with cockroachdb:// instead of postgresql://.

Getting Started

We’re going to start with Flask-SQLAlchemy’s example app. You can follow along as we make our changes, or get the finished product from our repo.

Install pip if you don’t already have it.
Install flask-sqlalchemy and the cockroachdb python package: $ pip install flask-sqlalchemy cockroachdb
If you haven’t already, install CockroachDB and start a server. The rest of this tutorial assumes that you’re running a local cluster in insecure mode.
Clone the cockroachdb/examples-python repository and move into the flask-sqlalchemy directory: $ git clone https://github.com/cockroachdb/examples-python $ cd examples-python/flask-sqlalchemy/
Run this shell script to create a database for the application, grant an example user access to the database, and create the tables:
#!/bin/sh set -ex cockroach sql --insecure -e 'DROP DATABASE IF EXISTS exampleflasksqlalchemy' cockroach sql --insecure -e 'CREATE DATABASE exampleflasksqlalchemy' cockroach sql --insecure -e 'GRANT ALL ON DATABASE exampleflasksqlalchemy TO example' python -c 'import hello; hello.db.create_all()'
Finally, run the application: $ python hello.py
Visit http://localhost:5000 in your browser to see the application running.

Handling restarted transactions

Unlike most relational databases, CockroachDB uses optimistic concurrency control instead of locking. This means that when there is a conflict between two transactions one of them is forced to restart, instead of waiting for the other to complete. Transactions are sometimes forced to restart due to deadlocks even in databases that don’t use optimistic concurrency control, but it’s much less common, so many applications just return an error and don’t even attempt to retry. In CockroachDB, restarted transactions are common enough that it’s important to handle them correctly.

We provide a function to help with this: cockroachdb.sqlalchemy.run_transaction. It’s a little more cumbersome to use, because you can no longer use the global db.session or Model.query objects, but this protects you from accidentally reusing objects via the Session from outside the transaction.

Here is one function modified to use run_transaction; you can see the rest of the changes in this diff:

from flask import Flask import sqlalchemy.orm from cockroachdb.sqlalchemy import run_transaction app = Flask(__name__) app.config.from_pyfile('hello.cfg') db = SQLAlchemy(app) sessionmaker = sqlalchemy.orm.sessionmaker(db.engine) @app.route('/new', methods=['GET', 'POST']) def new(): if request.method == 'POST': if not request.form['title']: flash('Title is required', 'error') elif not request.form['text']: flash('Text is required', 'error') else: def callback(session): todo = Todo(request.form['title'], request.form['text']) session.add(todo) run_transaction(sessionmaker, callback) flash(u'Todo item was successfully created') return redirect(url_for('show_all')) return render_template('new.html')

Under the hood, run_transaction() is using the SAVEPOINT statement. This is a standard SQL statement (normally used to support nested transactions) that CockroachDB uses in a special way. We don’t support nested transactions, but we do support the special case of a single SAVEPOINT that covers the entire transaction:

BEGIN; SAVEPOINT cockroach_restart; INSERT INTO todos VALUES (...); -- first attempt ROLLBACK TO SAVEPOINT cockroach_restart; -- failed; try again INSERT INTO todos VALUES (...); -- second attempt RELEASE SAVEPOINT cockroach_restart; -- success! COMMIT;

RELEASE SAVEPOINT is a kind of COMMIT. In fact, for the special case of SAVEPOINT cockroach_restart, RELEASE SAVEPOINT is the COMMIT. The transaction is fully committed at this point; the final COMMIT is just to match the first BEGIN. By structuring the transaction in this way, the server is able to preserve some information about the previous attempts to allow the retries to complete more easily, which it couldn’t do if the retries were independent top-level transactions.

Conclusion

There’s much more that we haven’t covered (for example, SQLAlchemy’s “core” API is supported as well if you don’t want to use the ORM/Session layer), but we hope this post serves as an introduction to using CockroachDB with an existing framework. Fans of other frameworks may also want to check out the implementation of the cockroachdb package to see what is involved in adapting a new framework to CockroachDB.