As part of our recent 1.1 release, we’re introducing an extension to our SQL dialect, a new syntax permitting the parallel execution of transactional SQL statements. The feature can be used to batch the execution of multiple independent statements, exploit intrinsic inter-statement parallelism, and with a little luck, speed up many common interactions with CockroachDB.
Statement parallelization is documented here.
The Cost of Communication
CockroachDB maintains strong consistency across a globally-distributed cluster of computers. In doing so, it can provide up-to-date information across the globe while providing high availability even in the face of faulty computers and networks. We believe strong consistency is a requirement for modern day persistence layers. After all, if you can’t trust your database, what can you trust?
But consistency at scale comes at a price: communication latency. When a client writes data to CockroachDB, the update is replicated through a consensus protocol. This protocol assures that the change is stored safely across multiple machines, so that even failures occurring at the worst of times won’t result in inconsistencies or data losses. Distributed systems connoisseurs will recall that standard consensus protocols achieve these properties by requiring a majority of replicas to agree on an update before committing it. In turn, this means that a write to CockroachDB requires coordination between multiple computers within your cluster.
Unfortunately, until our interns learn how to break the speed of light, long distance communication will continue to take time. For instance, sending a single message from California to Japan and back is usually estimated to take about 120 ms. So when CockroachDB nodes are spread across the world1, the communication latency required for each write can begin to add up!
To help limit the impact this communication latency has on SQL performance, we began exploring another idea: what if we performed multiple writes at the same time, effectively amortizing the communication latency of a write across multiple SQL statements. This led us to the idea of parallel statement execution, where multiple SQL statements within a transaction are all run together2.
Parallel Statements in a Serial World
The challenge with introducing any form of parallelism in SQL is that the language is traditionally exposed as a statement at-a-time conversational API. This means that a client sends a statement to a SQL database, the database server processes the statement, and finally the statement’s results are sent back to the client. Only at this time can the next statement be issued. In order to gain any parallelism between these statements, we first needed to break this mold.
Interactions with SQL databases are slightly different when newer statement-batching modes are employed. For instance, the JDBC batching API allows a client to send a batch of queries to the server all-at-once before waiting for their results. The importance of this functionality is that it avoids a client-server round trip for each statement in the batch. However, support for this form of batching is client specific. Furthermore, even when statements are sent as batches to a SQL database, the expectation is that they’ll be executed sequentially. This has implications on error handling which prevents the batch from being transparently run in parallel by default3.
Because we wanted all clients to be able to exploit parallelism between statements, even when sending single statements at a time to CockroachDB, we decided to introduce a new syntax called
RETURNING NOTHING. Usually, statements like
UPDATE return the number of rows affected. However, these statement types can be augmented with a
RETURNING clause to specify a set of columns from the affected rows to return. Much like other
RETURNING clauses, our new
RETURNING NOTHING clause is appended on to the end of statements like
UPDATE. Instead of telling the statement to return a set of columns though, this new syntax does exactly what it says; it tells the statement to return nothing. In fact, by the time a statement with
RETURNING NOTHING specified returns, it hasn’t even begun executing yet. Instead, the statement has simply been queued up to run asynchronously. In doing this, we create an opportunity for CockroachDB to return to the client with nothing but an acknowledgement and allow the client to issue more statements before the first one has finished executing. When a second statement is issued, it can then be run in parallel with the first statement. Furthermore, if a chain of statements within a transaction are run with the
RETURNING NOTHING syntax, they can all be run in parallel!
The only consideration we need to be careful with is error handling with parallel statements. For instance, imagine a parallel statement attempts to insert a duplicate primary key. As usual in SQL, we would expect this to raise a unique constraint violation. But wait, if we haven’t actually run the statement by the time we return to the client, then how do we communicate this error? Our approach is to report the error message for any parallel statements on the next statement that does not include the
RETURNING NOTHING. We call this statement a “barrier” statement because it synchronizes the group of parallel statements before itself executing. Because statements within a transaction will always either be followed by another statement or by a
COMMIT (which can act as the barrier statement), every parallel group will have a barrier statement. This means that clients will need to be prepared to accept errors for the parallel statements when issuing these barriers. It’s important to note, though, that this will never affect the outcome of a transaction because transactions already have all-or-nothing semantics and will be aborted if they ever see any error.
In the following diagram we can see a timeline of a transaction containing a series of
UPDATE statements, first run sequentially and then in parallel. It’s interesting to point out that the execution time of the three
UPDATE statements are overlapped, indicating that they are being run in parallel. Also notice that in this case, the
COMMIT statement acts as the barrier statement for the parallel group.
The parallel statement documentation page has more of these diagrams, which are useful for visualizing the execution timeline of parallel statements under different circumstances from a SQL client’s perspective.
An Analysis of Dependencies
When an application speaks SQL, it expects that the result of each statement will be fully reflected in the state of the system by the time it executes the next statement. For instance, when a certain row is inserted and then updated, we expect that the
INSERT statement will be executed first so that the
UPDATE statement has something to modify. However, when statements are run in parallel, it is possible that their execution will be reordered. We didn’t want this small detail preventing improved performance, so when designing parallel statements, we addressed the problem head on.
We built off an insight exploited across transactional models for decades, and more recently some consensus protocols: two non-interfering operations can reorder their execution without conflict. An implication of this principle is that the serial execution of non-interfering operations is equivalent to their parallel execution. Applying this idea, we decided to perform full dependency analysis on-the-fly between each parallel statement. This gives us the ability to allow independent statements to be reordered while enforcing a strict ordering between dependent statements.
Those who have followed CockroachDB over the years may be familiar with our approach to mapping SQL onto a KV keyspace. For those who aren’t, the details are described in this blog post. This mapping proved extremely useful when performing dependency analysis. Before executing a statement, we predict the maximal set of key spans that it may read from and the maximal set of key spans that it may write to. We then define statements independence as the condition where given two statements, neither of a statement’s read or write sets overlap the other’s write set.
Given this definition for statement independence and the ability to predict upper bounds on statement influence, we can then run independent statements at the same time, while respecting the execution order of dependent statements even when both are run with
RETURNING NOTHING. In effect, this means that it is always safe to run statements using
RETURNING NOTHING, and that the end result of running statements using the feature will always be the same as when running them without it.
Banking on Parallelism
The desire to parallelize statement execution within transactions was something we heard from a few early CockroachDB users, and it’s easy to see why. Transactions are the primary means of interacting with CockroachDB, as they provide full ACID guarantees. When an application requires that a group of statements be run atomically, there’s no way around them. But oftentimes the statements within a transaction themselves don’t depend on one another. Furthermore, the results of each statement within the transaction are very rarely used to influence other statements. These are the use cases where parallel statement execution shines.
One sample transaction presented to us by a user dealt with a monetary transfer within a bank. The transaction began by looking up information on two account holders, the participants of the transfer. If the sender was observed to have enough money to facilitate the transfer, it would be initiated. During a subsequent ledger transaction, five statements were executed. The transfer record was first added to a transfer log using an
INSERT statement. Next, transfer “legs” were
INSERTED for each account into the transfer_leg table. Finally, both participants’ accounts were
UPDATED to reflect the change in balance. Of course, this all needed to be performed in a transaction, unless the bank was ok with money occasionally going missing.
The complete transaction looked like:
BEGIN; SELECT id, balance FROM account WHERE id IN (account1, account2); -- result used by client INSERT INTO transfer (id) VALUES (transfer_id); INSERT INTO transfer_leg (account_id, amount, running_balance, txn_id) VALUES (account1,-amount,balance1-amount,transfer_id); INSERT INTO transfer_leg (account_id, amount, running_balance, txn_id) VALUES (account2,amount,balance2+amount,transfer_id); UPDATE account SET balance = balance1-amount WHERE id = account1; UPDATE account SET balance = balance2+amount WHERE id = account2; COMMIT;
Take a moment to think about why each statement is needed in the transaction. It’s interesting to note that other than the first SELECT, the results of the other five statements, the rows affected by each, are not used by the application. Regardless, because of how standard SQL works, each of these statements must executed in series.
BEGIN S1-----\S1 S2-----\S2 S3-----\S3 S4-----\S4 S5-----\S5 S6-----\S6 COMMIT
Parallel statement execution was built specifically for this kind of use case. By adding a
RETURNING NOTHING clause to each
UPDATE statement, we can speed the transaction up significantly. The new transaction looks like:
BEGIN; SELECT id, balance FROM account WHERE id IN (account1, account2); -- result used by client INSERT INTO transfer (id) VALUES (transfer_id) RETURNING NOTHING; INSERT INTO transfer_leg (account_id, amount, running_balance, txn_id) VALUES (account1,-amount,balance1-amount,transfer_id) RETURNING NOTHING; INSERT INTO transfer_leg (account_id, amount, running_balance, txn_id) VALUES (account2,amount,balance2+amount,transfer_id) RETURNING NOTHING; UPDATE account SET balance = balance1-amount WHERE id = account1 RETURNING NOTHING; UPDATE account SET balance = balance2+amount WHERE id = account2 RETURNING NOTHING; COMMIT;
When run, none of the parallel statements will finish executing by the time they return. Instead, they will all start executing in the background and will synchronize on the
COMMIT statement, which acts as the barrier statement in this situation.
BEGIN S1-----\S1 S2-----\S2 S3-----\S3 COMMIT S4-----\S4 S5-----\S5 S6-----\S6
The timeline of statement execution once parallel statement execution is employed looks something like this. First, we can see that S1 (statement 1, the
SELECT statement) needs to be run before all other statements, because it is not being parallelized. Next, we can see that S2 runs before S3 and S4. This is because the transfer_leg table has a foreign key on the transfer table which it must check before performing the
INSERT. This is a case where dependency analysis prevents us from doing anything incorrect. Finally, the two
UPDATES are run in parallel with the three
INSERTS. In theory these two updates could be run in parallel themselves because they touch different primary keys, but our dependency analysis is conservative and as of version 1.1 considers these two updates to be dependent.
These timelines are nice for visualizing how statements will be executed with and without parallel statement execution, but they are unable to demonstrate realized performance improvements. To explore this, we ran a load generator against a three node cluster that repeatedly executed the sample transaction both with and without the use of parallel statements. While doing so, we simulated wide-area network latency using the Linux
tc utility. The results from this experiment have been plotted below. The first thing to note is that even without any simulated network latency between nodes, parallel statement execution results in a 34% speedup. We can also see that as we increase simulated inter-node round-trip time, the relative speedup we observe by running statements in parallel grows. In fact, as this latency becomes the dominant factor in transaction runtime, this speedup approaches 43%, which is the idealized speedup because we’re reducing the number of serialized round-trip calls from 7 to 4 (
BEGIN does not require a network call).
As with all performance experimentation, the results must be taken with a grain of salt. The true test is whether these benefits translate over to improvements on real workloads.
This post has introduced the new parallel statement execution feature available in CockroachDB 1.1. By using the
RETURNING NOTHING syntax, statements are allowed to execute in parallel with each other. This permits statements to transparently exploit any intrinsic parallelism between themselves, which can lead to large performance gains. All the while, dependency analysis assures that statements are always safe to parallelize and that results produced with the optimization will always mirror those produced without it.
Parallel statement execution is still experimental, but we’re excited to see developers play with it. Feel free to try it out in your apps and give us feedback!
In some cases it may be desirable to spread data across the world so that data is always close to the users who need it. In others, operators may want to speed up commit latencies by collocating replication zones. Whatever the need, CockroachDB allows operators to fine-tune the geographical distribution of their data. For more, see the docs on configuring replication zones.
Note that this features aims to parallelize execution between multiple statements. We’ve also invested significant effort in exploiting parallelism within the execution of a single statement. For more on this, see our blog post on distributed query processing in CockroachDB.
There is an open proposal in CockroachDB to run these statements in parallel as well, whenever possible (see An Analysis of Dependencies), but this will most likely be put behind a session variable.
Illustration by Anna Hill