Relational databases are great for transactional workloads. But things can get messy when multiple transactions start trying to access the same data at the same time. Luckily, in many SQL databases there’s a solution for that:
SELECT FOR UPDATE.
SELECT FOR UPDATE is a SQL command that’s useful in the context of transactional workloads. It allows you to “lock” the rows returned by a
SELECT query until the entire transaction that query is part of has been committed. Other transactions attempting to access those rows are placed into a time-based queue to wait, and are executed chronologically after the first transaction is completed.
This is useful because it prevents the thrashing and unnecessary transaction retries that would otherwise occur when multiple transactions are attempting to read those same rows. Any time multiple transactions are likely to be working with the same rows at roughly the same time,
SELECT FOR UPDATE can be used to increase throughput and decrease tail latency (compared to what you would see without using it).
In other words:
SELECT FOR UPDATE makes contended transactions process more smoothly (which generally also means they process more quickly and efficiently).
SELECT FOR UPDATE is useful for any transactional workload where multiple transactions might attempt to access the same rows at the same time.
Often, the reason for this is related to the way different systems handle transactions more broadly. For example, SQLite does not need
SELECT FOR UPDATE because initiating a transaction locks the entire database. This allows SQLite databases to offer serializable transactional isolation, the highest level of isolation, for ironclad consistency. However, locking the entire database during a transaction that might only affect a few rows has obvious negative implications for performance, particularly at scale.
However, choosing between performance and consistency is not necessary! CockroachDB, for example, offers serializable isolation, but it does not have to lock the entire database at the beginning of a transaction to make that happen.
SELECT FOR UPDATE can be used to maximize database performance in the event of concurrent transactions working on the same rows, and the end result (in the case of CockroachDB) is still a database with serializable isolation.
Different SQL databases handle transaction isolation, and thus
SELECT FOR UPDATE, differently, so it’s important to be familiar with the options and defaults for the system you’re using.
Let’s take a look at how
SELECT FOR UPDATE works. We’ll use CockroachDB syntax and parameters here, but the syntax for other SQL databases that support
SELECT FOR UPDATE will be similar.
Imagine we’re working with a database that includes the following table
A complete transaction that uses
SELECT FOR UPDATE on that table could look like this:
BEGIN; SELECT * FROM kv WHERE k = 1 FOR UPDATE; UPDATE kv SET v = v + 5 WHERE k = 1; COMMIT;
Working line by line through the statement above:
BEGIN, initiates the transaction.
SELECTstatement, identifies the rows that will be impacted and locks them. In this case, it’s a single row: the first row of the table.
vcolumn of the row.
COMMIT;commits the transaction.
If we were to run this transaction on our example table, the result would be this:
What’s important here, though, is that the
SELECT … FOR UPDATE line (line 2) locked the row that the transaction updated. If another transaction (let’s call it Tx 2) hit the database attempting to operate on the same row while this transaction (Tx 1) was processing, Tx 2 would be added to the queue for processing after Tx 1 commits, rather than beginning to execute, failing, and having to retry because Tx 1 changed one of the values Tx 2 was accessing while it was processing.
Again, different database systems allow for different parameters and modifications of
A common one, for example, is
SELECT … FOR SHARE, which provides a weaker form of row-locking in some database systems. In PostgreSQL,
SELECT … FOR UPDATE completely locks the relevant rows, whereas
SELECT … FOR SHARE locks the relevant rows only for updates and deletes.
In contrast, because CockroachDB always guarantees serializable isolation and there are no “weaker” locking levels,
SELECT … FOR SHARE functions identically to
SELECT … FOR UPDATE. The
FOR SHARE syntax is supported only for Postgres compatibility.
Another common parameter is
NOWAIT, which returns an error immediately if a transaction is not able to immediately lock a row. In SQL syntax,
NOWAIT appears directly after
FOR UPDATE, like so:
SELECT * FROM kv WHERE k = 1 FOR UPDATE NOWAIT;
SKIP LOCKED is also a parameter supported by some databases that allows waiting transactions to skip locked rows temporarily so that the “hold” on those rows doesn’t slow the processing of elements of the transaction impacting non-locked rows. CockroachDB doesn’t currently support this, in part because it uses multiversion concurrency control. This forum thread has some excellent information for achieving the same goals without using
Let’s be real: while there’s nothing wrong with a nice GUI, doing things with the command line is just …Read More