A follower read is performed on the nearest replica relative to the SQL gateway that is executing the SQL statement regardless of the replica's leaseholder status. Using the nearest replica can reduce read latencies and increase throughput. Applications in multi-region deployments especially can use follower reads to get improved performance.
Follower read types
A strong follower read is a read taken from a Global table. Such tables are optimized for low-latency reads from every region in the database. The tradeoff is that writes will incur higher latencies from any given region, since writes have to be replicated across every region to make the global low-latency reads possible. For more information about global tables, including troubleshooting information, see Global Tables.
A stale follower read is a historical read taken from the nearest replica. You should use stale follower reads only when your application can tolerate reading stale data, since the results of stale follower reads may not reflect the latest writes against the tables you are querying.
The following table summarizes the read types and how to accomplish them.
|Strong Reads||Stale Reads|
|Only From Leaseholder||
|From Nearest Replica||
Stale follower reads
CockroachDB provides the following types of stale follower reads:
- Exact staleness read: A historical read as of a static, user-provided timestamp. See Exact staleness reads.
- Bounded staleness read: A historical read that uses a dynamic, system-determined timestamp to minimize staleness while being more tolerant to replication lag than an exact staleness read. See Bounded staleness reads.
Stale follower reads are always served from a consistent view; CockroachDB does not allow a historical read to view uncommitted data.
Exact staleness reads
An exact staleness read is a historical read as of a static, user-provided timestamp.
For requirements and limitations, see Exact staleness reads and long-running writes and Exact staleness read timestamps must be far enough in the past.
When to use exact staleness reads
Use exact staleness follower reads when you:
- Need multi-statement reads inside transactions.
- Can tolerate reading older data (at least 4.8 seconds in the past), to reduce the chance that the historical query timestamp is not quite old enough to prevent blocking on a conflicting write and thus being able to be served by a local replica.
- Do not need the increase in availability provided by bounded staleness reads in the face of network partitions or other failures.
- Need a read that is slightly cheaper to perform than a bounded staleness read, because exact staleness reads don't need to dynamically compute the query timestamp.
Run queries that use exact staleness follower reads
SELECT statement with an appropriate
AS OF SYSTEM TIME value is an exact staleness follower read. You can use the convenience function
follower_read_timestamp(), which returns a
TIMESTAMP that provides a high probability of being served locally while not blocking on conflicting writes.
Use this function in an
AS OF SYSTEM TIME statement as follows:
SELECT ... FROM ... AS OF SYSTEM TIME follower_read_timestamp();
Exact staleness follower reads demo
The following video describes and demonstrates exact staleness follower reads.
Exact staleness follower reads in read-only transactions
BEGIN; SET TRANSACTION AS OF SYSTEM TIME follower_read_timestamp(); SELECT ... SELECT ... COMMIT;
Follower reads are "read-only" operations; you cannot use them in read-write transactions.
AS OF SYSTEM TIME follower_read_timestamp() on all implicit and explicit read-only transactions by default, use one of the following options:
- Set the
default_transaction_use_follower_readssession variable to
default_transaction_use_follower_reads=on, all read-only transactions use exact staleness follower reads.
- Execute the
SET SESSION CHARACTERISTICS AS TRANSACTION AS OF SYSTEM TIME follower_read_timestamp()SQL statement. This has the same effect as setting the session variable as shown above.
You can set
default_transaction_use_follower_reads on a per-role basis; for instructions, see Set default session variable values for a role.
Bounded staleness reads
A bounded staleness read is a historical read that uses a dynamic, system-determined timestamp to minimize staleness while being more tolerant to replication lag than an exact staleness read. Bounded staleness reads also help increase system availability, since they provide the ability to serve reads from local replicas even in the presence of network partitions or other failures that prevent the SQL gateway from communicating with the leaseholder.
When to use bounded staleness reads
Use bounded staleness follower reads when you:
- Need minimally stale reads from the nearest replica without blocking on conflicting transactions. This is possible because the historical timestamp is chosen dynamically and the least stale timestamp that can be served locally without blocking is used.
- Can confine the read to a single statement that meets the bounded staleness limitations.
- Need higher availability than is provided by exact staleness reads. Specifically, what we mean by availability in this context is:
- The ability to serve a read with low latency from a local replica rather than a leaseholder.
- The ability to serve reads from local replicas even in the presence of a network partition or other failure event that prevents the SQL gateway from communicating with the leaseholder. Once a replica begins serving follower reads at a timestamp, it will always continue to serve follower reads at that timestamp. Even if the replica becomes completely partitioned away from the rest of its range, it will continue to stay available for (increasingly) stale reads.
Run queries that use bounded staleness follower reads
To get a bounded staleness read, use one of the following built-in functions:
||Defines a minimum timestamp at which to perform the bounded staleness read. The actual timestamp of the read may be equal to or later than the provided timestamp, but cannot be before the provided timestamp. This is useful to request a read from nearby followers, if possible, while enforcing causality between an operation at some point in time and any dependent reads. This function accepts an optional
||Defines a maximum staleness interval with which to perform the bounded staleness read. The timestamp of the read can be at most this stale with respect to the current time. This is useful to request a read from nearby followers, if possible, while placing some limit on how stale results can be. Note that
Start the demo cluster:
SELECT code FROM promo_codes AS OF SYSTEM TIME with_max_staleness('10s') where code = '0_explain_theory_something';
code ------------------------------ 0_explain_theory_something (1 row)
The query returns successfully.
If it had failed with the following error message, you would need to troubleshoot your query to ensure it meets the conditions required for bounded staleness reads.
ERROR: unimplemented: cannot use bounded staleness for queries that may touch more than one row or require an index join SQLSTATE: 0A000 HINT: You have attempted to use a feature that is not yet implemented. See: https://go.crdb.dev/issue-v/67562/v21.2
You can verify using
EXPLAINthat the reason this query was able to perform a bounded staleness read is that it performed a point lookup from a single row:
EXPLAIN SELECT code FROM promo_codes AS OF SYSTEM TIME with_max_staleness('10s') where code = '0_explain_theory_something';
info -------------------------------------------------------------------------------- distribution: local vectorized: true • scan estimated row count: 1 (0.10% of the table; stats collected 4 minutes ago) table: promo_codes@primary spans: [/'0_explain_theory_something' - /'0_explain_theory_something'] (7 rows)
Verify that a cluster is performing follower reads
To verify that a cluster is performing follower reads, go to the Custom Chart Debug Page in the DB Console and add the metric
follower_read.success_count to the time-series graph. The number of follower reads performed by your cluster will be shown.
How stale follower reads work
Each CockroachDB range tracks a property called its closed timestamp, which means that no new writes can ever be introduced at or below that timestamp. The closed timestamp is advanced continuously on the leaseholder, and lags the current time by some target interval. As the closed timestamp is advanced, notifications are sent to each follower. If a range receives a write at a timestamp less than or equal to its closed timestamp, the write is forced to change its timestamp, which might result in a transaction retry error.
With follower reads, any replica in a range can serve a read for a key as long as the time at which the operation is performed (i.e., the
AS OF SYSTEM TIME value) is less than or equal to the range's closed timestamp.
When a gateway node in a cluster receives a request to read a key with a sufficiently old
AS OF SYSTEM TIME value, it forwards the request to the closest node that contains a replica of the data—whether it be a follower or the leaseholder.
For further details, see An Epic Read on Follower Reads.
- Exact staleness reads and long-running writes
- Exact staleness read timestamps must be far enough in the past
- Bounded staleness read limitations
Exact staleness reads and long-running writes
Long-running write transactions will create write intents with a timestamp near when the transaction began. When an exact staleness follower read encounters a write intent, it will often end up in a "transaction wait queue", waiting for the operation to complete; however, this runs counter to the benefit exact staleness reads provide.
To counteract this, you can issue all follower reads in explicit transactions set with
BEGIN PRIORITY HIGH AS OF SYSTEM TIME follower_read_timestamp(); SELECT ... SELECT ... COMMIT;
Exact staleness read timestamps must be far enough in the past
If an exact staleness read is not using an
AS OF SYSTEM TIME value far enough in the past, CockroachDB cannot perform a follower read. Instead, the read must access the leaseholder replica. This adds network latency if the leaseholder is not the closest replica to the gateway node. Most users will use the
follower_read_timestamp() function to get a timestamp far enough in the past that there is a high probability of getting a follower read.
Bounded staleness read limitations
Bounded staleness reads have the following limitations:
- They must be used in a single-statement (aka implicit) transaction.
- They must read from a single row.
- They must not require an index join. In other words, the index used by the read query must be either a primary index, or some other index that covers the entire query by
SELECT code FROM promo_codes AS OF SYSTEM TIME with_max_staleness('10s') LIMIT 1; ERROR: unimplemented: cannot use bounded staleness for queries that may touch more than one row or require an index join SQLSTATE: 0A000 HINT: You have attempted to use a feature that is not yet implemented. See: https://go.crdb.dev/issue-v/67562/v21.2
As noted by the error message, this query cannot be served as a bounded staleness read because in this case it would touch more than one row. Even though we used a
LIMIT 1 clause, the query would still have to touch more than one row in order to filter out the additional results.
EXPLAIN SELECT code FROM promo_codes LIMIT 5;
info ------------------------------------------------------------------------------- distribution: full vectorized: true • scan estimated row count: 1 (0.10% of the table; stats collected 1 minute ago) table: promo_codes@primary spans: LIMITED SCAN limit: 1 (8 rows)
The output verifies that this query performs a scan of the primary index on the
promo_codes table, which is why it cannot be used for a bounded staleness read.
For an example showing how to successfully perform a bounded staleness read, see Run queries that use bounded staleness follower reads.
- Follower Reads Topology
- Cluster Settings Overview
- Load-Based Splitting
- Network Latency Page
- Enterprise Features