Publication date: August 20, 2021
Description
Cockroach Labs has discovered a bug in which setting the sql.trace.txn.enable_threshold
cluster setting to a non-0 value causes nodes to crash loop in CockroachDB v21.1.0 to v21.1.6. The cause of this bug was a misuse of an internal tracing API.
Statement
This is resolved in CockroachDB by PR #68002, which fixes the miuse of the tracing API.
The fix has been applied to maintenance releases of CockroachDB v21.1, and is available in v21.1.7+.
This bug is publicly tracked as #68005.
Mitigation
The mitigation is to upgrade to CockroachDB v21.1.7+ or to restore from a backup that did not have the cluster setting set to a non-0 value.
Impact
Setting the sql.trace.txn.enable_threshold
cluster setting to a non-0 value causes all nodes in the cluster to permanently crash loop. To mitigate, upgrade to CockroachDB v21.1.7+, restore a backup into a v20.2 cluster, or restore a backup without the cluster setting set into a v21.1 cluster.
All deployments running CockroachDB v21.1.0 to v21.1.6 are affected. Upgrading a v20.2 cluster to an affected version of v21.1 that has the cluster already set will also be affected by the bug. Such an upgrade can be mitigated by rolling back to v20.2 unless the version upgrade was finalized. Cockroach Labs acknowledges the issue and has issued a fix that is available in v21.1.7.