Upgrade to CockroachDB v21.2

On this page Carat arrow pointing down

Because of CockroachDB's multi-active availability design, you can perform a "rolling upgrade" of your CockroachDB cluster. This means that you can upgrade nodes one at a time without interrupting the cluster's overall health and operations.

This page describes how to upgrade to the latest v21.2 release, v21.2.16.

Terminology

Before upgrading, review the CockroachDB release terminology:

  • A new major release is performed every 6 months. The major version number indicates the year of release followed by the release number, which will be either 1 or 2. For example, the latest major release is v21.2 (also written as v21.2.0).
  • Each supported major release is maintained across patch releases that fix crashes, security issues, and data correctness issues. Each patch release increments the major version number with its corresponding patch number. For example, patch releases of v21.2 use the format v21.2.x.
  • All major and patch releases are suitable for production usage, and are therefore considered "production releases". For example, the latest production release is v21.2.16.
  • Prior to an upcoming major release, alpha and beta releases and release candidates are made available. These "testing releases" are not suitable for production usage. They are intended for users who need early access to a feature before it is available in a production release. These releases append the terms alpha, beta, or rc to the version number.
Note:

There are no "minor releases" of CockroachDB.

Step 1. Verify that you can upgrade

Run cockroach sql against any node in the cluster to open the SQL shell. Then check your current cluster version:

icon/buttons/copy
> SHOW CLUSTER SETTING version;
Note:

If you are upgrading from any cluster version prior to v21.1, then before upgrading from v20.2 to v21.1, you must ensure that any previously decommissioned nodes are fully decommissioned. Otherwise, they will block the upgrade. For instructions, see Check decommissioned nodes.

To upgrade to v21.2.16, you must be running either:

  • Any earlier v21.2 release: v21.2.0-beta.1 to v21.2.15.

  • A v21.1 production release: v21.1.0 to v21.1.21.

If you are running any other version, take the following steps before continuing on this page:

Version Action(s) before upgrading to any v21.2 release
Pre-v21.2 testing release Upgrade to a corresponding production release; then upgrade through each subsequent major release, ending with a v21.1 production release.
Pre-v21.1 production release Upgrade through each subsequent major release, ending with a v21.1 production release.
v21.1 testing release Upgrade to a v21.1 production release.

When you are ready to upgrade to v21.2.16, continue to step 2.

Step 2. Prepare to upgrade

Before starting the upgrade, complete the following steps.

Check load balancing

Make sure your cluster is behind a load balancer, or your clients are configured to talk to multiple nodes. If your application communicates with a single node, stopping that node to upgrade its CockroachDB binary will cause your application to fail.

Check cluster health

Verify the overall health of your cluster using the DB Console:

  • Under Node Status, make sure all nodes that should be live are listed as such. If any nodes are unexpectedly listed as SUSPECT or DEAD, identify why the nodes are offline and either restart them or decommission them before beginning your upgrade. If there are DEAD and non-decommissioned nodes in your cluster, it will not be possible to finalize the upgrade (either automatically or manually).

  • Under Replication Status, make sure there are 0 under-replicated and unavailable ranges. Otherwise, performing a rolling upgrade increases the risk that ranges will lose a majority of their replicas and cause cluster unavailability. Therefore, it's important to identify and resolve the cause of range under-replication and/or unavailability before beginning your upgrade.

  • In the Node List:

    • Make sure all nodes are on the same version. If any nodes are behind, upgrade them to the cluster's current version first, and then start this process over.
  • In the Metrics dashboards:

    • Make sure CPU, memory, and storage capacity are within acceptable values for each node. Nodes must be able to tolerate some increase in case the new version uses more resources for your workload. If any of these metrics is above healthy limits, consider adding nodes to your cluster before beginning your upgrade.

Check decommissioned nodes

Check the membership field in the output of cockroach node status --decommission. Nodes with decommissioned membership are fully decommissioned, while nodes with decommissioning membership have not completed the process. If there are decommissioning nodes in your cluster, this will block the upgrade.

If you are upgrading from any cluster version prior to v21.1, then before upgrading from v20.2 to v21.1, you must manually change the status of any decommissioning nodes to decommissioned. To do this, run cockroach node decommission on these nodes and confirm that they update to decommissioned.

In case a decommissioning process is hung, recommission and then decommission those nodes again, and confirm that they update to decommissioned.

Review breaking changes

Review the changes in v21.2. If any affect your deployment, make the necessary changes before starting the rolling upgrade to v21.2.

  • Interleaved tables and interleaved indexes have been removed. Before upgrading to v21.2, convert interleaved tables and replace interleaved indexes. Clusters with interleaved tables and indexes cannot finalize the v21.2 upgrade.
  • Previously, CockroachDB only supported the YMD format for parsing timestamps from strings. It now also supports the MDY format to better align with PostgreSQL. A timestamp such as 1-1-18, which was previously interpreted as 2001-01-18, will now be interpreted as 2018-01-01. To continue interpreting the timestamp in the YMD format, the first number can be represented with 4 digits, 2001-1-18.
  • The deprecated cluster setting cloudstorage.gs.default.key has been removed, and the behavior of the AUTH parameter in Google Cloud Storage BACKUP and IMPORT URIs has been changed. The default behavior is now that of AUTH=specified, which uses the credentials passed in the CREDENTIALS parameter, and the previous default behavior of using the node's implicit access (via its machine account or role) now requires explicitly passing AUTH=implicit.
  • We have switched types from TEXT to "char" for compatibility with PostgreSQL in the following columns: pg_constraint (confdeltype, confmatchtype, confudptype, contype) pg_operator (oprkind), pg_prog (proargmodes), pg_rewrite (ev_enabled, ev_type), and pg_trigger (tgenabled).

Step 3. Decide how the upgrade will be finalized

Note:

This step is relevant only when upgrading from v21.1.x to v21.2. For upgrades within the v21.2.x series, skip this step.

By default, after all nodes are running the new version, the upgrade process will be auto-finalized. This will enable certain features and performance improvements introduced in v21.2. However, it will no longer be possible to perform a downgrade to v21.1. In the event of a catastrophic failure or corruption, the only option will be to start a new cluster using the old binary and then restore from one of the backups created prior to performing the upgrade. For this reason, we recommend disabling auto-finalization so you can monitor the stability and performance of the upgraded cluster before finalizing the upgrade, but note that you will need to follow all of the subsequent directions, including the manual finalization in step 5:

  1. Upgrade to v21.1, if you haven't already.

  2. Start the cockroach sql shell against any node in the cluster.

  3. Set the cluster.preserve_downgrade_option cluster setting:

    icon/buttons/copy
    > SET CLUSTER SETTING cluster.preserve_downgrade_option = '21.1';
    

    It is only possible to set this setting to the current cluster version.

Features that require upgrade finalization

When upgrading from v21.1 to v21.2, certain features and performance improvements will be enabled only after finalizing the upgrade, including but not limited to:

  • Expression indexes: Indexes on expressions can now be created. These indexes speed up queries that filter on the result of that expression, and are especially useful for indexing only a specific field of a JSON object.
  • Privilege inheritance: CockroachDB's model for inheritance of privileges that cascade from schema objects now matches PostgreSQL. Added support for ALTER DEFAULT PRIVILEGES and SHOW DEFAULT PRIVILEGES.
  • Bounded staleness reads: Bounded staleness reads are now available in CockroachDB. These use a dynamic, system-determined timestamp to minimize staleness while being more tolerant to replication lag than exact staleness reads. This dynamic timestamp is returned by the with_min_timestamp() or with_max_staleness() functions. In addition, bounded staleness reads provide the ability to serve reads from local replicas even in the presence of network partitions or other failures.
  • Restricted and default placement: You can now use the ALTER DATABASE ... PLACEMENT RESTRICTED statement to constrain the replica placement for a multi-region database's regional tables to the home regions associated with those tables.
  • ON UPDATE expressions: An ON UPDATE expression can now be added to a column to update column values when an UPDATE or UPSERT statement modifies a different column value in the same row, or when an ON UPDATE CASCADE expression on a different column modifies an existing value in the same row.

For an expanded list of features included in the v21.2 release, see the v21.2 release notes.

Step 4. Perform the rolling upgrade

For each node in your cluster, complete the following steps. Be sure to upgrade only one node at a time, and wait at least one minute after a node rejoins the cluster to upgrade the next node. Simultaneously upgrading more than one node increases the risk that ranges will lose a majority of their replicas and cause cluster unavailability.

Tip:

We recommend creating scripts to perform these steps instead of performing them manually. Also, if you are running CockroachDB on Kubernetes, see our documentation on single-cluster and/or multi-cluster orchestrated deployments for upgrade guidance instead.

Note:

These steps perform an upgrade to the latest v21.2 release, v21.2.16.

  1. Drain and shut down the node.

  2. Download and install the CockroachDB binary you want to use:

    icon/buttons/copy

    $ curl https://binaries.cockroachdb.com/cockroach-v21.2.16.darwin-10.9-amd64.tgz|tar -xzf -
    

    icon/buttons/copy

    $ curl https://binaries.cockroachdb.com/cockroach-v21.2.16.linux-amd64.tgz|tar -xzf -
    

  3. If you use cockroach in your $PATH, rename the outdated cockroach binary, and then move the new one into its place:

    icon/buttons/copy

    i="$(which cockroach)"; mv "$i" "$i"_old
    
    icon/buttons/copy
    $ cp -i cockroach-v21.2.16.darwin-10.9-amd64/cockroach /usr/local/bin/cockroach
    

    icon/buttons/copy

    i="$(which cockroach)"; mv "$i" "$i"_old
    
    icon/buttons/copy
    $ cp -i cockroach-v21.2.16.linux-amd64/cockroach /usr/local/bin/cockroach
    

  4. Start the node to have it rejoin the cluster.

    Without a process manager like systemd, re-run the cockroach start command that you used to start the node initially, for example:

    icon/buttons/copy
    $ cockroach start \
    --certs-dir=certs \
    --advertise-addr=<node address> \
    --join=<node1 address>,<node2 address>,<node3 address>
    

    If you are using systemd as the process manager, run this command to start the node:

    icon/buttons/copy
    $ systemctl start <systemd config filename>
    
  5. Verify the node has rejoined the cluster through its output to stdout or through the DB Console.

  6. If you use cockroach in your $PATH, you can remove the old binary:

    icon/buttons/copy
    $ rm /usr/local/bin/cockroach_old
    

    If you leave versioned binaries on your servers, you do not need to do anything.

  7. After the node has rejoined the cluster, ensure that the node is ready to accept a SQL connection.

    Unless there are tens of thousands of ranges on the node, it's usually sufficient to wait one minute. To be certain that the node is ready, run the following command:

    icon/buttons/copy
    cockroach sql -e 'select 1'
    

    The command will automatically wait to complete until the node is ready.

  8. Repeat these steps for the next node.

Step 5. Finish the upgrade

Note:

This step is relevant only when upgrading from v21.1.x to v21.2. For upgrades within the v21.2.x series, skip this step.

If you disabled auto-finalization in step 3, monitor the stability and performance of your cluster for as long as you require to feel comfortable with the upgrade (generally at least a day). If during this time you decide to roll back the upgrade, repeat the rolling restart procedure with the old binary.

Once you are satisfied with the new version:

  1. Run cockroach sql against any node in the cluster to open the SQL shell.

  2. Re-enable auto-finalization:

    icon/buttons/copy
    > RESET CLUSTER SETTING cluster.preserve_downgrade_option;
    
    Note:

    This statement can take up to a minute to complete, depending on the amount of data in the cluster, as it kicks off various internal maintenance and migration tasks. During this time, the cluster will experience a small amount of additional load.

  3. Check the cluster version to confirm that the finalize step has completed:

    icon/buttons/copy
    > SHOW CLUSTER SETTING version;
    

Troubleshooting

After the upgrade has finalized (whether manually or automatically), it is no longer possible to downgrade to the previous release. If you are experiencing problems, we therefore recommend that you:

  1. Run the cockroach debug zip command against any node in the cluster to capture your cluster's state.

  2. Reach out for support from Cockroach Labs, sharing your debug zip.

In the event of catastrophic failure or corruption, the only option will be to start a new cluster using the old binary and then restore from one of the backups created prior to performing the upgrade.

See also


Yes No
On this page

Yes No