Cut Over from a Primary Cluster to a Standby Cluster

On this page

Note:

This feature is in preview. This feature is subject to change. To share feedback and/or issues, contact Support.

Physical cluster replication allows you to cut over from the active primary cluster to the passive standby cluster that has ingested replicated data. When you complete the replication, it will stop the stream of new data, reset the standby virtual cluster to a point in time where all ingested data is consistent, and then mark the standby virtual cluster as ready to accept traffic.

The cutover is a two-step process on the standby cluster:

Initiating the cutover.
Completing the cutover.

Warning:

Initiating a cutover is a manual process that makes the standby cluster ready to accept SQL connections. However, the cutover process does not automatically redirect traffic to the standby cluster. Once the cutover is complete, you must redirect application traffic to the standby (new) cluster. If you do not manually redirect traffic, writes to the primary (original) cluster may be lost.

After a cutover, you may want to cut back to the original primary cluster. That is, set up the original primary cluster to once again accept application traffic. This requires you to configure another full replication stream in the opposite direction from the original standby (now primary) to the original primary. For more detail, refer to Cut back to the primary cluster.

Step 1. Initiate the cutover

To initiate a cutover to the standby cluster, there are different ways of specifying the point in time for the standby's promotion. That is, the standby cluster's live data at the point of cutover. Refer to the following sections for steps:

LATEST: The most recent replicated timestamp.
Point-in-time:
- Past: A timestamp in the past that is within the cutover window.
- Future: A timestamp in the future in order to plan a cutover.

Cut over to the most recent replicated time

To initiate a cutover to the most recent replicated timestamp, you can specify LATEST. It is important to note that the latest replicated time may be behind the actual time if there is replication lag in the stream. That is, the time between the most up-to-date replicated time and the actual time.

To view the current replication timestamp, use:

SHOW VIRTUAL CLUSTER main WITH REPLICATION STATUS;

  id | name | source_tenant_name |              source_cluster_uri                 |         retained_time           |    replicated_time     | replication_lag | cutover_time |   status
-----+------+--------------------+-------------------------------------------------+---------------------------------+------------------------+-----------------+--------------+--------------
   3 | main | main               | postgresql://user@hostname or IP:26257?redacted | 2024-04-18 10:07:45.000001+00   | 2024-04-18 14:07:45+00 | 00:00:19.602682 |         NULL | replicating
(1 row)

Run the following from the standby cluster's SQL shell to start the cutover:

ALTER VIRTUAL CLUSTER main COMPLETE REPLICATION TO LATEST;

The cutover_time is the timestamp at which the replicated data is consistent. The cluster will revert any data above this timestamp:

           cutover_time
----------------------------------
  1695922878030920020.0000000000
(1 row)

Cut over to a point in time

You can control the point in time that the replication stream will cut over to.

To select a specific time in the past, use:

SHOW VIRTUAL CLUSTER main WITH REPLICATION STATUS;

The retained_time response provides the earliest time to which you can cut over.

  id | name | source_tenant_name |              source_cluster_uri                 |         retained_time         |    replicated_time     | replication_lag | cutover_time |   status
-----+------+--------------------+-------------------------------------------------+-------------------------------+------------------------+-----------------+--------------+--------------
   3 | main | main               | postgresql://user@hostname or IP:26257?redacted | 2024-04-18 10:07:45.000001+00 | 2024-04-18 14:07:45+00 | 00:00:19.602682 |         NULL | replicating
(1 row)

Specify a timestamp:

ALTER VIRTUAL CLUSTER main COMPLETE REPLICATION TO SYSTEM TIME '-1h';

Refer to Using different timestamp formats for more information.

Similarly, to cut over to a specific time in the future:

ALTER VIRTUAL CLUSTER main COMPLETE REPLICATION TO SYSTEM TIME '+5h';

A future cutover will proceed once the replicated data has reached the specified time.

Note:

To monitor for when the replication stream completes, do the following:

Find the replication stream's job_id using SELECT * FROM [SHOW JOBS] WHERE job_type = 'REPLICATION STREAM INGESTION';
Run SHOW JOB WHEN COMPLETE job_id. Refer to the SHOW JOBS page for details and an example.

Step 2. Complete the cutover

The completion of the replication is asynchronous; to monitor its progress use:

SHOW VIRTUAL CLUSTER main WITH REPLICATION STATUS;

id | name | source_tenant_name |              source_cluster_uri                 |         retained_time         |    replicated_time           | replication_lag | cutover_time                   |   status
---+------+--------------------+-------------------------------------------------+-------------------------------+------------------------------+-----------------+--------------------------------+--------------
3  | main | main               | postgresql://user@hostname or IP:26257?redacted | 2023-09-28 16:09:04.327473+00 | 2023-09-28 17:41:18.03092+00 | 00:00:19.602682 | 1695922878030920020.0000000000 | replication pending cutover
(1 row)

Refer to Physical Cluster Replication Monitoring for the Responses and Data state of SHOW VIRTUAL CLUSTER ... WITH REPLICATION STATUS fields.

Once complete, bring the standby's virtual cluster online with:

ALTER VIRTUAL CLUSTER main START SERVICE SHARED;

  id |        name         |     data_state     | service_mode
-----+---------------------+--------------------+---------------
  1  | system              | ready              | shared
  3  | main                | ready              | shared
(3 rows)

To make the standby's virtual cluster the default for connection strings, set the following cluster setting:
```
SET CLUSTER SETTING server.controller.default_target_cluster='main';
```

At this point, the primary and standby clusters are entirely independent. You will need to use your own network load balancers, DNS servers, or other network configuration to direct application traffic to the standby (now primary). To enable physical cluster replication again, from the new primary to the original primary (or a completely different cluster), refer to Cut back to the primary cluster.

Cut back to the primary cluster

After cutting over to the standby cluster, you may need to move back to the original primary cluster, or a completely different cluster. This process is manual and requires starting a new replication stream.

For example, if you had set up physical cluster replication between a primary and standby cluster and then cut over to the standby, the workflow to cut back to the original primary cluster would be as follows:

Original primary cluster = Cluster A
Original standby cluster = Cluster B

Cluster B is now serving application traffic after the cutover.
Drop the application virtual cluster from the cluster A with DROP VIRTUAL CLUSTER.
Start a replication stream that sends updates from cluster B to cluster A. Refer to Start replication.

Cockroach
University

Docs Hub

Cut Over from a Primary Cluster to a Standby Cluster

Step 1. Initiate the cutover

Cut over to the most recent replicated time

Cut over to a point in time

Step 2. Complete the cutover

Cut back to the primary cluster

See also

Cockroach University

Docs Hub

Cockroach University

Docs Hub

Cut Over from a Primary Cluster to a Standby Cluster

Step 1. Initiate the cutover

Cut over to the most recent replicated time

Cut over to a point in time

Step 2. Complete the cutover

Cut back to the primary cluster

See also

Cockroach
University

Cockroach
University