Back up and Restore Data - Advanced Options

On this page Carat arrow pointing down
Warning:
CockroachDB v20.1 is no longer supported. For more details, see the Release Support Policy.

New in v20.1: The ability to backup a full cluster has been added and the syntax for incremental backups is simplified. Because of these two changes, basic backup usage is now sufficient for most CockroachDB clusters. However, you may want to control your backup and restore options more explicitly, which now falls under advanced usage.

This doc provides information about the advanced options you can use when you backup and restore data in a CockroachDB cluster:

Note:

The advanced options covered in this doc are included in BACKUP, which is an enterprise-only feature. For non-enterprise backups, see cockroach dump.

Incremental backups with explicitly specified destinations

To explicitly control where your incremental backups go, use the INCREMENTAL FROM syntax:

icon/buttons/copy
> BACKUP DATABASE bank \
TO 'gs://acme-co-backup/db/bank/2017-03-29-nightly' \
AS OF SYSTEM TIME '-10s' \
INCREMENTAL FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly', 'gs://acme-co-backup/database-bank-2017-03-28-nightly' WITH revision_history;

For an example of the simplified incremental backup syntax, see Create incremental backups.

Backup with revision history and point-in-time restore

You can create full or incremental backups with revision history:

  • Taking full backups with revision history allows you to back up every change made within the garbage collection period leading up to and including the given timestamp.
  • Taking incremental backups with revision history allows you to back up every change made since the last backup and within the garbage collection period leading up to and including the given timestamp. You can take incremental backups with revision history even when your previous full or incremental backups were taken without revision history.

You can configure garbage collection periods using the ttlseconds replication zone setting. Taking backups with revision history allows for point-in-time restores within the revision history.

Create a backup with revision history

icon/buttons/copy
> BACKUP TO \
'gs://acme-co-backup/test-cluster-2017-03-27-weekly' \
AS OF SYSTEM TIME '-10s' WITH revision_history;

Point-in-time restore

If the full or incremental backup was taken with revision history, you can restore the data as it existed at an arbitrary point-in-time within the revision history captured by that backup. Use the AS OF SYSTEM TIME clause to specify the point-in-time.

New in v20.1: Additionally, if you want to restore a specific incremental backup, you can do so by specifying the end_time of the backup by using the AS OF SYSTEM TIME clause. To find the incremental backup's end_time, use SHOW BACKUP.

If you do not specify a point-in-time, the data will be restored to the backup timestamp; that is, the restore will work as if the data was backed up without revision history.

icon/buttons/copy
> RESTORE FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
AS OF SYSTEM TIME '2017-02-26 10:00:00';

Point-in-time restore from incremental backups

Restoring from incremental backups requires previous full and incremental backups. In this example, -weekly is the full backup and the two -nightly are incremental backups:

icon/buttons/copy
> RESTORE FROM \
'gs://acme-co-backup/database-bank-2017-03-27-weekly', 'gs://acme-co-backup/database-bank-2017-03-28-nightly', 'gs://acme-co-backup/database-bank-2017-03-29-nightly' \
AS OF SYSTEM TIME '2017-02-28 10:00:00';

Locality-aware backup and restore

You can create locality-aware backups such that each node writes files only to the backup destination that matches the node locality configured at node startup.

This is useful for:

  • Reducing cloud storage data transfer costs by keeping data within cloud regions.
  • Helping you comply with data domiciling requirements.

A locality-aware backup is specified by a list of URIs, each of which has a COCKROACH_LOCALITY URL parameter whose single value is either default or a single locality key-value pair such as region=us-east. At least one COCKROACH_LOCALITY must be the default. Given a list of URIs that together contain the locations of all of the files for a single locality-aware backup, RESTORE can read in that backup.

Note:

The locality query string parameters must be URL-encoded.

During locality-aware backups, backup file placement is determined by leaseholder placement, as each node is responsible for backing up the ranges for which it is the leaseholder. Nodes write files to the backup storage location whose locality matches their own node localities, with a preference for more specific values in the locality hierarchy. If there is no match, the default locality is used.

Create a locality-aware backup

For example, to create a locality-aware backup where nodes with the locality region=us-west write backup files to s3://us-west-bucket, and all other nodes write to s3://us-east-bucket by default, run:

icon/buttons/copy
> BACKUP TO
      ('s3://us-east-bucket?COCKROACH_LOCALITY=default', 's3://us-west-bucket?COCKROACH_LOCALITY=region%3Dus-west');

can be restored by running:

icon/buttons/copy
> RESTORE FROM ('s3://us-east-bucket', 's3://us-west-bucket');

Note that the first URI in the list has to be the URI specified as the default URI when the backup was created. If you have moved your backups to a different location since the backup was originally taken, the first URI must be the new location of the files originally written to the default location.

Restore from a locality-aware backup

You can create locality-aware backups such that each node writes files only to the backup destination that matches the node locality configured at node startup.

Given a list of URIs that together contain the locations of all of the files for a single locality-aware backup, RESTORE can read in that backup. Note that the list of URIs passed to RESTORE may be different from the URIs originally passed to BACKUP. This is because it's possible to move the contents of one of the parts of a locality-aware backup (i.e., the files written to that destination) to a different location, or even to consolidate all the files for a locality-aware backup into a single location.

When restoring a full backup, the cluster data is restored first, then the system table data "as is." This means that the restored zone configurations can point to regions that do not have active nodes in the new cluster. For example, if your full backup has the following zone configurations:

> ALTER PARTITION europe_west OF INDEX movr.public.rides@primary \
        CONFIGURE ZONE USING constraints = '[+region=europe-west1]';

> ALTER PARTITION us_east OF INDEX movr.public.rides@primary \
        CONFIGURE ZONE USING constraints = '[+region=us-east1]';

> ALTER PARTITION us_west OF INDEX movr.public.rides@primary \
        CONFIGURE ZONE USING constraints = '[+region=us-west1]';

And the restored cluster does not have nodes with the locality region=us-west1, the restored cluster will still have a zone configuration for us-west1. This means that the cluster's data will not be reshuffled to us-west1 because the region does not exist. The data will be distributed as if the zone configuration does not exist. For the data to be distributed correctly, you can add node(s) with the missing region or remove the zone configuration.

Note:

RESTORE is not truly locality-aware; while restoring from backups, a node may read from a store that does not match its locality. This can happen in the cases that either the BACKUP or RESTORE was not full cluster. Note that during a locality-aware restore, some data may be temporarily located on another node before it is eventually relocated to the appropriate node. To avoid this, you can manually restore zone configurations from a locality-aware backup.

Create an incremental locality-aware backup

To create an incremental locality-aware backup from a full locality-aware backup, the syntax the same as it is for regular incremental backups. If you backup to a destination already containing a full backup, an incremental backup will be appended to the full backup in a subdirector. For example:

icon/buttons/copy
> BACKUP TO
      ('s3://us-east-bucket?COCKROACH_LOCALITY=default', 's3://us-west-bucket?COCKROACH_LOCALITY=region%3Dus-west');
Note:

It is recommend that the same localities be included for every incremental backup in the series of backups; however, only the default locality is required. When restoring from an incremental locality-aware backup, you need to include every locality ever used, even if it was only used once.

And if you want to explicitly control where your incremental backups go, use the INCREMENTAL FROM syntax:

icon/buttons/copy
> BACKUP TO (${uri_1}, ${uri_2}, ...) INCREMENTAL FROM ${full_backup_uri} ...;

For example, to create an incremental locality-aware backup from a previous full locality-aware backup where nodes with the locality region=us-west write backup files to s3://us-west-bucket, and all other nodes write to s3://us-east-bucket by default, run:

icon/buttons/copy
> BACKUP TO \
('s3://us-east-bucket/test-cluster-2019-10-08-nightly?COCKROACH_LOCALITY=default', 's3://us-west-bucket/test-cluster-2019-10-08-nightly?COCKROACH_LOCALITY=region%3Dus-west')
INCREMENTAL FROM 's3://us-east-bucket/test-cluster-2019-10-07-weekly';
Note:

Note that only the backup URIs you set as the default when you created the previous backup(s) are needed in the INCREMENTAL FROM clause of your incremental BACKUP statement (as shown in the example). This is because the default destination for a locality-aware backup contains a manifest file that contains all the metadata required to create additional incremental backups based on it.

Restore from an incremental locality-aware backup

A locality-aware backup URI can also be used in place of any incremental backup URI in RESTORE.

For example, an incremental locality-aware backup created with

icon/buttons/copy
> BACKUP TO
      ('s3://us-east-bucket/database-bank-2019-10-08-nightly?COCKROACH_LOCALITY=default', 's3://us-west-bucket/database-bank-2019-10-08-nightly?COCKROACH_LOCALITY=region%3Dus-west')
  INCREMENTAL FROM
      's3://us-east-bucket/database-bank-2019-10-07-weekly';

can be restored by running:

icon/buttons/copy
> RESTORE FROM
    ('s3://us-east-bucket/database-bank-2019-10-07-weekly', 's3://us-west-bucket/database-bank-2019-10-07-weekly'),
      ('s3://us-east-bucket/database-bank-2019-10-08-nightly', 's3://us-west-bucket/database-bank-2019-10-08-nightly');
Note:

When restoring from an incremental locality-aware backup, you need to include every locality ever used, even if it was only used once.

Create an incremental locality-aware backup from a previous locality-aware backup

To make an incremental locality-aware backup from another locality-aware backup, the syntax is as follows:

icon/buttons/copy
> BACKUP TO ({uri_1}, {uri_2}, ...) INCREMENTAL FROM {full_backup}, {incr_backup_1}, {incr_backup_2}, ...;

For example, let's say you normally run a full backup every Monday, followed by incremental backups on the remaining days of the week.

By default, all nodes send their backups to your s3://us-east-bucket, except for nodes in region=us-west, which will send their backups to s3://us-west-bucket.

If today is Thursday, October 10th, 2019, your BACKUP statement will list the following backup URIs:

  • The full locality-aware backup URI from Monday, e.g.,
    • s3://us-east-bucket/test-cluster-2019-10-07-weekly
  • The incremental backup URIs from Tuesday and Wednesday, e.g.,
    • s3://us-east-bucket/test-cluster-2019-10-08-nightly
    • s3://us-east-bucket/test-cluster-2019-10-09-nightly

Given the above, to take the incremental locality-aware backup scheduled for today (Thursday), you will run:

icon/buttons/copy
> BACKUP TO
      ('s3://us-east-bucket/test-cluster-2019-10-10-nightly?COCKROACH_LOCALITY=default', 's3://us-west-bucket/test-cluster-2019-10-10-nightly?COCKROACH_LOCALITY=region%3Dus-west')
  INCREMENTAL FROM
      's3://us-east-bucket/test-cluster-2019-10-07-weekly',
      's3://us-east-bucket/test-cluster-2019-10-08-nightly',
    's3://us-east-bucket/test-cluster-2019-10-09-nightly';
Note:

Note that only the backup URIs you set as the default when you created the previous backup(s) are needed in the INCREMENTAL FROM clause of your incremental BACKUP statement (as shown in the example). This is because the default destination for a locality-aware backup contains a manifest file that contains all the metadata required to create additional incremental backups based on it.

Manually restore zone configurations from a locality-aware backup

During a locality-aware restore, some data may be temporarily located on another node before it is eventually relocated to the appropriate node. To avoid this, you need to manually restore zone configurations first:

Once the locality-aware restore has started, pause the restore:

icon/buttons/copy
> PAUSE JOB 27536791415282;

The system.zones table stores your cluster's zone configurations, which will prevent the data from rebalancing. To restore them, you must restore the system.zones table into a new database because you cannot drop the existing system.zones table:

icon/buttons/copy
> RESTORE system.zones \
FROM 'azure://acme-co-backup?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co' \
WITH into_db = 'newdb';

After it's restored into a new database, you can write the restored zones table data to the cluster's existing system.zones table:

icon/buttons/copy
> INSERT INTO system.zones SELECT * FROM newdb.zones;

Then drop the temporary table you created:

icon/buttons/copy
> DROP TABLE newdb.zones;

Then, resume the restore:

icon/buttons/copy
> RESUME JOB 27536791415282;

Encrypted backup and restore

New in v20.1: You can encrypt full or incremental backups by using the encryption_passphrase option. Files written by the backup (including BACKUP manifests and data files) are encrypted using the specified passphrase to derive a key. To restore the encrypted backup, the same encryption_passphrase option (with the same passphrase) must be included in the RESTORE statement.

When used with incremental backups, the encryption_passphrase option is applied to all the backup file URLs, which means the same passphrase must be used when appending another incremental backup to an existing backup. Similarly, when used with locality-aware backups, the passphrase provided is applied to files in all localities.

Encryption is done using AES-256-GCM, and GCM is used to both encrypt and authenticate the files. A random salt is used to derive a once-per-backup AES key from the specified passphrase, and then a random initialization vector is used per-file. CockroachDB uses PBKDF2 with 64,000 iterations for the key derivation.

Note:

BACKUP and RESTORE will use more memory when using encryption, as both the plain-text and cipher-text of a given file are held in memory during encryption and decryption.

For an example of an encrypted backup, see Create an encrypted backup.

Create an encrypted backup

New in v20.1: To create an encrypted backup, use the encryption_passphrase option:

icon/buttons/copy
> BACKUP TO \
'gs://acme-co-backup/test-cluster' \
WITH encryption_passphrase = 'password123';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+---------
  543214409874014209 | succeeded |                  1 | 2597 |          1028 | 467701
(1 row)

To restore, use the same encryption_passphrase:

Restore from an encrypted backup

New in v20.1: To decrypt an encrypted backup, use the encryption_passphrase option and the same passphrase that was used to create the backup.

For example, the encrypted backup created in the previous example:

icon/buttons/copy
> BACKUP TO \
'gs://acme-co-backup/test-cluster' \
WITH encryption_passphrase = 'password123';

Can be restored with:

icon/buttons/copy
> RESTORE FROM \
'gs://acme-co-backup/test-cluster' \
WITH encryption_passphrase = 'password123';
        job_id       |  status   | fraction_completed | rows | index_entries | bytes
---------------------+-----------+--------------------+------+---------------+---------
  543217488273801217 | succeeded |                  1 | 2597 |          1028 | 467701
(1 row)

Other restore usages

Restore tables into a different database

By default, tables and views are restored to the database they originally belonged to. However, using the into_db option, you can control the target database.

icon/buttons/copy
> RESTORE bank.customers \
FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
WITH into_db = 'newdb';

Remove the foreign key before restore

By default, tables with Foreign Key constraints must be restored at the same time as the tables they reference. However, using the skip_missing_foreign_keys option you can remove the Foreign Key constraint from the table and then restore it.

icon/buttons/copy
> RESTORE bank.accounts \
FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
WITH skip_missing_foreign_keys;

Restoring users from system.users backup

The system.users table stores your cluster's usernames and their hashed passwords. To restore them, you must restore the system.users table into a new database because you cannot drop the existing system.users table.

After it's restored into a new database, you can write the restored users table data to the cluster's existing system.users table.

icon/buttons/copy
> RESTORE system.users \
FROM 'azure://acme-co-backup/table-users-2017-03-27-full?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co' \
WITH into_db = 'newdb';
icon/buttons/copy
> INSERT INTO system.users SELECT * FROM newdb.users;
icon/buttons/copy
> DROP TABLE newdb.users;

See also


Yes No
On this page

Yes No