Take Full and Incremental Backups

Because CockroachDB is designed with high fault tolerance, backups are primarily needed for disaster recovery (i.e., if your cluster loses a majority of its nodes). Isolated issues (such as small-scale node outages) do not require any intervention. However, as an operational best practice, we recommend taking regular backups of your data.

There are two main types of backups:

You can use the BACKUP statement to efficiently back up your cluster's schemas and data to popular cloud services such as AWS S3, Google Cloud Storage, or NFS, and the RESTORE statement to efficiently restore schema and data as necessary. For more information, see Use Cloud Storage for Bulk Operations.

Tip:

You can create schedules for periodic backups in CockroachDB. We recommend using scheduled backups to automate daily backups of your cluster.

Full backups

Full backups are now available to both core and Enterprise users.

Full backups contain an un-replicated copy of your data and can always be used to restore your cluster. These files are roughly the size of your data and require greater resources to produce than incremental backups. You can take full backups as of a given timestamp. Optionally, you can include the available revision history in the backup.

In most cases, it's recommended to take nightly full backups of your cluster. A cluster backup allows you to do the following:

  • Restore table(s) from the cluster
  • Restore database(s) from the cluster
  • Restore a full cluster

Take a full backup

To do a cluster backup, use the BACKUP statement:

icon/buttons/copy
> BACKUP INTO '{destination}';

If it's ever necessary, you can use the RESTORE statement to restore a table:

icon/buttons/copy
> RESTORE TABLE bank.customers FROM '{destination}';

Or to restore a database:

icon/buttons/copy
> RESTORE DATABASE bank FROM '{destination}';

Or to restore your full cluster:

icon/buttons/copy
> RESTORE FROM '{destination}';
Note:

A full cluster restore can only be run on a target cluster that has never had user-created databases or tables.

Incremental backups

Note:

To take incremental backups, you need an Enterprise license.

If your cluster grows too large for nightly full backups, you can take less frequent full backups (e.g., weekly) with nightly incremental backups. Incremental backups are storage efficient and faster than full backups for larger clusters.

Incremental backups are smaller and faster to produce than full backups because they contain only the data that has changed since a base set of backups you specify (which must include one full backup, and can include many incremental backups). You can take incremental backups either as of a given timestamp or with full revision history.

Warning:

Incremental backups can only be created within the garbage collection period of the base backup's most recent timestamp. This is because incremental backups are created by finding which data has been created or modified since the most recent timestamp in the base backup––that timestamp data, though, is deleted by the garbage collection process.

You can configure garbage collection periods using the ttlseconds replication zone setting.

Take an incremental backup

Periodically run the BACKUP command to take a full backup of your cluster:

icon/buttons/copy
> BACKUP INTO '{destination}';

Then, create nightly incremental backups based off of the full backups you've already created. To append an incremental backup to the most recent full backup created in the given destination, use LATEST:

icon/buttons/copy
> BACKUP INTO LATEST IN '{destination}';
Note:

For an example on how to specify the destination of an incremental backup, see Incremental backups with explicitly specified destinations

If it's ever necessary, you can then use the RESTORE command to restore your cluster, database(s), and/or table(s). Restoring from incremental backups requires previous full and incremental backups. To restore from a destination containing the full backup, as well as the appended incremental backups:

icon/buttons/copy
> RESTORE FROM '{subdirectory}' IN '{destination}';
Note:

New in v21.1: RESTORE will re-validate indexes when incremental backups are created from an older version (v20.2.2 and earlier or v20.1.4 and earlier), but restored by a newer version (v21.1.0+). These earlier releases may have included incomplete data for indexes that were in the process of being created.

Incremental backups with explicitly specified destinations

To explicitly control where your incremental backups go, use the INCREMENTAL FROM syntax:

icon/buttons/copy
> BACKUP DATABASE bank INTO '{subdirectory}' IN '{destination}' \
    AS OF SYSTEM TIME '-10s' \
    WITH revision_history;
Note:

To take incremental backups, you need an Enterprise license.

Examples

The examples below provide connection strings to Amazon S3, Google Cloud Storage, and Azure Storage. For guidance on connecting to other storage options or using other authentication parameters, read Use Cloud Storage for Bulk Operations.

Automated full backups

Both core and Enterprise users can use backup scheduling for full backups of clusters, databases, or tables. To create schedules that only take full backups, include the FULL BACKUP ALWAYS clause. For example, to create a schedule for taking full cluster backups:

icon/buttons/copy
> CREATE SCHEDULE core_schedule_label
  FOR BACKUP INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}'
    RECURRING '@daily'
    FULL BACKUP ALWAYS
    WITH SCHEDULE OPTIONS first_run = 'now';
     schedule_id     |        name         | status |         first_run         | schedule |                                                                                       backup_stmt
---------------------+---------------------+--------+---------------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  588799238330220545 | core_schedule_label | ACTIVE | 2020-09-11 00:00:00+00:00 | @daily   | BACKUP INTO 's3://{BUCKET NAME}/{PATH}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}' WITH detached
(1 row)

Automated full backups

Both core and Enterprise users can use backup scheduling for full backups of clusters, databases, or tables. To create schedules that only take full backups, include the FULL BACKUP ALWAYS clause. For example, to create a schedule for taking full cluster backups:

icon/buttons/copy
> CREATE SCHEDULE core_schedule_label
  FOR BACKUP INTO 'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}'
    RECURRING '@daily'
    FULL BACKUP ALWAYS
    WITH SCHEDULE OPTIONS first_run = 'now';
     schedule_id     |        name         | status |         first_run         | schedule |                                                                                       backup_stmt
---------------------+---------------------+--------+---------------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  588799238330220545 | core_schedule_label | ACTIVE | 2020-09-11 00:00:00+00:00 | @daily   | BACKUP INTO 'azure://{CONTAINER NAME}/{PATH}?AZURE_ACCOUNT_NAME={ACCOUNT NAME}&AZURE_ACCOUNT_KEY={URL-ENCODED KEY}' WITH detached
(1 row)

Automated full backups

Both core and Enterprise users can use backup scheduling for full backups of clusters, databases, or tables. To create schedules that only take full backups, include the FULL BACKUP ALWAYS clause. For example, to create a schedule for taking full cluster backups:

icon/buttons/copy
> CREATE SCHEDULE core_schedule_label
  FOR BACKUP INTO 'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}'
    RECURRING '@daily'
    FULL BACKUP ALWAYS
    WITH SCHEDULE OPTIONS first_run = 'now';
     schedule_id     |        name         | status |         first_run         | schedule |                                                                                       backup_stmt
---------------------+---------------------+--------+---------------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  588799238330220545 | core_schedule_label | ACTIVE | 2020-09-11 00:00:00+00:00 | @daily   | BACKUP INTO 'gs://{BUCKET NAME}/{PATH}?AUTH=specified&CREDENTIALS={ENCODED KEY}' WITH detached
(1 row)

For more examples on how to schedule backups that take full and incremental backups, see CREATE SCHEDULE FOR BACKUP.

Advanced examples

For examples of advanced BACKUP and RESTORE use cases, see:

Note:

To take incremental backups, backups with revision history, locality-aware backups, and encrypted backups, you need an Enterprise license.

See also

YesYes NoNo