BACKUP - CockroachDB

CockroachDB’s BACKUP allows you to create of your cluster’s schema and data that are consistent as of a given timestamp. You can back up a full cluster, which includes:

Relevant system tables
All
All (which automatically includes their )
All
All

You can also backup:

An individual database, which includes all of its tables and views.
An individual table, which includes its indexes and views. BACKUP only backs up entire tables; it does not support backing up subsets of a table.

Because CockroachDB is designed with high fault tolerance, these backups are designed primarily for disaster recovery (i.e., if your cluster loses a majority of its nodes) through . Isolated issues (such as small-scale node outages) do not require any intervention. You can check that backups in external storage are valid by using a command. To view the contents of an backup created with the BACKUP statement, use .

We recommend using scheduled backups to automate daily backups of your cluster.

The BACKUP ... TO and RESTORE ... FROM {storage_uri} syntax has been removed from CockroachDB v24.3 and later.For details on the syntax to run BACKUP and RESTORE, refer to the backup and restore examples.

Considerations

Full cluster backups include . When you a full cluster backup that includes a license, the license is also restored.
You cannot restore a backup of a multi-region database into a single-region database.
Exclude a table’s row data from a backup using the parameter.
BACKUP is a blocking statement. To run a backup job asynchronously, use the DETACHED option. See the options below.
During a , any present on the destination cluster are overwritten with the zone configurations from the . If no customized zone configurations were on the cluster when the backup was taken, then after the restore the destination cluster will use the zone configuration from the .

Storage considerations

Cockroach Labs tests functionality with AWS S3, Google Cloud Storage (GCS), and Azure Blob Storage. Other S3-compatible storage solutions are untested, but common compatibility issues in v24.3 and later may be fixed by adding the AWS_SKIP_CHECKSUM option to the S3 URLs.
is not supported for BACKUP and RESTORE.
Modifying backup files in the storage location could invalidate a backup, and therefore, prevent a restore. In v22.1 and later, we recommend enabling in your cloud storage bucket.

Cockroach Labs does not officially support untested storage systems. If you encounter issues when using unsupported S3-compatible storage, drivers, or frameworks, contact the maintainer.

You can test the connection from each node in the cluster to your external storage with the statement.

Required privileges

Starting in v22.2, CockroachDB introduces a new that provides finer control over a user’s privilege to work with the database, including taking backups.There is continued support for the legacy privilege model for backups in v22.2, however it will be removed in a future release of CockroachDB. We recommend implementing the new privilege model that follows in this section for all new and existing backups.

You can the BACKUP privilege to a user or role depending on the type of backup:

Backup	Privilege
Cluster	Grant a user the `BACKUP` . For example, `GRANT SYSTEM BACKUP TO user;`.
Database	Grant a user the `BACKUP` privilege on the target database. For example, `GRANT BACKUP ON DATABASE test_db TO user;`.
Table	Grant a user the `BACKUP` privilege at the table level. This gives the user the privilege to back up the schema and all user-defined types that are associated with the table. For example, `GRANT BACKUP ON TABLE test_db.table TO user;`.

The listed privileges do not cascade to objects lower in the schema tree. For example, if you are granted database-level BACKUP privileges, this does not give you the privilege to back up a table. If you need the BACKUP privilege on a database to apply to all newly created tables in that database, use . You can add BACKUP to the user or role’s default privileges with .

You can grant the BACKUP privilege to a user or role without the SELECT privilege on a table. As a result, these users will be able to take backups, but they will not be able to run a SELECT query on that data directly. However, these users could still read this data indirectly, by restoring it from any backups they produce.

Members of the can run all three types of backups (cluster, database, and table) without the need to grant a specific BACKUP privilege. However, we recommend using the BACKUP privilege model to create users or roles and grant them BACKUP privileges as necessary for stronger access control.

Privileges for managing a backup job

To manage a backup job with , , or , users must have at least one of the following:

Be a member of the .
The .

To view a backup job with , users must have at least one of the following:

The , which allows you to view all jobs (including admin-owned jobs).
Be a member of the .
The .

See for detail on granting privileges to a role or user.

Required privileges using the legacy privilege model

The following details the legacy privilege model that CockroachDB supports in v22.2 and earlier. Support for this privilege model will be removed in a future release of CockroachDB:

can only be run by members of the . By default, the root user belongs to the admin role.
For all other backups, the user must have on all objects being backed up. Database backups require CONNECT privileges, and table backups require SELECT privileges. Backups of user-defined schemas, or backups containing user-defined types, require USAGE privileges.

See the Required privileges section for the updated privilege model.

Destination privileges

You can grant a user the EXTERNALIOIMPLICITACCESS . Either the EXTERNALIOIMPLICITACCESS system-level privilege or the role is required for the following scenarios:

Interacting with a cloud storage resource using .
Using a custom endpoint on S3.
Using the command.

No special privilege is required for:

Interacting with an Amazon S3 and Google Cloud Storage resource using SPECIFIED credentials. Azure Storage is always SPECIFIED by default.
Using storage.

We recommend using . You also need to ensure that the permissions at your storage destination are configured for the operation. See for a list of the necessary permissions that each bulk operation requires.

While Cockroach Labs actively tests Amazon S3, Google Cloud Storage, and Azure Storage, we do not test S3-compatible services (e.g., MinIO, Red Hat Ceph).

Synopsis

Parameters

CockroachDB stores full backups in a backup collection. Each full backup in a collection may also have incremental backups. For more detail on this, see .

Parameter	Description
`targets`	Back up the listed targets.
`subdirectory`	The name of the specific backup (e.g., `2021/03/23-213101.37`) in the collection to which you want to add an . To view available backup subdirectories, use . If the backup `subdirectory` is not provided, incremental backups will be stored in the default `/incrementals` directory at the root of the collection URI. See the Create incremental backups example. Warning: If you use an arbitrary `STRING` as the subdirectory, a new full backup will be created, but it will never be shown in `SHOW BACKUPS IN`. We do not recommend using arbitrary strings as subdirectory names.
`LATEST`	Append an incremental backup to the latest completed full backup’s subdirectory.
`collectionURI`	The URI where you want to store the backup. (Or, the default locality for a locality-aware backup.) The storage URI for each must be unique. You will encounter an error if you run multiple backup collections to the same storage URI. For information about this URL structure, see Backup File URLs.
`localityURI`	The URI containing the `COCKROACH_LOCALITY` parameter for a non-default locality that is part of a single locality-aware backup.
`timestamp`	Back up data as it existed as of . The `timestamp` must be more recent than your data’s garbage collection TTL (which is controlled by the ).
`backup_options`	Control the backup behavior with a comma-separated list of these options.

Targets

Target	Description
N/A	Back up the cluster. For an example of a full cluster backup, refer to Back up a cluster.
`DATABASE {database_name} [, ...]`	The names of the databases to back up. A database backup includes all tables and views in the database. Refer to Back Up a Database.
`TABLE {table_name} [, ...]`	The names of the tables and to back up. Refer to Back Up a Table or View.

Query parameters

Query parameter	Value	Description
`ASSUME_ROLE`		Pass the ARN of the role to assume. Use in combination with `AUTH=implicit` or `specified`. `external_id`: Use as a value to `ASSUME_ROLE` to specify the external ID for third-party access to your S3 bucket. Refer to for setup details.
`AUTH`		The authentication parameter can define either `specified` (default) or `implicit` authentication. To use `specified` authentication, pass your Service Account credentials with the URI. To use `implicit` authentication, configure these credentials via an environment variable. Refer to the page for examples of each of these.
`AWS_ENDPOINT`		Specify a custom endpoint for Amazon S3 or S3-compatible services. Use to define a particular region or a Virtual Private Cloud (VPC) endpoint.
`AWS_SESSION_TOKEN`		(Optional) Use as part of temporary security credentials when accessing AWS S3. For more information, refer to Amazon’s guide on temporary credentials.
`AWS_USE_PATH_STYLE`		Change the URL format to path style from the default AWS S3 virtual-hosted–style URLs when connecting to Amazon S3 or S3-compatible services.
`COCKROACH_LOCALITY`	Key-value pairs	Define a locality-aware backup with a list of URIs using `COCKROACH_LOCALITY`. The value is either `default` or a single locality key-value pair, such as `region=us-east`. At least one `COCKROACH_LOCALITY` must the `default` per locality-aware backup. Refer to for more detail and examples.
`S3_STORAGE_CLASS`		Specify the Amazon S3 storage class for files created by the backup job. Refer to Back up with an S3 storage class for the available classes and an example.

If you are creating an with or parameters, you must pass them in uppercase otherwise you will receive an unknown query parameters error.

Options

Option	Value	Description
`revision_history`	/ None	Create a backup with full , which records every change made to the cluster within the garbage collection period leading up to and including the given timestamp. You can specify a backup with revision history without any value e.g., `WITH revision_history`. Or, you can explicitly define `WITH revision_history = 'true' / 'false'`. `revision_history` defaults to `true` when used with `BACKUP` or `CREATE SCHEDULE FOR BACKUP`. A value is required when using .
`encryption_passphrase`		The passphrase used to (`BACKUP` manifest and data files) that the `BACKUP` statement generates. This same passphrase is needed to decrypt the file when it is used to and to list the contents of the backup when using . There is no practical limit on the length of the passphrase.
`detached`	/ None	When a backup runs in `detached` mode, it will execute asynchronously. The job ID will be returned after the backup completes. Note that with `detached` specified, further job information and the job completion status will not be returned. For more on the differences between the returned job data, see the . To check on the job status, use the statement. Backups running on a have the `detached` option applied implicitly. To run a backup within a , use the `detached` option.
`EXECUTION LOCALITY`	Key-value pairs	Restricts the execution of the backup to nodes that match the defined locality filter requirements. For example, `WITH EXECUTION LOCALITY = 'region=us-west-1a,cloud=aws'`. Refer to for usage and reference detail.
`kms`		The URI of the cryptographic key stored in a key management service (KMS), or a comma-separated list of key URIs, used to . Refer to . The key or keys are used to encrypt the manifest and data files that the `BACKUP` statement generates and to decrypt them during a operation, and to list the contents of the backup when using . AWS KMS, Google Cloud KMS, and Azure Key Vault are supported.
`STRICT`	/ None	A locality-aware backup running in `STRICT` mode fails if it cannot back up data from a node with a locality tag to a bucket with a matching locality tag.

Backup file URLs

CockroachDB uses the URL provided to construct a secure API call to the service you specify. The URL structure depends on the type of file storage you are using. For more information, see the following:

You can create an external connection to represent an external storage or sink URI. This allows you to specify the external connection’s name in statements rather than the provider-specific URI. For detail on using external connections, see the page.

Backups support cloud object locking and Amazon S3 storage classes. For more detail, see .

Functional details

Object dependencies

Dependent objects must be restored backed up at the same time as the objects they depend on. When you back up a table, it will not include any dependent tables, , or . For example, if you back up v that depends on table t, it will only back up v, not t. When you try to restore v, the restore will fail because the referenced table is not present in the backup. Alternatively, you can pass a skip option with RESTORE to skip the dependency instead: Dependent object | Depends on | Skip option -------|------------+------------- Table with constraints | The table it REFERENCES. | Table with a | The sequence. | | The tables used in the view’s SELECT statement. | We recommend treating tables with , which contribute to , or that use sequences or user-defined types as a single unit with their dependencies. While you can restore individual tables, you may find that backing up and restoring at the database level is more convenient.

To exclude a table’s row data from a backup, use the exclude_data_from_backup parameter with or .For more detail, see the example.

Users and privileges

The system.users table stores your users and their passwords. To restore your users and privilege , do a cluster backup and restore the cluster to a fresh cluster with no user data. You can also backup the system.users table, and then use .

Performance

The backup job process minimizes its impact to the cluster’s performance with:

Even distribution of work to a node that has a replica of the range to back up. If a locality filter is specified, work is distributed to a node from those that match the locality filter and has the most locality tiers in common with a node that has a replica. Refer to the page for a detailed explanation of how a backup job works.
Integration with elastic CPU limiter by default, which helps to minimize the impact backups have on foreground traffic. This integration will limit the amount of CPU time used by a backup thereby allowing foreground SQL traffic to continue largely unaffected.

A backup job, like any read, cannot export a range if the range contains an . While it is important to minimize the impact of bulk, background jobs like BACKUP on your foreground traffic, it is still crucial for backups to finish (in order to maintain your recovery point objective (RPO)). Unlike a normal that will block until any uncommitted writes it encounters are resolved, a backup job’s read request will be allotted a fixed amount of CPU time to read the required keys and values. Once the backup’s read request has exhausted this time, the backup will resume once it has been allocated more CPU time. This process allows for other requests, such as foreground SQL traffic to continue, almost unaffected, because there is a cap on how much CPU a backup job will take. You can monitor your cluster’s on the . To monitor your backup jobs, refer to the page. For a more technical explanation of elastic CPU, refer to the Rubbing control theory on the Go scheduler blog post. We recommend always starting backups with a specific at least 10 seconds in the past. For example:

BACKUP...AS OF SYSTEM TIME '-10s';

This improves performance by decreasing the likelihood that the BACKUP will be . However, because returns historical data, your reads might be stale. Taking backups with AS OF SYSTEM TIME '-10s' is a good best practice to reduce the number of still-running transactions you may encounter, because a backup will eventually push the contending transactions to a higher timestamp, which causes the transactions to retry. A backup job will initially ask individual ranges to back up but to skip if they encounter an intent. Any range that is skipped is placed at the end of the queue. When a backup job has completed its initial pass and is revisiting ranges, it will ask any range that did not resolve within the given time limit (default 1 minute) to attempt to resolve any intents that it encounters and to not skip. Additionally, the backup’s read transaction priority is eventually set to high. This will result in contending transactions being pushed and retried at a higher timestamp. If a backup job encounters too many retryable errors, it will enter a with the most recent error, which allows subsequent backups the chance to succeed. Refer to the page for metrics to track backup failures.

Backup performance configuration

Cluster settings provide a means to tune a CockroachDB cluster. The following cluster settings are helpful for configuring backup files and performance:

`bulkio.backup.file_size`

Set a target for the amount of backup data written to each backup file. This is the maximum target size the backup will reach, but it is possible files of a smaller size are created during the backup job. Note that if you lower bulkio.backup.file_size below the default, it will cause the backup job to create many small SST files, which could impact a restore job’s performance because it will need to keep track of so many small files. Default: 128 MiB

`cloudstorage.azure.concurrent_upload_buffers`

Improve the speed of backups to Azure Storage by increasing cloudstorage.azure.concurrent_upload_buffers to 3. This setting configures the number of concurrent buffers that are used during file uploads to Azure Storage. Note that the higher this setting the more data that is held in memory, which can increase the risk of OOMs if there is not sufficient memory on each node. Default: 1

Cluster settings for cloud storage

The following cluster settings limit the read and write rates to . A user may choose to use these settings if their backups overwhelm the network. These settings limit throughput and as a result backups and will take longer. The designated <providers include s3, gs, and azure.

`cloudstorage.<provider.write.node_rate_limit`

Limit the number of bytes per second per node across operations writing to the designated cloud storage provider if non-zero. Default: unlimited, 0 B

`cloudstorage.<provider.write.node_burst_limit`

Limit the number of bytes per second per node handled concurrently across operations writing to the designated cloud storage provider if non-zero. Default: unlimited, 0 B

`cloudstorage.<provider.read.node_rate_limit`

Limit the number of bytes per second per node across operations reading to the designated cloud storage provider if non-zero. Default: unlimited, 0 B

`cloudstorage.<provider.read.node_burst_limit`

Limit the number of bytes per second per node handled concurrently across operations reading to the designated cloud storage provider if non-zero. Default: unlimited, 0 B For a complete list, including all cluster settings related to backups, see the page.

Viewing and controlling backups jobs

After CockroachDB successfully initiates a backup, it registers the backup as a job, and you can do the following:

Action	SQL Statement
View the backup status
Pause the backup
Resume the backup
Cancel the backup

You can also visit the of the DB Console to view job details. The BACKUP statement will return when the backup is finished or if it encounters an error.

The presence of the BACKUP MANIFEST file in the backup subdirectory is an indicator that the backup job completed successfully.

Backup compactions

New in v26.2: This capability is available in for self-hosted clusters in v26.2.

Backup compactions automatically merge incremental backups, allowing you to take up to 400 incrementals between full backups instead of the usual 48-backup limit. This maintains the same RPO while reducing storage costs. Enabling backup compactions also improves restore performance and is required for using WITH EXPERIMENTAL COPY.

How it works

When enabled for scheduled backups, compaction jobs automatically trigger when the number of backups in the backup chain—the full backup and its incrementals—reaches a configured quantity. These jobs merge multiple consecutive incremental backups into a single compacted backup, reducing chain length while preserving data. Compactions run through to minimize impact on foreground operations. With backup compaction enabled, each backup chain can grow up to a maximum recommended size of 400 incremental backups, far more than the maximum of 48 recommended when using incremental backups without compaction.

Backup compactions improve restore performance by creating consolidated SST files from incremental backups. The original backups remain available for point-in-time restores. For fine-grained point-in-time restore, use .

Enable backup compactions

Set the backup.compaction.threshold cluster setting to 4:

SET CLUSTER SETTING backup.compaction.threshold = 4;

Higher values are not recommended.

0: Disabled (default)
4: Recommended, if enabling backup compactions. Compaction occurs when chain reaches length 4 (1 full + 3 incrementals)

Compactions only apply to scheduled backups with full and incremental backups. Manual BACKUP statements do not trigger compactions. In addition to full and incremental backups, compactions also support locality-aware, revision history, and encrypted backups.

For example:

SET CLUSTER SETTING backup.compaction.threshold = 4;

CREATE SCHEDULE hourly_backups
FOR BACKUP DATABASE movr INTO 'external://backup_s3'
RECURRING '@hourly'
FULL BACKUP '@weekly';

This configuration runs hourly incrementals with weekly full backups. Each time the chain reaches 4, compaction automatically merges incrementals.

Examples

Per our guidance in the Performance section, we recommend starting backups from a time at least 10 seconds in the past using . The examples in this section use one of the following storage URIs:

External connections, which allow you to represent an external storage or sink URI. You can then specify the external connection’s name in statements rather than the provider-specific URI. For detail on using external connections, see the page.
Amazon S3 connection strings with the default AUTH=specified parameter. For guidance on using AUTH=implicit authentication with Amazon S3 buckets instead, read .

For guidance on connecting to other storage options or using other authentication parameters instead, read . If you need to limit the control specific users have over your storage buckets, see for setup instructions. The BACKUP ... TO and RESTORE ... FROM {storage_uri} syntax has been removed from CockroachDB v24.3 and later. For details on the syntax to run BACKUP and RESTORE, refer to the backup and restore examples.

Back up a cluster

To take a of a cluster:

BACKUP INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';

You will encounter an error if you run multiple to the same storage URI. Backup collections can contain multiple full and incremental backups, but each collection’s URI must be unique. If you are using backup schedules, each schedule must have a unique URI.

Back up a database

To take a of a single database:

BACKUP DATABASE bank INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';

To take a of multiple databases:

BACKUP DATABASE bank, employees INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';

Back up a table or view

To take a of a single table or view:

BACKUP bank.customers INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';

To take a of multiple tables:

BACKUP bank.customers, bank.accounts INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';

Back up all tables in a schema

To back up all tables in a , use a wildcard (*) with the schema name:

BACKUP test_schema.* INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s';

Alternatively, use a : database.schema.*. With this syntax, schemas will be resolved before databases. test_object.* will resolve to a schema of test_object within the set current database before matching to a database of test_object. If a database and schema have the same name, such as bank.bank, running BACKUP bank.* will result in the schema resolving first. All the tables within that schema will be backed up. However, if this were to be run from a different database that does not have a bank schema, all tables in the bank database will be backed up. See for more details on how naming hierarchy and name resolution work in CockroachDB.

Create incremental backups

When a BACKUP statement specifies an existing subdirectory in the collection, explicitly or via the LATEST keyword, an incremental backup will be added to the default /incrementals directory at the root of the storage location. To take an incremental backup using the LATEST keyword:

BACKUP INTO LATEST IN 'external://backup_s3' AS OF SYSTEM TIME '-10s';

To store the backup in an existing subdirectory in the collection:

BACKUP INTO {'subdirectory'} IN 'external://backup_s3' AS OF SYSTEM TIME '-10s';

If you intend to take a full backup, we recommend running BACKUP INTO {collectionURI} without specifying a subdirectory.

Run a backup asynchronously

Use the DETACHED option to execute the backup asynchronously:

BACKUP INTO 'external://backup_s3' AS OF SYSTEM TIME '-10s' WITH DETACHED;

The job ID is returned after the backup completes:

        job_id
----------------------
  592786066399264769
(1 row)

Without the DETACHED option, BACKUP will block the SQL connection until the job completes. Once finished, the job status and more detailed job data is returned:

job_id             |  status   | fraction_completed | rows | index_entries | bytes
-------------------+-----------+--------------------+------+---------------+--------
652471804772712449 | succeeded |                  1 |   50 |             0 |  4911
(1 row)

Back up with an S3 storage class

To associate your backup objects with a in your Amazon S3 bucket, use the S3_STORAGE_CLASS parameter with the class. For example, the following S3 connection URI specifies the INTELLIGENT_TIERING storage class:

BACKUP DATABASE movr INTO 's3://{BUCKET NAME}?AWS_ACCESS_KEY_ID={KEY ID}&AWS_SECRET_ACCESS_KEY={SECRET ACCESS KEY}&S3_STORAGE_CLASS=INTELLIGENT_TIERING' AS OF SYSTEM TIME '-10s';

To use an external connection URI to back up to cloud storage with an associated S3 storage class, you need to include the S3_STORAGE_CLASS parameter when you . Use the parameter to set one of these storage classes listed in Amazon’s documentation. For more general usage information, see Amazon’s Using Amazon S3 storage classes documentation.

are not compatible with the S3 Glacier Flexible Retrieval or Glacier Deep Archive storage classes. Incremental backups require the reading of previous backups on an ad-hoc basis, which is not possible with backup files already in Glacier Flexible Retrieval or Glacier Deep Archive. This is because these storage classes do not allow immediate access to an S3 object without first restoring the archived objects to its S3 bucket.Refer to for more detail.

Advanced examples

For examples of advanced BACKUP and RESTORE use cases, see:

​Considerations

​Storage considerations

​Required privileges

​Privileges for managing a backup job

​Required privileges using the legacy privilege model

​Destination privileges

​Synopsis

​Parameters

​Targets

​Query parameters

​Options

​Backup file URLs

​Functional details

​Object dependencies

​Users and privileges

​Performance

​Backup performance configuration

​bulkio.backup.file_size

​cloudstorage.azure.concurrent_upload_buffers

​Cluster settings for cloud storage

cloudstorage.<provider.write.node_rate_limit

cloudstorage.<provider.write.node_burst_limit

cloudstorage.<provider.read.node_rate_limit

cloudstorage.<provider.read.node_burst_limit

​Viewing and controlling backups jobs

​Backup compactions

​How it works

​Enable backup compactions

​Examples

​Back up a cluster

​Back up a database

​Back up a table or view

​Back up all tables in a schema

​Create incremental backups

​Run a backup asynchronously

​Back up with an S3 storage class

​Advanced examples

​See also

Considerations

Storage considerations

Required privileges

Privileges for managing a backup job

Required privileges using the legacy privilege model

Destination privileges

Synopsis

Parameters

Targets

Query parameters

Options

Backup file URLs

Functional details

Object dependencies

Users and privileges

Performance

Backup performance configuration

`bulkio.backup.file_size`

`cloudstorage.azure.concurrent_upload_buffers`

Cluster settings for cloud storage

`cloudstorage.<provider.write.node_rate_limit`

`cloudstorage.<provider.write.node_burst_limit`

`cloudstorage.<provider.read.node_rate_limit`

`cloudstorage.<provider.read.node_burst_limit`

Viewing and controlling backups jobs

Backup compactions

How it works

Enable backup compactions

Examples

Back up a cluster

Back up a database

Back up a table or view

Back up all tables in a schema

Create incremental backups

Run a backup asynchronously

Back up with an S3 storage class

Advanced examples

See also