EXPORT statement exports tabular data or the results of arbitrary
SELECT statements to CSV files.
Using the CockroachDB distributed execution engine,
EXPORT parallelizes CSV creation across all nodes in the cluster, making it possible to quickly get large sets of data out of CockroachDB in a format that can be ingested by downstream systems. If you do not need distributed exports, you can use the non-enterprise feature to export tabular data in CSV format.
Export file location
You can use remote cloud storage (Amazon S3, Google Cloud Platform, etc.) to store the exported CSV data. Alternatively, you can use an HTTP server accessible from all nodes.
For simplicity's sake, it's strongly recommended to use cloud/remote storage for the data you want to export. Local files are supported; however, they must be accessible identically from all nodes in the cluster.
After the export has been initiated, you can cancel it with
EXPORTstatement cannot be used within a transaction.
Only members of the
admin role can run
EXPORT. By default, the
root user belongs to the
||Specify the URL of the file location where you want to store the exported CSV data.
Note: Exports do not generate unique names across exports, so each export should have a unique destination to avoid overwriting.
||Control your export's behavior with these options.|
||Specify the query whose result you want to export to CSV format.|
||Specify the name of the table you want to export to CSV format.|
Export file URL
You can specify the base directory where you want to store the exported .csv files. CockroachDB will create the export file(s) in the specified directory with programmatically generated names (e.g., n1.1.csv, n1.2.csv, n2.1.csv, ...). Each export should have a unique destination to avoid overwriting other exports.
URLs for the file directory location you want to export to must use the following format:
||N/A (see Example file URLs||
|Google Cloud 2||
|S3-compatible services 6||
If your environment requires an HTTP or HTTPS proxy server for outgoing connections, you can set the standard
HTTPS_PROXY environment variables when starting CockroachDB.
New in v20.1: If you cannot run a full proxy, you can disable external HTTP(S) access (as well as custom HTTP(S) endpoints) when performing bulk operations (e.g.,
RESTORE, etc.) by using the
--external-io-disable-http flag. You can also disable the use of implicit credentials when accessing external cloud storage services for various bulk operations by using the
1 If the
AUTHparameter is not provided, AWS connections default to
specifiedand the access keys must be provided in the URI parameters. If the
implicit, the access keys can be ommitted and the credentials will be loaded from the environment.
2 If the
AUTHparameter is not specified, the
cloudstorage.gs.default.keycluster setting will be used if it is non-empty, otherwise the
implicitbehavior is used. If the
implicit, all GCS connections use Google's default authentication strategy. If the
cloudstorage.gs.default.keycluster setting must be set to the contents of a service account file which will be used during authentication. If the
specified, GCS connections are authenticated on a per-statement basis, which allows the JSON key object to be sent in the
CREDENTIALSparameter. The JSON key object should be base64-encoded (using the standard encoding in RFC 4648).
3 You can create your own HTTP server with Caddy or nginx. A custom root CA can be appended to the system's default CAs by setting the
cloudstorage.http.custom_cacluster setting, which will be used when verifying certificates from HTTPS URLs.
4 The file system backup location on the NFS drive is relative to the path specified by the
--external-io-dirflag set while starting the node. If the flag is set to
disabled, then imports from local directories and NFS drives are disabled.
5 New in v20.1: Using a
nodeIDis required and the data files will be in the
externdirectory of the specified node. In most cases (including single-node clusters), using
nodelocal://1/<path>is sufficient. Use
selfif you do not want to specify a
nodeID, and the individual data files will be in the
externdirectories of arbitrary nodes; however, to work correctly, each node must have the
--external-io-dirflag point to the same NFS mount or other network-backed, shared storage.
6 A custom root CA can be appended to the system's default CAs by setting the
cloudstorage.http.custom_cacluster setting, which will be used when verifying certificates from an S3-compatible service.
AWS_REGIONparameter is optional since it is not a required parameter for most S3-compatible services. Specify the parameter only if your S3-compatible service requires it.
Example file URLs
You can control the
EXPORT process's behavior using any of the following key-value pairs as a
If not using comma as your column delimiter, you can specify another ASCII character as the delimiter.
|Value||The ASCII character that delimits columns in your rows|
|Example||To use tab-delimited values:
Convert SQL NULL values so they match the specified string.
|Value||The string that should be used to represent NULL values. To avoid collisions, it is important to pick
|Example||To use empty columns as NULL:
Export a table
> EXPORT INTO CSV 'azure://acme-co/customer-export-data?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co' WITH delimiter = '|' FROM TABLE bank.customers;
Export using a
> EXPORT INTO CSV 'azure://acme-co/customer-export-data?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co' FROM SELECT * FROM bank.customers WHERE id >= 100;
Non-distributed export using the SQL shell
$ cockroach sql -e "SELECT * from bank.customers WHERE id>=100;" --format=csv > my.csv
View a running export
View running exports by using
> SHOW QUERIES;
Cancel a running export
> CANCEL QUERY '14dacc1f9a781e3d0000000000000001';
EXPORT may fail with an error if the SQL statements are incompatible with DistSQL. In that case, use the non-enterprise feature to export tabular data in CSV format.