IMPORT

On this page

The IMPORT statement imports the following types of data into CockroachDB:

Tip:

IMPORT only works for creating new tables. For information on how to import into existing tables, see IMPORT INTO. Also, for instructions and working examples on how to migrate data from other databases, see the Migration Overview.

Warning:

IMPORT is a blocking statement and cannot be used within a transaction. Also, IMPORT cannot be used during a rolling upgrade.

Required privileges

Only members of the admin role can run IMPORT. By default, the root user belongs to the admin role.

Synopsis

Import a table from CSV or Avro

Import a database or table from dump file

Parameters

For import from CSV or Avro

Parameter	Description
`table_name`	The name of the table you want to import/create.
`table_elem_list`	The table schema you want to use.
`CREATE USING file_location`	If not specifying the table schema inline via `table_elem_list`, this is the URL of a SQL file containing the table schema.
`file_location`	The URL of a CSV file containing the table data. This can be a comma-separated list of URLs to CSV files or specified by a `*` wildcard character to include matching files under the specified path.
`WITH kv_option_list`	Control your import's behavior with these options.

For import from dump file

Parameter	Description
`table_name`	The name of the table you want to import/create. Use this when the dump file contains a specific table. Leave out `TABLE table_name FROM` when the dump file contains an entire database.
`import_format`	`PGDUMP`, `MYSQLDUMP`, or `DELIMITED DATA`
`file_location`	The URL of a dump file you want to import.
`WITH kv_option_list`	Control your import's behavior with these options.

Delimited data files

The DELIMITED DATA format can be used to import delimited data from any text file type, while ignoring characters that need to be escaped, like the following:

The file's delimiter (\t by default)
Double quotes (")
Newline (\n)
Carriage return (\r)

For examples showing how to use the DELIMITED DATA format, see the Examples section below.

Import file URLs

URLs for the files you want to import must use the format shown below. For examples, see Example file URLs.

[scheme]://[host]/[path]?[parameters]

Location	Scheme	Host	Parameters
Amazon	`s3`	Bucket name	`AUTH` ¹ (optional; can be `implicit` or `specified`), `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`
Azure	`azure`	N/A (see Example file URLs	`AZURE_ACCOUNT_KEY`, `AZURE_ACCOUNT_NAME`
Google Cloud ²	`gs`	Bucket name	`AUTH` (optional; can be `default`, `implicit`, or `specified`), `CREDENTIALS`
HTTP ³	`http`	Remote host	N/A
NFS/Local ⁴	`nodelocal`	`nodeID` or `self` ⁵ (see Example file URLs)	N/A
S3-compatible services ⁶	`s3`	Bucket name	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`, `AWS_REGION` ⁷ (optional), `AWS_ENDPOINT`

Note:

The location parameters often contain special characters that need to be URI-encoded. Use Javascript's encodeURIComponent function or Go language's url.QueryEscape function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters.

Note:

If your environment requires an HTTP or HTTPS proxy server for outgoing connections, you can set the standard HTTP_PROXY and HTTPS_PROXY environment variables when starting CockroachDB.

New in v20.1: If you cannot run a full proxy, you can disable external HTTP(S) access (as well as custom HTTP(S) endpoints) when performing bulk operations (e.g., BACKUP, RESTORE, etc.) by using the --external-io-disable-http flag. You can also disable the use of implicit credentials when accessing external cloud storage services for various bulk operations by using the --external-io-disable-implicit-credentials flag.

¹ If the AUTH parameter is not provided, AWS connections default to specified and the access keys must be provided in the URI parameters. If the AUTH parameter is implicit, the access keys can be omitted and the credentials will be loaded from the environment.
² If the AUTH parameter is not specified, the cloudstorage.gs.default.key cluster setting will be used if it is non-empty, otherwise the implicit behavior is used. If the AUTH parameter is implicit, all GCS connections use Google's default authentication strategy. If the AUTH parameter is default, the cloudstorage.gs.default.key cluster setting must be set to the contents of a service account file which will be used during authentication. If the AUTH parameter is specified, GCS connections are authenticated on a per-statement basis, which allows the JSON key object to be sent in the CREDENTIALS parameter. The JSON key object should be base64-encoded (using the standard encoding in RFC 4648).
³ You can create your own HTTP server with Caddy or nginx. A custom root CA can be appended to the system's default CAs by setting the cloudstorage.http.custom_ca cluster setting, which will be used when verifying certificates from HTTPS URLs.
⁴ The file system backup location on the NFS drive is relative to the path specified by the --external-io-dir flag set while starting the node. If the flag is set to disabled, then imports from local directories and NFS drives are disabled.
⁵ New in v20.1: Using a nodeID is required and the data files will be in the extern directory of the specified node. In most cases (including single-node clusters), using nodelocal://1/<path> is sufficient. Use self if you do not want to specify a nodeID, and the individual data files will be in the extern directories of arbitrary nodes; however, to work correctly, each node must have the --external-io-dir flag point to the same NFS mount or other network-backed, shared storage.
⁶ A custom root CA can be appended to the system's default CAs by setting the cloudstorage.http.custom_ca cluster setting, which will be used when verifying certificates from an S3-compatible service.
⁷ The AWS_REGION parameter is optional since it is not a required parameter for most S3-compatible services. Specify the parameter only if your S3-compatible service requires it.

Example file URLs

Location	Example
Amazon S3	`s3://acme-co/employees?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456`
Azure	`azure://employees?AZURE_ACCOUNT_KEY=123&AZURE_ACCOUNT_NAME=acme-co`
Google Cloud	`gs://acme-co`
HTTP	`http://localhost:8080/employees`
NFS/Local	`nodelocal://1/path/employees`, `nodelocal://self/nfsmount/backups/employees` ⁵

Import options

You can control the IMPORT process's behavior using any of the following optional key-value pairs as a kv_option. To set multiple import options, use a comma-separated list (see examples).

Key	Context	Value
`delimiter`	`CSV DATA`	The unicode character that delimits columns in your rows. Default: `,`.
`comment`	`CSV DATA`	The unicode character that identifies rows to skip.
`strict_quotes`	`CSV DATA`	Use if CSV import files have quotes (`""`) within rows to prevent multiple rows from being treated as single rows. Default: Off
`nullif`	`CSV DATA`, `DELIMITED DATA`	The string that should be converted to NULL.
`skip`	`CSV DATA`, `DELIMITED DATA`	The number of rows to be skipped while importing a file. Default: `'0'`.
`decompress`	General	The decompression codec to be used: `gzip`, `bzip`, `auto`, or `none`. Default: `'auto'`, which guesses based on file extension (`.gz`, `.bz`, `.bz2`). `none` disables decompression.
`skip_foreign_keys`	`PGDUMP`, `MYSQLDUMP`	Ignore foreign key constraints in the dump file's DDL. Default: Off. May be necessary to import a table with unsatisfied foreign key constraints from a full database dump.
`max_row_size`	`PGDUMP`	Override limit on line size. Default: 0.5MB. This setting may need to be tweaked if your Postgres dump file has extremely long lines, for example as part of a `COPY` statement.
`rows_terminated_by`	`DELIMITED DATA`	The unicode character to indicate new lines in the input file. Default: `\n`
`fields_terminated_by`	`DELIMITED DATA`	The unicode character used to separate fields in each input line. Default: `\t`
`fields_enclosed_by`	`DELIMITED DATA`	The unicode character that encloses fields. Default: `"`
`fields_escaped_by`	`DELIMITED DATA`	The unicode character, when preceding one of the above `DELIMITED DATA` options, to be interpreted literally.
`strict_validation`	`AVRO DATA`	Rejects Avro records that do not have a one-to-one mapping between Avro fields to the target CockroachDB schema. By default, CockroachDB ignores unknown Avro fields and sets missing SQL fields to `NULL`. CockroachDB will also attempt to convert the Avro field to the CockroachDB data type; otherwise, it will report an error.
`records_terminated_by`	`AVRO DATA`	The unicode character to indicate new lines in the input binary or JSON file. This is not needed for Avro OCF. Default: `\n`
`data_as_binary_records`	`AVRO DATA`	Use when importing a binary file containing Avro records. The schema is not included in the file, so you need to specify the schema with either the `schema` or `schema_uri` option.
`data_as_json_records`	`AVRO DATA`	Use when importing a JSON file containing Avro records. The schema is not included in the file, so you need to specify the schema with either the `schema` or `schema_uri` option.
`schema`	`AVRO DATA`	The schema of the Avro records included in the binary or JSON file. This is not needed for Avro OCF.
`schema_uri`	`AVRO DATA`	The URI of the file containing the schema of the Avro records include in the binary or JSON file. This is not needed for Avro OCF.

For examples showing how to use these options, see the Examples section below.

For instructions and working examples showing how to migrate data from other databases and formats, see the Migration Overview.

Requirements

Prerequisites

Before using IMPORT, you should have:

The schema of the table you want to import.
The data you want to import, preferably hosted on cloud storage. This location must be equally accessible to all nodes using the same import file location. This is necessary because the IMPORT statement is issued once by the client, but is executed concurrently across all nodes of the cluster. For more information, see the Import file location section below.

Import targets

Imported tables must not exist and must be created in the IMPORT statement. If the table you want to import already exists, you must drop it with DROP TABLE or use IMPORT INTO.

You can specify the target database in the table name in the IMPORT statement. If it's not specified there, the active database in the SQL session is used.

Create table

Your IMPORT statement must reference a CREATE TABLE statement representing the schema of the data you want to import. You have several options:

Specify the table's columns explicitly from the SQL client. For an example, see Import a table from a CSV file below.
Load a file that already contains a CREATE TABLE statement. For an example, see Import a Postgres database dump below.

We also recommend specifying all secondary indexes you want to use in the CREATE TABLE statement. It is possible to add secondary indexes later, but it is significantly faster to specify them during import.

Warning:

Column values cannot be generated by DEFAULT when importing; an import must include a value for every column specified in the IMPORT statement. To use DEFAULT values, your file must contain values for the column upon import, or you can add the column or alter the column after the table has been imported.

Note:

By default, the Postgres and MySQL import formats support foreign keys. However, the most common dependency issues during import are caused by unsatisfied foreign key relationships that cause errors like pq: there is no unique constraint matching given keys for referenced table tablename. You can avoid these issues by adding the skip_foreign_keys option to your IMPORT statement as needed. Ignoring foreign constraints will also speed up data import.

Available storage

Each node in the cluster is assigned an equal part of the imported data, and so must have enough temp space to store it. In addition, data is persisted as a normal table, and so there must also be enough space to hold the final, replicated data. The node's first-listed/default store directory must have enough available storage to hold its portion of the data.

On cockroach start, if you set --max-disk-temp-storage, it must also be greater than the portion of the data a node will store in temp space.

Import file location

We strongly recommend using cloud/remote storage (Amazon S3, Google Cloud Platform, etc.) for the data you want to import.

Local files are supported; however, they must be accessible to all nodes in the cluster using identical Import file URLs.

To import a local file, you have the following options:

Option 1. Run a local file server to make the file accessible from all nodes.
Option 2. Make the file accessible from each local node's store:
1. Create an extern directory on each node's store. The pathname will differ depending on the --store flag passed to cockroach start (if any), but will look something like /path/to/cockroach-data/extern/.
2. Copy the file to each node's extern directory.
3. Assuming the file is called data.sql, you can access it in your IMPORT statement using the following import file URL: 'nodelocal://1/data.sql'.

Table users and privileges

Imported tables are treated as new tables, so you must GRANT privileges to them.

Performance

All nodes are used during the import job, which means all nodes' CPU and RAM will be partially consumed by the IMPORT task in addition to serving normal traffic.
To improve performance, import at least as many files as you have nodes (i.e., there is at least one file for each node to import) to increase parallelism.
To further improve performance, order the data in the imported files by primary key and ensure the primary keys do not overlap between files.

Viewing and controlling import jobs

After CockroachDB initiates an import, you can view its progress with SHOW JOBS and on the Jobs page of the Admin UI, and you can control it with PAUSE JOB, RESUME JOB, and CANCEL JOB.

Note:

If initiated correctly, the statement returns when the import is finished or if it encounters an error. In some cases, the import can continue after an error has been returned (the error message will tell you that the import has resumed in the background).

Changed in v20.1: When resumed, paused imports now continue from their internally recorded progress instead of starting over.

Examples

Import a table from a CSV file

To specify the table schema in-line:

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('s3://acme-co/customers.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]&AWS_SESSION_TOKEN=[placeholder]')
;

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
;

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('gs://acme-co/customers.csv')
;

To use a file to specify the table schema:

Amazon S3:

> IMPORT TABLE customers
CREATE USING 's3://acme-co/customers-create-table.sql?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]'
CSV DATA ('s3://acme-co/customers.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]')
;

Azure:

> IMPORT TABLE customers
CREATE USING 'azure://acme-co/customer-create-table.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
;

Google Cloud:

> IMPORT TABLE customers
CREATE USING 'gs://acme-co/customers-create-table.sql'
CSV DATA ('gs://acme-co/customers.csv')
;

Note:

The column order in your schema must match the column order in the file being imported.

Import a table from multiple CSV files

Using a comma-separated list

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA (
    's3://acme-co/customers.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]',
    's3://acme-co/customers2.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder',
    's3://acme-co/customers3.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]',
    's3://acme-co/customers4.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]',
);

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA (
    'azure://acme-co/customer-import-data1.1.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co',
    'azure://acme-co/customer-import-data1.2.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co',
    'azure://acme-co/customer-import-data1.3.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co',
    'azure://acme-co/customer-import-data1.4.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co',
    'azure://acme-co/customer-import-data1.5.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co',    
);

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA (
    'gs://acme-co/customers.csv',
    'gs://acme-co/customers2.csv',
    'gs://acme-co/customers3.csv',
    'gs://acme-co/customers4.csv',
);

Using a wildcard

You can specify file patterns to match instead of explicitly listing every file. Paths are matched Use the * wildcard character to include matching files directly under the specified path. A wildcard can be used to include:

All files in a given directory (e.g.,s3://bucket-name/path/to/data/*)
All files in a given directory that end with a given string (e.g., s3://bucket-name/files/*.csv)
All files in a given directory that start with a given string (e.g., s3://bucket-name/files/data*)
All files in a given directory that start and end with a given string (e.g., s3://bucket-name/files/data*.csv)

These only match files directly under the specified path and do not descend into additional directories recursively.

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA (
    's3://acme-co/*?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]'
);

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA (
    'azure://acme-co/customer-import-data*?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'  
);

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA (
    'gs://acme-co/*'
);

Import a table from a TSV file

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('s3://acme-co/customers.tsv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]')
WITH
    delimiter = e'\t'
;

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.tsv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    delimiter = e'\t'
;

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('gs://acme-co/customers.tsv')
WITH
    delimiter = e'\t'
;

Skip commented lines

The comment option determines which Unicode character marks the rows in the data to be skipped.

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('s3://acme-co/customers.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]')
WITH
    comment = '#'
;

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    comment = '#'
;

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('gs://acme-co/customers.csv')
WITH
    comment = '#'
;

Skip first n lines

The skip option determines the number of header rows to skip when importing a file.

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('s3://acme-co/customers.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]')
WITH
    skip = '2'
;

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    skip = '2'
;

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('gs://acme-co/customers.csv')
WITH
    skip = '2'
;

Use blank characters as `NULL`

The nullif option defines which string should be converted to NULL.

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('s3://acme-co/employees.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]')
WITH
    nullif = ''
;

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    nullif = ''
;

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('gs://acme-co/customers.csv')
WITH
    nullif = ''
;

Import a compressed CSV file

CockroachDB chooses the decompression codec based on the filename (the common extensions .gz or .bz2 and .bz) and uses the codec to decompress the file during import.

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('s3://acme-co/employees.csv.gz?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]')
;

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.csv.gz?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
;

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('gs://acme-co/customers.csv.gz')
;

Optionally, you can use the decompress option to specify the codec to be used for decompressing the file during import:

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('s3://acme-co/employees.csv.gz?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]')
WITH
    decompress = 'gzip'
;

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('azure://acme-co/customer-import-data.csv.gz.latest?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
    decompress = 'gzip'
;

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
CSV DATA ('gs://acme-co/customers.csv.gz')
WITH
    decompress = 'gzip'
;

Import a Postgres database dump

Amazon S3:

> IMPORT PGDUMP 's3://your-external-storage/employees.sql?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]';

Azure:

> IMPORT PGDUMP 'azure://acme-co/employees.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co';

Google Cloud:

> IMPORT PGDUMP 'gs://acme-co/employees.sql';

For the commands above to succeed, you need to have created the dump file with specific flags to pg_dump. For more information, see Migrate from Postgres.

Import a table from a Postgres database dump

Amazon S3:

> IMPORT TABLE employees FROM PGDUMP 's3://your-external-storage/employees-full.sql?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]' WITH skip_foreign_keys;

Azure:

> IMPORT TABLE employees FROM PGDUMP 'azure://acme-co/employees.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co' WITH skip_foreign_keys;

Google Cloud:

> IMPORT TABLE employees FROM PGDUMP 'gs://acme-co/employees.sql' WITH skip_foreign_keys;

If the table schema specifies foreign keys into tables that do not exist yet, the WITH skip_foreign_keys shown may be needed. For more information, see the list of import options.

For the command above to succeed, you need to have created the dump file with specific flags to pg_dump. For more information, see Migrate from Postgres.

Import a CockroachDB dump file

Cockroach dump files can be imported using the IMPORT PGDUMP.

Amazon S3:

> IMPORT PGDUMP 's3://your-external-storage/employees-full.sql?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]';

Azure:

> IMPORT PGDUMP 'azure://acme-co/employees.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co';

Google Cloud:

> IMPORT PGDUMP 'gs://acme-co/employees.sql';

For more information, see SQL Dump (Export).

Import a MySQL database dump

Amazon S3:

> IMPORT MYSQLDUMP 's3://your-external-storage/employees-full.sql?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]';

Azure:

> IMPORT MYSQLDUMP 'azure://acme-co/employees.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co';

Google Cloud:

> IMPORT MYSQLDUMP 'gs://acme-co/employees.sql';

For more detailed information about importing data from MySQL, see Migrate from MySQL.

Import a table from a MySQL database dump

Amazon S3:

> IMPORT TABLE employees FROM MYSQLDUMP 's3://your-external-storage/employees-full.sql?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]' WITH skip_foreign_keys;

Azure:

> IMPORT TABLE employees FROM MYSQLDUMP 'azure://acme-co/employees.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co' WITH skip_foreign_keys;

Google Cloud:

> IMPORT TABLE employees FROM MYSQLDUMP 'gs://acme-co/employees.sql' WITH skip_foreign_keys;

If the table schema specifies foreign keys into tables that do not exist yet, the WITH skip_foreign_keys shown may be needed. For more information, see the list of import options.

For more detailed information about importing data from MySQL, see Migrate from MySQL.

Import a delimited data file

Amazon S3:

> IMPORT DELIMITED DATA 's3://your-external-storage/employees-full.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]'
  WITH
    fields_terminated_by='|',
    fields_enclosed_by='"',
    fields_escaped_by='"';

Azure:

> IMPORT DELIMITED DATA 'azure://acme-co/employees.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
  WITH
    fields_terminated_by='|',
    fields_enclosed_by='"',
    fields_escaped_by='"';

Google Cloud:

> IMPORT DELIMITED DATA 'gs://acme-co/employees.csv'
  WITH
    fields_terminated_by='|',
    fields_enclosed_by='"',
    fields_escaped_by='"';

Note:

If you want to escape special symbols, use fields_escaped_by.

Import a table from a delimited data file

Amazon S3:

> IMPORT TABLE employees
    FROM DELIMITED DATA 's3://your-external-storage/employees.csv?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]'
    WITH
      skip_foreign_keys;

Azure:

> IMPORT TABLE employees
    FROM DELIMITED DATA 'azure://acme-co/employees.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
    WITH
      skip_foreign_keys;

Google Cloud:

> IMPORT TABLE employees
    FROM DELIMITED DATA 'gs://acme-co/employees.csv'
    WITH
      skip_foreign_keys;

If the table schema specifies foreign keys into tables that do not exist yet, the WITH skip_foreign_keys shown may be needed. For more information, see the list of import options.

Import a table from a local file

If a nodeID is provided, the data files to import will be in the extern directory of the specified node:

$ cd node2
$ ls

000355.log              cockroach-temp700212211
000357.log              cockroach.advertise-addr
000359.sst              cockroach.advertise-sql-addr
COCKROACHDB_VERSION     cockroach.http-addr
CURRENT                     cockroach.listen-addr
IDENTITY                    cockroach.sql-addr
LOCK                        extern
MANIFEST-000010         logs
OPTIONS-000005          temp-dirs-record.txt
auxiliary

$ cd extern
$ ls

customers.csv

Then, specify which node to access by including the nodeID in the IMPORT statement:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name STRING,
        INDEX name_idx (name)
)
CSV DATA ('nodelocal://2/customers.csv')
;

New in v20.1: You can also use the cockroach nodelocal upload command to upload a file to the external IO directory on a node's (the gateway node, by default) local file system.

Import a table from an Avro file

New in v20.1: Avro OCF data, JSON records, or binary records can be imported. The following are examples of importing Avro OCF data.

To specify the table schema in-line:

Amazon S3:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
AVRO DATA ('s3://acme-co/customers.avro?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]&AWS_SESSION_TOKEN=[placeholder]')
;

Azure:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
AVRO DATA ('azure://acme-co/customer-import-data.avro?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
;

Google Cloud:

> IMPORT TABLE customers (
        id UUID PRIMARY KEY,
        name TEXT,
        INDEX name_idx (name)
)
AVRO DATA ('gs://acme-co/customers.avro')
;

To use a file to specify the table schema:

Amazon S3:

> IMPORT TABLE customers
CREATE USING 's3://acme-co/customers-create-table.sql?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]'
AVRO DATA ('s3://acme-co/customers.avro?AWS_ACCESS_KEY_ID=[placeholder]&AWS_SECRET_ACCESS_KEY=[placeholder]')
;

Azure:

> IMPORT TABLE customers
CREATE USING 'azure://acme-co/customer-create-table.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
AVRO DATA ('azure://acme-co/customer-import-data.avro?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
;

Google Cloud:

> IMPORT TABLE customers
CREATE USING 'gs://acme-co/customers-create-table.sql'
AVRO DATA ('gs://acme-co/customers.avro')
;

For more detailed information about importing data from Avro and examples, see Migrate from Avro.

Known limitation

IMPORT can sometimes fail with a "context canceled" error, or can restart itself many times without ever finishing. If this is happening, it is likely due to a high amount of disk contention. This can be mitigated by setting the kv.bulk_io_write.max_rate cluster setting to a value below your max disk write speed. For example, to set it to 10MB/s, execute:

> SET CLUSTER SETTING kv.bulk_io_write.max_rate = '10MB';

IMPORT

Required privileges

Synopsis

Parameters

For import from CSV or Avro

For import from dump file

Delimited data files

Import file URLs

Example file URLs

Import options

Requirements

Prerequisites

Import targets

Create table

Available storage

Import file location

Table users and privileges

Performance

Viewing and controlling import jobs

Examples

Import a table from a CSV file

Import a table from multiple CSV files

Using a comma-separated list

Using a wildcard

Import a table from a TSV file

Skip commented lines

Skip first n lines

Use blank characters as NULL

Import a compressed CSV file

Import a Postgres database dump

Import a table from a Postgres database dump

Import a CockroachDB dump file

Import a MySQL database dump

Import a table from a MySQL database dump

Import a delimited data file

Import a table from a delimited data file

Import a table from a local file

Import a table from an Avro file

Known limitation

See also

Tell us about your experience

Select the problem area

Thank you for your feedback!

Explore More Documentation:

Use blank characters as `NULL`