This page describes basic considerations and provides a basic example of migrating data from MySQL to CockroachDB. The information on this page assumes that you have read Migration Overview, which describes the broad phases and considerations of migrating a database to CockroachDB.
The MySQL migration example on this page demonstrates how to use MOLT tooling to update the MySQL schema, perform an initial load of data, and validate the data. These steps are essential when preparing for a full migration.
If you need help migrating to CockroachDB, contact our sales team.
Syntax differences
You will likely need to make application changes due to differences in syntax between MySQL and CockroachDB. Along with the general considerations in the migration overview, also consider the following MySQL-specific information as you develop your migration plan.
When using the Schema Conversion Tool, MySQL syntax that cannot automatically be converted will be displayed in the Summary Report. These may include the following.
String case sensitivity
Strings are case-insensitive in MySQL and case-sensitive in CockroachDB. You may need to edit your MySQL data to get the results you expect from CockroachDB. For example, you may have been doing string comparisons in MySQL that will need to be changed to work with CockroachDB.
For more information about the case sensitivity of strings in MySQL, see Case Sensitivity in String Searches from the MySQL documentation. For more information about CockroachDB strings, see STRING
.
Identifier case sensitivity
Identifiers are case-sensitive in MySQL and case-insensitive in CockroachDB. When using the Schema Conversion Tool, you can either keep case sensitivity by enclosing identifiers in double quotes, or make identifiers case-insensitive by converting them to lowercase.
AUTO_INCREMENT
attribute
The MySQL AUTO_INCREMENT
attribute, which creates sequential column values, is not supported in CockroachDB. When using the Schema Conversion Tool, columns with AUTO_INCREMENT
can be converted to use sequences, UUID
values with gen_random_uuid()
, or unique INT8
values using unique_rowid()
. Cockroach Labs does not recommend using a sequence to define a primary key column. For more information, see Unique ID best practices.
Changing a column type during schema conversion will cause MOLT Verify to identify a type mismatch during data validation. This is expected behavior.
ENUM
type
MySQL ENUM
types are defined in table columns. On CockroachDB, ENUM
is a standalone type. When using the Schema Conversion Tool, you can either deduplicate the ENUM
definitions or create a separate type for each column.
TINYINT
type
TINYINT
data types are not supported in CockroachDB. The Schema Conversion Tool automatically converts TINYINT
columns to INT2
(SMALLINT
).
Geospatial types
MySQL geometry types are not converted to CockroachDB geospatial types by the Schema Conversion Tool. They should be manually converted to the corresponding types in CockroachDB.
FIELD
function
The MYSQL FIELD
function is not supported in CockroachDB. Instead, you can use the array_position
function, which returns the index of the first occurrence of element in the array.
Example usage:
SELECT array_position(ARRAY[4,1,3,2],1);
array_position
------------------
2
(1 row)
While MySQL returns 0 when the element is not found, CockroachDB returns NULL
. So if you are using the ORDER BY
clause in a statement with the array_position
function, the caveat is that sort is applied even when the element is not found. As a workaround, you can use the COALESCE
operator.
SELECT * FROM table_a ORDER BY COALESCE(array_position(ARRAY[4,1,3,2],5),999);
Load MySQL data
You can use one of the following methods to migrate MySQL data to CockroachDB:
Use
IMPORT INTO
to migrate CSV, TSV, or Avro data stored via userfile or cloud storage into pre-existing tables on CockroachDB. This option achieves the highest throughput, but requires taking the CockroachDB tables offline to achieve its import speed.Tip:For best practices for optimizing import performance in CockroachDB, see Import Performance Best Practices.Use a third-party data migration tool (e.g., AWS DMS, Qlik, Striim) to load the data.
The following example uses IMPORT INTO
to perform the initial data load.
Example: Migrate world
to CockroachDB
The following steps demonstrate converting a schema, performing an initial load of data, and validating data consistency during a migration.
In the context of a full migration, these steps ensure that MySQL data can be properly migrated to CockroachDB and your application queries tested against the cluster. For details, see the Migration Overview.
Before you begin
The example uses the MySQL world
data set and demonstrates how to migrate the schema and data to a Serverless cluster. To follow along with these steps:
Download the
world
data set.Create the
world
database on your MySQL instance, specifying the path of the downloaded file:mysqlsh -uroot --sql --file {path}/world-db/world.sql
Create a free Cloud account, which is used to access the Schema Conversion Tool and create the Serverless cluster.
If you are migrating to a CockroachDB Self-Hosted database, you can export the converted schema from the Schema Conversion Tool and execute the statements in cockroach sql
, or use a third-party schema migration tool such as Alembic, Flyway, or Liquibase.
Step 1. Convert the MySQL schema
Use the Schema Conversion Tool to convert the world
schema for compatibility with CockroachDB. The schema has three tables: city
, country
, and countrylanguage
.
Dump the MySQL
world
schema with the followingmysqldump
command:mysqldump -uroot --no-data world > world_schema.sql
Open the Schema Conversion Tool in the Cloud Console and add a new MySQL schema.
For AUTO_INCREMENT Conversion Option, select the
unique_rowid()
option. This will convert theID
column in thecity
table, which has MySQL typeint
andAUTO_INCREMENT
, to a CockroachDBINT8
type with default values generated byunique_rowid()
. For context on this option, seeAUTO_INCREMENT
attribute.The
UUID
andunique_rowid()
options are each preferred for different use cases. For this example, selecting theunique_rowid()
option makes loading the data more straightforward in a later step, since both the source and target columns will have integer types.Upload
world_schema.sql
to the Schema Conversion Tool.After conversion is complete, review the results. The Summary Report shows that there are no errors. This means that the schema is ready to migrate to CockroachDB.
Tip:You can also add your MySQL database credentials to have the Schema Conversion Tool obtain the schema directly from the MySQL database.This example migrates directly to a Serverless cluster. If you are migrating to a CockroachDB Self-Hosted database, you can export the converted schema from the Schema Conversion Tool and execute the statements in
cockroach sql
, or use a third-party schema migration tool such as Alembic, Flyway, or Liquibase.Before you migrate the converted schema, click the Statements tab to view the Statements list. Scroll down to the
CREATE TABLE countrylanguage
statement and edit the statement to add a collation (COLLATE en_US
) on thelanguage
column:CREATE TABLE countrylanguage ( countrycode VARCHAR(3) DEFAULT '' NOT NULL, language VARCHAR(30) COLLATE en_US DEFAULT '' NOT NULL, isofficial countrylanguage_isofficial_enum DEFAULT 'F' NOT NULL, percentage DECIMAL(4,1) DEFAULT '0.0' NOT NULL, PRIMARY KEY (countrycode, language), INDEX countrycode (countrycode), CONSTRAINT countrylanguage_ibfk_1 FOREIGN KEY (countrycode) REFERENCES country (code) )
Click Save.
This is a workaround to prevent data validation from failing due to collation mismatches. For more details, see the MOLT Verify documentation.
Click Migrate Schema to create a new Serverless cluster with the converted schema. Name the database
world
.You can view this database on the Databases page of the Cloud Console.
Open a SQL shell to the CockroachDB
world
cluster. To find the command, open the Connect dialog in the Cloud Console and select theworld
database and CockroachDB Client option. It will look like:cockroach sql --url "postgresql://{username}@{hostname}:{port}/world?sslmode=verify-full"
For large imports, Cockroach Labs recommends removing indexes prior to loading data and recreating them afterward. This provides increased visibility into the import progress and the ability to retry each step independently.
Show the indexes on the
world
database:SHOW INDEXES FROM DATABASE world;
The
countrycode
foreign key indexes on thecity
andcountrylanguage
tables can be removed for now:table_name | index_name | index_schema | non_unique | seq_in_index | column_name | definition | direction | storing | implicit | visible ---------------------------------+-------------------------------------------------+--------------+------------+--------------+-----------------+-----------------+-----------+---------+----------+---------- ... city | countrycode | public | t | 2 | id | id | ASC | f | t | t city | countrycode | public | t | 1 | countrycode | countrycode | ASC | f | f | t ... countrylanguage | countrycode | public | t | 1 | countrycode | countrycode | ASC | f | f | t countrylanguage | countrycode | public | t | 2 | language | language | ASC | f | t | t ...
Drop the
countrycode
indexes:DROP INDEX city@countrycode;
DROP INDEX countrylanguage@countrycode;
You will recreate the indexes after loading the data.
Step 2. Load the MySQL data
Load the world
data into CockroachDB using IMPORT INTO
with CSV-formatted data. IMPORT INTO
requires that you export one file per table with the following attributes:
- Files must be in valid CSV (comma-separated values) or TSV (tab-separated values) format.
- The delimiter must be a single character. Use the
delimiter
option to set a character other than a comma (such as a tab, for TSV format). - Files must be UTF-8 encoded.
- If one of the following characters appears in a field, the field must be enclosed by double quotes:
- Delimiter (
,
by default). - Double quote (
"
). Because the field will be enclosed by double quotes, escape a double quote inside a field by preceding it with another double quote. For example:"aaa","b""bb","ccc"
. - Newline (
\n
). - Carriage return (
\r
).
- Delimiter (
- If a column is of type
BYTES
, it can either be a valid UTF-8 string or a hex-encoded byte literal beginning with\x
. For example, a field whose value should be the bytes1
,2
would be written as\x0102
.
When MySQL dumps data, the tables are not ordered by foreign key constraints, and foreign keys are not placed in the correct dependency order. It is best to disable foreign key checks when loading data into CockroachDB, and revalidate foreign keys on each table after the data is loaded.
By default, IMPORT INTO
invalidates all foreign key constraints on the target table.
Dump the MySQL
world
data with the followingmysqldump
command:mysqldump -uroot -T /{path}/world-data --fields-terminated-by ',' --fields-enclosed-by '"' --fields-escaped-by '\' --no-create-info world
This dumps each table in your database to the path
/{path}/world-data
as a.txt
file in CSV format.--fields-terminated-by
specifies that values are separated by commas instead of tabs.--fields-enclosed-by
and--fields-escaped-by
specify the characters that enclose and escape column values, respectively.--no-create-info
dumps only the data manipulation language (DML).
Host the files where the CockroachDB cluster can access them.
Each node in the CockroachDB cluster needs to have access to the files being imported. There are several ways for the cluster to access the data; for more information on the types of storage
IMPORT INTO
can pull from, see the following:Cloud storage such as Amazon S3 or Google Cloud is highly recommended for hosting the data files you want to import.
The dump files generated in the preceding step are already hosted on a public S3 bucket created for this example.
Open a SQL shell to the CockroachDB
world
cluster, using the same command as before:cockroach sql --url "postgresql://{username}@{hostname}:{port}/world?sslmode=verify-full"
Use
IMPORT INTO
to import each MySQL dump file into the corresponding table in theworld
database.The following commands point to a public S3 bucket where the
world
data dump files are hosted for this example. Thenullif='\N'
clause specifies that\N
values, which are produced by themysqldump
command, should be read asNULL
.Tip:You can add therow_limit
option to specify the number of rows to import. For example,row_limit = '10'
will import the first 10 rows of the table. This option is useful for finding errors quickly before executing a more time- and resource-consuming import.IMPORT INTO countrylanguage CSV DATA ( 'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/mysql/world/countrylanguage.txt' ) WITH nullif='\N';
job_id | status | fraction_completed | rows | index_entries | bytes ---------------------+-----------+--------------------+------+---------------+--------- 887782070812344321 | succeeded | 1 | 984 | 984 | 171555
IMPORT INTO country CSV DATA ( 'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/mysql/world/country.txt' ) WITH nullif='\N';
job_id | status | fraction_completed | rows | index_entries | bytes ---------------------+-----------+--------------------+------+---------------+-------- 887782114360819713 | succeeded | 1 | 239 | 0 | 33173
IMPORT INTO city CSV DATA ( 'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/mysql/world/city.txt' ) WITH nullif='\N';
job_id | status | fraction_completed | rows | index_entries | bytes ---------------------+-----------+--------------------+------+---------------+--------- 887782154421567489 | succeeded | 1 | 4079 | 4079 | 288140
Note:After converting the schema to work with CockroachDB, theid
column incity
is anINT8
with default values generated byunique_rowid()
. However,unique_rowid()
values are only generated when new rows are inserted without anid
value. The MySQL data dump still includes the sequentialid
values generated by the MySQLAUTO_INCREMENT
attribute, and these are imported with theIMPORT INTO
command.In an actual migration, you can either update the primary key into a multi-column key or add a new primary key column that generates unique IDs.
Recreate the indexes that you deleted before importing the data:
CREATE INDEX countrycode ON city (countrycode, id);
CREATE INDEX countrycode ON countrylanguage (countrycode, language);
Recall that
IMPORT INTO
invalidates all foreign key constraints on the target table. View the constraints that are defined oncity
andcountrylanguage
:SHOW CONSTRAINTS FROM city;
table_name | constraint_name | constraint_type | details | validated -------------+-----------------+-----------------+--------------------------------------------------------------+------------ city | city_ibfk_1 | FOREIGN KEY | FOREIGN KEY (countrycode) REFERENCES country(code) NOT VALID | f city | city_pkey | PRIMARY KEY | PRIMARY KEY (id ASC) | t
SHOW CONSTRAINTS FROM countrylanguage;
table_name | constraint_name | constraint_type | details | validated ------------------+------------------------+-----------------+--------------------------------------------------------------+------------ countrylanguage | countrylanguage_ibfk_1 | FOREIGN KEY | FOREIGN KEY (countrycode) REFERENCES country(code) NOT VALID | f countrylanguage | countrylanguage_pkey | PRIMARY KEY | PRIMARY KEY (countrycode ASC, language ASC) | t
To validate the foreign keys, issue an
ALTER TABLE ... VALIDATE CONSTRAINT
statement for each table:ALTER TABLE city VALIDATE CONSTRAINT city_ibfk_1;
ALTER TABLE countrylanguage VALIDATE CONSTRAINT countrylanguage_ibfk_1;
Step 3. Validate the migrated data
Use MOLT Verify to check that the data on MySQL and CockroachDB are consistent.
In the directory where you installed MOLT Verify, use the following command to compare the two databases, specifying the JDBC connection string for MySQL with
--source
and the SQL connection string for CockroachDB with--target
:Tip:To find the CockroachDB connection string, open the Connect dialog in the Cloud Console and select theworld
database and the General connection string option../molt verify --source 'jdbc:mysql://{user}:{password}@tcp({host}:{port})/world' --target 'postgresql://{user}:{password}@{host}:{port}/world?sslmode=verify-full'
You will see the initial output:
<nil> INF verification in progress
The following warnings indicate that the MySQL and CockroachDB columns have different types. This is an expected result, since some columns were changed to
ENUM
types when you converted the schema:<nil> WRN mismatching table definition mismatch_info="column type mismatch on continent: text vs country_continent_enum" table_name=country table_schema=public <nil> WRN mismatching table definition mismatch_info="column type mismatch on isofficial: text vs countrylanguage_isofficial_enum" table_name=countrylanguage table_schema=public
The following output indicates that MOLT Verify has completed verification:
<nil> INF finished row verification on public.country (shard 1/1): truth rows seen: 239, success: 239, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF finished row verification on public.countrylanguage (shard 1/1): truth rows seen: 984, success: 984, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF finished row verification on public.city (shard 1/1): truth rows seen: 4079, success: 4079, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0 <nil> INF verification complete
With the schema migrated and the initial data load verified, the next steps in a real-world migration are to ensure that you have made any necessary application changes, validate application queries, and perform a dry run before conducting the full migration.
To learn more, see the Migration Overview.