Migrate from MySQL

On this page Carat arrow pointing down

This page describes basic considerations and provides a basic example of migrating data from MySQL to CockroachDB. The information on this page assumes that you have read Migration Overview, which describes the broad phases and considerations of migrating a database to CockroachDB.

The MySQL migration example on this page demonstrates how to use MOLT tooling to update the MySQL schema, perform an initial load of data, and validate the data. These steps are essential when preparing for a full migration.

Tip:

If you need help migrating to CockroachDB, contact our sales team.

Syntax differences

You will likely need to make application changes due to differences in syntax between MySQL and CockroachDB. Along with the general considerations in the migration overview, also consider the following MySQL-specific information as you develop your migration plan.

When using the Schema Conversion Tool, MySQL syntax that cannot automatically be converted will be displayed in the Summary Report. These may include the following.

String case sensitivity

Strings are case-insensitive in MySQL and case-sensitive in CockroachDB. You may need to edit your MySQL data to get the results you expect from CockroachDB. For example, you may have been doing string comparisons in MySQL that will need to be changed to work with CockroachDB.

For more information about the case sensitivity of strings in MySQL, see Case Sensitivity in String Searches from the MySQL documentation. For more information about CockroachDB strings, see STRING.

Identifier case sensitivity

Identifiers are case-sensitive in MySQL and case-insensitive in CockroachDB. When using the Schema Conversion Tool, you can either keep case sensitivity by enclosing identifiers in double quotes, or make identifiers case-insensitive by converting them to lowercase.

AUTO_INCREMENT attribute

The MySQL AUTO_INCREMENT attribute, which creates sequential column values, is not supported in CockroachDB. When using the Schema Conversion Tool, columns with AUTO_INCREMENT can be converted to use sequences, UUID values with gen_random_uuid(), or unique INT8 values using unique_rowid(). Cockroach Labs does not recommend using a sequence to define a primary key column. For more information, see Unique ID best practices.

Note:

Changing a column type during schema conversion will cause MOLT Verify to identify a type mismatch during data validation. This is expected behavior.

ENUM type

MySQL ENUM types are defined in table columns. On CockroachDB, ENUM is a standalone type. When using the Schema Conversion Tool, you can either deduplicate the ENUM definitions or create a separate type for each column.

TINYINT type

TINYINT data types are not supported in CockroachDB. The Schema Conversion Tool automatically converts TINYINT columns to INT2 (SMALLINT).

Geospatial types

MySQL geometry types are not converted to CockroachDB geospatial types by the Schema Conversion Tool. They should be manually converted to the corresponding types in CockroachDB.

FIELD function

The MYSQL FIELD function is not supported in CockroachDB. Instead, you can use the array_position function, which returns the index of the first occurrence of element in the array.

Example usage:

icon/buttons/copy
SELECT array_position(ARRAY[4,1,3,2],1);
  array_position
------------------
               2
(1 row)

While MySQL returns 0 when the element is not found, CockroachDB returns NULL. So if you are using the ORDER BY clause in a statement with the array_position function, the caveat is that sort is applied even when the element is not found. As a workaround, you can use the COALESCE operator.

icon/buttons/copy
SELECT * FROM table_a ORDER BY COALESCE(array_position(ARRAY[4,1,3,2],5),999);

Load MySQL data

You can use one of the following methods to migrate MySQL data to CockroachDB:

The following example uses IMPORT INTO to perform the initial data load.

Example: Migrate world to CockroachDB

The following steps demonstrate converting a schema, performing an initial load of data, and validating data consistency during a migration.

In the context of a full migration, these steps ensure that MySQL data can be properly migrated to CockroachDB and your application queries tested against the cluster. For details, see the Migration Overview.

Before you begin

The example uses the MySQL world data set and demonstrates how to migrate the schema and data to a Serverless cluster. To follow along with these steps:

  1. Download the world data set.

  2. Create the world database on your MySQL instance, specifying the path of the downloaded file:

    icon/buttons/copy

       mysqlsh -uroot --sql --file {path}/world-db/world.sql
    
  3. Create a free Cloud account, which is used to access the Schema Conversion Tool and create the Serverless cluster.

Tip:

If you are migrating to a CockroachDB Self-Hosted database, you can export the converted schema from the Schema Conversion Tool and execute the statements in cockroach sql, or use a third-party schema migration tool such as Alembic, Flyway, or Liquibase.

Step 1. Convert the MySQL schema

Use the Schema Conversion Tool to convert the world schema for compatibility with CockroachDB. The schema has three tables: city, country, and countrylanguage.

  1. Dump the MySQL world schema with the following mysqldump command:

    icon/buttons/copy

       mysqldump -uroot --no-data world > world_schema.sql
    
  2. Open the Schema Conversion Tool in the Cloud Console and add a new MySQL schema.

    For AUTO_INCREMENT Conversion Option, select the unique_rowid() option. This will convert the ID column in the city table, which has MySQL type int and AUTO_INCREMENT, to a CockroachDB INT8 type with default values generated by unique_rowid(). For context on this option, see AUTO_INCREMENT attribute.

    The UUID and unique_rowid() options are each preferred for different use cases. For this example, selecting the unique_rowid() option makes loading the data more straightforward in a later step, since both the source and target columns will have integer types.

  3. Upload world_schema.sql to the Schema Conversion Tool.

    After conversion is complete, review the results. The Summary Report shows that there are no errors. This means that the schema is ready to migrate to CockroachDB.

    Tip:
    You can also add your MySQL database credentials to have the Schema Conversion Tool obtain the schema directly from the MySQL database.

    This example migrates directly to a Serverless cluster. If you are migrating to a CockroachDB Self-Hosted database, you can export the converted schema from the Schema Conversion Tool and execute the statements in cockroach sql, or use a third-party schema migration tool such as Alembic, Flyway, or Liquibase.

  4. Before you migrate the converted schema, click the Statements tab to view the Statements list. Scroll down to the CREATE TABLE countrylanguage statement and edit the statement to add a collation (COLLATE en_US) on the language column:

    icon/buttons/copy

       CREATE TABLE countrylanguage (
              countrycode VARCHAR(3) DEFAULT '' NOT NULL,
              language VARCHAR(30) COLLATE en_US DEFAULT '' NOT NULL,
              isofficial countrylanguage_isofficial_enum
              DEFAULT 'F'
              NOT NULL,
              percentage DECIMAL(4,1) DEFAULT '0.0' NOT NULL,
              PRIMARY KEY (countrycode, language),
              INDEX countrycode (countrycode),
              CONSTRAINT countrylanguage_ibfk_1
                     FOREIGN KEY (countrycode) REFERENCES country (code)
       )
    

    Click Save.

    This is a workaround to prevent data validation from failing due to collation mismatches. For more details, see the MOLT Verify documentation.

  5. Click Migrate Schema to create a new Serverless cluster with the converted schema. Name the database world.

    You can view this database on the Databases page of the Cloud Console.

  6. Open a SQL shell to the CockroachDB world cluster. To find the command, open the Connect dialog in the Cloud Console and select the world database and CockroachDB Client option. It will look like:

    icon/buttons/copy

       cockroach sql --url "postgresql://{username}@{hostname}:{port}/world?sslmode=verify-full" 
    
  7. For large imports, Cockroach Labs recommends removing indexes prior to loading data and recreating them afterward. This provides increased visibility into the import progress and the ability to retry each step independently.

    Show the indexes on the world database:

    icon/buttons/copy

       SHOW INDEXES FROM DATABASE world;
    

    The countrycode foreign key indexes on the city and countrylanguage tables can be removed for now:

                   table_name           |                   index_name                    | index_schema | non_unique | seq_in_index |   column_name   |   definition    | direction | storing | implicit | visible
       ---------------------------------+-------------------------------------------------+--------------+------------+--------------+-----------------+-----------------+-----------+---------+----------+----------
       ...
         city                           | countrycode                                     | public       |     t      |            2 | id              | id              | ASC       |    f    |    t     |    t
         city                           | countrycode                                     | public       |     t      |            1 | countrycode     | countrycode     | ASC       |    f    |    f     |    t
       ...
         countrylanguage                | countrycode                                     | public       |     t      |            1 | countrycode     | countrycode     | ASC       |    f    |    f     |    t
         countrylanguage                | countrycode                                     | public       |     t      |            2 | language        | language        | ASC       |    f    |    t     |    t
       ...
    
  8. Drop the countrycode indexes:

    icon/buttons/copy

       DROP INDEX city@countrycode;
    

    icon/buttons/copy

       DROP INDEX countrylanguage@countrycode;
    

    You will recreate the indexes after loading the data.

Step 2. Load the MySQL data

Load the world data into CockroachDB using IMPORT INTO with CSV-formatted data. IMPORT INTO requires that you export one file per table with the following attributes:

  • Files must be in valid CSV (comma-separated values) or TSV (tab-separated values) format.
  • The delimiter must be a single character. Use the delimiter option to set a character other than a comma (such as a tab, for TSV format).
  • Files must be UTF-8 encoded.
  • If one of the following characters appears in a field, the field must be enclosed by double quotes:
    • Delimiter (, by default).
    • Double quote ("). Because the field will be enclosed by double quotes, escape a double quote inside a field by preceding it with another double quote. For example: "aaa","b""bb","ccc".
    • Newline (\n).
    • Carriage return (\r).
  • If a column is of type BYTES, it can either be a valid UTF-8 string or a hex-encoded byte literal beginning with \x. For example, a field whose value should be the bytes 1, 2 would be written as \x0102.
Note:

When MySQL dumps data, the tables are not ordered by foreign key constraints, and foreign keys are not placed in the correct dependency order. It is best to disable foreign key checks when loading data into CockroachDB, and revalidate foreign keys on each table after the data is loaded.

By default, IMPORT INTO invalidates all foreign key constraints on the target table.

  1. Dump the MySQL world data with the following mysqldump command:

    icon/buttons/copy

       mysqldump -uroot -T /{path}/world-data --fields-terminated-by ',' --fields-enclosed-by '"' --fields-escaped-by '\' --no-create-info world
    

    This dumps each table in your database to the path /{path}/world-data as a .txt file in CSV format.

    • --fields-terminated-by specifies that values are separated by commas instead of tabs.
    • --fields-enclosed-by and --fields-escaped-by specify the characters that enclose and escape column values, respectively.
    • --no-create-info dumps only the data manipulation language (DML).
  2. Host the files where the CockroachDB cluster can access them.

    Each node in the CockroachDB cluster needs to have access to the files being imported. There are several ways for the cluster to access the data; for more information on the types of storage IMPORT INTO can pull from, see the following:

    Cloud storage such as Amazon S3 or Google Cloud is highly recommended for hosting the data files you want to import.

    The dump files generated in the preceding step are already hosted on a public S3 bucket created for this example.

  3. Open a SQL shell to the CockroachDB world cluster, using the same command as before:

    icon/buttons/copy

       cockroach sql --url "postgresql://{username}@{hostname}:{port}/world?sslmode=verify-full" 
    
  4. Use IMPORT INTO to import each MySQL dump file into the corresponding table in the world database.

    The following commands point to a public S3 bucket where the world data dump files are hosted for this example. The nullif='\N' clause specifies that \N values, which are produced by the mysqldump command, should be read as NULL.

    Tip:
    You can add the row_limit option to specify the number of rows to import. For example, row_limit = '10' will import the first 10 rows of the table. This option is useful for finding errors quickly before executing a more time- and resource-consuming import.

       IMPORT INTO countrylanguage
         CSV DATA (
           'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/mysql/world/countrylanguage.txt'
         )
         WITH
           nullif='\N';
    
               job_id       |  status   | fraction_completed | rows | index_entries | bytes
       ---------------------+-----------+--------------------+------+---------------+---------
         887782070812344321 | succeeded |                  1 |  984 |           984 | 171555
    
       IMPORT INTO country
         CSV DATA (
           'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/mysql/world/country.txt'
         )
         WITH
           nullif='\N';
    
               job_id       |  status   | fraction_completed | rows | index_entries | bytes
       ---------------------+-----------+--------------------+------+---------------+--------
         887782114360819713 | succeeded |                  1 |  239 |             0 | 33173
    
       IMPORT INTO city
         CSV DATA (
           'https://cockroachdb-migration-examples.s3.us-east-1.amazonaws.com/mysql/world/city.txt'
         )
         WITH
           nullif='\N';
    
               job_id       |  status   | fraction_completed | rows | index_entries | bytes
       ---------------------+-----------+--------------------+------+---------------+---------
         887782154421567489 | succeeded |                  1 | 4079 |          4079 | 288140
    

    Note:
    After converting the schema to work with CockroachDB, the id column in city is an INT8 with default values generated by unique_rowid(). However, unique_rowid() values are only generated when new rows are inserted without an id value. The MySQL data dump still includes the sequential id values generated by the MySQL AUTO_INCREMENT attribute, and these are imported with the IMPORT INTO command.

    In an actual migration, you can either update the primary key into a multi-column key or add a new primary key column that generates unique IDs.

  5. Recreate the indexes that you deleted before importing the data:

    icon/buttons/copy

       CREATE INDEX countrycode ON city (countrycode, id);
    

    icon/buttons/copy

       CREATE INDEX countrycode ON countrylanguage (countrycode, language);
    
  6. Recall that IMPORT INTO invalidates all foreign key constraints on the target table. View the constraints that are defined on city and countrylanguage:

    icon/buttons/copy

       SHOW CONSTRAINTS FROM city;
    
         table_name | constraint_name | constraint_type |                           details                            | validated
       -------------+-----------------+-----------------+--------------------------------------------------------------+------------
         city       | city_ibfk_1     | FOREIGN KEY     | FOREIGN KEY (countrycode) REFERENCES country(code) NOT VALID |     f
         city       | city_pkey       | PRIMARY KEY     | PRIMARY KEY (id ASC)                                         |     t
    

    icon/buttons/copy

       SHOW CONSTRAINTS FROM countrylanguage;
    
           table_name    |    constraint_name     | constraint_type |                           details                            | validated
       ------------------+------------------------+-----------------+--------------------------------------------------------------+------------
         countrylanguage | countrylanguage_ibfk_1 | FOREIGN KEY     | FOREIGN KEY (countrycode) REFERENCES country(code) NOT VALID |     f
         countrylanguage | countrylanguage_pkey   | PRIMARY KEY     | PRIMARY KEY (countrycode ASC, language ASC)                  |     t
    
  7. To validate the foreign keys, issue an ALTER TABLE ... VALIDATE CONSTRAINT statement for each table:

    icon/buttons/copy

       ALTER TABLE city VALIDATE CONSTRAINT city_ibfk_1;
    

    icon/buttons/copy

       ALTER TABLE countrylanguage VALIDATE CONSTRAINT countrylanguage_ibfk_1;
    

Step 3. Validate the migrated data

Use MOLT Verify to check that the data on MySQL and CockroachDB are consistent.

  1. Install MOLT Verify.

  2. In the directory where you installed MOLT Verify, use the following command to compare the two databases, specifying the JDBC connection string for MySQL with --source and the SQL connection string for CockroachDB with --target:

    Tip:
    To find the CockroachDB connection string, open the Connect dialog in the Cloud Console and select the world database and the General connection string option.

    icon/buttons/copy

       ./molt verify --source 'jdbc:mysql://{user}:{password}@tcp({host}:{port})/world' --target 'postgresql://{user}:{password}@{host}:{port}/world?sslmode=verify-full'
    

    You will see the initial output:

       <nil> INF verification in progress
    

    The following warnings indicate that the MySQL and CockroachDB columns have different types. This is an expected result, since some columns were changed to ENUM types when you converted the schema:

       <nil> WRN mismatching table definition mismatch_info="column type mismatch on continent: text vs country_continent_enum" table_name=country table_schema=public
       <nil> WRN mismatching table definition mismatch_info="column type mismatch on isofficial: text vs countrylanguage_isofficial_enum" table_name=countrylanguage table_schema=public
    

    The following output indicates that MOLT Verify has completed verification:

       <nil> INF finished row verification on public.country (shard 1/1): truth rows seen: 239, success: 239, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0
       <nil> INF finished row verification on public.countrylanguage (shard 1/1): truth rows seen: 984, success: 984, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0
       <nil> INF finished row verification on public.city (shard 1/1): truth rows seen: 4079, success: 4079, missing: 0, mismatch: 0, extraneous: 0, live_retry: 0
       <nil> INF verification complete
    

With the schema migrated and the initial data load verified, the next steps in a real-world migration are to ensure that you have made any necessary application changes, validate application queries, and perform a dry run before conducting the full migration.

To learn more, see the Migration Overview.

See also


Yes No
On this page

Yes No