Migration Overview

On this page Carat arrow pointing down

This page provides an overview of how to migrate a database to CockroachDB.

Note:

If you need to migrate data from a CockroachDB Serverless cluster to a CockroachDB Dedicated cluster, see Migrate data from Serverless to Dedicated.

A database migration broadly consists of the following phases:

  1. Develop a migration plan: Evaluate your downtime requirements and cutover strategy, size the CockroachDB cluster that you will migrate to, and become familiar with the application changes that you need to make for CockroachDB.
  2. Prepare for migration: Run a pre-mortem (optional), set up metrics (optional), convert your schema, perform an initial load of test data, validate your application queries for correctness and performance, and perform a dry run of the migration.
  3. Conduct the migration: Use a lift-and-shift or "zero-downtime" method to migrate your data, application, and users to CockroachDB.
  4. Complete the migration: Notify the appropriate parties and summarize the details.
Tip:

If you need help migrating to CockroachDB, contact our sales team.

Develop a migration plan

Consider the following as you plan your migration:

  • Who will lead and perform the migration? Which teams are involved, and which aspects are they responsible for?
  • Which internal and external parties do you need to inform about the migration?
  • Which external or third-party tools (e.g., microservices, analytics, payment processors, aggregators, CRMs) must be tested and migrated along with your application?
  • What portion of the data can be inconsistent, and for how long? What is the tolerable percentage of latency and application errors? This comprises your "error budget".
  • What is the tolerable downtime, and what cutover strategy will you use to switch users to CockroachDB?
  • Will you set up a "dry-run" environment to test the migration? How many dry-run migrations will you perform?
  • When is the best time to perform this migration to be minimally disruptive to the database's users?
  • What is your target date for completing the migration?

Create a document that summarizes the intent of the migration, the technical details, and the team members involved.

Approach to downtime

A primary consideration is whether your application can tolerate downtime:

  • What types of operations can you suspend: reads, writes, or both?
  • How long can operations be suspended: seconds, minutes, or hours?
  • Should writes be queued while service is suspended?

Take the following two use cases:

  • An application that is primarily in use during daytime business hours may be able to be taken offline during a predetermined timeframe without disrupting the user experience and business continuity. In this case, your migration can occur in a downtime window.
  • An application that must serve writes continuously cannot tolerate a long downtime window. In this case, you will aim for zero or near-zero downtime.

Downtime window

If your application can tolerate downtime, then it will likely be easiest to take your application offline, load a snapshot of the data into CockroachDB, and perform a cutover to CockroachDB once the data is migrated. This is known as a lift-and-shift migration.

A lift-and-shift approach is the most straightforward. However, it's important to fully prepare the migration in order to be certain that it can be completed successfully during the downtime window.

  • Scheduled downtime is made known to your users in advance. Once you have prepared for the migration, you take the application offline, conduct the migration, and bring the application back online on CockroachDB. To succeed, you should estimate the amount of downtime required to migrate your data, and ideally schedule the downtime outside of peak hours. Scheduling downtime is easiest if your application traffic is "periodic", meaning that it varies by the time of day, day of week, or day of month.

  • Unscheduled downtime impacts as few customers as possible, ideally without impacting their regular usage. If your application is intentionally offline at certain times (e.g., outside business hours), you can migrate the data without users noticing. Alternatively, if your application's functionality is not time-sensitive (e.g., it sends batched messages or emails), then you can queue requests while your system is offline, and process those requests after completing the migration to CockroachDB.

  • Reduced functionality takes some, but not all, application functionality offline. For example, you can disable writes but not reads while you migrate the application data, and queue data to be written after completing the migration.

For an overview of lift-and-shift migrations to CockroachDB, see Lift and Shift. For considerations and details about the pros and cons of this approach, see Migration Strategy: Lift and Shift.

Minimal downtime

If your application cannot tolerate downtime, then you should aim for a "zero-downtime" approach. "Zero" means that downtime is reduced to either an absolute minimum or zero, such that users do not notice the migration.

The minimum possible downtime depends on whether you can tolerate inconsistency in the migrated data:

  • Consistent migrations reduce downtime to an absolute minimum (i.e., from 30 seconds to sub-seconds) while keeping data synchronized between the source database and CockroachDB. Consistency requires downtime. In this approach, downtime occurs right before cutover, as you drain the remaining transactions from the source database to CockroachDB.

  • Inconsistent migrations can reduce downtime to zero. These require the most preparation, and typically allow read/write traffic to both databases for at least a small amount of time, thereby sacrificing consistency for availability. Without stopping application traffic, you perform an immediate cutover, while assuming that some writes will not be replicated to CockroachDB. You may want to manually reconcile these data inconsistencies after switching over.

For an overview of zero-downtime migrations to CockroachDB, see Zero Downtime.

Cutover strategy

Cutover is the process of switching application traffic from the source database to CockroachDB. Consider the following:

  • Will you perform the cutover all at once, or incrementally (e.g., by a subset of users, workloads, or tables)?

    • Switching all at once generally follows a downtime window approach. Once the data is migrated to CockroachDB, you "flip the switch" to route application traffic to the new database, thus ending downtime.
    • Migrations with zero or near-zero downtime can switch either all at once or incrementally, since writes are being synchronously replicated and the system can be gradually migrated as you validate the queries.
  • Will you have a fallback plan that allows you to reverse ("roll back") the migration from CockroachDB to the source database? A fallback plan enables you to fix any issues or inconsistencies that you encounter during or after cutover, then retry the migration.

All at once (no rollback)

This is the simplest cutover method, since you won't need to develop and execute a fallback plan.

As part of migration preparations, you will have already tested your queries and performance to have confidence to migrate without a rollback option. After moving all of the data from the source database to CockroachDB, you switch application traffic to CockroachDB.

All at once (rollback)

This method adds a fallback plan to the simple all-at-once cutover.

In addition to moving data to CockroachDB, data is also replicated from CockroachDB back to the source database in case you need to roll back the migration. Continuous replication is already possible when performing a zero-downtime migration that dual writes to both databases. Otherwise, you will need to ensure that data is replicated in the reverse direction at cutover. The challenge is to find a point at which both the source database and CockroachDB are in sync, so that you can roll back to that point. You should also avoid falling into a circular state where updates continuously travel back and forth between the source database and CockroachDB.

Phased rollout

Also known as the "strangler fig" approach, a phased rollout migrates a portion of your users, workloads, or tables over time. Until all users, workloads, and/or tables are migrated, the application will continue to write to both databases.

This approach enables you to take your time with the migration, and to pause or roll back as you monitor the migration for issues and performance. Rolling back the migration involves the same caveats and considerations as for the all-at-once method. Because you can control the blast radius of your migration by routing traffic for a subset of users or services, a phased rollout has reduced business risk and user impact at the cost of increased implementation risk. You will need to figure out how to migrate in phases while ensuring that your application is unaffected.

Capacity planning

Tip:

If you need help migrating to CockroachDB, contact our sales team.

Determine the size of the target CockroachDB cluster. To do this, consider your data volume and workload characteristics:

  • What is the total size of the data you will migrate?
  • How many active application connections will be running in the CockroachDB environment?

Use this information to size the CockroachDB cluster you will create. If you are migrating to a CockroachDB Cloud cluster, see Plan Your Cluster for details:

  • For CockroachDB Dedicated, refer to the example that shows how your data volume, storage requirements, and replication factor affect the recommended node size (number of vCPUs per node) and total number of nodes on the cluster.
  • For CockroachDB Serverless, your cluster will scale automatically to meet your storage and usage requirements. Refer to Choosing resource limits to learn about how to limit your resource consumption.
  • For guidance on sizing for connection pools, see the CockroachDB Cloud Production Checklist.

If you are migrating to a CockroachDB Self-Hosted cluster:

  • Refer to our sizing methodology to determine the total number of vCPUs on the cluster and the number of vCPUs per node (which determines the number of nodes on the cluster).
  • Refer to our storage recommendations to determine the amount of storage to provision on each node.
  • For guidance on sizing for connection pools, see the CockroachDB Self-Hosted Production Checklist.

Application changes

As you develop your migration plan, consider the application changes that you will need to make. These may relate to the following:

Schema design best practices

Follow these recommendations when converting your schema for compatibility with CockroachDB.

Tip:

The Schema Conversion Tool automatically identifies potential improvements to your schema.

Index creation best practices

Review the best practices for creating secondary indexes on CockroachDB.

We discourage indexing on sequential keys. If a table must be indexed on sequential keys, use hash-sharded indexes. Hash-sharded indexes distribute sequential traffic uniformly across ranges, eliminating single-range hot spots and improving write performance on sequentially-keyed indexes at a small cost to read performance.

Handling transaction contention

Optimize your queries against transaction contention. You may encounter transaction retry errors when you test application queries, as well as transaction contention due to long-running transactions when you conduct the migration and bulk load data.

Transaction retry errors are more frequent under CockroachDB's default SERIALIZABLE isolation level. If you are migrating an application that was built at a READ COMMITTED isolation level, you should first enable READ COMMITTED isolation on the CockroachDB cluster for compatibility.

Unimplemented features and syntax incompatibilities

Update your queries to resolve differences in functionality and SQL syntax.

Tip:

The Schema Conversion Tool automatically flags syntax incompatibilities and unimplemented features in your schema.

CockroachDB supports the PostgreSQL wire protocol and is largely compatible with PostgreSQL syntax. However, the following PostgreSQL features do not yet exist in CockroachDB:

If your source database uses any of the preceding features, you may need to implement workarounds in your schema design, in your data manipulation language (DML), or in your application code.

For more details on the CockroachDB SQL implementation, see SQL Feature Support.

Prepare for migration

Once you have a migration plan, prepare the team, application, source database, and CockroachDB cluster for the migration.

Run a migration "pre-mortem"

Tip:

This step is optional.

To minimize issues after cutover, compose a migration "pre-mortem":

  • Clearly describe the roles and processes of each team member performing the migration.
  • List the likely failure points and issues that you may encounter as you conduct the migration.
  • Rank potential issues by severity, and identify ways to reduce risk.
  • Create a plan for implementing the actions that would most effectively reduce risk.

Set up monitoring and alerting

Tip:

This step is optional.

Based on the error budget you defined in your migration plan, identify the metrics that you can use to measure your success criteria and set up monitoring for the migration. These metrics may be identical to those you normally use in production, but can also be specific to your migration needs.

Update the schema and queries

In the following order:

  1. Convert your schema.
  2. Load test data.
  3. Validate your application queries.

You can use the following MOLT (Migrate Off Legacy Technology) tools to simplify these steps:

Convert the schema

First, convert your database schema to an equivalent CockroachDB schema:

Then import the converted schema to a CockroachDB cluster:

Load test data

Tip:

Before moving data, Cockroach Labs recommends dropping any indexes on the CockroachDB database. The indexes can be recreated after the data is loaded. Doing so will optimize performance.

After converting the schema, load your data into CockroachDB so that you can test your application queries. Then use one of the following methods to migrate the data (you may need to use additional tooling to extract and/or convert the data to an appropriate file format):

  • Use IMPORT INTO to migrate CSV, TSV, or Avro data stored via userfile or cloud storage into pre-existing tables on CockroachDB. This option achieves the highest throughput, but requires taking the tables offline to achieve its import speed. Typically, initial data loading during a database migration will not be running concurrently with application traffic, so the fact that IMPORT INTO takes the table offline may not have any observable availability impact.
  • Use a third-party data migration tool (e.g., AWS DMS, Qlik, Striim) to load the data. Within the tool, you can select the database tables to migrate to the test cluster.
  • When migrating from PostgreSQL, you can use COPY FROM to copy CSV or tab-delimited data to your CockroachDB tables. This option enables your tables to remain online and accessible. However, it is slower than using IMPORT INTO.

Validate queries

After you load the test data, validate your queries on CockroachDB. You can do this by shadowing or by manually testing the queries.

Note that CockroachDB defaults to the SERIALIZABLE transaction isolation level. If you are migrating an application that was built at a READ COMMITTED isolation level on the source database, you must enable READ COMMITTED isolation on the CockroachDB cluster for compatibility.

Shadowing

You can "shadow" your production workload by executing your source SQL statements on CockroachDB in parallel. The CockroachDB Live Migration Service (LMS) can perform shadowing. You can then test the queries on CockroachDB for consistency, performance, and potential issues with the migration.

Shadowing may not be necessary or practical for your workload. For example, because transactions are serialized on CockroachDB, this will limit your ability to validate the performance of high-throughput workloads.

Test query results and performance

You can manually validate your queries by testing a subset of "critical queries" on an otherwise idle CockroachDB cluster:

  • Check the application logs for error messages and the API response time. If application requests are slower than expected, use the SQL Activity page on the CockroachDB Cloud Console or DB Console to find the longest-running queries that are part of that application request. If necessary, tune the queries according to our best practices for SQL performance.

  • Compare the results of the queries and check that they are identical in both the source database and CockroachDB. To do this, you can use MOLT Verify.

Test performance on a CockroachDB cluster that is appropriately sized for your workload:

  1. Run the application with single- or very low-concurrency and verify the app's performance is acceptable. The cluster should be provisioned with more than enough resources to handle this workload, because you need to verify that the queries will be fast enough when there are zero resource bottlenecks.

  2. Run stress tests with at least the production concurrency and rate, but ideally higher in order to verify that the system can handle unexpected spikes in load. This can also uncover contention issues that will appear during spikes in app load, which may require application design changes to avoid.

Perform a dry run

To further minimize potential surprises when you conduct the migration, practice cutover using your application and similar volumes of data on a "dry-run" environment. Use a test or development environment that is as similar as possible to production.

Performing a dry run is highly recommended. In addition to demonstrating how long the migration may take, a dry run also helps to ensure that team members understand what they need to do during the migration, and that changes to the application are coordinated.

Conduct the migration

Before proceeding, double-check that you are prepared to migrate.

Once you are ready to migrate, optionally drop the database and delete the test cluster so that you can get a clean start:

icon/buttons/copy
DROP DATABASE {database-name} CASCADE;

Alternatively, truncate each table you used for testing to avoid having to recreate your schema:

icon/buttons/copy
TRUNCATE {table-name} CASCADE;

Migrate your data to CockroachDB using the method that is appropriate for your downtime requirements and cutover strategy.

Lift and Shift

Using this method, consistency is achieved by only performing the cutover once all writes have been replicated from the source database to CockroachDB. This requires downtime during which the application traffic is stopped.

The following is a high-level overview of the migration steps. For considerations and details about the pros and cons of this approach, see Migration Strategy: Lift and Shift.

  1. Stop application traffic to your source database. This begins downtime.
  2. Move data in one of the following ways:
  3. After the data is migrated, you can use MOLT Verify to validate the consistency of the data between the source database and CockroachDB.
  4. Perform a cutover by resuming application traffic, now to CockroachDB.

Zero Downtime

Using this method, downtime is minimized by performing the cutover while writes are still being replicated from the source database to CockroachDB. Inconsistencies are resolved through manual reconciliation.

The following is a high-level overview of the migration steps.

To prioritize consistency and minimize downtime:

  1. Use a third-party data migration tool (e.g., AWS DMS, Qlik, Striim) to load the data. Select the tool's option to replicate ongoing changes after performing the initial load of data into CockroachDB.
  2. As the data is migrating, you can use MOLT Verify to validate the consistency of the data between the source database and CockroachDB.
  3. Once nearly all data from your source database has been moved to CockroachDB (for example, with a <1 second delay or <1000 rows), stop application traffic to your source database. This begins downtime.
  4. Wait for replication to CockroachDB to complete.
  5. Perform a cutover by resuming application traffic, now to CockroachDB.

To achieve zero downtime with inconsistency:

  1. Use a third-party data migration tool (e.g., AWS DMS, Qlik, Striim) to load the data. Select the tool's option to replicate ongoing changes after performing the initial load of data into CockroachDB.
  2. As the data is migrating, you can use MOLT Verify to validate the consistency of the data between the source database and CockroachDB.
  3. Once nearly all data from your source database has been moved to CockroachDB (for example, with a <1 second delay or <1000 rows), perform a cutover by pointing application traffic to CockroachDB.
  4. Manually reconcile any inconsistencies caused by writes that were not replicated during the cutover.
  5. Close the connection to the source database when you are ready to finish the migration.

Complete the migration

After you have successfully conducted the migration:

  • Notify the teams and other stakeholders impacted by the migration.
  • Retire any test or development environments used to verify the migration.
  • Extend the document you created when developing your migration plan with any issues encountered and follow-up work that needs to be done.

See also


Yes No
On this page

Yes No