Optimize Statement Performance

On this page Carat arrow pointing down
Warning:
CockroachDB v21.1 is no longer supported. For more details, see the Release Support Policy.

This page provides an overview for optimizing statement performance in CockroachDB. To get good performance, you need to look at how you're accessing the database through several lenses:

  • SQL statement performance: This is the most common cause of performance problems, and where most developers should start.
  • Schema design: Depending on your SQL schema and the data access patterns of your workload, you may need to make changes to avoid creating "hotspots".
  • Cluster topology: As a distributed system, CockroachDB requires you to trade off latency vs. resiliency. This requires choosing the right cluster topology for your needs.
Note:

If you aren't sure whether SQL statement performance needs to be improved on your cluster, see Identify slow statements.

SQL statement performance

To get good SQL statement performance, follow the rules below (in approximate order of importance):

Note:

These rules apply to an environment where thousands of OLTP statements are being run per second, and each statement needs to run in milliseconds. These rules are not intended to apply to analytical, or OLAP, statements.

  • Rule 1. Scan as few rows as possible. If your application is scanning more rows than necessary for a given statement, it's going to be difficult to scale.
  • Rule 2. Use the right index: Your statement should use an index on the columns in the WHERE clause. You want to avoid the performance hit of a full table scan.
  • Rule 3. Use the right join type: Depending on the relative sizes of the tables you are querying, the type of join may be important. This should rarely be necessary because the cost-based optimizer should pick the best-performing join type if you add the right indexes as described in Rule 2.
Note:

To identify poorly performing statements, use the DB Console and slow query log.

To show each of these rules in action, we will optimize a statement against the MovR data set as follows:

Start the MovR database on a 3-node CockroachDB demo cluster with a larger data set.

icon/buttons/copy
cockroach demo movr --num-histories 250000 --num-promo-codes 250000 --num-rides 125000 --num-users 12500 --num-vehicles 3750 --nodes 3

It's common to offer users promo codes to increase usage and customer loyalty. In this scenario, we want to find the 10 users who have taken the highest number of rides on a given date, and offer them promo codes that provide a 10% discount.

To phrase it in the form of a question: "Who are the top 10 users by number of rides on a given date?"

Rule 1. Scan as few rows as possible

First, let's study the schema so we understand the relationships between the tables.

Start a SQL shell:

icon/buttons/copy
cockroach sql --insecure

Next, set movr as the current database and run SHOW TABLES:

icon/buttons/copy
USE movr;
SHOW TABLES;
  schema_name |         table_name         | type  | estimated_row_count
--------------+----------------------------+-------+----------------------
  public      | promo_codes                | table |              250000
  public      | rides                      | table |              125000
  public      | user_promo_codes           | table |                   0
  public      | users                      | table |               12500
  public      | vehicle_location_histories | table |              250000
  public      | vehicles                   | table |                3750
(6 rows)

Time: 17ms total (execution 17ms / network 0ms)

Let's look at the schema for the users table:

icon/buttons/copy
SHOW CREATE TABLE users;
  table_name |                      create_statement
-------------+--------------------------------------------------------------
  users      | CREATE TABLE public.users (
             |     id UUID NOT NULL,
             |     city VARCHAR NOT NULL,
             |     name VARCHAR NULL,
             |     address VARCHAR NULL,
             |     credit_card VARCHAR NULL,
             |     CONSTRAINT "primary" PRIMARY KEY (city ASC, id ASC),
             |     FAMILY "primary" (id, city, name, address, credit_card)
             | )
(1 row)

Time: 9ms total (execution 9ms / network 0ms)

There's no information about the number of rides taken here, nor anything about the days on which rides occurred. Luckily, there is also a rides table. Let's look at it:

icon/buttons/copy
SHOW CREATE TABLE rides;
  table_name |                                                        create_statement
-------------+----------------------------------------------------------------------------------------------------------------------------------
  rides      | CREATE TABLE public.rides (
             |     id UUID NOT NULL,
             |     city VARCHAR NOT NULL,
             |     vehicle_city VARCHAR NULL,
             |     rider_id UUID NULL,
             |     vehicle_id UUID NULL,
             |     start_address VARCHAR NULL,
             |     end_address VARCHAR NULL,
             |     start_time TIMESTAMP NULL,
             |     end_time TIMESTAMP NULL,
             |     revenue DECIMAL(10,2) NULL,
             |     CONSTRAINT "primary" PRIMARY KEY (city ASC, id ASC),
             |     CONSTRAINT fk_city_ref_users FOREIGN KEY (city, rider_id) REFERENCES public.users(city, id),
             |     CONSTRAINT fk_vehicle_city_ref_vehicles FOREIGN KEY (vehicle_city, vehicle_id) REFERENCES public.vehicles(city, id),
             |     INDEX rides_auto_index_fk_city_ref_users (city ASC, rider_id ASC),
             |     INDEX rides_auto_index_fk_vehicle_city_ref_vehicles (vehicle_city ASC, vehicle_id ASC),
             |     FAMILY "primary" (id, city, vehicle_city, rider_id, vehicle_id, start_address, end_address, start_time, end_time, revenue),
             |     CONSTRAINT check_vehicle_city_city CHECK (vehicle_city = city)
             | )
(1 row)

Time: 9ms total (execution 8ms / network 1ms)

There is a rider_id field that we can use to match each ride to a user. There is also a start_time field that we can use to filter the rides by date.

This means that to get the information we want, we'll need to do a join on the users and rides tables.

Next, let's get the row counts for the tables that we'll be using in this query. We need to understand which tables are large, and which are small by comparison. We will need this later if we need to verify we are using the right join type.

As specified above by our cockroach workload command, the users table has 12,500 records, and the rides table has 125,000 records. Because it's so large, we want to avoid scanning the entire rides table in our query. In this case, we can avoid scanning rides using an index, as shown in the next section.

Rule 2. Use the right index

Below is a query that fetches the right answer to our question: "Who are the top 10 users by number of rides on a given date?"

icon/buttons/copy
SELECT
    name, count(rides.id) AS sum
FROM
    users JOIN rides ON users.id = rides.rider_id
WHERE
    rides.start_time BETWEEN '2018-12-31 00:00:00' AND '2020-01-01 00:00:00'
GROUP BY
    name
ORDER BY
    sum DESC
LIMIT
    10;
        name       | sum
-------------------+------
  William Brown    |  14
  William Mitchell |  10
  Joseph Smith     |  10
  Paul Nelson      |   9
  Christina Smith  |   9
  Jennifer Johnson |   8
  Jeffrey Walker   |   8
  Joseph Jones     |   7
  James Williams   |   7
  Thomas Smith     |   7
(10 rows)

Time: 111ms total (execution 111ms / network 0ms)

Unfortunately, this query is a bit slow. 111 milliseconds puts us over the limit where a user feels the system is reacting instantaneously, and we're still down in the database layer. This data still needs to be shipped back out to your application and displayed to the user.

We can see why if we look at the output of EXPLAIN:

icon/buttons/copy
EXPLAIN SELECT
    name, count(rides.id) AS sum
FROM
    users JOIN rides ON users.id = rides.rider_id
WHERE
    rides.start_time BETWEEN '2018-12-31 00:00:00' AND '2019-01-01 00:00:00'
GROUP BY
    name
ORDER BY
    sum DESC
LIMIT
    10;
                                                    info
-------------------------------------------------------------------------------------------------------------
  distribution: full
  vectorized: true

  • limit
  │ estimated row count: 10
  │ count: 10
  │
  └── • sort
      │ estimated row count: 3,392
      │ order: -count_rows
      │
      └── • group
          │ estimated row count: 3,392
          │ group by: name
          │
          └── • hash join
              │ estimated row count: 4,013
              │ equality: (id) = (rider_id)
              │
              ├── • scan
              │     estimated row count: 12,500 (100% of the table; stats collected 4 minutes ago)
              │     table: users@primary
              │     spans: FULL SCAN
              │
              └── • filter
                  │ estimated row count: 4,013
                  │ filter: (start_time >= '2018-12-31 00:00:00') AND (start_time <= '2019-01-01 00:00:00')
                  │
                  └── • scan
                        estimated row count: 125,000 (100% of the table; stats collected 3 minutes ago)
                        table: rides@primary
                        spans: FULL SCAN
(32 rows)

Time: 2ms total (execution 2ms / network 0ms)

The main problem is that we are doing full table scans on both the users and rides tables (see spans: FULL SCAN). This tells us that we do not have indexes on the columns in our WHERE clause, which is an indexing best practice.

Therefore, we need to create an index on the column in our WHERE clause, in this case: rides.start_time.

It's also possible that there is not an index on the rider_id column that we are doing a join against, which will also hurt performance.

Before creating any more indexes, let's see what indexes already exist on the rides table by running SHOW INDEXES:

icon/buttons/copy
SHOW INDEXES FROM rides;
  table_name |                  index_name                   | non_unique | seq_in_index | column_name  | direction | storing | implicit
-------------+-----------------------------------------------+------------+--------------+--------------+-----------+---------+-----------
  rides      | primary                                       |   false    |            1 | city         | ASC       |  false  |  false
  rides      | primary                                       |   false    |            2 | id           | ASC       |  false  |  false
  rides      | rides_auto_index_fk_city_ref_users            |    true    |            1 | city         | ASC       |  false  |  false
  rides      | rides_auto_index_fk_city_ref_users            |    true    |            2 | rider_id     | ASC       |  false  |  false
  rides      | rides_auto_index_fk_city_ref_users            |    true    |            3 | id           | ASC       |  false  |   true
  rides      | rides_auto_index_fk_vehicle_city_ref_vehicles |    true    |            1 | vehicle_city | ASC       |  false  |  false
  rides      | rides_auto_index_fk_vehicle_city_ref_vehicles |    true    |            2 | vehicle_id   | ASC       |  false  |  false
  rides      | rides_auto_index_fk_vehicle_city_ref_vehicles |    true    |            3 | id           | ASC       |  false  |   true
  rides      | rides_auto_index_fk_vehicle_city_ref_vehicles |    true    |            4 | city         | ASC       |  false  |   true
(9 rows)

Time: 5ms total (execution 5ms / network 0ms)

As we suspected, there are no indexes on start_time or rider_id, so we'll need to create indexes on those columns.

Because another performance best practice is to create an index on the WHERE condition storing the join key, we will create an index on start_time that stores the join key rider_id:

icon/buttons/copy
CREATE INDEX ON rides (start_time) storing (rider_id);

Now that we have an index on the column in our WHERE clause that stores the join key, let's run the query again:

icon/buttons/copy
SELECT
    name, count(rides.id) AS sum
FROM
    users JOIN rides ON users.id = rides.rider_id
WHERE
    rides.start_time BETWEEN '2018-12-31 00:00:00' AND '2019-01-01 00:00:00'
GROUP BY
    name
ORDER BY
    sum DESC
LIMIT
    10;
        name       | sum
-------------------+------
  William Brown    |   6
  Laura Marsh      |   5
  Joseph Smith     |   5
  David Martinez   |   4
  Michael Garcia   |   4
  David Mitchell   |   4
  Arthur Nielsen   |   4
  Michael Bradford |   4
  William Mitchell |   4
  Jennifer Johnson |   4
(10 rows)

Time: 22ms total (execution 22ms / network 0ms)

This query is now running much faster than it was before we added the indexes (111ms vs. 22ms). This means we have an extra 89 milliseconds we can budget towards other areas of our application.

To see what changed, let's look at the EXPLAIN output:

icon/buttons/copy
EXPLAIN SELECT
    name, count(rides.id) AS sum
FROM
    users JOIN rides ON users.id = rides.rider_id
WHERE
    rides.start_time BETWEEN '2018-12-31 00:00:00' AND '2019-01-01 00:00:00'
GROUP BY
    name
ORDER BY
    sum DESC
LIMIT
    10;

As you can see, this query is no longer scanning the entire (larger) rides table. Instead, it is now doing a much smaller range scan against only the values in rides that match the index we just created on the start_time column (4,013 rows instead of 125,000).

                                                info
----------------------------------------------------------------------------------------------------
  distribution: full
  vectorized: true

  • limit
  │ estimated row count: 10
  │ count: 10
  │
  └── • sort
      │ estimated row count: 3,392
      │ order: -count_rows
      │
      └── • group
          │ estimated row count: 3,392
          │ group by: name
          │
          └── • hash join
              │ estimated row count: 4,013
              │ equality: (id) = (rider_id)
              │
              ├── • scan
              │     estimated row count: 12,500 (100% of the table; stats collected 8 minutes ago)
              │     table: users@primary
              │     spans: FULL SCAN
              │
              └── • scan
                    estimated row count: 4,013 (3.2% of the table; stats collected 7 minutes ago)
                    table: rides@rides_start_time_idx
                    spans: [/'2018-12-31 00:00:00' - /'2019-01-01 00:00:00']
(28 rows)

Time: 2ms total (execution 1ms / network 0ms)

Rule 3. Use the right join type

Out of the box, the cost-based optimizer will select the right join type for your statement in the majority of cases. Therefore, you should only provide join hints in your query if you can prove to yourself through experimentation that the optimizer should be using a different join type than it is selecting.

We can confirm that in this case the optimizer has already found the right join type for this statement by using a hint to force another join type.

For example, we might think that a lookup join could perform better in this instance, since one of the tables in the join is 10x smaller than the other.

In order to get CockroachDB to plan a lookup join in this case, we will need to add an explicit index on the join key for the right-hand-side table, in this case, rides.

icon/buttons/copy
CREATE INDEX ON rides (rider_id);

Next, we can specify the lookup join with a join hint:

icon/buttons/copy
SELECT
    name, count(rides.id) AS sum
FROM
    users INNER LOOKUP JOIN rides ON users.id = rides.rider_id
WHERE
    (rides.start_time BETWEEN '2018-12-31 00:00:00' AND '2019-01-01 00:00:00')
GROUP BY
    name
ORDER BY
    sum DESC
LIMIT
    10;
        name       | sum
-------------------+------
  William Brown    |   6
  Laura Marsh      |   5
  Joseph Smith     |   5
  Michael Garcia   |   4
  David Mitchell   |   4
  David Martinez   |   4
  Arthur Nielsen   |   4
  Jennifer Johnson |   4
  William Mitchell |   4
  Michael Bradford |   4
(10 rows)

Time: 1.548s total (execution 1.548s / network 0.000s)

The results, however, are not good. The query is much slower using a lookup join than what CockroachDB planned for us earlier.

The query is a little faster when we force CockroachDB to use a merge join:

icon/buttons/copy
SELECT
    name, count(rides.id) AS sum
FROM
    users INNER MERGE JOIN rides ON users.id = rides.rider_id
WHERE
    (rides.start_time BETWEEN '2018-12-31 00:00:00' AND '2019-01-01 00:00:00')
GROUP BY
    name
ORDER BY
    sum DESC
LIMIT
    10;
        name        | sum
+-------------------+-----+
  William Brown     |   6
  Laura Marsh       |   5
  Joseph Smith      |   5
  Jennifer Ford     |   4
  David Mitchell    |   4
  William Mitchell  |   4
  Christopher Allen |   4
  Michael Bradford  |   4
  Michael Garcia    |   4
  Jennifer Johnson  |   4
(10 rows)

Time: 22ms total (execution 22ms / network 0ms)

The results are consistently about 20-26ms with a merge join vs. 16-23ms when we let CockroachDB choose the join type as shown in the previous section. In other words, forcing the merge join is slightly slower than if we had done nothing.

Schema design

If you are following the instructions in the SQL performance section and still not getting the performance you want, you may need to look at your schema design and data access patterns to make sure you are not creating "hotspots" in your cluster that will lead to performance problems due to transaction contention.

You can avoid contention with the following strategies:

  • Use index keys with a more random distribution of values, as described in Unique ID best practices.
  • Make transactions smaller by operating on less data per transaction. This will offer fewer opportunities for transactions' data access to overlap.
  • Split the table across multiple ranges to distribute its data across multiple nodes for better load balancing of some write-heavy workloads.

For more information about how to avoid performance problems caused by contention, see Understanding and Avoiding Transaction Contention.

Cluster topology

It's very important to make sure that the cluster topology you are using is the right one for your use case. Because CockroachDB is a distributed system that involves nodes communicating over the network, you need to choose the cluster topology that results in the right latency vs. resiliency tradeoff.

For more information about how to choose the cluster topology that is right for your application, see this list of topology patterns.

See also

Reference information:

Specific tasks:


Yes No
On this page

Yes No