Indexes

Indexes improve your database’s performance by helping SQL locate data without having to look through every row of a table.

How do indexes work?

When you create an index, CockroachDB “indexes” the columns you specify, which creates a copy of the columns and then sorts their values (without sorting the values in the table itself). After a column is indexed, SQL can easily filter its values using the index instead of scanning each row one-by-one. On large tables, this greatly reduces the number of rows SQL has to use, executing queries exponentially faster. For example, if you index an INT column and then filter it WHERE <indexed column> = 10, SQL can use the index to find values starting at 10 but less than 11. In contrast, without an index, SQL would have to evaluate every row in the table for values equaling 10. This is also known as a “full table scan”, and it can be very bad for query performance. You can also create an index on a subset of rows. This type of index is called a partial index. For more information, see . To index , CockroachDB uses spatial indexes. For more information, see . In , most users should use instead of explicit index . When you add an index to a REGIONAL BY ROW table, it is automatically partitioned on the . Explicit index partitioning is not required. While CockroachDB process an or statement on a particular database, creating or modifying an index will throw an error. Similarly, all and statements will be blocked while an index is being modified on a REGIONAL BY ROW table within the same database.

Creation

Each table automatically has a primary index called {tbl}_pkey, which indexes either its or — if there is no primary key — a unique value for each row known as rowid. We recommend always defining a primary key because the index it creates provides much better performance than letting CockroachDB use rowid. To require an explicitly defined primary key for all tables created in your cluster, set the sql.defaults.require_explicit_primary_keys.enabled to true.

Use instead of the sql.defaults.* . This allows you to set a default value for all users for any that applies during login, making the sql.defaults.* cluster settings redundant.

The primary index helps filter a table’s primary key but doesn’t help SQL find values in any other columns. However, you can use to improve the performance of queries using columns not in a table’s primary key. You can create them:

At the same time as the table with the INDEX clause of . In addition to explicitly defined indexes, CockroachDB automatically creates secondary indexes for columns with the .
For existing tables with .
By applying the UNIQUE constraint to columns with , which automatically creates an index of the constrained columns.

To review guidelines for creating the most useful secondary indexes, see .

Selection

In most cases CockroachDB selects the index it calculates will scan the fewest rows (i.e., the fastest). Cases where CockroachDB will use multiple indexes include certain queries that use disjunctions (i.e., predicates with OR), as well as for some other queries. To learn how to use the statement for your query to see which index is being used, see Index Selection in CockroachDB. To override CockroachDB index selection, you can also force queries (also known as “index hinting”). Index hinting is supported for , , and statements.

Storage

CockroachDB stores indexes directly in its key-value store. You can find more information in our blog post Mapping Table Data to Key-Value Storage.

Locking

Tables are not locked during index creation due to CockroachDB support for .

Performance

Indexes create a trade-off: they greatly improve the speed of queries, but may slightly slow down writes to an affected column (because new values have to be written for both the table and the index). To maximize your indexes’ performance, Cockroach Labs recommends following the . To observe the impact of an index without affecting a production workload, you can using the NOT VISIBLE clause. If an index is NOT VISIBLE, queries will not read from the index unless it is specifically selected with an or the property is overridden with the . For an example, refer to . For more index visibility considerations, refer to . We strongly recommend adding size limits to all , which includes columns in . Values exceeding 1 MiB can lead to and cause significant performance degradation or even . To add a size limit using :

CREATE TABLE name (first STRING(100), last STRING(100));

To add a size limit using :

ALTER TABLE name ALTER first TYPE STRING(99);

For more information about how to tune CockroachDB performance, see .

Storing columns

The STORING clause specifies columns which are not part of the index key but should be stored in the index. This optimizes queries that retrieve those columns without filtering on them, because it prevents the need to read the . An index that stores all the columns needed by a query is also known as a covering index for that query. When a query has a covering index, CockroachDB can use that index directly instead of doing an “index join” with the primary index, which is likely to be slower. The synonym COVERING is also supported.

Example

Suppose you have a table with three columns, two of which are indexed:

> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2));

If you filter on the indexed columns but retrieve the unindexed column, this requires reading col3 from the primary index via an “index join.”

> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;

  distribution: local
  vectorized: true

  • index join
  │ table: tbl@tbl_pkey
  │
  └── • scan
        missing stats
        table: tbl@tbl_col1_col2_idx
        spans: [/10/2 - /10]

  index recommendations: 1
  1. type: index replacement
     SQL commands: CREATE INDEX ON tbl (col1, col2) STORING (col3); DROP INDEX tbl@tbl_col1_col2_idx;
(14 rows)

However, if you store col3 in the index as shown in the , the index join is no longer necessary. This means your query only needs to read from the secondary index, so it will be more efficient.

> CREATE TABLE tbl (col1 INT, col2 INT, col3 INT, INDEX (col1, col2) STORING (col3));

> EXPLAIN SELECT col3 FROM tbl WHERE col1 = 10 AND col2 > 1;

               info
----------------------------------
  distribution: local
  vectorized: true

  • scan
    missing stats
    table: tbl@tbl_col1_col2_idx
    spans: [/10/2 - /10]
(7 rows)

Best practices

For best practices, see .

Indexes on `REGIONAL BY ROW` tables in multi-region databases

In , most users should use instead of explicit index . When you add an index to a REGIONAL BY ROW table, it is automatically partitioned on the . Explicit index partitioning is not required. While CockroachDB process an or statement on a particular database, creating or modifying an index will throw an error. Similarly, all and statements will be blocked while an index is being modified on a REGIONAL BY ROW table within the same database. This behavior also applies to . For an example that uses unique indexes but applies to all indexes on REGIONAL BY ROW tables, see .

Get Started

CockroachDB and AI

Feature Overview

Data Resilience

Connect to an Application

Self-Hosted Deployments

Schema Design

Reads and Writes

Stream Data

Cross-Cluster Replication

Multi-Region Capabilities

Optimize Performance

Integrate

How do indexes work?

Creation

Selection

Storage

Locking

Performance

Storing columns

Example

Best practices

Indexes on `REGIONAL BY ROW` tables in multi-region databases

See also

​How do indexes work?

​Creation

​Selection

​Storage

​Locking

​Performance

​Storing columns

​Example

​Best practices

​Indexes on REGIONAL BY ROW tables in multi-region databases

​See also

How do indexes work?

Creation

Selection

Storage

Locking

Performance

Storing columns

Example

Best practices

Indexes on `REGIONAL BY ROW` tables in multi-region databases

See also