Vectorized Query Execution

On this page

CockroachDB supports column-oriented ("vectorized") query execution on all CockroachDB data types.

Many SQL databases execute query plans one row of table data at a time. Row-oriented execution models can offer good performance for online transaction processing (OLTP) queries, but suboptimal performance for online analytical processing (OLAP) queries. The CockroachDB vectorized execution engine dramatically improves performance over row-oriented execution by processing each component of a query plan on type-specific batches of column data.

Configuring vectorized execution

By default, vectorized execution is enabled in CockroachDB.

You can configure vectorized execution with the vectorize session variable. The following options are supported:

Option	Description
`on`	Turns on vectorized execution for all queries on rows over the `vectorize_row_count_threshold` (0 rows, by default, meaning all queries will use the vectorized engine). Default: `vectorize=on`
`off`	Turns off vectorized execution for all queries.

For information about setting session variables, see SET <session variable>.

Tip:

To see if CockroachDB will use the vectorized execution engine for a query, run a simple EXPLAIN statement on the query. If vectorize is true, the query will be executed with the vectorized engine. If it is false, the row-oriented execution engine is used instead.

Setting the row threshold for vectorized execution

The efficiency of vectorized execution increases with the number of rows processed. If you are querying a table with a small number of rows, it is more efficient to use row-oriented execution.

By default, vectorized execution is enabled for all queries.

For performance tuning, you can change the minimum number of rows required to use the vectorized engine to execute a query plan in the current session with the vectorize_row_count_threshold session variable.

How vectorized execution works

When you issue a query, the gateway node (i.e., the node from which you issue the query) parses the query and creates a physical plan for execution on each node that receives the plan. If vectorized execution is enabled, the physical plan is sent to each node to be executed by the vectorized execution engine.

Tip:

To see a detailed view of the vectorized execution plan for a query, run the EXPLAIN(VEC) statement on the query.

For information about vectorized execution in the context of the CockroachDB architecture, see Query Execution.

For detailed examples of vectorized query execution for hash and merge joins, see the blog posts 40x faster hash joiner with vectorized execution and Vectorizing the merge joiner in CockroachDB.

Disk-spilling operations

The following operations require memory buffering during execution:

Global sorts
Unordered aggregations
Hash joins
Merge joins on non-unique columns. Merge joins on columns that are guaranteed to have one row per value, also known as "key columns", can execute entirely in-memory.
Window functions. Note that support for window functions is limited in the vectorized execution engine.

If there is not enough memory allocated for an operation, CockroachDB will spill the intermediate execution results to disk. By default, the memory limit allocated per operator is 64MiB. You can change this limit with the sql.distsql.temp_storage.workmem cluster setting.

You can also configure a node's total budget for in-memory query processing at node startup with the --max-sql-memory flag. If the queries running on the node exceed the memory budget, the node spills intermediate execution results to disk. The --max-disk-temp-storage flag sets the maximum on-disk storage capacity. If the maximum on-disk storage capacity is reached, the query will return an error during execution.

Known limitations

Unsupported queries

The vectorized engine does not support queries containing:

A join filtered with an ON expression. See tracking issue.

Window functions

Support for certain window functions is limited in the vectorized execution engine. If a query includes an unsupported window function, the window function will be handled by the row-oriented execution engine. If the same query includes other, supported operations, those operations will be handled by the vectorized execution engine. See tracking issue.

Spatial features

The vectorized engine does not support working with spatial data. Queries with geospatial functions or spatial data will revert to the row-oriented execution engine.

Unordered distinct operations

Disk spilling isn't supported when running UPSERT statements that have nulls are distinct and error on duplicate markers. You can check this by using EXPLAIN and looking at the statement plan.

        ├── distinct                     |                     |
        │    │                           | distinct on         | ...
        │    │                           | nulls are distinct  |
        │    │                           | error on duplicate  |

Pricing

Contact us

Sign In

Vectorized Query Execution

Configuring vectorized execution

Setting the row threshold for vectorized execution

How vectorized execution works

Disk-spilling operations

Known limitations

Unsupported queries

Window functions

Spatial features

Unordered distinct operations

See also

Tell us about your experience

Thank you for your feedback!

Explore More Documentation:

Vectorized Query Execution

Configuring vectorized execution

Setting the row threshold for vectorized execution

How vectorized execution works

Disk-spilling operations

Known limitations

Unsupported queries

Window functions

Spatial features

Unordered distinct operations

See also

Tell us about your experience

Select the problem area

Thank you for your feedback!

Explore More Documentation: