Vectorized Query Execution

On this page Carat arrow pointing down
Warning:
CockroachDB v19.2 is no longer supported. For more details, see the Release Support Policy.

New in v19.2: CockroachDB supports column-oriented ("vectorized") query execution.

Many SQL databases execute query plans one row of table data at a time. Row-oriented execution models can offer good performance for online transaction processing (OLTP) queries, but suboptimal performance for online analytical processing (OLAP) queries. The CockroachDB vectorized execution engine dramatically improves performance over row-oriented execution by processing each component of a query plan on type-specific batches of column data.

Note:

CockroachDB does not support vectorized execution for all data types. For details, see supported data types.

Configuring vectorized execution

By default, vectorized execution is enabled in CockroachDB for all queries that are guaranteed to execute in memory, on tables with supported data types.

You can turn vectorized execution on or off for all queries in the current session with the vectorize session variable. The following options are supported:

Option Description
auto Instructs CockroachDB to use the vectorized execution engine on most queries that execute in memory, without the need to spill intermediate results to disk. See Memory-intensive operations for more information.

Default: vectorize=auto
experimental_on Turns on vectorized execution for all supported queries. We do not recommend using this option in production environments, as it can lead to memory issues.
See Known limitations for more information.
off Turns off vectorized execution for all queries.

For information about setting session variables, see SET <session variable>.

Tip:

CockroachDB supports vectorized execution on columns with supported data types only. Setting the vectorize session variable to experimental_on does not turn vectorized execution on for queries on columns with unsupported data types.

Tip:

To see if CockroachDB will use the vectorized execution engine for a query, run a simple EXPLAIN statement on the query. If vectorize is true, the query will be executed with the vectorized engine. If it is false, the row-oriented execution engine is used instead.

Setting the row threshold for vectorized execution

The efficiency of vectorized execution increases with the number of rows processed. If you are querying a table with a small number of rows, it is more efficient to use row-oriented execution.

By default, vectorized execution is enabled for queries on tables of 1000 rows or more. If the number of rows in a table falls below 1000, CockroachDB uses the row-oriented execution engine instead.

For performance tuning, you can change the minimum number of rows required to use the vectorized engine to execute a query plan in the current session with the vectorize_row_count_threshold session variable. This variable is ignored if vectorize=experimental_on.

How vectorized execution works

When you issue a query, the gateway node (i.e., the node from which you issue the query) parses the query and creates a physical plan for execution on each node that receives the plan. If vectorized execution is enabled, the physical plan is sent to each node to be executed by the vectorized execution engine.

Tip:

To see a detailed view of the vectorized execution plan for a query, run the EXPLAIN(VEC) statement on the query.

For information about vectorized execution in the context of the CockroachDB architecture, see Query Execution.

For detailed examples of vectorized query execution for hash and merge joins, see the blog posts 40x faster hash joiner with vectorized execution and Vectorizing the merge joiner in CockroachDB.

Known limitations

Vectorized execution is not as extensively tested as CockroachDB's existing row-oriented execution engine. In addition, some data types are not supported, and support for some operations is experimental.

Supported data types

Vectorized execution is supported for the following data types and their aliases:

Note:

CockroachDB uses the vectorized engine to execute queries on columns with supported data types, even if a column's parent table includes unused columns with unsupported data types.

Queries with constant NULL arguments

The vectorized execution engine does not support queries that contain constant NULL arguments, with the exception of the IS projection operators IS NULL and IS NOT NULL.

For example, SELECT x IS NOT NULL FROM t is supported, but SELECT x + NULL FROM t returns an unable to vectorize execution plan error.

For more information, see the tracking issue.

Memory-intensive operations

Support for vectorized execution is experimental for the following memory-intensive operations:

During the execution of these operations, some intermediate results are written to an in-memory data buffer. In CockroachDB v19.2, the vectorized engine does not spill the contents of intermediate data buffers to disk. As a result, memory-intensive queries can buffer an unlimited quantity of data in memory and lead to memory issues.

We do not recommend using vectorized execution in production environments for memory-intensive queries that cannot spill to disk. Executing these queries with the vectorized execution engine (i.e., with the vectorize session variable set to experimental_on) can cause CockroachDB nodes to run out of memory and crash.

See also


Yes No
On this page

Yes No