SQL subqueries enable reuse of the results from a selection query within another query.
CockroachDB supports two kinds of subqueries:
- Relational subqueries which appear as operand in selection queries or table expressions.
- Scalar subqueries which appear as operand in a scalar expression.
Data writes in subqueries
When a subquery contains a data-modifying statement (
DELETE, etc.), the data modification is always executed to
completion even if the surrounding query only uses a subset of the
This is true both for subqueries defined using the
notations, and those defined using
> SELECT * FROM [INSERT INTO t(x) VALUES (1), (2), (3) RETURNING x] LIMIT 1;
This query always inserts 3 rows into
t, even though the surrounding
query only observes 1 row using
A subquery is said to be "correlated" when it uses table or column names defined in the surrounding query.
For example, to find every customer with at least one order, run:
> SELECT c.name FROM customers AS c WHERE EXISTS( SELECT * FROM orders AS o WHERE o.customer_id = c.id );
The subquery is correlated because it uses
c defined in the surrounding query.
LATERAL subqueries. A
LATERAL subquery is a correlated subquery that references another query or subquery in its
SELECT statement, usually in the context of a
LEFT join or an
INNER join. Unlike other correlated subqueries,
LATERAL subqueries iterate through each row in the referenced query for each row in the inner subquery, like a for loop.
To create a
LATERAL subquery, use the
LATERAL keyword directly before the inner subquery's
For example, the following statement performs an
INNER join of the
users table and a subquery of the
rides table that filters on values in the
> SELECT name, address FROM users, LATERAL (SELECT * FROM rides WHERE rides.start_address = users.address AND city = 'new york');
name | address +------------------+-----------------------------+ Robert Murphy | 99176 Anderson Mills James Hamilton | 73488 Sydney Ports Suite 57 Judy White | 18580 Rosario Ville Apt. 61 Devin Jordan | 81127 Angela Ferry Apt. 8 Catherine Nelson | 1149 Lee Alley Nicole Mcmahon | 11540 Patton Extensions (6 rows)
LATERAL subquery joins are especially useful when the join table includes a computed column.
For example, the following query joins a subquery of the
rides table with a computed column (
adjusted_revenue), and a subquery of the
users table that references columns in the
> SELECT ride_id, vehicle_id, type, adjusted_revenue FROM ( SELECT id AS ride_id, vehicle_id, revenue - 0.25*revenue AS adjusted_revenue FROM rides ) r JOIN LATERAL ( SELECT type FROM vehicles WHERE city = 'new york' AND vehicles.id = r.vehicle_id AND r.adjusted_revenue > 65 ) v ON true;
ride_id | vehicle_id | type | adjusted_revenue +--------------------------------------+--------------------------------------+------------+------------------+ 049ba5e3-53f7-4ec0-8000-000000000009 | 11111111-1111-4100-8000-000000000001 | scooter | 71.2500 0624dd2f-1a9f-4e80-8000-00000000000c | 00000000-0000-4000-8000-000000000000 | skateboard | 70.5000 08b43958-1062-4e00-8000-000000000011 | 11111111-1111-4100-8000-000000000001 | scooter | 70.5000 0bc6a7ef-9db2-4d00-8000-000000000017 | 00000000-0000-4000-8000-000000000000 | skateboard | 68.2500 0d4fdf3b-645a-4c80-8000-00000000001a | 00000000-0000-4000-8000-000000000000 | skateboard | 67.5000 1ba5e353-f7ce-4900-8000-000000000036 | 11111111-1111-4100-8000-000000000001 | scooter | 70.5000 (6 rows)
LATERAL subquery join, the rows returned by the inner subquery are added to the result of the join with the outer query. Without the
LATERAL keyword, each subquery is evaluated independently and cannot refer to objects defined in separate queries.
The cost-based optimizer supports most correlated subqueries, with the exception of correlated subqueries that generate side effects inside a
Performance best practices
CockroachDB is currently undergoing major changes to evolve and improve the performance of subqueries. The restrictions and workarounds listed in this section will be lifted or made unnecessary over time.
Scalar subqueries currently disable the distribution of the execution of a query. To ensure maximum performance on queries that process a large number of rows, make the client application compute the subquery results ahead of time and pass these results directly in the surrounding query.
The results of scalar subqueries are currently loaded entirely into memory when the execution of the surrounding query starts. To prevent execution errors due to memory exhaustion, ensure that subqueries return as few results as possible.