What's New in v21.2.0-beta.1

September 24, 2021

Warning:

This testing release includes a known bug. We do not recommend upgrading to this release. The v21.2.0-beta.2 release includes a fix for the bug.

Get future release notes emailed to you:

Downloads

Warning:

This release includes a known bug. We do not recommend upgrading to this release.

Warning:
The CockroachDB executable for Windows is experimental and not suitable for production deployments. Windows 8 or higher is required.

Docker image

icon/buttons/copy
$ docker pull cockroachdb/cockroach-unstable:v21.2.0-beta.1

Backward-incompatible changes

  • Previously, CockroachDB only supported the YMD format for parsing timestamps from strings. It now also supports the MDY format to better align with PostgreSQL. A timestamp such as 1-1-18, which was previously interpreted as 2001-01-18, will now be interpreted as 2018-01-01. To continue interpreting the timestamp in the YMD format, the first number can be represented with 4 digits, 2001-1-18. #64381
  • The deprecated setting cloudstorage.gs.default.key has been removed, and the behavior of the AUTH parameter in Google Cloud Storage BACKUP and IMPORT URIs has been changed. The default behavior is now that of AUTH=specified, which uses the credentials passed in the CREDENTIALS parameter, and the previous default behavior of using the node's implicit access (via its machine account or role) now requires explicitly passing AUTH=implicit. #64737
  • Switched types from TEXT to "char" for compatibility with postgres in the following columns: pg_constraint (confdeltype, confmatchtype, confudptype, contype) pg_operator (oprkind), pg_prog (proargmodes), pg_rewrite (ev_enabled, ev_type), pg_trigger (tgenabled) #65101

Security updates

  • Certain HTTP debug endpoints reserved to admin users now return more details about range start/end keys, such as the "hot ranges" report. #63748
  • The cluster setting server.remote_debugging.mode has been removed. The debug endpoints are now available to every client with access to the HTTP port. All the HTTP URLs previously affected by this setting already have user authentication and require a user to be logged in as a member of the admin role, so there was no need for an additional layer of security. #63748
  • There is now a cache for per-user authentication-related information. The data in the cache is always kept up-to-date because it checks if any change to the underlying authentication tables has been made since the last time the cache was updated. The cached data includes the user's hashed password, the NOLOGIN role option, and the VALID UNTIL role option. #66919
  • The --cert-principal-map flag now allows the certificate principal name to contain colons. #67703
  • Added the admin-only debugging builtin functions crdb_internal.read_file and crdb_internal.write_file to read/write bytes from/to external storage URIs used by BACKUP or IMPORT. #67427
  • The certificate loader now allows the loading of root-owned private keys, provided the owning group matches the group of the (non-root) user running CockroachDB. #68182
  • Old authentication web session rows in the system.web_sessions table no longer accumulate indefinitely in the long run. These rows are periodically deleted. Refer to the documentation for details about the new cluster settings for system.web_sessions. #67547
  • The error returned during a failed authentication attempt will now include the InvalidAuthorizationSpecification PostgreSQL error code (28000). #69106

General changes

  • CockroachDB now supports new debug endpoints to help users with troubleshooting. #69594
  • The kv.closed_timestamp.closed_fraction and kv.follower_read.target_multiple settings are now deprecated and turned into no-ops. They had already stopped controlling the closing of timestamps in v21.1, but were still influencing the follower_read_timestamp() computation for a timestamp that's likely to be closed on all followers. To replace them, a simpler kv.closed_timestamp.propagation_slack setting is introduced, modeling the delay between when a leaseholder closes a timestamp and when all the followers become aware of it (defaults conservatively to 1s). follower_read_timestamp() is now computed as kv.closed_timestamp.target_duration + kv.closed_timestamp.side_transport_interval + kv.closed_timestamp.propagation_slack, which defaults to 4.2s (instead of the previous default of 4.8s). #69775
  • Added documentation for Cluster API v2 endpoints. #62560
  • Added crdb_internal.create_join_token() SQL builtin function to create join tokens for use when joining new nodes to a secure cluster. This functionality is hidden behind a feature flag. #62053
  • Added Cluster API v2 endpoints for querying databases, tables, users, and events in a database. #63000
  • All SQL-level cluster settings have been made public. #66688
  • The setting kv.transaction.write_pipelining_max_outstanding_size is now a no-op. Its function is folded into the kv.transaction.max_intents_bytes setting. #66915
  • Introduced a /_status/regions endpoint which returns all regions along with their availability zones. #67098
  • crdb_internal.regions is now accessible from a tenant. #67098
  • The behavior for retrying jobs, which fail due to a retriable error or due to job coordinator failure, is now delayed using exponential backoff. Before this change, jobs that failed in a retriable manner would be resumed immediately on a different coordinator. This change reduces the impact of recurrently failing jobs on the cluster. This change adds two new cluster settings that control this behavior: jobs.registry.retry.initial_delay and jobs.registry.retry.max_delay, which respectively control initial delay and maximum delay between resumptions. #66889
  • Previously, non-cancelable jobs, such as schema-change jobs, could fail while reverting due to transient errors, leading to unexpected results. Now, non-cancelable reverting jobs are retried instead of failing when transient errors are encountered. This mitigates the impact of temporary failures on non-cancelable reverting jobs. #69087
  • Added new columns in the crdb_internal.jobs table that show the current backoff state of a job and its execution log. The execution log consists of a sequence of job start and end events and any associated errors that were encountered during each job's execution. Now users can query the internal crdb_internal.jobs table to get more insights about jobs through the following columns: last_run shows the last execution time of a job; next_run shows the next execution time of a job based on exponential-backoff delay; num_runs shows the number of times the job has been executed; and execution_log provides a set of events that are generated when a job starts and ends its execution. #68995
  • When jobs encounter retriable errors during execution, they will now record these errors into their state. The errors, as well as metadata about the execution, can be inspected via the newly added execution_errors field of crdb_internal.jobs, which is a STRING[] column. #69370

Enterprise edition changes

  • Added new DEBUG_PAUSE_ON option to RESTORE jobs to allow for self pause on errors. #69422
  • Changefeed option values are now case insensitive. #69217
  • Performance for changefeeds during some range-split operations is now improved. #66312
  • Cloud storage sinks for enterprise changefeeds are no longer experimental. #69787
  • Kafka sink URIs now accept the topic_name parameter to override per-table topic names. #62377
  • SHOW BACKUP now shows whether the backup is full or incremental under the backup_type column. #63832
  • Previously, if a restore cluster mismatched the regions in backup cluster, the data would be restored as if the zone configuration did not exist. CockroachDB now checks the regions before restore, making users aware of mismatched regions between backup and restore clusters. If there is a mismatched region, users can either update cluster localities or restore with the --skip-localities-check option to continue. #64758
  • Added ca_cert as a query parameter to the Confluent registry schema URL to trust custom certs on connection. #65431
  • Changefeeds will now report more schema registry connection problems immediately at job creation time. #65775
  • Changefeeds can now be started with the mvcc_timestamp option to emit the MVCC timestamp of each row being emitted. This option is similar to the updated option, but the mvcc_timestamp will always contain the row's MVCC timestamp, even during the changefeed's initial backfill. #65661
  • Introduced a new webhook sink (prefix webhook-https) to send individual changefeed messages as webhook events. #66497
  • RESTORE now supports restoring individual tables into a multi-region database. If the table being restored is also multi-region, REGIONAL BY ROW tables cannot be restored, and REGIONAL BY TABLE tables can only be restored if their localities match those of the database they're being restored into. #65015
  • Changefeeds don't attempt to use protected timestamps when running in a multi-tenant environment. #67285
  • New 'on_error' option to pause on non-retryable errors instead of failing. #68176
  • Changefeeds no longer fail when started on REGIONAL BY ROW tables. Note that in REGION BY ROW tables, the crdb_region column becomes part of the primary index. Thus, changing an existing table to REGIONAL BY ROW will trigger a changefeed backfill with new messages emitted using the new composite primary key. #68229
  • Descriptor IDs of every object are now visible in SHOW BACKUP, along with the descriptor IDs of the object's database and parent schema. SHOW BACKUP will display these IDs if the WITH debug_ids option is specified. #68540
  • The changefeed Avro format is no longer marked as experimental. #68818
  • Changefeed statements now error if the provided sink URL does not contain a scheme. Such URLs are typically a mistake and will result in non-functional changefeeds. #68978
  • Added WITH REASON = <reason> to PAUSE JOB to gain visibility into why a job was paused by allowing pauses to be attached to a reason string. This reason is then persisted in the payload of the job and can be queried later. #68909
  • Because the SELECT database privilege is being deprecated, CockroachDB now additionally checks for the CONNECT privilege on the database to allow for backing up the database. Existing users with SELECT on the database can still back up the database, but it is now recommended to GRANT CONNECT on the database. #68391
  • Added a webhook_sink_config JSON option to configure batching and flushing behavior, along with retry behavior for webhook sink changefeed messages. #68633
  • Changefeeds will now error if an option is used with an incompatible sink. #69173
  • Fixed a bug where changefeeds would fail to correctly handle a primary key change. #69234
  • Changefeeds now correctly account for memory during backfills and "pushback" under memory pressure—that is, slow down backfills. #69388
  • Changefeeds will now slow down correctly whenever there is a slow-down in the system (i.e. downstream sink is slow). #68288
  • Changefeeds will now flush the sink only when frontier advances. This eliminates unnecessary sink flushes. #67988
  • Improved changefeed scalability, particularly when running against large tables, by reducing the rate of job progress updates. #67815
  • Changefeeds can resume during backfill without losing too much progress. #66013

SQL language changes

  • To perform REASSIGN OWNED BY, the current user running the command must now be a member of both the old and new roles. Previously the new owner would need: to be a member of the CREATEDB role if the object being changed was a database, CREATE privileges for the database if the object being changed was a schema, or CREATE privileges for the schema if object being changed was a table or type. #69382
  • The SQL stats compaction job now only shows up in the output of SHOW AUTOMATIC JOBS. #69641
  • DROPs, RENAMEs, and other light schema changes are no longer user cancelable to avoid scenarios that do not properly rollback. #69328
  • Users can now opt to disable auto-rehoming for a session by setting on_update_rehome_row_enabled = false. This can be permanently unset by default using the cluster setting sql.defaults.on_update_rehome_row.enabled. #69626
  • It is now possible to alter the owner of the crdb_internal_region type, which is created by initiating a multi-region database. #69722
  • Added more detail to the error message users receive when they call SHOW BACKUP with a path pointing to the root of the collection, rather than a specific backup in the collection. #69638
  • Introduced a new cluster setting sql.stats.persisted_rows.max and increased its default value to 1000000 (1,000,000 rows). #69667
  • Previously, users had no way of determining which objects in their database utilized deprecated features like interleaved indexes/tables or cross-database references. Added crdb_internal tables cross_db_references, interleaved_indexes, and interleaved_tables for detecting these deprecated features within a given database. #61629
  • The sql.defaults.vectorize_row_count_threshold cluster setting, as well as the corresponding vectorize_row_count_threshold session variable, have been removed. From now on, CockroachDB will behave exactly as if these were set to 0 (last default value). #62164
  • Updated crdb_internal.interleaved to add the parent_table_name column replacing the parent_index_name column. #62076
  • CockroachDB now references sequences used in views by their IDs to allow these sequences to be renamed. #61439
  • Added SQLType to classify DDL, DML, DCL, or TCL statement types. #62989
  • Implemented the geometry-based builtin ST_IsValidTrajectory. #63072
  • CockroachDB now accepts UUID inputs that have a hyphen after any group of four digits. This aligns with the UUID format used by PostgreSQL. For example, a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11 is now considered a valid UUID. #63137
  • The gen_ulid and uuid_ulid_to_string builtins are now available for use. #62440
  • Single-key spans in EXPLAIN and EXPLAIN (DISTSQL) are now shown without a misleading dash after them. #61583
  • Enabled locality-optimized search in the row execution engine. #63384
  • CockroachDB now should be more stable when executing queries with subqueries producing many rows (previously it could OOM crash; it wil now use the temporary disk storage). #63900
  • Correlated common table expressions (CTEs) can now be used. #63956
  • Introduced a new session variable distsql_workmem that determines how much RAM a single operation of a single query can take before the operation must spill to disk. This is identical to the sql.distsql.temp_storage.workmem cluster setting but has the session-level scope. #63959
  • crdb_internal.node_statement_statistics now stores statement_id. #64076
  • Added line to the EXPLAIN ANALYZE output to show disk spill usage to make it clear when disk spilling occurs executing a query. This output is only shown when the disk usage is greater than zero. The verbiage in the DistSQL EXPLAIN diagrams changed from max scratch disk allocated to max sql temp disk usage for consistency and to match the way we talk about SQL spill disk usage elsewhere. #64137
  • Previously, committing a transaction when a portal was suspended would cause a "multiple active portals not supported" error. Now, the portal is automatically destroyed. #63677
  • Bulk IO operations are no longer included in service latency metrics. #64442
  • Collated strings may now have a locale that is a language tag, followed by a -u- suffix, followed by anything else. For example, any locale with a prefix of en-US-u- is now considered valid. #64695
  • Added new tables to pg_catalog: pg_partitioned_table, pg_replication_origin_status, pg_init_privs, pg_replication_slots, pg_policy, pg_sequences, pg_subscription_rel, pg_largeobject_metadata. #64035
  • Added new columns to pg_catalog tables: pg_class (relminmxid), pg_constraint (conparentid). #64035
  • Using the table name as a projection now works, e.g., SELECT table_name FROM table_name or SELECT row_to_json(table_name) FROM table_name. #64748
  • Added generate_series for TIMESTAMPTZ values. #64887
  • Added the sql.defaults.require_explicit_primary_keys.enabled cluster setting for requiring explicit primary keys in CREATE TABLE statements. #64951
  • SQL service latency now only includes metrics from DML statements. #64893
  • Added support for using OPERATOR(operator) for binary expressions. This only works for in-built CockroachDB operators. #64701
  • Introduced the OPERATOR syntax for unary operators. This only works for unary operators usable in CockroachDB. #64701
  • Implemented the geometry builtin function ST_LineCrossingDirection. #64997
  • Added the pg_relation_is_updatable and pg_column_is_updatable builtin functions #64788
  • information_schema.columns.data_type now returns "USER-DEFINED" when the column is a user-defined type (e.g., ENUM) #65154
  • CockroachDB now supports the scalar functions get_byte() and set_byte() as in PostgreSQL. #65189
  • CockroachDB now supports converting strings of hexadecimal digits prefixed by x or X to a BIT value, in the same way as PostgreSQL. Note that only the conversion via casts is supported (e.g., 'XAB'::BIT(8)); PostgreSQL's literal constant syntax (e.g., X'AB'::BIT(8)) continues to have different meaning in CockroachDB (byte array) due to historical reasons. #65188
  • SHOW JOBS now shows the trace_id of the trace that is associated with the current execution of the job. This allows pulling inflight traces for a job for debugging purposes. #65322
  • The vectorized execution engine now supports the ntile window function. #64977
  • Pg_sequences table was implemented on pg_catalog #65420
  • A constant can now be cast to regclass without first converting the constant to an OID. E.g., 52::regclass can now be done instead of 52::oid::regclass. #65432
  • References to WITH expressions from correlated subqueries are now always supported. #65550
  • Added new SHOW CHANGEFEED JOBS command with additional information about changefeeds for improved user visibility. #64956
  • This is strictly a change for docgen and sql grammar. Now all sql.y statements (excluding those that are unimplemented or specified to be skipped) will have automatically have a stmtSpec defined for them and thus will have a bnf and svg file automatically generated in cockroachdb/generated-diagrams. #65278
  • The vectorized execution engine now supports the lag and lead window functions. #65634
  • The total number of statement/transaction fingerprints stored in-memory can now be limited using the cluster settings sql.metrics.max_mem_stmt_fingerprints and sql.metrics.max_mem_txn_fingerprints. #65902
  • Previously, SQL commands that were sent during the PostgreSQL extended protocol that were too big would error opaquely. This is now resolved by returning a friendlier error message. #57590
  • Namespace entries may no longer be queried via system.namespace2. #65340
  • EXPLAIN ANALYZE output now includes, for each plan step, the total time spent waiting for KV requests as well as the total time those KV requests spent contending with other transactions. #66157
  • Added the SHOW CREATE DATABASE command to get database metadata. #66033
  • Added sample_plan, database_name, and exec_node_ids columns to the crdb_internal.node_statement_statistics table. This allows for third-party and partner consumption of this data. #65782
  • Implemented pg_rewrite for table-view dependencies. Table-view dependencies are no longer stored directly in pg_depend. #65495
  • Added some virtual tables crdb_internal.(node|cluster)_distsql_flows that expose the information about the flows of the DistSQL execution scheduled on remote nodes. These tables do not include information about the non-distributed queries or about local flows (from the perspective of the gateway node of the query). #65727
  • The use order of columns in a foreign key no longer needs to match the order the columns were defined for the reference table's unique constraint. #65209
  • Previously, in some special cases (UPSERTs, as documented in this issue), the support of the distinct operations was missing in the vectorized execution engine. This has been added, and such operations will be able to spill to disk if necessary. However, in case the distinct operator does, in fact, spill to disk, there is a slight complication. The order in which rows are inserted can be non-deterministic: for example, for a query such as INSERT INTO t VALUES (1, 1), (1, 2), (1, 3) ON CONFLICT DO NOTHING, with t having the schema a INT PRIMARY KEY, b INT, it is possible that any of the three rows are actually inserted. PostgreSQL appears to have the same behavior. #61582
  • Added three new views to the crdb_internal schema to support developers investigating contention events: cluster_contended_{tables, indexes, keys}. #66370
  • Implemented similar_escape and made similar_to_escape compatible with PostgreSQL. #66578
  • The "char" column type will now truncate long values, in line with PostgreSQL. #66422
  • The contents of the statistics table in the information schema have changed; therefore, so have the results of SHOW INDEX and SHOW COLUMNS. A column that is not in the primary key will now be listed as belonging to the primary index as a stored column. Previously, it was simply not listed as belonging to the primary index. #66599
  • Adding empty missing tables on information_schema for compatibility: attributes, check_constraint_routine_usage, column_column_usage, column_domain_usage, column_options, constraint_table_usage, data_type_privileges, domain_constraints, domain_udt_usage, domains, element_types, foreign_data_wrapper_options, foreign_data_wrappers, foreign_server_options, foreign_servers, foreign_table_options, foreign_tables, information_schema_catalog_name, role_column_grants, role_routine_grants, role_udt_grants, role_usage_grants, routine_privileges, sql_features, sql_implementation_info, sql_parts, sql_sizing, transforms, triggered_update_columns, triggers, udt_privileges, usage_privileges, user_defined_types, user_mapping_options, user_mappings, view_column_usage, view_routine_usage, view_table_usage. #65854
  • The SHOW QUERIES command was extended for prepared statements to show the actual values in use at query time, rather than the previous $1, $2, etc. placeholders. We expect showing these values will greatly improve the experience of debugging slow queries. #66689
  • Populated the pg_type table with entries for each table. Also populated pg_class.reltypid with the corresponding oid in the pg_type table. #66815
  • Added a virtual table crdb_internal.cluster_inflight_traces which surfaces cluster-wide inflight traces for the trace_id specified via an index constraint. The output of this table is not appropriate to consume over a SQL connection; follow-up changes will add CLI wrappers to make the interaction more user-friendly. #66679
  • Added support for iso_8601 and sql_standard as usable session variables in IntervalStyle. Also added a sql.defaults.intervalstyle cluster setting to be used as the default interval style. #67000
  • Added a cluster setting sql.defaults.primary_region, which assigns a PRIMARY REGION to a database by default. #67168
  • Introduced a cluster setting sql.allow_drop_final_region.enabled which disallows dropping of a PRIMARY REGION (the final region of a database). #67168
  • IMPORT TABLE will be deprecated in v21.2 and removed in a future release. Users should create a table using CREATE TABLE and then IMPORT INTO the newly created table. #67275
  • CockroachDB now uses JSONB instead of BYTES to store statement plans in system.statement_statistics. #67331
  • Reduced instantaneous memory usage during scans by up to 2x. #66376
  • Added the session variable backslash_quote for PostgreSQL compatibility. Setting this does a no-op, and only safe_encoding is supported. #67343
  • Introduced a crdb_internal.regions table which contains data on all regions in the cluster. #67098
  • When parsing intervals, IntervalStyle is now taken into account. In particular, IntervalStyle = 'sql_standard' will make all interval fields negative if there is a negative symbol at the front, e.g., -3 years 1 day would be -(3 years 1 day) in sql_standard and -3 days, 1 day in PostgreSQL DateStyle. #67210
  • ROLLBACK TO SAVEPOINT can now be used to recover from LockNotAvailable errors (pgcode 55P03), which are returned when performing a FOR UPDATE SELECT with a NOWAIT wait policy. #67514
  • Added tables to information_schema that are present on MySQL. The tables are not populated and are entirely empty. column_statistics, columns_extensions, engines, events, files, keywords, optimizer_trace, partitions, plugins, processlist, profiling, resource_groups, schemata_extensions, st_geometry_columns, st_spatial_reference_systems, st_units_of_measure, table_constraints_extensions, tables_extensions, tablespaces, tablespaces_extensions, user_attributes. #66795
  • Added new builtins compress(data, codec) and decompress(data, codec) which can compress and decompress bytes with the specified codec. Gzip is the only currently supported codec. #67426
  • The column types of the results of SHOW LAST QUERY STATISTICS (an undocumented statement meant mostly for internal use by CockroachDB's SQL shell) has been changed from INTERVAL to STRING. They are populated by the durations of the various phases of executions as if the duration, as an INTERVAL, was converted to STRING using the 'postgres' IntervalStyle. This ensures that the server-side execution timings are always available regardless of the value of the IntervalStyle session variable. #67654
  • Changed information_schema.routines data types at columns interval_precision, result_cast_char_octet_length and result_cast_datetime_precision to INT. #67641
  • Added syntax for granting and revoking privileges for all the tables in the specified schemas. New supported syntax: GRANT {privileges...} ON ALL TABLES IN SCHEMA {schema_names...} TO {roles...}; REVOKE {privileges...} ON ALL TABLES IN SCHEMA {schema_names...} TO {roles...}. This command is added for PostgreSQL compatibility. #67509
  • Added pg_stat_database and pg_stat_database_conflicts to pg_catalog. #66687
  • A database that is restored with the cluster setting sql.defaults.primary_region will now have the PRIMARY REGION from the cluster setting assigned to the database. #67581
  • The vectorized execution engine now supports CASE expressions that output BYTES-like types. #66399
  • Introduced the with_min_timestamp and with_max_staleness builtin functions. In a SELECT clause, they return the same timestamp and (now() - interval), but are intended for use in AS OF SYSTEM TIME, which will appear in an upcoming update. #67697
  • first_value, last_value, and nth_value window functions can now be executed in the vectorized execution engine. This allows for faster execution time, and also removes the need for conversions to and from row format. #67764
  • Improved performance of lookup joins in some cases. If join inequality conditions can be matched to index columns, CockroachDB now includes the conditions in the index lookup spans and removes them from the runtime filters. #66002
  • Added support for ALTER DEFAULT PRIVILEGES and default privileges stored on databases. All objects created in a database will have the privilege set defined by the default privileges for that type of object on the database. The types of objects are TABLES, SEQUENCES, SCHEMAS, TYPES. Example: ALTER DEFAULT PRIVILEGES GRANT SELECT ON TABLES TO foo makes it such that all tables created by the user that executed the ALTER DEFAULT PRIVILEGES command will have SELECT privilege on the table for user foo. Additionally, one can specify a role. Example: ALTER DEFAULT PRIVILEGES FOR ROLE bar GRANT SELECT ON TABLES TO foo. All tables created by bar will have SELECT privilege for foo. If a role is not specified, it uses the current user. For further context, see the PostgreSQL documentation. Currently, default privileges are not supported on the schema. Specifying a schema like ALTER DEFAULT PRIVILEGES IN SCHEMA s will error. WITH GRANT OPTION is ignored. GRANT OPTION FOR is also ignored. #66785
  • Introduced a nearest_only argument for with_min_timestamp/with_max_staleness, which enforces that bounded staleness reads only talk to the nearest replica. #67837
  • CREATE TABLE LIKE now copies hidden columns over. #67799
  • Populated pg_catalog.pg_default_acl. This is important for tracking which default privileges are defined in the database. pg_catalog.pg_default_acl has 5 columns:
    • oid (oid): row identifier
    • defaclrole (oid): oid of the role the default privileges are defined for
    • defaclnamespace (oid): oid of the schema the default privileges are defined in
    • defaclobjtype (char): r = relation (table, view), S = sequence, f = function, T = type, n = schema
    • defaclacl (aclitem[]): string representation of default privileges, following the format "$1=$2/$3" where $1 is the grantee's username, $2 is a list of characters representing the privileges, and $3 is the grantor (which is currently always an empty string in CockroachDB).
    Privileges are represented by chars in the aclitem[] representation.
    • CREATE = 'C'
    • SELECT = 'r'
    • INSERT = 'a'
    • DELETE = 'd'
    • UPDATE = 'w'
    • USAGE = 'U'
    • CONNECT = 'c'
    See the PostgreSQL documentation for the table of PostgreSQL-supported privileges and their char representations and the PostgreSQL definition of pg_catalog.pg_default_acl. #67872
  • An earlier commit changed CockroachDB to use the value of the IntervalStyle session var when interpreting interval to string conversions. However, this made string::interval and interval::string casts have a volatility of "stable" instead of "immutable". This has ramifications for items such as computed columns and check clauses, which cannot use immutable expressions. This means that the particular results returned by these queries can become incoherent when IntervalStyle is customized to a different value from its default, postgres. In order to provide guardrails against this incoherence, CockroachDB now provides a new, separate configuration knob called intervalstyle_enabled that applications can use to "opt in" the ability to customize IntervalStyle:
    • By default, this knob is false and applications cannot customize IntervalStyle. Interval to string conversions that are already stored in the schema are unaffected and continue to produce results as per the default style postgres.
    • When the knob is true, these things happen:
      • Apps can start customizing IntervalStyle in SQL sessions.
      • A SQL session that has a custom IntervalStyle will start observing incoherent results when accessing tables that already contain a conversion from interval to string.
    The knob works as follows: The primary configuration mechanism is a new cluster setting called sql.defaults.intervalstyle.enabled. This is the knob that operators and DBAs should customize manually. Then, as a secondary configuration mechanism, the value of the cluster setting is also copied to each SQL session as a new session var intervalstyle_enabled. This is a performance optimization. SQL apps should not modify this session var directly, except for temporary testing purposes. In v22.1, upgrades will be disabled if these stable expressions are found in computed columns, check clauses, etc. #67792
  • Previously, OPERATOR(+)int would simplify to +int when parsed, which would lead to re-reproducibility issues when considering order of operators. This is now fixed by leaving OPERATOR(+) in the tree. #68041
  • Previously, pretty printing could fold some OPERATOR expressions based on the order of operations of the operator inside the OPERATOR. This can lead to a different order of operations when reparsing, so this is fixed by never folding OPERATOR expressions. #68041
  • Added the SHOW CREATE SCHEDULE command to view SQL statements used to create existing schedules. #66782
  • Created a builtin to reset the zone configurations of multi-region tables. This builtin can be helpful in cases where the user has overridden the zone configuration for a given table and wishes to revert back to the original system-specified state. #67985
  • Implemented the parse_interval and to_char_with_style builtins, which convert strings from/to intervals with immutable volatility. #67970
  • Casting from interval to string or vice-versa is now blocked for computed columns, partial indexes, and partitions when the intervalstyle_enabled session setting is enabled. Instead, using to_char_with_style(interval, style) or parse_interval(interval, intervalstyle) is available as a substitute. This is enforced in v22.1, but is opt-in for v21.2. It is recommended to set the cluster setting sql.defaults.intervalstyle_enabled to true to avoid surprises when upgrading to v22.1. #67970
  • Common aggregate functions can now be executed in the vectorized execution engine. This allows for better memory accounting and faster execution in some cases. #68081
  • Retry information has been added to the statement trace under the exec stmt operation. The trace message is in the format: "executing after retries, last retry reason: ". This message will appear in any operations that show the statement trace, which is included in operations such as SHOW TRACE FOR SESSION and is also exported in the statement diagnostics bundle. #67941
  • Added empty pg_stat* tables on pg_catalog: pg_stat_all_indexes, pg_stat_all_tables, pg_stat_archiver, pg_stat_bgwriter, pg_stat_gssapi, pg_stat_progress_analyze, pg_stat_progress_basebackup, pg_stat_progress_cluster, pg_stat_progress_create_index, pg_stat_progress_vacuum, pg_stat_replication, pg_stat_slru, pg_stat_ssl, pg_stat_subscription, pg_stat_sys_indexes, pg_stat_sys_tables, pg_stat_user_functions, pg_stat_user_indexes, pg_stat_user_tables, pg_stat_wal_receiver, pg_stat_xact_all_tables, pg_stat_xact_sys_tables, pg_stat_xact_user_functions, pg_stat_xact_user_tables, pg_statio_all_indexes, pg_statio_all_sequences, pg_statio_all_tables, pg_statio_sys_indexes, pg_statio_sys_sequences, pg_statio_sys_tables, pg_statio_user_indexes, pg_statio_user_sequences, pg_statio_user_tables. #67947
  • Added syntax for ALTER ROLE ... SET statements. The business logic for these statements is not yet implemented, but will be added in a later commit. The following forms are supported: ALTER ROLE { name | ALL } [ IN DATABASE database_name ] SET var { TO | = } { value | DEFAULT }, ALTER ROLE { name | ALL } [ IN DATABASE database_name ] RESET var, ALTER ROLE { name | ALL } [ IN DATABASE database_name ] RESET ALL. As with other statements, the keywords ROLE and USER are interchangeable. This matches the PostgreSQL syntax. #68001
  • BACKUP now supports backing up tables in a specified schema (e.g., BACKUP my_schema.*, or my_db.my_schema.*). Schemas will be resolved before databases, so my_object.* will resolve to a schema of that name in the current database before matching a database with that name. #67649
  • Added support for a new index hint, NO_ZIGZAG_JOIN, which will prevent the optimizer from planning a zigzag join for the specified table. The hint can be used in the same way as other existing index hints. For example, SELECT * FROM table_name@{NO_ZIGZAG_JOIN};. #68141
  • Added a cardinality builtin function that returns the total number of elements in a given array. #68263
  • Introduced a new cluster setting jobs.trace.force_dump_mode that allows users to configure Traceable jobs to dump their traces:
    • never: Job will never dump its traces.
    • onFail: Job will dump its trace after transitioning to the failed state.
    • onStatusChange: Job will dump its trace whenever it transitions from paused, canceled, succeeded or failed state. #67386
  • DMY and YMD DateStyles are now supported. #68093
  • When the date value is out of range, a hint now suggests that the user try a different DateStyle. #68093
  • When DateStyle and IntervalStyle are updated, this will now send a ParamStatusUpdate over the wire protocol. #68093
  • Added support to alter default privileges for all roles. The syntax supported is ALTER DEFAULT PRIVILEGES FOR ALL ROLES grant_default_privs_stmt/revoke_default_privs_stmt. Only admin users are able to execute this. This allows adding default privileges for objects that are created by ANY role, as opposed to having to specify a creator role to which the default privileges will apply when creating an object. Example: ALTER DEFAULT PRIVILEGES FOR ALL ROLES GRANT SELECT ON TABLES TO foo;. Regardless of whichever user now creates a table in the current database, foo will have SELECT. #68076
  • Added a crdb_internal.reset_multi_region_zone_configs_for_database builtin to reset the zone configuration of a multi-region database. This builtin can be helpful in cases where the user has overridden the zone configuration for a given database and wishes to revert back to the original system-specified state. #68280
  • The session setting optimizer_improve_disjunction_selectivity and its associated cluster setting sql.defaults.optimizer_improve_disjunction_selectivity.enabled are no longer supported. They were added in v21.1.7 to enable better optimizer selectivity calculations for disjunctions. This logic is now always enabled. #68349
  • Running ALTER ROLE on any role that is a member of admin now requires the admin role. Previously, any user with the CREATEROLE option could ALTER an admin. #68187
  • Introduced an hlc_to_timestamp builtin, which converts a CockroachDB HLC to a TIMESTAMPTZ. This is useful for pretty printing crdb_internal_mvcc_timestamp or cluster_logical_timestamp(), but is not useful for accuracy. #68360
  • Added a crdb_internal.default_privileges table that is useful for getting a human-readable way of examining default privileges. #67997
  • Added support for SHOW DEFAULT PRIVILEGES and SHOW DEFAULT PRIVILEGES FOR ROLE .... If a role(s) is not specified, default privileges are shown for the current role. SHOW DEFAULT PRIVILEGES returns the following columns: schema_name, role, object_type, grantee, privilege_type. #67997
  • The pg_db_role_setting table of the pg_catalog is now implemented. When ALTER ROLE ... SET var is used to configure per-role defaults, these default settings will be populated in pg_db_role_setting. This table contains the same data no matter which database the current session is using. For more context, see the PostgreSQL documentation. #68245
  • Removed the count column from the system.statement_statistics and system.transaction_statistics tables. #67866
  • Introduced the crdb_internal.index_usage_statistics virtual table to surface index usage statistics. The sql.metrics.index_usage_stats.enabled cluster setting can be used to turn on/off the subsystem. It defaults to true. #66640
  • The bulkio.backup.proxy_file_writes.enabled cluster setting is no longer needed to enable proxied writes, which are now the default. #68468
  • Default session variable settings configured by ALTER ROLE ... SET are now supported. The following order of precedence is used for variable settings:
    1. Settings specified in the connection URL as a query parameter
    2. Per-role and per-database settings configured by ALTER ROLE
    3. Per-role and all-database settings configured by ALTER ROLE
    4. All-role and per-database settings configured by ALTER ROLE
    5. All-role and all-database settings configured by ALTER ROLE
    RESET does not validate the setting name. SET validates both the name and the proposed default value. Note that the default settings for a role are not inherited if one role is a member of another role that has default settings. Also, the defaults only apply during session initialization. Using SET DATABASE to change databases does not apply default settings for that database. The public, admin, and root roles cannot have default session variables configured. The root role also will never use the "all-role" default settings. This is so that root has fewer dependencies during session initialization and to make it less likely for root authentication to become unavailable during the loss of a node. Changing the default settings for a role requires the role running the ALTER command to either be an ADMIN or to have the CREATEROLE role option. Only ADMINs can edit the default settings for another admin. Futhermore, changing the default settings for ALL roles is only allowed for ADMINs. Roles without ADMIN or CREATEROLE cannot change the default settings for themselves. #68128
  • An earlier commit changed CockroachDB to use the value of the DateStyle session var when interpreting date to string conversions (and vice-versa). However, this made string::{date,timestamp} and {date,timetz,time,timestamp}::string casts have a volatility of "stable" instead of "immutable". This has ramifications for items such as computed columns and check clauses, which cannot use immutable expressions. This means that the particular results returned by these queries can become incoherent when DateStyle is customized to a different value from its default, ISO,MDY. In order to provide guardrails against this incoherence, CockroachDB now provides a new, separate configuration knob called datestyle_enabled that applications can use to "opt in" the ability to customize DateStyle:
    • By default, this knob is false and applications cannot customize DateStyle. Invalid conversions that are already stored in the schema are unaffected and continue to produce results as per the default style postgres.
    • When the knob is true, these things happen:
      • Apps can start customizing DateStyle in SQL sessions.
      • A SQL session that has a custom DateStyle will start observing incoherent results when accessing tables that already contain the aforementioned casts.
      • New schemas cannot have the above casts.
    The knob works as follows: The primary configuration mechanism is a new cluster setting called sql.defaults.datestyle.enabled. This is the knob that operators and DBAs should customize manually. Then as a secondary configuration mechanism, the value of the cluster setting is also copied to each SQL session as a new session var datestyle_enabled. This is a performance optimization. SQL apps should not modify this session var directly, except for temporary testing purposes. In v22.1, upgrades will be disabled if these stable expressions are found in computed columns, check clauses, etc. #68352
  • Introduced parse_interval and to_char, which takes in 1 string or interval and assumes the PostgreSQL IntervalStyle to make its output. #68351
  • parse_timestamp now has a two-argument variant, which takes in a DateStyle and parses timestamps according to that DateStyle. The one argument version assumes MDY. These builtins have an immutable volatility. #68351
  • Introduced a timestamp,DateStyle variant to to_char_with_style, which converts timestamps to a string with an immutable volatility. There is also a 1 arg to_char for timestamp values which assumes the ISO,MDY output style. #68351
  • Introduced a parse_date builtin with two variants. The single-argument variant parses a date and assumes ISO,MDY datestyle, and the two-argument variant parses a date assuming the DateStyle variant on the second argument. This provides an immutable way of casting strings to dates. #68351
  • Introduced to_char(date), which assumes a DateStyle of ISO,MDY and outputs date in that format. There is also a to_char_with_style(date, DateStyle) variant which outputs the date in the chosen DateStyle. This provides an immutable way of casting dates to strings. #68351
  • Implemented the parse_time and parse_timetz builtins, which parses a TIME or TIMETZ with immutable volatility. #68351
  • Some queries with lookup joins and/or top K sorts are now more likely to be executed in "local" manner with the distsql=auto session variable. #68524
  • SQL stats now can be persisted into system.statement_statistics and system.transaction_statistics tables by enabling the sql.stats.flush.enable cluster setting. The interval of persistence is determined by the new sql.stats.flush.interval cluster setting, which defaults to 1 hour. #67090
  • The lock_timeout session variable is now supported. The configuration can be used to abort a query with an error if it waits longer than the configured duration while blocking on a single row-level lock acquisition. #68042
  • Added support for GENERATED {ALWAYS | BY DEFAULT} AS IDENTITY syntax in a column definition. This will automatically create a sequence for the given column. This matches PostgreSQL syntax and functionality. For more context, see the PostgreSQL documentation. #68711
  • The vectorized execution engine can now perform a scan over an index, and then join on the primary index to retrieve the required columns. #67450
  • SHOW DEFAULT PRIVILEGES FOR ALL ROLES is now supported as a syntax. Show default privileges returns a second column for_all_roles (bool) which indicates whether or not the default privileges shown are for all roles. #68607
  • SHOW DEFAULT PRIVILEGES now only shows default privileges for the current user. #68607
  • If a user has a default privilege defined for, they cannot be dropped until the default privilege is removed. Example: ALTER DEFAULT PRIVILEGES FOR ROLE test1 GRANT SELECT ON TABLES TO test2;. Neither test1 nor test2 can be dropped until performing ALTER DEFAULT PRIVILEGES FOR ROLE test1 REVOKE SELECT ON TABLES FROM test2;. #67950
  • Added new metrics to track schema job failure: sql.schema_changer.errors.all, sql.schema_changer.errors.constraint_violation, sql.schema_changer.errors.uncategorized; errors inside the crdb_internal.feature_usage table. #68252
  • Indexes on expressions can now be created. These indexes can be used to satisfy queries that contain filters with identical expressions. For example, SELECT * FROM t WHERE a + b = 10 can utilize an index like CREATE INDEX i ON t ((a + b)). #68807
  • Roles with the name none and starting with pg_ or crdb_internal can no longer be created. Any existing roles with these names may continue to work, but they may be broken when new features (e.g., SET ROLE) are introduced. #68972
  • Implemented the geometry-based builtin ST_Translate on arguments {geometry, float8,float8,float8}. #68959
  • Errors involving MinTimestampBoundUnsatisfiableError during a bounded staleness read now get a custom pgcode (54C01). #68967
  • Table statistics are no longer collected for views. #68997
  • Added support for SCHEMA comments using PostgreSQL's COMMENT ON SCHEMA syntax. #68606
  • SET ROLE user now parses without an equal between ROLE and =. This functionality is not yet implemented. #68750
  • Allowed RESET ROLE to be parsed. This is not yet implemented. #68750
  • Added a session_user() builtin function, which currently returns the same thing as current_user(), as we do not implement SET ROLE. #68749
  • Introduced new crdb_internal.statement_statistics virtual table that surfaces both cluster-wide in-memory statement statistics as well as persisted statement statistics. #68715
  • The regrole OID alias type is now supported, which is a PostgreSQL-compatible object identifier alias that references the pg_catalog.pg_authid table. #68877
  • CockroachDB now correctly sends the RESET tag instead of the SET tag when a RESET statement is run. #69053
  • Bounded staleness reads now retry transactions when nearest_only=True and a schema change is detected which may prevent a follower read from being served. #68969
  • Granting SELECT, UPDATE, INSERT, DELETE on databases is being deprecated. The syntax is still supported, but is automatically converted to the equivalent ALTER DEFAULT PRIVILEGES FOR ALL ROLES command. The user is given a notice that the privilege is incompatible and automatically being converted to an ALTER DEFAULT PRIVILEGE FOR ALL ROLES command. #68391
  • The syntax for setting database placement is now ALTER DATABASE db PLACEMENT .... (The SET keyword is no longer allowed before the PLACEMENT keyword.) #69067
  • The ALTER DATABASE db SET var ... syntax is now supported. It is a syntax alias for ALTER ROLE ALL IN DATABASE db SET var ..., since it is identical to that functionality: it configures the default value to use for a session variable when a user connects to the given database. #69067
  • Implemented the crdb_internal.serialize_session() and crdb_internal.deserialize_session(bytes) builtins. The former outputs the session settings in a string that can be deserialized into another session by the latter. #68792
  • Added a cluster setting schedules.backup.gc_protection_enabled that defaults to true and enables chaining of GC protection across backups run as part of a schedule. #68446
  • crdb_internal.pb_to_json now does not emit default values by default. #69185
  • Implemented pg_shdepend with shared dependencies with tables, databases and pinned user/roles. #68018
  • Added a builtin function crdb_internal.datums_to_bytes, which can encode any data type which can be used in an forward index key into bytes in an immutable way. This function is now used in the expression for hash-sharded indexes. #67865
  • Introduced a new crdb_internal.transaction_statistics virtual table that surfaces both cluster-wide in-memory transaction statistics as well as persisted transaction statistics. #69049
  • Introduced SET ROLE, which allows users with certain permissions to assume the identity of another user. It is worth noting that due to cross-version compatibility, session_user will always return the same as current_user until v22.1. Instead, use session_user() if you require this information. #68973
  • An ON UPDATE expression can now be added to a column. Whenever a row is updated without modifying the ON UPDATE column, the column's ON UPDATE expression is re-evaluated, and the column is updated to the result. #69091
  • Roles have a default set of default privileges. For example, a role has ALL privileges on all objects as its default privileges when it creates the object. Additionally, the public role has Usage is a default privilege. This matches PostgreSQL's behavior such that the creator role and public role have the same set of default privileges in the default state. Now, when a user creates a table, sequence, type, or schema, it will automatically have ALL privileges on it, and public will have USAGE on types. This can be altered: ALTER DEFAULT PRIVILEGE FOR ROLE rolea REVOKE ALL ON ... FROM rolea will remove the default set of default privileges on the specified object from the role. #68500
  • SHOW DEFAULT PRIVILEGES shows implicit privileges. Implicit privileges are "default" default privileges. For example, the creator should have all privileges on any object that it creates. This is now reflected in SHOW DEFAULT PRIVILEGES. #69377
  • Added a system.span_configurations table. This will later be used to store authoritative span configs. #69047
  • Added support for GENERATED {ALWAYS | BY DEFAULT} AS IDENTITY (seq_option) syntax under CREATE TABLE. An IDENTITY column is an auto-incremented column based on an automatically created sequence, with the same performance considerations as the CREATE SEQUENCE syntax. The seq_option is consistent with the sequence option syntax in CREATE SEQUENCE, and we also support CACHE in seq_option for better performance. Hence, such a column can only be of integer type, and is implicitly NOT NULL. It is essentially the same as SERIAL with serial_normalization=sql_sequence, except for user access to override it. A GENERATED ALWAYS AS IDENTITY column cannot be overridden without specifying OVERRIDING SYSTEM VALUE in an INSERT/UPSERT/UPDATE statement. This overriding issue cannot be resolved by the ON CONFLICT syntax. Such a column can only be updated to DEFAULT. A GENERATED BY DEFAULT AS IDENTITY column allows being overridden without specifying any extra syntax, and users are allowed to add repeated values to such a column. It can also be updated to customized expression, but only accepts integer type expression result. GENERATED {ALWAYS | BY DEFAULT} AS IDENTITY is also supported under ALTER TABLE ... ADD COLUMN ... syntax. This matches the PostgreSQL syntax. #69107
  • Added transaction_fingerprint_id to system.statement_statistics primary key. #69320
  • Introduced crdb_internal.schedule_sql_stats_compaction() to manually create SQL Stats compaction schedule. Extended the SHOW SCHEDULES command to support SHOW SCHEDULES FOR SQL STATISTICS. #68401
  • The cluster setting sql.defaults.experimental_auto_rehoming.enabled and session setting experimental_enable_auto_rehoming were added to enable auto-rehoming on UPDATE for REGIONAL BY ROW tables. #69381
  • SHOW is_superuser now works, and is set to true if the user has root privileges. #69224
  • Introduced SET LOCAL, which sets a session variable for the duration of the transaction. SET LOCAL is a no-op outside the transaction. #69224
  • SET LOCAL now works for SAVEPOINTs. ROLLBACK will rollback any variables set during SET LOCAL. RELEASE TO SAVEPOINT will continue to use the variables set by SET LOCAL in the transaction. #69224
  • Interleaved syntax for CREATE TABLE/INDEX is now a no-op, since support has been removed. #69304
  • crdb_internal.reset_sql_stats() now resets persisted SQL Stats. #69273
  • Changed the plan_hash column in both system.statement_statistics and crdb_internal.statement_statistics from Int to Bytes. #69502
  • Added the optional IF NOT EXISTS clause to the CREATE SCHEDULE statement, making the statement idempotent. #69152
  • SHOW is_superuser now works, and is set to true if the user has admin privileges. #69355
  • The set_config builtin function now allows the local parameter to be true. This is the same as using SET LOCAL. #69480
  • Added a new as_json option to SHOW BACKUP which renders the backup manifest as JSON value. #62628
  • Added a new EXPLAIN flag, MEMO, to be used with EXPLAIN (OPT). When the MEMO flag is passed, a representation of the optimizer memo will be printed along with the best plan. The MEMO flag can be used in combination with other flags such as CATALOG and VERBOSE. For example, EXPLAIN (OPT, MEMO, VERBOSE) will print the memo along with verbose output for the best plan.

Operational changes

  • New session variable large_full_scan_rows, as well as the corresponding cluster setting sql.defaults.large_full_scan_rows, are now available. This setting determines which tables are considered "large" for the purposes of enabling disallow_full_table_scans feature to reject full table/index scans only of "large" table. The default value for the new setting is 1000, and in order to reject all full table/index scans (the previous behavior) one can set the new setting to 0. Internally issued queries aren't affected, and the new setting has no impact when disallow_full_table_scans feature is not enabled. #69371
  • Introduced new metric called txn.restarts.commitdeadlineexceeded that tracks the number of transactions that were forced to restart because their commit deadline was exceeded (COMMIT_DEADLINE_EXCEEDED). #69671
  • The default value of the storage.transaction.separated_intents.enabled cluster setting was changed to true. #64831
  • Node-level admission control that considers the CPU resource was introduced for KV request processing, and response processing (in SQL) for KV responses. This admission control can be enabled using admission.kv.enabled and admission.sql_kv_response.enabled. #65614
  • The new cluster setting bulkio.backup.merge_file_size allows BACKUP to buffer and merge smaller files to reduce the number of small individual files created by BACKUP. #66856
  • Increased the timeout for range MVCC garbage collection from 1 minute to 10 minutes, to allow larger jobs to run to completion. #65001
  • MVCC and intent garbage collection now triggers when the average intent age is 8 hours, down from 10 days. #65001
  • Added a server.authentication_cache.enabled cluster setting that defaults to true. When enabled, this cache stores authentication-related data and will improve the latency of authentication attempts. Keeping the cache up to date adds additional overhead when using the CREATE, ALTER, and DROP ROLE commands. To minimize the overhead, any bulk ROLE operations should be run inside of a transaction. To make the cache more effective, any regularly-scheduled ROLE updates should be done all together, rather than occurring throughout the day at all times. #66919
  • Introduced a new metric exportrequest.delay.total to track how long ExportRequests (issued by BACKUP) are delayed by throttling mechansisms. #67310
  • Enabling admission.kv.enabled may provide better inter-tenant isolation for multi-tenant KV nodes. #67533
  • debug.zip files no longer contain the file threads.txt, which was previously used to list RocksDB background threads. #67389
  • DistSQL response admission control can now be enabled using the cluster setting admission.sql_sql_response.enabled. #67531
  • Added the kv.bulk_sst.target_size and kv.bulk_sst.max_allowed_overage cluster settings that control the batch size used by export requests during BACKUP. #67705
  • RESTORE no longer dynamically reads from the kv.bulk_ingest.batch_size cluster setting to determine its batch size. If the value is updated, RESTORE jobs need to be PAUSEd and RESUMEd to adopt the updated setting. #68105
  • The memory pool used for SQL is now also used to cover KV memory used for scans. #66362
  • CockroachDB now records a log event and counter increment when removing an expired session. #68476
  • Added an automatically created, on by default, emergency ballast file. This new ballast defaults to the minimum of 1% total disk capacity or 1GiB. The size of the ballast may be configured via the --store flag with a ballast-size field, accepting the same value formats as the size field. Also, added a new Disk Full (10) exit code that indicates that the node exited because disk space on at least one store is exhausted. On node start, if any store has less than half the ballast's size bytes available, the node immediately exits with the Disk Full (10) exit code. The operator may manually remove the configured ballast (assuming they haven't already) to allow the node to start, and they can take action to remedy the disk space exhaustion. The ballast will automatically be recreated when available disk space is 4x the ballast size, or at least 10 GiB is available after the ballast is created. #66893
  • Added a new cluster setting, sql.mutations.max_row_size.log, which controls large row logging. Whenever a row larger than this size is written (or a single column family if multiple column families are in use) a LargeRow event is logged to the SQL_PERF channel (or a LargeRowInternal event is logged to SQL_INTERNAL_PERF if the row was added by an internal query). This could occur for INSERT, UPSERT, UPDATE, CREATE TABLE AS, CREATE INDEX, ALTER TABLE, ALTER INDEX, IMPORT, or RESTORE statements. SELECT, DELETE, TRUNCATE, and DROP are not affected by this setting. #67953
  • Added a new cluster setting, sql.mutations.max_row_size.err, which limits the size of rows written to the database (or individual column families, if multiple column families are in use). Statements trying to write a row larger than this will fail with a code 54000 (program_limit_exceeded) error. (Internal queries writing a row larger than this will not fail, but will log a LargeRowInternal event to the SQL_INTERNAL_PERF channel.) This limit is enforced for INSERT, UPSERT, and UPDATE statements. CREATE TABLE AS, CREATE INDEX, ALTER TABLE, ALTER INDEX, IMPORT, and RESTORE will not fail with an error, but will log LargeRowInternal events to the SQL_INTERNAL_PERF channel. SELECT, DELETE, TRUNCATE, and DROP are not affected by this limit. Note that existing rows violating the limit cannot be updated, unless the update shrinks the size of the row below the limit, but can be selected, deleted, altered, backed-up, and restored. For this reason we recommend using the accompanying setting sql.mutations.max_row_size.log in conjunction with SELECT pg_column_size() queries to detect and fix any existing large rows before lowering sql.mutations.max_row_size.err. #67953
  • The new cluster settings admission.l0_sub_level_count_overload_threshold and admission.l0_file_count_overload_threshold can be used to tune admission control. #69311
  • The new cluster settings sql.defaults.transaction_rows_written_log, sql.defaults.transaction_rows_written_err, sql.defaults.transaction_rows_read_log, and sql.defaults.transaction_rows_read_err (as well as the corresponding session variables) have been introduced. These settings determine the "size" of the transactions in written and read rows upon reaching of which the transactions are logged or rejected. The logging will go into the SQL_PERF logging channel. Note that the internal queries (i.e., those issued by CockroachDB internally) cannot error out but can be logged instead into SQL_INTERNAL_PERF logging channel. The "written" limits apply to INSERT, INSERT INTO SELECT FROM, INSERT ON CONFLICT, UPSERT, UPDATE, and DELETE whereas the "read" limits apply to SELECT statement in addition to all of these. These limits will not apply to CREATE TABLE AS, SELECT, IMPORT, TRUNCATE, DROP, ALTER TABLE, BACKUP, RESTORE, or CREATE STATISTICS statements. Note that enabling transaction_rows_read_err guardrail comes at the cost of disabling the usage of the auto commit optimization for the mutation statements in implicit transactions. #69202
  • The cockroach debug tsdump command now downloads histogram timeseries it silently omitted previously. #69469
  • New variables sql.mutations.max_row_size.{log|err} were renamed to sql.guardrails.max_row_size_{log|err} for consistency with other variables and metrics. #69457
  • Improved range feed observability by adding a crdb_internal.active_range_feeds virtual table which lists all currently executing range feeds on the node. #69055
  • Upgrading to the next version will be blocked if interleaved tables/indexes exist. Users should convert existing interleaved tables/indexes to non-interleaved ones or drop any interleaved tables/indexes before upgrading to the next version. #68074
  • Added support for the DataDog tracer. Set the trace.datadog.agent cluster setting to enable it if you have got the DataDog collector service running. #61602
  • Upgraded the Lightstep tracer version, resulting in better observability for Lightstep users. #61593
  • Added four new metrics, sql.guardrails.max_row_size_{log|err}.count{.internal}, which are incremented whenever a large row violates the corresponding sql.guardrails.max_row_size_{log|err} limit. #69457

Command-line changes

  • Added load show args to display subset of backup metadata. Users can display subsets of metadata of a single manifest by using load show files/descriptors/metadata/spans <backup_url>. #61131
  • Updated load show with summary subcommand to display meta information of an individual backup. Users are now able to inspect backup metadata, files, spans, and descriptors with the CLI command cockroach load show summary <backup_url> without a running cluster. #61131
  • Updated load show with incremental subcommand to display incremental backups. Users can list incremental backup paths of a full backup by running cockroach load show incremental <backup_url>. #61862
  • Added load show backups to display backup collection. Previously, users could list backups created by BACKUP INTO via SHOW BACKUP IN in a SQL session. But this listing task can be also done offline without a running cluster. Now, users are able to list backups in a collection with cockroach load show backups <collection_url>. #61862
  • The command cockroach sysbench has been removed. Users who depend on this command can use a copy of a cockroachdb executable binary from a previous version. #62305
  • The number of connection retries and connection timeouts for configurations generated by cockroach gen haproxy have been tweaked. #62308
  • Extended load show with load show data subcommand to display backup table data. By running cockroach load show data <table> <backup_url>, users are able to inspect data of a table in backup. Also, added --as-of flag to load show data command. Users are able to show backup snapshot data at a specified timestamp by running cockroach load show data <table> <backup_url> --as-of='-1s' #62662
  • Updated load show data command to display backup data in CSV format. Users can either pipe the output to a file or specify the destination by a --destination flag. #62662
  • Previously, load show summary would output results in an unstructured way, which made it harder to filter information. load show summary now outputs the information in JSON format, which is easier to handle and can be filtered through another command-line JSON processor. #63100
  • Previously, --as-of of load show data had the restriction that users could only inspect data at an exact backup timestamp. The flag has improved to work with backups with revision history so that users can inspect data at an arbitrary timestamp. #63181
  • Certain errors caused by invalid command-line arguments are now printed on the process' standard error stream, instead of standard output. #63839
  • The cockroach gen autocomplete command has been updated and can now produce autocompletion definitions for the fish shell. #63839
  • Previously, backup inspection was done via cockroach load show .., which could confuse users with ambiguous verbs in the command chain. The syntax is now more clear and indicative for users debugging backups. The changes are: load show summary <backup_url> -> debug backup show <backup_url>; load show incremental <backup_url> -> debug backup list-incremental <backup_url>; load show backups <collection_url> -> debug backup list-backups <collection_url>; load show data <table_name> <backup_url> -> debug backup export <backup_url> --table=<table_name>. #63309
  • Previously, \demo shutdown <node_idx> would error if --global was set. This will now error gracefully as an unsupported behavior. #62435
  • The --global flag for cockroach demo is now advertised. This flag simulates latencies in multi-node demo clusters when the nodes are set in different regions to simulate real-life global latencies. #62435
  • There will now be a message upon start-up on cockroach demo --global indicating that latencies between nodes will simulate real-world latencies. #62435
  • Added --max-rows and --start-key of cockroach backup debug tool for users to specify on export row number and start key when inspecting data from backup. #64157
  • Added --with-revisions on debug export to allow users to export revisions of table data. If --with-revisions is specified, revisions of data are returned to users, with an extra column displaying the revision time of that record. This is an experimenal/beta feature of the cockroach backup debug tool to allow users to export revisions of data from backup. #64285
  • Renamed connect to connect init, and added connect join command to retrieve certificates from an existing secure cluster and setup a new node to connect with it. #63492
  • The cockroach debug keys command recognizes a new flag --type that constrains types of displayed entries. This enables more efficient introspection of storage in certain troubleshooting scenarios. #64879
  • Server health metrics are now a structured event sent to the HEALTH logging channel. For details about the event payload, refer to the reference documentation. #65024
  • Server health metrics are now optimized for machine readability by being sent as a structured event to the HEALTH logging channel. For details about the event payload, refer to the reference documentation. #65024
  • The cockroach debug pebble tool can now be used with encrypted stores. #64908
  • Added a cockroach debug job-trace command that takes 2 arguments: <jobID> and file destination, along with a --url pointing to the node on which to execute this command against. The command pulls information about inflight trace spans associated with the job and dumps it to the file destination. #65324
  • The new subcommand cockroach convert-url converts a connection URL, such as those printed out by cockroach start or included in the online documentation, to the syntax recognized by various client drivers. For example:

    $ ./cockroach convert-url --url "postgres://foo/bar"
    Connection URL for libpq (C/C++), psycopg (Python), lib/pq & pgx (Go),
    node-postgres (JS) and most pq-compatible drivers:
       postgresql://root@foo:26257/bar
    Connection DSN (Data Source Name) for Postgres drivers that accept
    DSNs - most drivers and also ODBC:
       database=bar user=root host=foo port=26257
    Connection URL for JDBC (Java and JVM-based languages):
       jdbc:postgresql://foo:26257/bar?user=root
    

    #65460

  • The URLs spelled out by cockroach start, cockroach start-single-node, and cockroach demo in various outputs now always contain a target database for the connection; for example, defaultdb for regular servers. Certain drivers previously automatically filled in the name "postgres" if the database name field was empty. #65460

  • The connection URLs spelled out by cockroach start, cockroach start-single-node, and cockroach demo in various outputs now include a variant suitable for use with JDBC client apps. #65460

  • cockroach sql and cockroach demo now support the client-side parameter border like psql. #66253

  • Added support for cockroach debug ballast on Windows. #66793

  • The cockroach import pgdump command now recognizes custom target database names inside the URL passed via --url. Additionally, the command now also accepts a --database parameter. Using this parameter is equivalent to customizing the database inside the --url flag. #66375

  • cockroach debug job-trace now creates a job-trace.zip which contains trace information for each node executing the job. #66914

  • Added the --recursive or -r flag to the cockroach userfile upload CLI command allowing users to upload the entire subtree rooted at a specified directory to user-scoped file storage: userfile upload -r path/to/source/dir destination. The destination can be expressed one of four ways:

    • Empty (not specified)
    • A relative path, such as path/to/dir
    • A well-formed URI with no host, such as userfile:///path/to/dir/
    • A full well-formed URI, such as userfile://db.schema.tablename_prefix/path/to/dir
    If a destination is not specified, the default URI scheme and host will be used, and the basename from the source will be used as the destination directory. For example: userfile://defaultdb.public.userfiles_root/yourdirectory. If the destination is a relative path such as path/to/dir, the default userfile URI schema and host will be used (userfile://defaultdb.public.userfiles_$user/), and the relative path will be appended to it. For example: userfile://defaultdb.public.userfiles_root/path/to/dir. If the destination is a well-formed URI with no host, such as userfile:///path/to/dir/, the default userfile URI schema and host will be used (userfile://defaultdb.public.userfiles_$user/). For example: userfile://defaultdb.public.userfiles_root/path/to/dir. If the destination is a full well-formed URI, such as userfile://db.schema.tablename_prefix/path/to/dir, then it will be used verbatim. For example: userfile://foo.bar.baz_root/path/to/dir. #65307

  • Previously, the crdb-v2 log file format lacked a parser. This has now changed. #65633

  • The cockroach debug merge-logs command now renders in color by default. #66629

  • CockroachDB now supports a new logging channel called TELEMETRY. This will be used in later versions to report diagnostic events useful to Cockroach Labs for product analytics. (At the time of this writing, no events are defined for the TELEMETRY channel yet.) When no logging configuration is specified, this channel is connected to file output, with a maximum retention of 1MiB. To also produce the diagnostic output elsewhere, one can define a new sink that captures this channel. For example, to see diagnostics reports on the standard error, one can use: --log='sinks: {stderr: {channels: TELEMETRY, filter: INFO}}' When configuring file output, the operator should be careful to apply a separate maximum retention for the TELEMETRY channel from other file outputs, as telemetry data can be verbose and outcrowd other logging messages. For example: --log='sinks: {file-groups: {telemetry: {channels: TELEMETRY, max-group-size: 1MB}, ...}}. #66427

  • Added the cockroach debug statement-bundle recreate <zipdir> command, which allows users to load a statement bundle into an in-memory database for inspection. #67979

  • CockroachDB server nodes now report more environment variables in logs upon startup. Only certain environment variables that may have an influence on the server's behavior are reported. #66842

  • cockroach sql and cockroach demo now support the \c / \connect client-side command, in a way similar to psql:

    • \c without arguments: display the current connection parameters.
    • \c [DB] [USER] [HOST] [PORT] connect using the specified parameters. Specify '-' to omit one parameter.
    • \c URL connect using the specified URL. For example: \c - myuser to reconnect to the same server/db as myuser.
    This feature is intended to ease switching across simulated nodes in cockroach demo. Note: \c <dbname> reuses the existing server connection to change the current database, using a SET statement. To force a network reconnect, use \c - then \c <dbname>, or use \c <dbname> -. Note: When using the syntax with discrete parameters, the generated URL reuses the same TLS parameters as the original connection, including the CA certificate used to validate the server. To use different TLS settings, use \c <URL> instead. #66258

  • Added a new HTTP sink to the logging system. This can be configured similarly to other log sinks with the new http-servers and http-defaults sections of the logging config passed via the --log or --log-config-file command-line flags. #66196

  • The \c client-side command in cockroach sql and cockroach demo now always reconnects to the server even when only changing the current database. (This negates a part of a previous release note.) #68326

  • cockroach demo now recognizes the command-line flag --listening-url-file like cockroach start and cockroach start-single-node. When specified, the demo utility will write a valid connection URL to that file after the test cluster has been initialized. This facility also makes it possible to automatically wait until the demo cluster has been initialized in automation; for example, by passing the name of a unix named FIFO via the new flag. #68706

  • cockroach mt start-sql now supports --advertise-addr in the same fashion as cockroach start. #69113

  • cockroach debug decode-proto now does not emit default values by default. #69185

  • The cockroach debug tsdump command now accepts --from and --to flags that limit for which dates timeseries are exported. #69491

  • Log file read and write permissions may now be set via the new file-permissions key in the --log flag or --log-config-file file. #69243

  • Updated the output of --locality and --locality-addr flags to use terms that match cloud provider names for things such as 'region' and 'zone'. #62381

API endpoint changes

  • A list of node IDs representing the nodes that store data for the database has been added to the stats field in the database details endpoint under nodeIds. Database details must be requested with include_stats set to true, e.g. /_admin/v1/databases/{database}?include_stats=true. Similarly, nodeIds has also been added to the table stats endpoint, which is an ordered list of node ids that stores the table data: /_admin/v1/databases/{database}/tables/{table}/stats #69788
  • The changefeed.poll_request_nanos metric is no longer reported by the node status API, the crdb_internal.metrics table, or the Prometheus endpoint. #63935
  • The transaction abort error reason ABORT_REASON_ALREADY_COMMITTED_OR_ROLLED_BACK_POSSIBLE_REPLAY has been renamed to ABORT_REASON_RECORD_ALREADY_WRITTEN_POSSIBLE_REPLAY. #67215
  • A Stats message was added to the admin DatabaseDetails response, providing RangeCount and ApproximateDiskBytes in support of upcoming UI changes to the DB console. #67986
  • Tenant pods now expose the Statements API at /_status/statements on their HTTP port. #66675
  • Tenant pods now expose the ListSessions API at /_status/sessions on their HTTP port. #69376
  • Added a new endpoint /_status/combinedstmts to retrieve persisted and in-memory statements from crdb_internal.statement_statistics and crdb_internal.transaction_statistics by aggregated_ts range. The request supports optional query string parameters start and end, which are the date range in unix time. The response returned is currently the response expected from /_status/statements. /_status/statements has also been updated to support the parameters combined, start, and end. If combined is true, then the statements endpoint will use /_status/combinedstmts with the optional parameters start and end. #69238

DB Console changes

  • A new column on the Database Page now shows the node and region information for each database. The Tables view now displays a summary section of the nodes and regions where the table data is stored. The new table columns and region/node sections are only displayed if there is more than one node. #69804
  • Fixed duplicates of statements on the Transaction Details Page for multi-node clusters. #61771
  • Changed copy that previously referred to the app as "Admin UI" to "DB Console" instead. #62452
  • Users can now reset SQL stats from the DB Console. #63342
  • Created new routes for statements. A database name can be passed on statements routes, so only statements executed on that particular database are displayed. Added a new function returning all databases that had at least one statement executed during the current statistics collection period. #64087
  • The lease history section on the range report debug page now shows the type of lease acquisition event that resulted in a given lease. #63822
  • Updated the DB Console to show information about the database on the Statements Page and ability to choose which columns to display. #64614
  • The DB Console now shows information about the region of a node on the Transactions Page. #64996
  • Added missing formatting for some event types displayed in the DB Console. #65717
  • Changed the default event formatting to appear less alarming to users. #65717
  • The Statement Details Page now displays information about nodes and regions a statement was executed on. #65126
  • The Statements and Transactions pages now display information about nodes and regions a statement was executed on. #65126
  • Changed time format on metrics events to 24-hour UTC time. #66277
  • Removed styling width calculation from multiple bars. #66734
  • Added a new chart showing the latency of establishing a new SQL connection, including the time spent on authentication. #66625
  • The KV transaction restarts chart was moved from the Distributed metrics to the SQL Dashboard to be close to the Open SQL Transactions chart for more prominent visibility. #66973
  • Fixed mislabelled tooltips on the Transactions and Transaction Details pages in DB console. #66605
  • The DB Console now uses dotted underline on text that contains a tooltip. The 'i' icon was removed. #67023
  • Added "" to whitespace application names on filter selection on the Statements and Transactions pages. #66967
  • Added a new Overload dashboard that groups metrics that are useful for admission control. #66595
  • The SQL Statement Contention Time chart is surfaced more prominently on the SQL Dashboard. #66969
  • Added a Full Table/Index Scans time series chart on the SQL Dashboard. #66972
  • The 'threads' debugging page, previously used to inspect RocksDB threads, has been removed. #67389
  • Fixed a color mismatch ont he node status badge on the Cluster Overview page. #68049
  • Added a CES survey link component to support being able to get client feedback. #68429
  • The Transaction Details Page's SQL box now shows all of a transaction's statements in the order that they were executed. The Transaction Details Page also no longer displays statement statistics. #68447
  • Updated the Databases Page in the DB Console to bring them into alignment with our modern UX. #68390
  • The "Logical Plan" tab in the DB Console has been renamed "Explain Plan", and the displayed plan format has been updated to match the output of the EXPLAIN command in the SQL shell. Global EXPLAIN properties have been added to the logical plan in the DB Console which were previously missing. The EXPLAIN format shown below should now be reflected in the DB Console:

      distribution: full
      vectorized: true
      • hash join
      │ estimated row count: 503
      │ equality: (rider_id) = (id)
      │
      ├── • scan
      │     estimated row count: 513 (100% of the table; stats collected 9 seconds ago)
      │     table: rides@primary
      │     spans: FULL SCAN
      │
      └── • scan
            estimated row count: 50 (100% of the table; stats collected 1 minute ago)
            table: users@primary
            spans: FULL SCAN
    

    #68566

  • Changed date times on the Jobs Page to use 24-hour UTC. #68916

  • Added admission control metrics to the Overload dashboard. #68595

  • Hid node and region information on the new tenant plan (serverless/free tier). #69444

  • Added a new date range selector component to the DB Console's Statements and Transactions pages with the ability to show historical data. The default date range is set to 1 hour ago, and is used as the value when users reset the date range. #68831

  • Fixed tooltip text on the Statements and Transactions pages to use the correct setting diagnostics.sql_stat_reset.interval instead of the previous value, diagnostics.reporting.interval. #69577

Bug fixes

  • Fixed a bug where cluster backups with revision history may have included dropped descriptors in the "current" snapshot of descriptors on the cluster. #68983
  • Users can now only scroll in the content section of the Transactions Page, Statements Page, and Sessions Page. #69620
  • Previously, when using ALTER PRIMARY KEY on a regional by row table, the copied unique index from the old primary key would not have the correct zone configurations applied. This is now resolved, but users who encountered this bug should re-create the index. #69681
  • Fixed a bug that caused incorrect evaluation of the IN operator when the tuple on the right-hand side of the operator included a subquery, like a IN ('foo', (SELECT s FROM t), 'bar'). #69651
  • Fixed a bug where previously an internal error or a crash could occur when some crdb_internal builtin functions took string-like type arguments (e.g. name). #69698
  • Previously, users would receive a panic message when the log parser failed to extract log file formats. This has been replaced with a helpful error message. #69018
  • Fixed a bug to ensure that auxiliary tables used during cluster restore are garbage collected quickly afterwards. #67936
  • RESTORE will now correctly ignore dropped databases that may have been included in cluster backups with revision history. #68551
  • Fixed a bug that can cause prolonged unavailability due to lease transfer to a replica that may be in need of a raft snapshot. #69696
  • Fixed a bug where resuming an active schedule would always reset its next run time. This was sometimes undesirable with schedules that had a first_run option specified. #69571
  • Fixed a regression in statistics estimation in the optimizer for very large tables. The bug, which has been present since v20.2.14 and v21.1.7, could cause the optimizer to severely underestimate the number of rows returned by an expression. #69711
  • The raft.commandsapplied metric is now populated again. #69857
  • Fixed a bug where previously the store rebalancer was unable to rebalance leases for hot ranges that received a disproportionate amount of traffic relative to the rest of the cluster. This often led to prolonged single node hotspots in certain workloads that led to hot ranges. #65379
  • Added protection to IMPORT INTO to guard against concurrent type changes on user-defined types referenced by the target table. #69674
  • DNS unavailability during range 1 leaseholder loss will no longer cause significant latency increases for queries and other operations. b6fb0e626
  • Previously, using SET in a transaction and having the transaction retry internally could result in the previous session variable being unused. For example:

    SET intervalstyle = 'postgres';
    BEGIN;
    do something with interval -- (1) SET intervalstyle = 'iso_8601';
    do something with interval -- (2) COMMIT;
    

    If the transaction retries at COMMIT, when attempting to re-run (1), we would have an interval style 'iso_8601' instead of the original 'postgres' value. This has now been resolved. #69554

  • Envelope schema in Avro registry now honors schema_prefix and full_table_name. #60946

  • Previously, a drop column would cause check constraints that are currently validating to become active on a table. This has been fixed. #62257

  • OpenTracing traces now work correctly across nodes. The client and server spans for RPCs are once again part of the same trace instead of the server erroneously being a root span. #62703

  • Fixed a bug which prevented cockroach debug doctor zipdir from validating foreign key information represented in the un-upgraded deprecated format. #62829

  • Schema changes that include both a column addition and primary key change in the same transaction no longer result in a failed changefeed. #63217

  • Fixed a bug whereby transient clock synchronization errors could result in permanent schema change failures. #63671

  • Fixed a bug of debug backup export caused by inspecting table with multiple ranges. #63678

  • Fixed a performance regression for very simple queries. #64225

  • Fixed a bug that prevented transactions which lasted longer than 5 minutes and then performed writes from committing. #63725

  • Added a fix to prevent a rare crash that could occur when reading data from interleaved tables. #64374

  • Fixed an "index out of range" internal error with certain simple queries. #65018

  • Hosts listed with the connect --join command-line flag now default to port 26257 (was 443). This matches the existing behavior of start --join. #65014

  • CockroachDB now shows a correct error message if it tries to parse an interval that is out of range. #65377

  • Fixed a bug where NaN coordinates could make ShortestLine/LongestLine panic. #65445

  • Fixed a bug whereby using an enum value as a placeholder in an AS OF SYSTEM TIME query preceding a recent change to that enum could result in a fatal error. #65620

  • Fixed a race condition where transaction cleanup would fail to take into account ongoing writes and clean up their intents. #65592

  • The Transactions Page now shows the correct value for implicit transactions. #65126

  • Fixed a bug where TIMETZ values would not display appopriately in the CLI. #65321

  • Fixed a bug which prevented adding self-referencing FOREIGN KEY constraints in the NOT VALID state. #65871

  • The cockroach mt start-sql command with a nonexistent tenant ID now returns an error. Previously, it would crash and poison the tenant ID for future usage. #65683

  • CockroachDB now correctly handles errors during the pgwire extended protocol. Specifically, when an error is detected while processing any extended protocol message, an ErrorResponse is returned; then the server reads and discards messages until a Sync command is received from the client. This matches the PostgreSQL behavior. #57590

  • Fixed a bug where owners of a table have privileges to SELECT from it, but would return false on has_*_privilege-related functions. #65766

  • Added a more accurate error message for restoring AOST before GC TTL. #66025

  • When a non-SQL CLI command (e.g., cockroach init) was invoked with the --url flag and the URL did not contain a sslmode parameter, the command was incorrecting defaulting to operate as if --insecure was specified. This has been corrected. #65460

  • The URLs printed out by the client-side command \demo ls in cockroach demo now properly include the workload database name, if any was created. #65460

  • Fixed a bug where a superfluous unique constraint would be added to a table during a primary key change when the primary key change specified a new primary key constraint that involved the same columns in the same directions. #66225

  • Properly populated crdb_internal.job's coordinator_id field, which had not been properly populated since v20.2. #61417

  • Fixed a bug in SHOW ZONE CONFIGURATIONS where long constraints fields may show \n characters. #69470

  • CockroachDB could previously error out when a query involving tuples with collated strings and NULLs was executed in a distributed manner. This is now fixed. #66337

  • Fixed a bug with PostgreSQL compatibility where dividing an interval by a number would round to the nearest Microsecond instead of always rounding down. #66345

  • Previously, rows treated as tuples in functions such as row_to_json may have had their keys normalized to lowercase, instead of being preserved in the original casing as per PostgreSQL. This is now fixed. #66535

  • CockroachDB could previously crash when executing EXPLAIN (VEC) on some mutations. This is now fixed. #66569

  • Fixed a bug that caused a panic for window functions operating in GROUPS mode with OFFSET PRECEDING start and end bounds. #66582

  • Fixed a bug on the chart catalog admin API. #66645

  • Fixed a deadlock during adminVerifyProtectedTimestamp. #66760

  • Fixed a bug where a substring of a linestring with the same points would return EMPTY instead of a POINT with the repeated point. #66738

  • Fixed a bug where it was possible for a linestring returned from ST_LineSubString to have repeated points. #66738

  • Fixed an error occurring when executing OPERATOR(pg_catalog.~). #66865

  • Fixed ST_LineSubstring for LINESTRING EMPTY panicking instead of returning null. #66936

  • A migration will be run when users upgrade to v21.2 (specifically v21.2.14). This migration fixes any privilege descriptors that were corrupted from the fallout of the ZONECONFIG/USAGE bug on tables and databases after upgrading from v20.1 to v20.2 (#65010) and those that were corrupted after converting a database to a schema (#65697). #66495

  • Fixed a bug where job leases might be revoked due to a transient network error. #67075

  • Fixed a case where IMPORT would panic when parsing geospacial schemas with spacial index tuning parameters. In particular, this bug could be triggered by specifying the fillfactor option, or setting the autovacuum_enabled option to false. #66899

  • Intent garbage collection no longer waits for or aborts running transactions. #65001

  • The Statements Page now properly displays the Statement Time label on the column selector. #67327

  • Avro feeds now support special decimals like Infinity #66870

  • Fixed a typo on the Network tooltip on the Statements Page. #65126

  • PostgreSQL-style intervals now print a + sign for day units if the year/month unit preceding was negative (e.g., -1 year -2 months 2 days will now print as -1 year -2 months +2 days). #67210

  • SQL Standard intervals will omit the day value if the day value is 0. #67210

  • Added partial redactability to log SQL statements. This change provides greater visibility to SQL usage in the logs, enabling greater ability to troubleshoot. #66359

  • Fixed a bug in jobs where failures to write to the jobs table could prevent subsequent adoption of a job until the previous node dies or the job is paused. #67671

  • Fixed a minor resource leak that occurs when a RESTORE is run. #67478

  • Previously, an unavailable node that started draining or decommissioning would be treated as live and thus could receive a lease transfer, leading to the range becoming unavailable. This has been fixed. #67319

  • INSERT and UPDATE statements which operate on larger rows are split into batches using the sql.mutations.mutation_batch_byte_size setting #67537

  • Fixed a bug that could cause the min window function to incorrectly return null when executed with a non-default EXCLUDE clause, because min and max did not ignore null input values. #68025

  • Fixed a bug that could cause a panic when a window function was executed in RANGE mode with OFFSET PRECEDING or OFFSET FOLLOWING on a datetime column. #68013

  • SHOW search_path will now properly quote $user. #68034

  • Fixed a bug where restores of data with multiple column families could be split illegally (within a single SQL row). This could result in temporary data unavailability until the ranges on either side of the invalid split were merged. #67497

  • Fixed a bug that was introduced in v21.1.5, which prevented nodes from decommissioning in a cluster if there were multiple nodes intermittently missing their liveness heartbeats. #67714

  • Fixed a bug where the schedules.backup.succeeded and schedules.backup.failed metrics would sometimes not be updated. #67855

  • Previously, a SHOW GRANTS ON TYPE db.public.typ command would not correctly show the grants if the current database was not db. This is now fixed. #68137

  • Fixed a bug which permitted the dropping of enum values which were in use in index predicates or partitioning values. #68257

  • Fixed a bug that could cause the min and max window functions to return incorrect results when the window frame for a row was smaller than the frame for the previous row. #68314

  • Previously, parsing a date from a string would incorrectly assuming YMD format instead of the MDY format if the date was formatted using the two digit format for year: "YY-MM-DD" instead of "MM-DD-YY". This has been resolved. However, if you relied on having two-digit years as YY-MM-DD, prepend 0s at the front until it is at least 3 digits; e.g., "15-10-15" for the year 15 should read "015-10-15". #68093

  • Fixed a bug where migration jobs might run and update a cluster version before the cluster was ready for the upgrade. The bug could result in many extra failed migration jobs. #67281

  • IMPORT PGDUMP with a UDT would result in a nil pointer exception. It now fails gracefully. #67994

  • Cascaded drop of views could run into 'table ...is already being dropped' errors incorrectly. This is now fixed. #68601

  • Fixed a bug in which an ENUM-type value could be dropped despite it being referenced in a table's CHECK expression. #68666

  • Fixed an oversight in the data generator for TPC-H which was causing a smaller number of distinct values to be generated for p_type and p_container in the part table than the spec called for. #68699

  • Fixed a bug that created non-partial unique constraints when a user attempted to create a partial unique constraint in ALTER TABLE statements. #68629

  • Fixed a bug where encryption-at-rest registry would accumulate nonexistent file entries forever, contributing to the filesystem operation's latency on the store. #68394

  • Fixed a bug where IMPORT would incorrectly reset its progress upon resumption. #68337

  • Previously, given a table with hash-sharded indexes, the output of SHOW CREATE TABLE was not round-trippable: executing the output would not create an identical table. This has been fixed by showing CHECK constraints that are automatically created for these indexes in the output of SHOW CREATE TABLE. The bug had existed since v21.1. #69001

  • Importing tables via IMPORT PGDUMP or IMPORT MYSQL should now honor cluster setting sql.defaults.default_int_size and session variable default_int_size. #68902

  • Fixed a bug with cardinality estimation in the optimizer that was introduced in v21.1.0. This bug could cause inaccurate row count estimates in queries involving tables with a large number of null values. As a result, it was possible that the optimizer could choose a suboptimal plan. This issue has now been fixed. #69070

  • Fixed internal or "invalid cast" errors in some cases involving cascading updates. #69126

  • Previously, CockroachDB could return an internal error when performing the streaming aggregation in some edge cases, and this is now fixed. The bug had been present since v21.1. #69122

  • Introduced checks on Statements and Transactions pages so that the managed repo can update the cluster-ui version. #69205

  • When using COPY FROM .. BINARY, the correct format code will now be returned. #69066

  • Previously, COPY FROM ... BINARY would return an error if the input data was split across different messages. This is now fixed. #69066

  • Previously, COPY FROM ... CSV would require each CopyData message to be split at the boundary of a record. This was a bug since the COPY protocol would allow messages to be split at arbitrary points. This is now fixed. #69066

  • Previously, COPY FROM ... CSV did not correctly handle octal byte escape sequences such as \011 when using a BYTEA column. This is now fixed. #69066

  • Fixed a bug that caused internal errors with set operations, like UNION, and columns with tuple types that contained constant NULL values. This bug was introduced in v20.2. #68627

  • Fixed a crash when using the cockroach backup debug tool. #69251

  • Previously, after a temporary node outage, other nodes in the cluster could fail to connect to the restarted node due to their circuit breakers not resetting. This would manifest in the logs via messages "unable to dial nXX: breaker open", where XX is the ID of the restarted node. (Note that such errors are expected for nodes that are truly unreachable, and may still occur around the time of the restart, but for no longer than a few seconds). This is now fixed. #69405

  • Previously, table stats collection issued via an ANALYZE or CREATE STATISTICS statement without specifying AS OF SYSTEM TIME option could run into flow: memory budget exceeded. This has been fixed. #69483

  • Fixed a bug where IMPORT internal retries (i.e., due to node failures) might not pick up the latest progress updates. #68218

  • Fixed a bug where the summary displayed after an IMPORT command would sometimes be inaccurate due to retries. #68218

  • Long-running ANALYZE statements will no longer result in GC TTL errors. #68929

Performance improvements

  • Intent resolution for transactions that write many intents such that we track intent ranges, for the purpose of intent resolution, is much faster (potentially 100x) when using the separated lock table. #66268
  • Updated the optimizer cost model so that all else being equal, the optimizer prefers plans in which LIMIToperators are pushed as far down the tree as possible. This can reduce the number of rows that need to be processed by higher operators in the plan tree, thus improving performance. #69688
  • QPS-based replica rebalancing is now aware of different constraints placed on different replication zones. This means that heterogeneously loaded replication zones (for instance, regions) will achieve a more even distribution of QPS within the stores inside each of these zone. #65379
  • Some additional expressions using the <@ (contained by) and @> (contains) operators now support index-acceleration with the indexed column on either side of the expression. #61219
  • Some additional expressions using the <@ (contained by) and @> (contains) operators now support index-acceleration with the indexed column on either side of the expression. #61817
  • Columns that are held constant in partial index predicates can now be produced when scanning the partial index. This eliminates unnecessary primary index joins to retrieve those constant columns in some queries, resulting in lower latency. #62406
  • Inverted joins using <@ (contained by) and @> (contains) operators are now supported with the indexed column on either side of the expression. #62626
  • The optimizer now folds functions to NULL when the function does not allow NULL arguments and one of the arguments is a NULL constant. As a result, more efficient query plans will be produced for queries with these types of function calls. #62924
  • Expressions with the -> (fetch val) operator on the left side of either <@ (contained by) or @> (contains) now support index-acceleration. #63048
  • SQL will now emit GetRequests when possible to KV instead of always emitting ScanRequests. This manifests as a modest performance improvement for some workloads. #61583
  • Set operations (UNION, UNION ALL, INTERSECT, INTERSECT ALL, EXCEPT, and EXCEPT ALL) can now maintain ordering if both inputs are ordered on the desired output ordering. This can eliminate unnecessary sort operations and improve performance. #63805
  • Reduced memory usage in some write-heavy workloads. #64222
  • Increased the intelligence of the optimizer around the ability of a scan to provide certain requested orderings when some of the columns are held constant. This can eliminate unneeded sort operations in some cases, resulting in improved performance. #64254
  • Increased the intelligence of the optimizer around orderings that can be provided by certain relational expressions when some columns are constant or there are equalities between columns. This can allow the optimizer to plan merge joins, streaming group bys, and streaming set operations in more cases, resulting in improved performance. #64501
  • Increased the intelligence of the optimizer around orderings that can be provided by certain relational expressions when there are equalities between columns. This can allow the optimizer to remove unnecessary sort operations in some cases, thus improving performance. #64593
  • Improved the performance for distributed queries that need to send a lot of data of certain datatypes across the network. #64169
  • Peak memory usage in the lock table is now significantly reduced. Runaway CPU usage due to wasted quadratic time complexity in clearing unclearable locks is addressed. #64102
  • The selectivity of query filters with OR expressions is now calculated more accurately during query optimization, improving query plans in some cases. #64886
  • Validation of a new UNIQUE index in a REGIONAL BY ROW table no longer requires an inefficient and memory-intensive hash aggregation query. The optimizer can now plans the validation query so that it uses all streaming operations, which are much more efficient. #65355
  • A limited scan now checks for conflicting locks in an optimistic manner, which means it will not conflict with locks (typically unreplicated locks) that were held in the scan's full spans, but were not in the spans that were scanned until the limit was reached. This behavior can be turned off by changing the value of the cluster setting kv.concurrency.optimistic_eval_limited_scans.enabled to false. #58670
  • Queries that produce a lot of rows in the result usually will now run faster when executed via the vectorized execution engine. #65289
  • Inner, Left, and Semi joins involving REGIONAL BY ROW tables can now take advantage of locality-optimized search. This optimization allows lookup joins to avoid communicating with remote nodes if a lookup is known to produce at most one match per input row, and all matches are found locally. This can reduce query latency. #65784
  • Improved the performance of has_table_privilege by using an internal cache for performing privilege lookups. #65766
  • Improved the performance of has_any_column_privilege by removing some internal queries. #65766
  • Improved the performance of has_column_privilege by removing excessive queries. #65766
  • When admission control is enabled, work sent to the KV layer is subject to admission control that takes into account write overload in the storage engines. #65850
  • Regexp expressions that restrict values to a prefix (e.g., x ^ '^foo') now result in better plans if there is a suitable index. #66441
  • The optimizer can now create query plans that use streaming set operations, even when no ordering is required by the query. Streaming set operations are more efficient than the alternative hash set operations, because they avoid the overhead of building a hash table. This can result in improved performance for queries containing the set operations UNION, INTERSECT, INTERSECT ALL, EXCEPT, and EXCEPT ALL. (UNION ALL does not benefit from this optimization.) #64953
  • A limited scan now checks for conflicting latches in an optimistic manner, which means it will not conflict with latches that were held in the scan's full spans, but were not in the spans that were scanned until the limit was reached. This behavior can be turned off (along with optimistic locking) by changing the value of the cluster setting kv.concurrency.optimistic_eval_limited_scans.enabled to false. #66059
  • The optimizer is now less likely to create query plans that require buffering a large number of rows in memory. This can improve performance by reducing memory pressure and reducing the likelihood that execution operators will need to spill to disk. #66559
  • Validation of a new partial UNIQUE index in a REGIONAL BY ROW table no longer requires an inefficient and memory-intensive hash aggregation query. The optimizer can now plan the validation query so that it uses all streaming operations, which are much more efficient. #66565
  • Fixed a performance regression that made the in-memory vectorized sorter slower than the row-engine sorter when the input had decimal columns. #66807
  • Increased the default value for the kv.transaction.max_intents_bytes cluster setting from 256KB to 4MB to improve transaction performance for large transactions at the expense of increased memory usage. Transactions above this size limit use slower cleanup mode for commits and aborts. #66859
  • Adjusted optimizer cost to include CPU cost of lookupExprs when used by lookup joins. When a lookup join is chosen we can use two strategies: a simple 1-column lookup or a more involved multi-column lookup that makes more efficient use of indexes but has higher CPU cost. This change makes the cost model reflect that extra cost. #66786
  • Improved the efficiency of validation for some partial unique indexes in REGIONAL BY ROW tables by improving the query plan to use all streaming operations. #67263
  • The latency of authenticating a user has been improved by adding a cache for lookups of authentication related information. #66919
  • Improved the optimizer's cardinality estimations for enum columns, including the crdb_region column in REGIONAL BY ROW tables, as well as all other columns with user-defined types. This may result in the optimizer choosing a better query plan in some cases. #67374
  • Eliminated a round-trip when running most jobs. #67671
  • The performance of queries returning many arrays has been improved. #66941
  • Improved concurrency control for heavily contended write queries outside of transactions that touch multiple ranges, reducing excessive aborts and retries. #67215
  • The optimizer can now decorrelate queries that have a limit on the right (uncorrelated) input of a lateral join when the limit is greater than one. #68299
  • Sort performance has been improved when sorting columns of type STRING, BYTES, or UUID. #67451
  • Lookup joins on partial indexes with virtual columns are no considered by the optimizer, resulting in more efficient query plans in some cases. #68568
  • The COCKROACHDB_REGISTRY file used for encryption-at-rest will be replaced with a COCKROACHDB_ENCRYPTION_REGISTRY, which can be written to in a more efficient manner. #67320
  • Reduce memory usage slightly during ANALYZE or CREATE STATISTICS statements. #69051
  • The optimizer more accurately costs streaming group-by operators. As a result, more efficient query plans should be chosen in some cases. #68922
  • If the query is executed locally and needs to perform reads from multiple remote leaseholders, those remote reads might now be done faster. This is especially likely for the case of locality optimized search when there is a local region miss. #68679
  • Improved the histogram construction logic so that histograms for columns with a large number of distinct values are more accurate. This can result in better cardinality estimates in the optimizer and enable the optimizer to choose better query plans. #68698
  • Locality-optimized search is now supported for scans that are guaranteed to return 100,000 keys or less. This optimization allows the execution engine to avoid visiting remote regions if all requested keys are found in the local region, thus reducing the latency of the query. #69395

Build changes

  • Added go-swagger dependency. Updated Makefile to call it to rebuild spec in docs/generated/swagger/, which will eventually be used for API docs. #62560

Contributors

This release includes 2750 merged PRs by 229 authors. We would like to thank the following contributors from the CockroachDB community:

  • AJ (first-time contributor)
  • Alan Acosta (first-time contributor)
  • Aleksandr Fedorov (first-time contributor)
  • Callum Neenan (first-time contributor)
  • Catherine J (first-time contributor)
  • David López
  • Eugene Kalinin
  • Ganeshprasad Biradar (first-time contributor)
  • Jane Xing (first-time contributor)
  • Janusz Marcinkiewicz (first-time contributor)
  • Jonathan Albrecht (first-time contributor)
  • Julien Levesy (first-time contributor)
  • Justin Lowery (first-time contributor)
  • K Rain Leander (first-time contributor)
  • Keith McClellan (first-time contributor)
  • Kumar Akshay
  • Lakshmi Kannan (first-time contributor)
  • Lauren Barker (first-time contributor)
  • Masahiro Ikeda (first-time contributor)
  • Max Neverov
  • Miguel Novelo (first-time contributor)
  • Mohammad Aziz (first-time contributor)
  • Mohit Agarwal (first-time contributor)
  • Nikola N (first-time contributor)
  • OrlovM (first-time contributor)
  • Rupesh Harode (first-time contributor)
  • Sam X Smith (first-time contributor)
  • Shivam Agrawal (first-time contributor)
  • Sumit Tembe (first-time contributor)
  • Tharun
  • Wilson Meng (first-time contributor)
  • Zhou Xing (first-time contributor)
  • Zijie Lu (first-time contributor)
  • aayush (first-time contributor)
  • ajstorm (first-time contributor)
  • auxten (first-time contributor)
  • e-mbrown (first-time contributor)
  • joesankey (first-time contributor)
  • kurokochin (first-time contributor)
  • linyimin (first-time contributor)
  • oeph (first-time contributor)
  • rharding6373 (first-time contributor)
  • seekingua (first-time contributor)
  • snyk-bot (first-time contributor)
  • yangxuan (first-time contributor)
  • zhangwei.95 (first-time contributor)
YesYes NoNo