cockroach start

This page explains the cockroach start , which you use to start a new multi-node cluster or add nodes to an existing cluster.

If you need a simple single-node backend for app development, use instead, and follow the best practices for local testing described in .For quick SQL testing, consider using to start a temporary, in-memory cluster with immediate access to an interactive SQL shell.

Node-level settings are defined by flags passed to the cockroach start command and cannot be changed without stopping and restarting the node. In contrast, some cluster-wide settings are defined via SQL statements and can be updated anytime after a cluster has been started. For more details, see .

Synopsis

Start a node to be part of a new multi-node cluster:

$ cockroach start <flags, including --join

Initialize a new multi-node cluster:

$ cockroach init <flags>

Add a node to an existing cluster:

$ cockroach start <flags, including --join

View help:

$ cockroach start --help

Flags

The cockroach start command supports the following general-use, networking, security, and logging flags. Many flags have useful defaults that can be overridden by specifying the flags explicitly. If you specify flags explicitly, however, be sure to do so each time the node is restarted, as they will not be remembered. The one exception is the --join flag, which is stored in a node’s data directory. We still recommend specifying the --join flag every time, as this will allow nodes to rejoin the cluster even if their data directory was destroyed.

General

Flag	Description
`--attrs`	Arbitrary strings, separated by colons, specifying node capability, which might include specialized hardware or number of cores, for example: `--attrs=ram:64gb` These can be used to influence the location of data replicas. See for full details.
`--background`	Runs the node in the background. Control is returned to the shell only once the node is ready to accept requests, so this is recommended over appending `&` to the command. This flag is not available in Windows environments. Note: `--background` is suitable for writing automated test suites or maintenance procedures that need a temporary server process running in the background. It is not intended to be used to start a long-running server, because it does not fully detach from the controlling terminal. Consider using a service manager or a tool like daemon(8) instead. If you use `--background`, using `--pid-file` is also recommended. To gracefully stop the `cockroach` process, send the `SIGTERM` signal to the process ID in the PID file. To gracefully restart the process, send the `SIGHUP` signal.
`--cache`	The total size for the block cache, shared evenly if there are multiple storage devices. This can be a percentage (notated as a decimal or with `%`) or any bytes-based unit, for example: `--cache=.25` `--cache=25%` `--cache=1000000000 ----> 1000000000 bytes` `--cache=1GB ----> 1000000000 bytes` `--cache=1GiB ----> 1073741824 bytes` Note: If you use the `%` notation, you might need to escape the `%` sign when configuring CockroachDB through `systemd` service files. For this reason, it’s recommended to use the decimal notation instead. Note: The sum of `--cache`, `--max-sql-memory`, and `--max-tsdb-memory` should not exceed 75% of the memory available to the `cockroach` process. Default: `256 MiB` The default cache size is reasonable for local development clusters. For production deployments, set this to 25% or higher. Increasing the cache size generally improves the node’s read performance. The block cache holds uncompressed blocks of persisted in memory. If a read misses within the block cache, the storage engine reads the file via the operating system’s page cache, which may hold the relevant block in-memory in its compressed form. Otherwise, the read is served from the storage device. The block cache fills to the configured size and is then recycled using a least-recently-used (LRU) policy. Refer to for more details. Production systems should always configure this setting.
`--clock-device`	Enable CockroachDB to use a PTP hardware clock when querying the current time. The value is a string that specifies the clock device to use. For example: `--clock-device=/dev/ptp0` Note: This is supported on Linux only and may be needed in cases where the host clock is unreliable or prone to large jumps (e.g., when using vMotion).
`--cluster-name`	A string that specifies a cluster name. This is used together with `--join` to ensure that all newly created nodes join the intended cluster when you are running multiple clusters. Note: If this is set, , , , and the `cockroach debug` commands must specify either `--cluster-name` or `--disable-cluster-name-verification` in order to work.
`--disable-cluster-name-verification`	On clusters for which a cluster name has been set, this flag paired with `--cluster-name` disables the cluster name check for the command. This is necessary on existing clusters, when setting a cluster name or changing the cluster name: Perform a rolling restart of all nodes and include both the new `--cluster-name` value and `--disable-cluster-name-verification`, then a second rolling restart with `--cluster-name` and without `--disable-cluster-name-verification`.
`--external-io-dir`	The path of the external IO directory with which the local file access paths are prefixed while performing backup and restore operations using local node directories or NFS drives. If set to `disabled`, backups and restores using local node directories and NFS drives, as well as , are disabled. Default: `extern` subdirectory of the first configured `store`. To set the `--external-io-dir` flag to the locations you want to use without needing to restart nodes, create symlinks to the desired locations from within the `extern` directory.
`--listening-url-file`	The file to which the node’s SQL connection URL will be written as soon as the node is ready to accept connections, in addition to being printed to the standard output. When `--background` is used, this happens before the process detaches from the terminal. This is particularly helpful in identifying the node’s port when an unused port is assigned automatically (`--port=0`).
`--locality`	Arbitrary key-value pairs that describe the location of the node. Locality might include country, region, availability zone, etc. A `region` tier must be included in order to enable . To specify locality in a file instead, refer to `--locality-file`. For more details, see Locality below.
`--locality-file`	A file that contains arbitrary key-value pairs that describe the location of the node, as an alternative to the `--locality` flag.
`--max-disk-temp-storage`	The maximum on-disk storage capacity available to store temporary data for SQL queries that exceed the memory budget (see `--max-sql-memory`). This ensures that JOINs, sorts, and other memory-intensive SQL operations are able to spill intermediate results to disk. This can be a percentage (notated as a decimal or with `%`) or any bytes-based unit (e.g., `.25`, `25%`, `500GB`, `1TB`, `1TiB`). Note: If you use the `%` notation, you might need to escape the `%` sign, for instance, while configuring CockroachDB through `systemd` service files. For this reason, it’s recommended to use the decimal notation instead. Also, if expressed as a percentage, this value is interpreted relative to the size of the first store. However, the temporary space usage is never counted towards any store usage; therefore, when setting this value, it’s important to ensure that the size of this temporary storage plus the size of the first store doesn’t exceed the capacity of the storage device. The temporary files are located in the path specified by the `--temp-dir` flag, or in the subdirectory of the first store (see `--store`) by default. Default: `32GiB`
`--max-go-memory`	The maximum soft memory limit for the Go runtime, which influences the behavior of Go’s garbage collection. Defaults to `--max-sql-memory x 2.25`, but cannot exceed 90% of the node’s available RAM. To disable the soft memory limit, set `--max-go-memory` to `0` (not recommended).
`--max-offset`	The maximum allowed clock offset for the cluster. If observed clock offsets exceed this limit, servers will crash to minimize the likelihood of reading inconsistent data. Increasing this value will increase the time to recovery of failures as well as the frequency of uncertainty-based read restarts. Nodes can run with different values for `--max-offset`, but only for the purpose of updating the setting across the cluster using a rolling upgrade. Default: `500ms`
`--max-sql-memory`	The maximum in-memory storage capacity available to store temporary data for SQL queries, including prepared queries and intermediate data rows during query execution. This can be a percentage (notated as a decimal or with `%`) or any bytes-based unit; for example: `--max-sql-memory=.25` `--max-sql-memory=25%` `--max-sql-memory=10000000000 ----> 1000000000 bytes` `--max-sql-memory=1GB ----> 1000000000 bytes` `--max-sql-memory=1GiB ----> 1073741824 bytes` The temporary files are located in the path specified by the `--temp-dir` flag, or in the subdirectory of the first store (see `--store`) by default. Note: If you use the `%` notation, you might need to escape the `%` sign (for instance, while configuring CockroachDB through `systemd` service files). For this reason, it’s recommended to use the decimal notation instead. Note: The sum of `--cache`, `--max-sql-memory`, and `--max-tsdb-memory` should not exceed 75% of the memory available to the `cockroach` process. Default: `25%` The default SQL memory size is suitable for production deployments but can be raised to increase the number of simultaneous client connections the node allows as well as the node’s capacity for in-memory processing of rows when using `ORDER BY`, `GROUP BY`, `DISTINCT`, joins, and window functions. For local development clusters with memory-intensive workloads, reduce this value to, for example, `128MiB` to prevent .
`--max-tsdb-memory`	Maximum memory capacity available to store temporary data for use by the time-series database to display metrics in the . Consider raising this value if your cluster is comprised of a large number of nodes where individual nodes have very limited memory available (e.g., under `8 GiB`). Insufficient memory capacity for the time-series database can constrain the ability of the DB Console to process the time-series queries used to render metrics for the entire cluster. This capacity constraint does not affect SQL query execution. This flag accepts numbers interpreted as bytes, size suffixes (e.g., `1GB` and `1GiB`) or a percentage of physical memory (e.g., `0.01`). Note: The sum of `--cache`, `--max-sql-memory`, and `--max-tsdb-memory` should not exceed 75% of the memory available to the `cockroach` process. Default: `0.01` (i.e., 1%) of physical memory or `64 MiB`, whichever is greater.
`--pid-file`	The file to which the node’s process ID will be written as soon as the node is ready to accept connections. When `--background` is used, this happens before the process detaches from the terminal. When this flag is not set, the process ID is not written to file.
`--store` `-s`	The file path to a storage device and, optionally, store attributes and maximum size. When using multiple storage devices for a node, this flag must be specified separately for each device, for example: `--store=/mnt/ssd01 --store=/mnt/ssd02` For more details, see Store below.
`--wal-failover`	Used to configure WAL failover on with multiple stores. To enable WAL failover, pass `--wal-failover=among-stores`. To disable, pass `--wal-failover=disabled` on .
`--spatial-libs`	The location on disk where CockroachDB looks for libraries. Defaults: `/usr/local/lib/cockroach` A `lib` subdirectory of the CockroachDB binary’s current directory.
`--temp-dir`	The path of the node’s temporary store directory. On node start up, the location for the temporary files is printed to the standard output. Default: Subdirectory of the first store

Networking

Flag	Description
`--experimental-dns-srv`	When this flag is included, the node will first attempt to fetch SRV records from DNS for every name specified with `--join`. If a valid SRV record is found, that information is used instead of regular DNS A/AAAA lookups. This feature is experimental and may be removed or modified in a later version.
`--listen-addr`	The IP address/hostname and port to listen on for connections from other nodes and clients. For IPv6, use the notation `[...]`, e.g., `[::1]` or `[fe80::f6f2:::]`. This flag’s effect depends on how it is used in combination with `--advertise-addr`. For example, the node will also advertise itself to other nodes using this value if `--advertise-addr` is not specified. For more details, see . Default: Listen on all IP addresses on port `26257`; if `--advertise-addr` is not specified, also advertise the node’s canonical hostname to other nodes
`--advertise-addr`	The IP address/hostname and port to tell other nodes to use. If using a hostname, it must be resolvable from all nodes. If using an IP address, it must be routable from all nodes; for IPv6, use the notation `[...]`, e.g., `[::1]` or `[fe80::f6f2:::]`. This flag’s effect depends on how it is used in combination with `--listen-addr`. For example, if the port number is different than the one used in `--listen-addr`, port forwarding is required. For more details, see . Default: The value of `--listen-addr`; if `--listen-addr` is not specified, advertises the node’s canonical hostname and port `26257`
`--advertise-http-addr`	The IP address/hostname and port to advertise for DB Console HTTP requests when . If omitted, the hostname is inherited from the operating system hostname or the hostname from `--advertise-addr`. If the port is omitted, it defaults to `8080` and is never inherited from `--advertise-addr`.
`--http-addr`	The IP address/hostname and port on which to listen for DB Console HTTP requests. For IPv6, use the notation `[...]`, e.g., `[::1]:8080` or `[fe80::f6f2:::]:8080`. Default: Listen on the address specified in `--listen-addr` and on port `8080`
`--locality-advertise-addr`	The IP address/hostname and port to tell other nodes in specific localities to use to connect to this node. This flag is useful when running a cluster across multiple networks, where nodes in a given network have access to a private or local interface while nodes outside the network do not. In this case, you can use `--locality-advertise-addr` to tell nodes within the same network to prefer the private or local address to improve performance and use `--advertise-addr` to tell nodes outside the network to use another address that is reachable from them. However, do not include addresses or hostnames that do not resolve to this node, because this will cause connection failures when other nodes attempt to connect to this node. This flag relies on nodes being started with the `--locality` flag and uses the `locality@address` notation, for example: `--locality-advertise-addr=region=us-west@10.0.0.0:26257` For more details, refer to the Start a multi-node cluster across private networks example.
`--sql-addr`	The IP address/hostname and port on which to listen for SQL connections from clients. For IPv6, use the notation `[...]`, e.g., `[::1]` or `[fe80::f6f2:::]`. This flag’s effect depends on how it is used in combination with `--advertise-sql-addr`. For example, the node will also advertise itself to clients using this value if `--advertise-sql-addr` is not specified. Default: The value of `--listen-addr`; if `--listen-addr` is not specified, advertises the node’s canonical hostname and port `26257` For an example, see Start a cluster with separate RPC and SQL networks
`--advertise-sql-addr`	The IP address/hostname and port to tell clients to use. If using a hostname, it must be resolvable from all nodes. If using an IP address, it must be routable from all nodes; for IPv6, use the notation `[...]`, e.g., `[::1]` or `[fe80::f6f2:::]`. This flag’s effect depends on how it is used in combination with `--sql-addr`. For example, if the port number is different than the one used in `--sql-addr`, port forwarding is required. Default: The value of `--sql-addr`; if `--sql-addr` is not specified, advertises the value of `--listen-addr`
`--join` `-j`	The host addresses that connect nodes to the cluster and distribute the rest of the node addresses. These can be IP addresses or DNS aliases of nodes. When starting a cluster in a single region, specify the addresses of 3-5 initial nodes. When starting a cluster in multiple regions, specify more than 1 address per region, and select nodes that are spread across failure domains. Then run the command against any of these nodes to complete cluster startup. See the example below for more details. Use the same `--join` list for all nodes to ensure that the cluster can stabilize. Do not list every node in the cluster, because this increases the time for a new cluster to stabilize. Note that these are best practices; it is not required to restart an existing node to update its `--join` flag. `cockroach start` must be run with the `--join` flag. To start a single-node cluster, use `cockroach start-single-node` instead.
`--socket-dir`	The directory path on which to listen for Unix domain socket connections from clients installed on the same Unix-based machine. For an example, see .
`--advertise-host`	Deprecated. Use `--advertise-addr` instead.
`--host`	Deprecated. Use `--listen-addr` instead.
`--port` `-p`	Deprecated. Specify port in `--advertise-addr` and/or `--listen-addr` instead.
`--http-host`	Deprecated. Use `--http-addr` instead.
`--http-port`	Deprecated. Specify port in `--http-addr` instead.

Security

Flag	Description
`--certs-dir`	The path to the . The directory must contain valid certificates if running in secure mode. Default: `${HOME}/.cockroach-certs/`
`--insecure`	Note: The `--insecure` flag is intended for non-production testing only. Run in insecure mode, skipping all TLS encryption and authentication. If this flag is not set, the `--certs-dir` flag must point to valid certificates. Note the following risks: An insecure cluster is open to any client that can access any node’s IP addresses; client connections must also be made insecurely; any user, even `root`, can log in without providing a password; any user, connecting as `root`, can read or write any data in your cluster; there is no network encryption or authentication, and thus no confidentiality. Default: `false`
`--accept-sql-without-tls`	This flag (in ) allows you to connect to the cluster using a SQL user’s password without . When connecting using the built-in SQL client, .
`--cert-principal-map`	A comma-separated list of `cert-principal:db-principal` mappings used to map the certificate principals to IP addresses, DNS names, and SQL users. This allows the use of certificates generated by Certificate Authorities that place restrictions on the contents of the `commonName` field. For usage information, see .
`--enterprise-encryption`	This optional flag specifies the encryption options for one of the stores on the node. If multiple stores exist, the flag must be specified for each store. This flag takes a number of options. For a complete list of options, and usage instructions, see .
`--external-io-disable-http`	This optional flag disables external HTTP(S) access (as well as custom HTTP(S) endpoints) when performing bulk operations (e.g, ). This can be used in environments where you cannot run a full proxy server. If you want to run a proxy server, you can start CockroachDB while specifying the `HTTP(S)_PROXY` environment variable.
`--external-io-disable-implicit-credentials`	This optional flag disables the use of implicit credentials when accessing external cloud storage services for bulk operations (e.g, ).
`--node-cert-distinguished-name`	A string with a comma separated list of distinguished name (DN) mappings in `{attribute-type}={attribute-value}` format in accordance with RFC4514 for the . If this flag is set, this needs to be an exact match with the DN subject in the client certificate provided for the `node` user. By exact match, we mean that the order of attributes in the argument to this flag must match the order of attributes in the DN subject in the certificate. For more information, see .
`--root-cert-distinguished-name`	A string with a comma separated list of distinguished name (DN) mappings in `{attribute-type}={attribute-value}` format in accordance with RFC4514 for the . If this flag is set, this needs to be an exact match with the DN subject in the client certificate provided for the `root` user. By exact match, we mean that the order of attributes in the argument to this flag must match the order of attributes in the DN subject in the certificate. For more information, see .
`--tls-cipher-suites`	A comma-separated list of TLS cipher suites to allow for SQL, RPC, and HTTP connections, limited to those . Connections using disallowed cipher suites will be rejected during the TLS handshake and logged to `cockroach.log`. Look for log messages containing: `presented cipher ... not in allowed cipher suite list`. Example usage: `--tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_AES_128_GCM_SHA256`.

Locality

The --locality flag accepts arbitrary key-value pairs that describe the location of the node. Locality should include a region key-value if you are using CockroachDB’s . Depending on your deployment you can also specify country, availability zone, etc. The key-value pairs should be ordered into locality tiers from most inclusive to least inclusive (e.g., region before availability zone as in region=eu-west-1,zone=eu-west-1a), and the keys and order of key-value pairs must be the same on all nodes. It’s typically better to include more pairs than fewer.

Specifying a region with a region tier is required in order to enable CockroachDB’s .
CockroachDB spreads the replicas of each piece of data across as diverse a set of localities as possible, with the order determining the priority. Locality can also be used to influence the location of data replicas in various ways using high-level or low-level .
When there is high latency between nodes (e.g., cross-availability zone deployments), CockroachDB uses locality to move range leases closer to the current workload, reducing network round trips and improving read performance, also known as . In a deployment across more than 3 availability zones, however, to ensure that all data benefits from “follow-the-workload”, you must increase your replication factor to match the total number of availability zones.
Locality is also a prerequisite for using the , , and Enterprise features.

Example

The following shell commands use the --locality flag to start 9 nodes to run across 3 regions: us-east-1, us-west-1, and europe-west-1. Each region’s nodes are further spread across different availability zones within that region.

This example follows the conventions required to use CockroachDB’s .

Nodes in us-east-1:

cockroach start --locality=region=us-east-1,zone=us-east-1a # ... other required flags go here

cockroach start --locality=region=us-east-1,zone=us-east-1b # ... other required flags go here

cockroach start --locality=region=us-east-1,zone=us-east-1c # ... other required flags go here

Nodes in us-west-1:

cockroach start --locality=region=us-west-1,zone=us-west-1a # ... other required flags go here

cockroach start --locality=region=us-west-1,zone=us-west-1b # ... other required flags go here

cockroach start --locality=region=us-west-1,zone=us-west-1c # ... other required flags go here

Nodes in europe-west-1:

cockroach start --locality=region=europe-west-1,zone=europe-west-1a # ... other required flags go here

cockroach start --locality=region=europe-west-1,zone=europe-west-1b # ... other required flags go here

cockroach start --locality=region=europe-west-1,zone=europe-west-1c # ... other required flags go here

For another multi-region example, see Start a multi-region cluster. For more information about how to use CockroachDB’s multi-region capabilities, see the .

Load-based lease rebalancing in uneven latency deployments

When nodes are started with the --locality flag, CockroachDB attempts to place the replica lease holder (the replica that client requests are forwarded to) on the node closest to the source of the request. This means as client requests move geographically, so too does the replica lease holder. However, you might see increased latency caused by a consistently high rate of lease transfers between datacenters in the following case:

Your cluster runs in datacenters which are very different distances away from each other.
Each node was started with a single tier of --locality, e.g., --locality=datacenter=a.
Most client requests get sent to a single datacenter because that’s where all your application traffic is.

To detect if this is happening, open the , select the Queues dashboard, hover over the Replication Queue graph, and check the Leases Transferred / second data point. If the value is consistently larger than 0, you should consider stopping and restarting each node with additional tiers of locality to improve request latency. For example, let’s say that latency is 10ms from nodes in datacenter A to nodes in datacenter B but is 100ms from nodes in datacenter A to nodes in datacenter C. To ensure A’s and B’s relative proximity is factored into lease holder rebalancing, you could restart the nodes in datacenter A and B with a common region, --locality=region=foo,datacenter=a and --locality=region=foo,datacenter=b, while restarting nodes in datacenter C with a different region, --locality=region=bar,datacenter=c.

Storage engine

The --storage-engine flag is used to choose the storage engine used by the node. Note that this setting applies to all stores on the node, including the temp store. As of v21.1 and later, CockroachDB always uses the . As such, pebble is the default and only option for the --storage-engine flag.

Store

The --store flag allows you to specify details about a node’s storage. To start a node with multiple disks or SSDs, provide a separate --store flag for each disk when starting the cockroach process on the node. For more details about stores, see .

If you start a node with multiple --store flags, it is not possible to scale back down to only using a single store on the node. Instead, you must decommission the node and start a new node with the updated --store.

The --store flag supports the following fields. Note that commas are used to separate fields, and so are forbidden in all field values.

In-memory storage is not suitable for production deployments at this time.

Use dedicated volumes for the CockroachDB store. Do not share the store volume with any other I/O activity.

Field	Description
`type`	For in-memory storage, set this field to `mem`; otherwise, leave this field out. The `path` field must not be set when `type=mem`.
`path`	The file path to the storage device. When not setting `attr`, `size`, or `ballast-size`, the `path` field label can be left out: `--store=/mnt/ssd01` When either of those fields are set, however, the `path` field label must be used: `--store=path=/mnt/ssd01,size=20GB` Default: `cockroach-data`
`attrs`	Arbitrary strings, separated by colons, specifying disk type or capability. These can be used to influence the location of data replicas. See for full details. In most cases, node-level `--locality` or `--attrs` are preferable to store-level attributes, but this field can be used to match capabilities for storage of individual databases or tables. For example, an OLTP database would probably want to allocate space for its tables only on solid state devices, whereas append-only time series might prefer cheaper spinning drives. Typical attributes include whether the store is flash (`ssd`) or spinny disk (`hdd`), as well as speeds and other specs, for example: `--store=path=/mnt/hda1,attrs=hdd:7200rpm`
`size`	The maximum size allocated to the node. When this size is reached, CockroachDB attempts to rebalance data to other nodes with available capacity. When no other nodes have available capacity, this limit will be exceeded. Data may also be written to the node faster than the cluster can rebalance it away; as long as capacity is available elsewhere, CockroachDB will gradually rebalance data down to the store limit. The `size` can be specified either in a bytes-based unit or as a percentage of hard drive space (notated as a decimal or with `%`), for example: `--store=path=/mnt/ssd01,size=10000000000 ----> 10000000000 bytes` `--store=path=/mnt/ssd01,size=20GB ----> 20000000000 bytes` `--store=path=/mnt/ssd01,size=20GiB ----> 21474836480 bytes` `--store=path=/mnt/ssd01,size=0.02TiB ----> 21474836480 bytes` `--store=path=/mnt/ssd01,size=20% ----> 20% of available space` `--store=path=/mnt/ssd01,size=0.2 ----> 20% of available space` `--store=path=/mnt/ssd01,size=.2 ----> 20% of available space` Default: 100% For an in-memory store, the `size` field is required and must be set to the true maximum bytes or percentage of available memory, for example: `--store=type=mem,size=20GB` `--store=type=mem,size=90%` Note: If you use the `%` notation, you might need to escape the `%` sign, for instance, while configuring CockroachDB through `systemd` service files. For this reason, it’s recommended to use the decimal notation instead.
`ballast-size`	Configure the size of the automatically created emergency ballast file. Accepts the same value formats as the `size` field. For more details, see . To disable automatic ballast file creation, set the value to `0`: `--store=path=/mnt/ssd01,ballast-size=0`
`provisioned-rate`	A mapping of a store name to a bandwidth limit, expressed in bytes per second. This constrains the bandwidth used for for operations on the store. The disk name is separated from the bandwidth value by a colon (`:`). A value of `0` (the default) represents unlimited bandwidth. For example: `--store=provisioned-rate=disk-name=/mnt/ssd01:200` Default: 0 If the bandwidth value is omitted, bandwidth is limited to the value of the . Modify this setting only in consultation with your support team.

Write Ahead Log (WAL) failover

On a CockroachDB with , you can mitigate some effects of by configuring the node to failover each store’s to another store’s data directory using the --wal-failover flag to or the COCKROACH_WAL_FAILOVER environment variable. Failing over the WAL may allow some operations against a store to continue to complete despite temporary unavailability of the underlying storage. For example, if the node’s primary store is stalled, and the node can’t read from or write to it, the node can still write to the WAL on another store. This can allow the node to continue to service requests during momentary unavailability of the underlying storage device. When WAL failover is enabled, CockroachDB does the following:

Pairs each primary store with a secondary failover store at node startup.
Monitors latency of all write operations against the primary WAL. If any operation exceeds the duration of , the node redirects new WAL writes to the secondary store.
Checks the primary store while failed over by performing a set of filesystem operations against a small internal “probe file” on its volume. This file contains no user data and exists only when WAL failover is enabled.
Switches back to the primary store once the set of filesystem operations against the probe file on its volume starts consuming less than a latency threshold (order of tens of milliseconds). If a probe fsync blocks longer than , CockroachDB emits a log like: disk stall detected: sync on file probe-file has been ongoing for 40.0s and, if the stall persists, the node exits (fatals) to and allow recovery elsewhere.
Exposes status at so you can monitor each store’s health and failover state.

WAL failover only relocates the WAL. Data files remain on the primary volume. Reads that miss the Pebble block cache and the OS page cache can still stall if the primary disk is stalled. Caches typically limit blast radius, but some reads may see elevated latency.

This page has basic instructions on how to enable WAL failover, disable WAL failover, and monitor WAL failover. For more detailed instructions showing how to use, test, and monitor WAL failover, as well as descriptions of how WAL failover works in multi-store configurations, see .

Enable WAL failover

To enable WAL failover, you must take one of the following actions:

Pass --wal-failover=among-stores to cockroach start, or
Set the environment variable COCKROACH_WAL_FAILOVER=among-stores before starting the node.

using the default configuration can lead to cluster instability in the event of a . It’s not enough to failover your WAL writes to another disk: you must also write your log files in such a way that the forward progress of your cluster is not stalled due to disk unavailability. Therefore, if you enable WAL failover and log to local disks, you must also update your configuration as follows:

Disable . File-based audit logging cannot coexist with the WAL failover feature. File-based audit logging provides guarantees that every log message makes it to disk, or CockroachDB must be shut down. For this reason, resuming operations in the face of disk unavailability is not compatible with audit logging.
Enable asynchronous buffering of using the buffering configuration option. The buffering configuration can be applied to or individual file-groups as needed. Note that enabling asynchronous buffering of file-groups log sinks is in .
Set max-staleness: 1s and flush-trigger-size: 256KiB.
When buffering is enabled, buffered-writes must be explicitly disabled as shown in the following example. This is necessary because buffered-writes does not provide true asynchronous disk access, but rather a small buffer. If the small buffer fills up, it can cause internal routines performing logging operations to hang. This will in turn cause internal routines doing other important work to hang, potentially affecting cluster stability.

The recommended logging configuration for using file-based logging with WAL failover is as follows:

file-defaults:
 buffered-writes: false
 auditable: false
 buffering:
   max-staleness: 1s
   flush-trigger-size: 256KiB
   max-buffer-size: 50MiB

As an alternative to logging to local disks, you can configure that are not correlated with the availability of your cluster’s local disks. However, this will make troubleshooting using more difficult, since the output of that command will not include the (remotely stored) log files.

Disable WAL failover

To disable WAL failover, you must and either:

Pass the --wal-failover=disabled flag to cockroach start, or
Set the environment variable COCKROACH_WAL_FAILOVER=disabled before restarting the node.

Monitor WAL failover

You can monitor WAL failover occurrences using the following metrics:

storage.wal.failover.secondary.duration: Cumulative time spent (in nanoseconds) writing to the secondary WAL directory. Only populated when WAL failover is configured.
storage.wal.failover.primary.duration: Cumulative time spent (in nanoseconds) writing to the primary WAL directory. Only populated when WAL failover is configured.
storage.wal.failover.switch.count: Count of the number of times WAL writing has switched from primary to secondary store, and vice versa.
storage.wal.fsync.latency monitors the latencies of WAL files. If you have WAL failover enabled and are failing over, storage.wal.fsync.latency will include the latency of the stalled primary.
storage.wal.failover.write_and_sync.latency: When WAL failover is configured in a cluster, the operator should monitor this metric which shows the effective latency observed by the higher layer writing to the WAL. This metric is expected to stay low in a healthy system, regardless of whether WAL files are being written to the primary or secondary.

The storage.wal.failover.secondary.duration is the primary metric to monitor. You should expect this metric to be 0 unless a WAL failover occurs. If a WAL failover occurs, the rate at which it increases provides an indication of the health of the primary store. You can access these metrics via the following methods:

The in .
By .

For more information, refer to

Logging

By , cockroach start writes all messages to log files, and prints nothing to stderr. This includes events with INFO and higher. However, you can of this command by using the --log flag:

Flag	Description
`--log`	Configure logging parameters by specifying a YAML payload. For details, see . If a YAML configuration is not specified, the is used. `--log-config-file` can also be used. Note: The logging flags below cannot be combined with `--log`, but can be defined instead in the YAML payload.
`--log-config-file`	Configure logging parameters by specifying a path to a YAML file. For details, see . If a YAML configuration is not specified, the is used. `--log` can also be used. Note: The logging flags below cannot be combined with `--log-config-file`, but can be defined instead in the YAML file.
`--log-dir`	An alias for the flag, for configuring the log directory where log files are stored and written to. Specifically, `--log-dir=XXX` is an alias for `--log='file-defaults: {dir: XXX}'`. Setting `--log-dir` to a blank directory (`--log-dir=`) disables logging to files. Do not use `--log-dir=""`; this creates a new directory named `""` and stores log files in that directory.
`--log-group-max-size`	An alias for the flag, for configuring the maximum size for a logging group (for example, `cockroach`, `cockroach-sql-audit`, `cockroach-auth`, `cockroach-sql-exec`, `cockroach-pebble`), after which the oldest log file is deleted. `--log-group-max-size=XXX` is an alias for `--log='file-defaults: {max-group-size: XXX}'`. Accepts a valid file size, such as `--log-group-max-size=1GiB`. Default: `100MiB`
`--log-file-max-size`	An alias for , used to specify the maximum size that a log file can grow before a new log file is created. `--log-file-max-size=XXX` is an alias for `--log='file-defaults: {max-file-size: XXX}'`. Accepts a valid file size, such as `--log-file-max-size=2MiB`. Requires logging to files. Default: `10MiB`
`--log-file-verbosity`	An alias for , used to specify the minimum of messages that are logged. `--log-file-verbosity=XXX` is an alias for `--log='file-defaults: {filter: XXX}'`. When a severity is specified, such as `--log-file-verbosity=WARNING`, log messages that are below the specified severity level are not written to the target log file. Requires logging to files. Default: `INFO`
`--logtostderr`	An alias for , to optionally output log messages at or above the configured to the `stderr` sink. `--logtostderr=XXX` is an alias for `--log='sinks: {stderr: {filter: XXX}}'`. Accepts a valid . If no value is specified, by default messages related to server commands are logged to `stderr` at `INFO` severity and above, and messages related to client commands are logged to `stderr` at `WARNING` severity and above. Setting `--logtostderr=NONE` disables logging to `stderr`. Default: `UNKNOWN`
`--no-color`	An alias for flag, used to control whether log output to the `stderr` sinc is colorized. `--no-color=XXX` is an alias for `--log='sinks: {stderr: {no-color: XXX}}'`. Accepts either `true` or `false`. When set to `false`, messages logged to `stderr` are colorized based on . Default: `false`
`--redactable-logs`	An alias for flag, used to whether are used in place of secret or sensitive information in log messages. `--redactable-logs=XXX` is an alias for `--log='file-defaults: {redactable: XXX}'`. Accepts `true` or `false`. Default: `false`
`--sql-audit-dir`	An alias for , used to optionally confine log output of the `SENSITIVE_ACCESS` to a separate directory. `--sql-audit-dir=XXX` is an alias for `--log='sinks: {file-groups: {sql-audit: {channels: SENSITIVE_ACCESS, dir: ...}}}'`. Enabling `SENSITIVE_ACCESS` logs can negatively impact performance. As a result, we recommend using the `SENSITIVE_ACCESS` channel for security purposes only. For more information, refer to .

Defaults

See the .

Standard output

When you run cockroach start, some helpful details are printed to the standard output:

CockroachDB node starting at
build:               CCL  @  (go1.12.6)
webui:               http://localhost:8080
sql:                 postgresql://root@localhost:26257?sslmode=disable
sql (JDBC):          jdbc:postgresql://localhost:26257/defaultdb?sslmode=disable&user=root
RPC client flags:    cockroach <client cmd> --host=localhost:26257 --insecure
logs:                /Users/<username/node1/logs
temp dir:            /Users/<username/node1/cockroach-temp242232154
external I/O path:   /Users/<username/node1/extern
store[0]:            path=/Users/<username/node1
status:              initialized new cluster
clusterID:           8a681a16-9623-4fc1-a537-77e9255daafd
nodeID:              1

These details are also written to the INFO log in the /logs directory. You can retrieve them with a command like grep 'node starting' node1/logs/cockroach.log -A 11.

Field	Description
`build`	The version of CockroachDB you are running.
`webui`	The URL for accessing the DB Console.
`sql`	The connection URL for your client.
`RPC client flags`	The flags to use when connecting to the node via .
`logs`	The directory containing debug log data.
`temp dir`	The temporary store directory of the node.
`external I/O path`	The external IO directory with which the local file access paths are prefixed while performing and operations using local node directories or NFS drives.
`attrs`	If node-level attributes were specified in the `--attrs` flag, they are listed in this field. These details are potentially useful for .
`locality`	If values describing the locality of the node were specified in the `--locality` field, they are listed in this field. These details are potentially useful for .
`store[n]`	The directory containing store data, where `[n]` is the index of the store, e.g., `store[0]` for the first store, `store[1]` for the second store. If store-level attributes were specified in the `attrs` field of the `--store` flag, they are listed in this field as well. These details are potentially useful for .
`status`	Whether the node is the first in the cluster (`initialized new cluster`), joined an existing cluster for the first time (`initialized new node, joined pre-existing cluster`), or rejoined an existing cluster (`restarted pre-existing node`).
`clusterID`	The ID of the cluster. When trying to join a node to an existing cluster, if this ID is different than the ID of the existing cluster, the node has started a new cluster. This may be due to conflicting information in the node’s data directory. For additional guidance, see the docs.
`nodeID`	The ID of the node.

Examples

Start a multi-node cluster

To start a multi-node cluster, run the cockroach start command for each node, setting the --join flag to the addresses of the initial nodes. For a cluster in a single region, set 3-5 --join addresses. Each starting node will attempt to contact one of the join hosts. In case a join host cannot be reached, the node will try another address on the list until it can join the gossip network. When starting a multi-region cluster, set more than one --join address per region, and select nodes that are spread across failure domains. This ensures .

Secure
Insecure

Before starting the cluster, use to generate node and client certificates for a secure cluster connection.

$ cockroach start \
--certs-dir=certs \
--advertise-addr=<node1 address \
--join=<node1 address,<node2 address,<node3 address \
--cache=.25 \
--max-sql-memory=.25

$ cockroach start \
--certs-dir=certs \
--advertise-addr=<node2 address \
--join=<node1 address,<node2 address,<node3 address \
--cache=.25 \
--max-sql-memory=.25

$ cockroach start \
--certs-dir=certs \
--advertise-addr=<node3 address \
--join=<node1 address,<node2 address,<node3 address \
--cache=.25 \
--max-sql-memory=.25

$ cockroach start \
--insecure \
--advertise-addr=<node1 address \
--join=<node1 address,<node2 address,<node3 address \
--cache=.25 \
--max-sql-memory=.25

$ cockroach start \
--insecure \
--advertise-addr=<node2 address \
--join=<node1 address,<node2 address,<node3 address \
--cache=.25 \
--max-sql-memory=.25

$ cockroach start \
--insecure \
--advertise-addr=<node3 address \
--join=<node1 address,<node2 address,<node3 address \
--cache=.25 \
--max-sql-memory=.25

Then run the command against any node to perform a one-time cluster initialization:

Secure
Insecure

$ cockroach init \
--certs-dir=certs \
--host=<address of any node

$ cockroach init \
--insecure \
--host=<address of any node

Start a multi-region cluster

In this example we will start a multi-node with a multi-region setup that uses the same regions (passed to the --locality flag) as the multi-region MovR demo application.

Start a node in the us-east1 region:

cockroach start --locality=region=us-east1,zone=us-east-1a \
                  --insecure --store=/tmp/node0 \
                  --listen-addr=localhost:26257 \
                  --http-port=8888 \
                  --join=localhost:26257,localhost:26258,localhost:26259

Start a node in the us-west1 region:

cockroach start --locality=region=us-west1,zone=us-west-1a \
                  --insecure \
                  --store=/tmp/node2 \
                  --listen-addr=localhost:26259 \
                  --http-port=8890 \
                  --join=localhost:26257,localhost:26258,localhost:26259

Start a node in the europe-west1 region:

cockroach start --locality=region=europe-west1,zone=europe-west-1a \
                  --insecure \
                  --store=/tmp/node1 \
                  --listen-addr=localhost:26258 \
                  --http-port=8889 \
                  --join=localhost:26257,localhost:26258,localhost:26259

Initialize the cluster:

cockroach init --insecure --host=localhost --port=26257

Connect to the cluster using :

cockroach sql --host=localhost --port=26257 --insecure

Issue the statement to verify that the list of regions is expected:

SHOW REGIONS;

    region    | zones | database_names | primary_region_of
---------------+-------+----------------+--------------------
  europe-west1 | {}    | {}             | {}
  us-east1     | {}    | {}             | {}
  us-west1     | {}    | {}             | {}
(3 rows)

For more information about running CockroachDB multi-region, see the .

For more information about the --locality flag, see Locality.

Start a multi-node cluster across private networks

Scenario:

You have a cluster that spans GCE and AWS.
The nodes on each cloud can reach each other on public addresses, but the private addresses aren’t reachable from the other cloud.

Approach:

Start each node on GCE with --locality set to describe its location, --locality-advertise-addr set to advertise its private address to other nodes in on GCE, --advertise-addr set to advertise its public address to nodes on AWS, and --join set to the public addresses of 3-5 of the initial nodes:

$ cockroach start \
--certs-dir=certs \
--locality=cloud=gce \
--locality-advertise-addr=cloud=gce@<private address of node \
--advertise-addr=<public address of node \
--join=<public address of node1,<public address of node2,<public address of node3 \
--cache=.25 \
--max-sql-memory=.25

Start each node on AWS with --locality set to describe its location, --locality-advertise-addr set to advertise its private address to other nodes on AWS, --advertise-addr set to advertise its public address to nodes on GCE, and --join set to the public addresses of 3-5 of the initial nodes:

$ cockroach start \
--certs-dir=certs \
--locality=cloud=aws \
--locality-advertise-addr=cloud=aws@<private address of node \
--advertise-addr=<public address of node \
--join=<public address of node1,<public address of node2,<public address of node3 \
--cache=.25 \
--max-sql-memory=.25

Run the command against any node to perform a one-time cluster initialization:

$ cockroach init \
--certs-dir=certs \
--host=<address of any node

Add a node to a cluster

To add a node to an existing cluster, run the cockroach start command, setting the --join flag to the same addresses you used when starting the cluster:

Secure
Insecure

$ cockroach start \
--certs-dir=certs \
--advertise-addr=<node4 address \
--join=<node1 address,<node2 address,<node3 address \
--cache=.25 \
--max-sql-memory=.25

$ cockroach start \
--insecure \
--advertise-addr=<node4 address \
--join=<node1 address,<node2 address,<node3 address \
--cache=.25 \
--max-sql-memory=.25

Create a table with node locality information

Start a three-node cluster with locality information specified in the cockroach start commands:

$ cockroach start --insecure --port=26257 --http-port=26258 --store=cockroach-data/1 --cache=256MiB --locality=region=eu-west-1,cloud=aws,zone=eu-west-1a

$ cockroach start --insecure --port=26259 --http-port=26260 --store=cockroach-data/2 --cache=256MiB --join=localhost:26257 --locality=region=eu-west-1,cloud=aws,zone=eu-west-1b

$ cockroach start --insecure --port=26261 --http-port=26262 --store=cockroach-data/3 --cache=256MiB --join=localhost:26257 --locality=region=eu-west-1,cloud=aws,zone=eu-west-1c

You can use the built-in function to return the current node’s locality information from inside a SQL shell. The example below uses the output of crdb_internal.locality_value('zone') as the DEFAULT value to use for the zone column of new rows. Other available locality keys for the running three-node cluster include region and cloud.

$ cockroach sql --insecure

> CREATE TABLE charges (
  zone STRING NOT NULL DEFAULT crdb_internal.locality_value('zone'),
  id INT PRIMARY KEY NOT NULL
);

> INSERT INTO charges (id) VALUES (1);

> SELECT * FROM charges WHERE id = 1;

     zone    | id
+------------+----+
  eu-west-1a |  1
(1 row)

The zone column has the zone of the node on which the row was created. In a separate terminal window, open a SQL shell to a different node on the cluster:

$ cockroach sql --insecure --port 26259

> INSERT INTO charges (id) VALUES (2);

> SELECT * FROM charges WHERE id = 2;

     zone    | id
+------------+----+
  eu-west-1b |  2
(1 row)

In a separate terminal window, open a SQL shell to the third node:

$ cockroach sql --insecure --port 26261

> INSERT INTO charges (id) VALUES (3);

> SELECT * FROM charges WHERE id = 3;

     zone    | id
+------------+----+
  eu-west-1c |  3
(1 row)

Start a cluster with separate RPC and SQL networks

Separating the network addresses used for intra-cluster RPC traffic and application SQL connections can provide an additional level of protection against security issues as a form of defense in depth. This separation is accomplished with a combination of the --sql-addr flag and firewall rules or other network-level access control (which must be maintained outside of CockroachDB). For example, suppose you want to use port 26257 for SQL connections and 26258 for intra-cluster traffic. Set up firewall rules so that the CockroachDB nodes can reach each other on port 26258, but other machines cannot. Start the CockroachDB processes as follows:

$ cockroach start --sql-addr=:26257 --listen-addr=:26258 --join=node1:26258,node2:26258,node3:26258 --certs-dir=~/cockroach-certs

Note the use of port 26258 (the value for listen-addr, not sql-addr) in the --join flag. Also, if your environment requires the use of the --advertise-addr flag, you should probably also use the --advertise-sql-addr flag when using a separate SQL address. Clusters using this configuration with client certificate authentication may also wish to use .

Architecture

Cockroach Commands

Logs

Metrics

Policies

Third-Party Support

Security

System Catalogs

FAQs

Synopsis

Flags

General

Networking

Security

Locality

Example

Load-based lease rebalancing in uneven latency deployments

Storage

Storage engine

Store

Write Ahead Log (WAL) failover

Enable WAL failover

Disable WAL failover

Monitor WAL failover

Logging

Defaults

Standard output

Examples

Start a multi-node cluster

Start a multi-region cluster

Start a multi-node cluster across private networks

Add a node to a cluster

Create a table with node locality information

Start a cluster with separate RPC and SQL networks

See also

​Synopsis

​Flags

​General

​Networking

​Security

​Locality

​Example

​Load-based lease rebalancing in uneven latency deployments

​Storage

​Storage engine

​Store

​Write Ahead Log (WAL) failover

Enable WAL failover

Disable WAL failover

Monitor WAL failover

​Logging

​Defaults

​Standard output

​Examples

​Start a multi-node cluster

​Start a multi-region cluster

​Start a multi-node cluster across private networks

​Add a node to a cluster

​Create a table with node locality information

​Start a cluster with separate RPC and SQL networks

​See also

Synopsis

Flags

General

Networking

Security

Locality

Example

Load-based lease rebalancing in uneven latency deployments

Storage

Storage engine

Store

Write Ahead Log (WAL) failover

Logging

Defaults

Standard output

Examples

Start a multi-node cluster

Start a multi-region cluster

Start a multi-node cluster across private networks

Add a node to a cluster

Create a table with node locality information

Start a cluster with separate RPC and SQL networks

See also