The Runtime dashboard in the DB Console lets you monitor runtime metrics for you cluster, such as node count, memory usage, and CPU time.
To view this dashboard, access the DB Console, click Metrics on the left-hand navigation bar, and select Dashboard > Runtime.
Use the Graph menu to display metrics for your entire cluster or for a specific node.
To the right of the Graph and Dashboard menus, a range selector allows you to filter the view for a predefined timeframe or custom date/time range. Use the navigation buttons to move to the previous, next, or current timeframe. Note that the active timeframe is reflected in the URL and can be easily shared.
The Runtime dashboard displays the following time series graphs:
Live Node Count
In the node view as well as the cluster view, the graph shows the number of live nodes in the cluster.
A dip in the graph indicates decommissioned nodes, dead nodes, or nodes that are not responding. To troubleshoot the dip in the graph, refer to the Summary panel.
In the node view, the graph shows the memory in use for the selected node.
In the cluster view, the graph shows the memory in use across all nodes in the cluster.
On hovering over the graph, the values for the following metrics are displayed:
|RSS||Total memory in use by CockroachDB.|
|Go Allocated||Memory allocated by the Go layer.|
|Go Total||Total memory managed by the Go layer.|
|CGo Allocated||Memory allocated by the C layer.|
|CGo Total||Total memory managed by the C layer.|
Expected values for a healthy cluster: RSS minus Go Total and CGo Total should not exceed 100 MiB. Go Allocated should not exceed a few hundred MiB. CGo Allocated should not exceed the
Runnable Goroutines per CPU
This graph shows the number of Goroutines waiting to run per CPU. This graph should rise and fall based on CPU load. Values greater than 50 are considered high.
In the node view, the graph shows the number of Goroutines waiting per CPU on the selected node.
In the cluster view, the graph shows the number of Goroutines waiting per CPU across all nodes in the cluster.
- In the node view, the graph shows the CPU time used by CockroachDB user and system-level operations for the selected node.
- In the cluster view, the graph shows the CPU time used by CockroachDB user and system-level operations across all nodes in the cluster.
On hovering over the CPU Time graph, the values for the following metrics are displayed:
|User CPU Time||Total CPU seconds per second used by the CockroachDB process across all nodes.|
|Sys CPU Time||Total CPU seconds per second used for CockroachDB system-level operations across all nodes.|
- In the node view, the graph shows the mean clock offset of the node against the rest of the cluster.
- In the cluster view, the graph shows the mean clock offset of each node against the rest of the cluster.
The Runtime dashboard shows other time series graphs that are important for CockroachDB developers:
- Goroutine Count
- GC Runs
- GC Pause Time
For monitoring CockroachDB, it is sufficient to use the Live Node Count, Memory Usage, CPU Time, and Clock Offset graphs.
Summary and events
A Summary panel of key metrics is displayed to the right of the timeseries graphs.
|Total Nodes||The total number of nodes in the cluster. Decommissioned nodes are not included in this count.|
|Capacity Used||The storage capacity used as a percentage of usable capacity allocated across all nodes.|
|Unavailable Ranges||The number of unavailable ranges in the cluster. A non-zero number indicates an unstable cluster.|
|Queries per second||The total number of
|P99 Latency||The 99th percentile of service latency.|
If you are testing your deployment locally with multiple CockroachDB nodes running on a single machine (this is not recommended in production), you must explicitly set the store size per node in order to display the correct capacity. Otherwise, the machine's actual disk capacity will be counted as a separate store for each node, thus inflating the computed capacity.
Underneath the Summary panel, the Events panel lists the 5 most recent events logged for all nodes across the cluster. To list all events, click View all events.
The following types of events are listed:
- Database created
- Database dropped
- Table created
- Table dropped
- Table altered
- Index created
- Index dropped
- View created
- View dropped
- Schema change reversed
- Schema change finished
- Node joined
- Node decommissioned
- Node restarted
- Cluster setting changed