Publication date: February 12, 2020
SHOW JOBS or viewing the Jobs page in the Admin UI, high memory usage can be incurred on a node to the point it could crash.
SHOW JOBS and the Jobs page in the Admin UI internally load all the job descriptions from the cluster in RAM before displaying them.
Under reasonable production settings, a single backup job payload may exceed 5MB in size. Considering an hourly backup and default property for
jobs.retention_time set to 336h, a single use of
SHOW JOBS or a single user of the Jobs page in the Admin UI can incur ~1.7GB of memory utilization. This allocation is then multiplied by the number of concurrent accesses to the jobs table.
Starting in CockroachDB v19.2.3, new jobs payloads are reduced in size. A later version will also avoid loading old job entries in memory when viewing recent jobs.
This public issue is tracked as #44166.
It is possible to reduce the number of job entries overall by setting the
jobs.retention_time cluster setting to a value closer to 48h or 24h.
SET CLUSTER SETTING jobs.retention_time='48:00:00'.
Additionally, if the nodes are observed to crash due to excessive memory usage, it may be necessary to truncate the job history. This can be achieved, for example, with:
DELETE from system.jobs WHERE status = 'succeeded' AND created < (now() - '2 days'::interval);
All deployments running CockroachDB v19.2.0 to v19.2.2 are affected.
Questions about any technical alert can be directed to our support team.