High CPU usage can bring your database – and with it, your application – grinding to a halt.
This is, unfortunately, a fairly common problem. But it also can be a relatively easy fix. Let’s take a look at how to check CPU usage in Postgres, and how to troubleshoot for some of the most common causes of high CPU usage.
There are a variety of ways to check CPU usage. For Linux and most other Unix-based servers, a simple
top command will give you the total CPU usage on the machine.
Generally speaking, any application that’s in production should also have some sort of monitoring configured to alert you when CPU usage is approaching dangerous territory, though. There are a wide variety of tools available for monitoring PostgreSQL databases.
If you have enabled the
pg_stat_statements module, you can also look at Postgres’s CPU usage on a more granular level by querying the statistical data it collects. It doesn’t directly track percent CPU usage for a given database or query, but it does track metrics like the number of times a query is called and the amount of time required to execute it, which can give you an approximate measure of how much compute various processes within your database are consuming.
In truth, you likely won’t have to proactively check CPU usage to know there’s a problem. Assuming you’ve got monitoring set up, you should be warned anytime you’re approaching a CPU usage spike. If you don’t have monitoring, you may only learn of the issue when Postgres becomes a bottleneck that starts impacting your application’s performance and/or availability.
For this reason, having proper monitoring is critical to preventing high CPU usage in Postgres from causing issues. You want to catch and address issues before they start impacting application performance, and with correctly-configured monitoring that should be possible. Let’s take a look at some of the things you’re most likely to find when you investigate higher-than-expected CPU usage.
A wide variety of problems with Postgres can cause CPU usage to spike. While we can’t cover all possibilities, in this article we’ll take a look at some ways you can try to troubleshoot some of the most common causes of high CPU usage in Postgres:
Let’s take a look at each of these, and how they might be addressed.
Note that for our purposes here, we will assume that you have at least some of Postgres’s statistics and monitoring modules enabled. There are a variety of third-party tools, including UI-based tools, that make these statistics available as well.
Inefficient queries are one of the most likely culprits if you’re getting CPU usage problems that don’t seem to be related to traffic.
The first step here is to figure out what queries are running that might be causing problems. You can investigate this in a variety of ways, but a simple query of the
pg_stat_statements table is a reasonable place to start. For example:
SELECT query, calls, total_exec_time, rows FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5;
The above query will return the top five queries by total execution time. If an inefficient query is what’s causing your high CPU usage, there’s a good chance that the problem will be in these results.
Once you’ve identified a potentially problematic query, you can use the
EXPLAIN ANALYZE command together with the query to take a look at how the query is being executed. For example:
EXPLAIN ANALYZE SELECT * FROM table;
There are additional parameters you can use with
EXPLAIN to dig deeper if needed, but if you’re lucky,
EXPLAIN ANALYZE will reveal something like a full table scan, and you can then take steps to fix the query. (For example, you might choose to add an index to prevent frequently-running queries from having to do full table scans).
Another way to investigate this would be to use Postgres’s built-in statistics tools to look at sequential scans and index scans, and the row count for these scans, to find the tables that problematic queries might be impacting. The
pg_stat_all_tables view contains a number of ways to look at this, for example.
If the above investigation doesn’t turn anything up, it’s also worth taking a look at the
calls column in
pg_stat_statements to see how frequently each query is being run within a given timeframe. It’s possible that an efficient query could still be taxing your compute resources if it’s being called more often than expected.
High CPU issues in PostgreSQL databases can also be caused by having too many active connections.
To check the number of current active connections, we can query the
pg_stat_activity table like so:
SELECT * FROM pg_stat_activity WHERE state = ‘active’;
It’s also a good idea to check what the
max_connections setting is in your database, if you’re not already sure:
The results of these two queries should guide your next steps, if they suggest that a high number of connections is the problem. Depending on the specific results you get, you should consider steps including:
It’s worth noting that beyond a certain point of scale, both connection pooling and scaling your database horizontally are likely to be a necessity. Postgres can be manually sharded to achieve horizontal scale, but this is time-and labor-intensive, so it’s worth considering Postgres-compatible distributed SQL solutions such as CockroachDB to automate that process.
If the above investigation doesn’t reveal an issue, it’s also possible that the total number of connections is appropriate, but that a single database or table within your system is getting too high a proportion of those connections. You can also investigate this via the
pg_stat_activity table, which contains a column
datname to identify the database a connection is connected to.
If one particular database does have a surprisingly high number of connections, you may need to adjust your application logic, but it’s worth first checking to see if the issue is actually an inefficient query causing connections to pile up as they wait for query execution to complete.
Database contention is another issue that can lead to an increase in active connections, as queries may have to retry repeatedly before they complete if they’re attempting to modify highly contended rows. The way to fix this will vary based on the specifics of your database implementation, but here’s a list of some best practices for preventing contention with CockroachDB.
Another common cause of high CPU usage in Postgres is growth. As traffic to and usage of your application grows, the load on your database increases. Eventually, you will start approaching the limits of what the underlying hardware is capable of.
The only true long-term solution to this problem is to scale out horizontally (scaling vertically can buy you time, but there are limits to how much you can upgrade and expand a single server, and it still leaves you with a single point of failurte.) This means embracing a distributed database architecture and, again, either dealing with the hassle and cost of manual sharding or migrating to a natively distributed SQL database – ideally a Postgres-compatible one such as CockroachDB to minimize the difficulty of the migration.
In the interim, though, it’s possible that you may be able to reduce the CPU load on your Postgres database by optimizing frequently-run queries, tweaking database parameters, adding or removing indexes, etc. If your application continues growing, these are likely to be “band-aid” fixes – no amount of query optimization is going to make a single Postgres instance performant under the weight of a massive user base. But optimization can still buy you time and save money as you move towards a distributed SQL solution in the medium term.
PostgreSQL plans and executes queries for efficiency based in part on the statistics that it collects about tables in your database. These statistics are updated automatically, but they can get out of date, particularly if a large number of entities are modified within a short timeframe. This, in turn, can cause the SQL engine to choose inefficient execution plans.
To determine when each of our tables was last analyzed for these statistics, we can query
pg_stat_all_tables and look at the
last_autoanalyze columns to see if anything looks outdated.
If it does, we can update the statistics for that table manually, like so:
Of course, what we’ve listed above are just a few of the most common causes of high CPU memory usage in Postgres. It is also possible that the cause is something a bit more esoteric, such as memory pagination.
In general, though, the most common causes of high CPU usage in Postgres are probably poorly-tuned queries and high traffic. Traffic volume, in particular, is going to become a factor for any application with a Postgres backend that reaches sufficient scale. It is worth considering what the options are for Postgres-compatible distributed SQL before you need them to prevent problems like “success disasters.”
And while Postgres is great, upgrading to a cloud-native distributed SQL database such as CockroachDB also comes with a variety of quality-of-life advantages that’ll help you avoid and fix problems with your database. For example:
Most SQL content on the web seems to be written with data analysts in mind. And that’s fine, but developers need …Read more