Something is wrong with your database. You just don't know it yet.
Maybe it's a query that takes 200ms on average but spikes to 8 seconds every few minutes. Maybe autovacuum fell behind three weeks ago and your tables are now bloated to twice their logical size. Maybe replication lag crept from 100ms to 4 seconds and nobody noticed because the app still works. Until it doesn't.
PostgreSQL gives you everything you need to catch these problems early. The issue is knowing which of the hundreds of available metrics actually matter, how to collect them without adding overhead, and what to alert on versus what to watch passively. This guide walks through all of it.
Why PostgreSQL Monitoring Is Different
PostgreSQL is not a black box. It exposes a rich set of internal statistics through system views that you can query with plain SQL. No special agent, no proprietary protocol. The challenge is not access but interpretation: there are dozens of views, each tracking a different subsystem, and the relationships between them are not obvious until you've debugged a few production incidents.
The other challenge is overhead. Collecting metrics too aggressively can itself impact performance, particularly on systems with high connection counts or heavy write workloads.
A good monitoring setup has three layers:
- Metrics collection — snapshot key numbers every 15 to 60 seconds
- Query analysis — track which queries are slow and how often they run
- Alerts — fire on conditions that require human action, not on every blip
The Metrics That Actually Matter
Connection Usage
PostgreSQL creates a new OS process for every client connection. This isn't free. Each connection consumes around 5 to 10MB of RAM, and the OS has to schedule and context-switch between all of them.
The most important connection metric is what fraction of your max_connections limit you're actually using:
SELECT
count(*) AS total_connections,
count(*) FILTER (WHERE state = 'active') AS active,
count(*) FILTER (WHERE state = 'idle') AS idle,
count(*) FILTER (WHERE state = 'idle in transaction') AS idle_in_transaction,
(SELECT setting::int FROM pg_settings WHERE name = 'max_connections') AS max_connections
FROM pg_stat_activity
WHERE pid <> pg_backend_pid();
idle in transaction connections are particularly dangerous. A connection sitting in an open transaction holds locks and prevents autovacuum from cleaning up dead rows. If you see this number climbing, you likely have application code that opens transactions and then waits on external resources before committing.
For most applications, if you're regularly exceeding 80% of max_connections, you need connection pooling. PgBouncer in transaction mode can multiplex hundreds of application connections through a small handful of actual Postgres connections.
Query Performance
pg_stat_statements is the single most useful view for understanding production performance. It tracks execution statistics for every distinct query seen since the extension was last reset, aggregated across all calls:
SELECT
query,
calls,
round(total_exec_time::numeric, 2) AS total_ms,
round(mean_exec_time::numeric, 2) AS mean_ms,
round(stddev_exec_time::numeric, 2) AS stddev_ms,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;
The total_exec_time column gives you the queries consuming the most cumulative time. This is usually the right place to start: a fast query that runs a million times a day costs more than a slow query that runs once.
The stddev_exec_time column is underused. A query with a mean of 10ms and a stddev of 200ms has pathological behavior: most calls are fast but occasional calls are catastrophically slow. This pattern often means missing indexes on filtered columns or a query plan that changes depending on parameter values.
To enable pg_stat_statements:
-- Add to postgresql.conf:
-- shared_preload_libraries = 'pg_stat_statements'
-- After restart:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
Cache Hit Ratio
PostgreSQL uses shared buffers as its in-memory cache. If your working set fits in memory, most reads are served from cache rather than disk. The buffer hit ratio tells you how well this is working:
SELECT
sum(heap_blks_read) AS heap_read,
sum(heap_blks_hit) AS heap_hit,
round(
sum(heap_blks_hit)::numeric /
nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) * 100,
2
) AS hit_ratio_pct
FROM pg_statio_user_tables;
A ratio below 90% on an OLTP database usually means your shared_buffers setting is too small, or your working set is genuinely larger than available RAM. The fix is either more RAM or a read replica to distribute load.
Table Bloat and Autovacuum
PostgreSQL uses MVCC (multi-version concurrency control). When you update or delete a row, the old version stays on disk until autovacuum cleans it up. If autovacuum falls behind, tables accumulate dead rows: "bloat." A highly bloated table is slower to scan and wastes disk space.
This query shows which tables have the most dead tuples relative to live ones:
SELECT
schemaname,
relname AS table_name,
n_live_tup,
n_dead_tup,
round(n_dead_tup::numeric / nullif(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
last_autovacuum,
last_autoanalyze
FROM pg_stat_user_tables
WHERE n_live_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 20;
Tables with a dead_pct above 20% need attention. If autovacuum is running but not keeping up, you need to tune it: either lower autovacuum_vacuum_scale_factor for large tables, or increase autovacuum_max_workers if you have many tables that need vacuuming simultaneously.
For tables that see extremely high write throughput, you can override autovacuum settings at the table level:
ALTER TABLE orders
SET (autovacuum_vacuum_scale_factor = 0.01,
autovacuum_vacuum_threshold = 100);
This tells autovacuum to trigger a vacuum on the orders table after 1% of rows change, rather than the default 20%. For a 10 million row table, the default means waiting until 2 million rows are dead before cleaning up.
Replication Lag
If you're running streaming replication (either for HA or read replicas), monitoring lag is critical. On the primary:
SELECT
client_addr,
state,
sent_lsn,
write_lsn,
flush_lsn,
replay_lsn,
pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replication_lag_bytes
FROM pg_stat_replication;
replication_lag_bytes tells you how far behind each replica is in bytes of WAL. You can convert this to approximate time, but the byte measure is more reliable: time is dependent on how busy the primary is, whereas bytes directly measures how much data hasn't been applied yet.
For a complete picture of how high availability and replication interact, see our PostgreSQL High Availability guide.
Lock Waits
Lock contention is one of the sneakiest performance problems. A query that normally takes 10ms will wait indefinitely if it needs a lock held by another transaction. This query shows active locks and what they're waiting on:
SELECT
blocked.pid AS blocked_pid,
blocked.query AS blocked_query,
blocking.pid AS blocking_pid,
blocking.query AS blocking_query,
blocked.wait_event_type,
blocked.wait_event
FROM pg_stat_activity AS blocked
JOIN pg_stat_activity AS blocking
ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
WHERE cardinality(pg_blocking_pids(blocked.pid)) > 0;
Long-running idle in transaction connections are the most common culprit. A connection that opens a transaction, acquires a lock, and then waits on application code blocks every other transaction that needs the same lock. Setting idle_in_transaction_session_timeout = '30s' in PostgreSQL 9.6+ will automatically terminate these connections.
Tools for Collecting PostgreSQL Metrics
Prometheus and postgres_exporter
The most common open source stack is postgres_exporter feeding into Prometheus, with Grafana for visualization. postgres_exporter runs as a sidecar, connects to your database with a read-only user, and scrapes all the stats views above on a configurable interval.
The advantage is flexibility: you control exactly which metrics you collect, how long you retain them, and how you visualize them. The disadvantage is that you own all of it. Setting up a reliable Prometheus stack with proper retention, alerting, and visualization takes real engineering time.
A minimal postgres_exporter setup:
# docker-compose.yml
services:
postgres-exporter:
image: prometheuscommunity/postgres-exporter
environment:
DATA_SOURCE_NAME: "postgresql://monitoring_user:password@postgres:5432/mydb?sslmode=require"
ports:
- "9187:9187"
The monitoring_user needs these grants:
CREATE USER monitoring_user WITH PASSWORD 'password';
GRANT pg_monitor TO monitoring_user;
pg_monitor is a predefined role in PostgreSQL 10+ that grants read access to all monitoring-related views without superuser privileges.
pganalyze
pganalyze is a hosted service purpose-built for PostgreSQL monitoring. It does query performance analysis, index recommendations, and automated bloat detection. The main differentiator is that it normalizes query statistics: it groups queries by their structure (ignoring parameter values) and tracks performance over time with a timeline view that shows when a query got slower and whether it correlates with a schema change or data growth.
The free tier covers one database and is enough to get started. The main tradeoff versus Prometheus is that you're sending query statistics to an external service, which requires a data processing agreement for regulated industries.
Datadog and CloudWatch
If you're already paying for Datadog or AWS CloudWatch, the PostgreSQL integrations are solid and require minimal setup. Datadog's postgres integration scrapes all the key stats views and provides out-of-the-box dashboards and alert policies.
The tradeoff is cost: Datadog pricing is per host, which gets expensive at scale. CloudWatch RDS Enhanced Monitoring is cheaper but only applies to RDS instances.
pg_activity
For interactive debugging, pg_activity is a top-like interface for PostgreSQL. It shows running queries, their duration, locks, I/O, and CPU usage in real time:
pip install pg_activity
pg_activity -U postgres --dbname=mydb
This isn't for automated monitoring; it's for when you're actively debugging an incident and need to see what's happening right now.
What to Alert On
Not every metric worth tracking needs an alert. Alert fatigue is real: if your alerts fire frequently and rarely require action, engineers start ignoring them.
These are the conditions worth waking someone up for:
High connection usage. Alert when active connections exceed 85% of max_connections. At 90%, new connections start failing, which causes application errors.
Replication lag above threshold. What the threshold is depends on your recovery time objective. For most applications, lag above 60 seconds on a replica that serves traffic is worth alerting on. For HA failover replicas, even 10 seconds of lag means potential data loss in a failover scenario.
Long-running transactions. Any transaction open for more than 5 minutes is almost certainly stuck or abandoned. This is a lock risk and a bloat risk.
Autovacuum not running. If last_autovacuum on a high-write table is more than a few hours old, autovacuum is either failing or misconfigured. The table will bloat.
Disk usage above 80%. PostgreSQL will crash if it runs out of disk space, and it will do so in a way that can require recovery. Alert early and plan for disk growth.
Checkpoint frequency too high. If checkpoints_req in pg_stat_bgwriter is growing faster than checkpoints_timed, your checkpoint_completion_target and max_wal_size settings need tuning. Frequent required checkpoints hurt write throughput.
SELECT
checkpoints_timed,
checkpoints_req,
round(checkpoints_req::numeric / nullif(checkpoints_timed + checkpoints_req, 0) * 100, 1) AS pct_required
FROM pg_stat_bgwriter;
If pct_required is above 10%, increase max_wal_size.
Index Health
Monitoring query performance tells you when queries are slow, but it doesn't always tell you why. Two index-specific queries help diagnose the most common cause: missing or unused indexes.
Unused indexes waste write performance and vacuum time without helping any query:
SELECT
schemaname,
relname AS table_name,
indexrelname AS index_name,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND NOT indisprimary
AND NOT indisunique
ORDER BY pg_relation_size(indexrelid) DESC;
Indexes with idx_scan = 0 since the last stats reset (or since the database was restarted) are candidates for removal. Check the PostgreSQL Indexing Guide for the full picture on when to drop versus keep.
Monitoring in Production: Practical Setup
A minimal but effective monitoring setup for a production PostgreSQL database:
Enable
pg_stat_statementsas a shared preload library. This has almost no overhead and is the single highest-value observability addition you can make.Create a dedicated monitoring user with
pg_monitorrole. Never use superuser credentials for monitoring.Collect metrics every 30 seconds to a time-series store (Prometheus, CloudWatch, or Datadog). This gives you enough resolution to catch spikes without overwhelming the database.
Set up a weekly query review. Look at the top 10 queries by total time in
pg_stat_statementsevery week. This is where most optimization opportunities hide.Alert on the conditions above, not on every metric. Keep the signal-to-noise ratio high so alerts remain meaningful.
Back up your monitoring data separately from your database. If your database is having a problem, you need to be able to look at historical metrics from before the incident started. See our guide on PostgreSQL backup and recovery for a full production backup strategy.
What Rivestack Handles for You
If you're running PostgreSQL on Rivestack, the monitoring fundamentals are already in place. Rivestack exposes connection counts, active query duration, replication lag, and disk usage through a built-in dashboard, and alerts fire automatically on the conditions above without requiring any setup on your part.
pg_stat_statements is enabled by default on every Rivestack database. You can query it directly through your preferred SQL client or connect to it via the Rivestack metrics API.
For teams that want deeper observability, Rivestack supports read-only monitoring credentials so you can connect postgres_exporter or pganalyze without any additional configuration.
Starting Points
The most important thing is to start simple. You don't need the full Prometheus stack before you've shipped. What you do need, immediately, is pg_stat_statements enabled and a dashboard that shows connection usage and disk growth. Those two things will catch 80% of production database problems before they become incidents.
Add alerts for connection spikes and replication lag. Set idle_in_transaction_session_timeout. Review slow queries weekly. Everything else you can add as you learn what actually causes problems in your specific workload.
When you're ready to take the operational burden off your team entirely, try Rivestack for managed PostgreSQL with monitoring built in from day one.
Originally published at rivestack.io
Top comments (0)