DEV Community

Cover image for Progress Reporting In PostgreSQL
Bolaji Wahab
Bolaji Wahab

Posted on

Progress Reporting In PostgreSQL

Can we monitor the progress of certain operations in PostgreSQL? Yes, we can monitor the progress of these operations in PostgreSQL:

  • create index
  • create index concurrently
  • reindex
  • reindex concurrently
  • vacuum
  • vacuum analyze
  • vacuum full
  • analyze
  • copy
  • cluster
  • basebackup

Let us jump right into it. The query snippets below can be used to monitor the progress of the listed-above operations.

Monitoring the progress of create index, create index concurrently, reindex, reindex concurrently

SELECT p.datname                                                 AS database_name,
       p.pid,
       clock_timestamp() - a.xact_start                          AS duration_so_far,
       a.application_name,
       a.client_addr,
       a.usename,
       coalesce(a.wait_event_type ||'.'|| a.wait_event, 'false') AS waiting,
       p.command,
       trim(trailing ';' from a.query)                           AS query,
       a.state,
       p.index_relid::regclass                                   AS index_name,
       p.relid::regclass                                         AS table_name,
       pg_size_pretty(pg_relation_size(p.relid))                 AS table_size,
       p.phase,
       CASE p.phase
           WHEN 'initializing' THEN '1 of 12'
           WHEN 'waiting for writers before build' THEN '2 of 12'
           WHEN 'building index: scanning table' THEN '3 of 12'
           WHEN 'building index: sorting live tuples' THEN '4 of 12'
           WHEN 'building index: loading tuples in tree' THEN '5 of 12'
           WHEN 'waiting for writers before validation' THEN '6 of 12'
           WHEN 'index validation: scanning index' THEN '7 of 12'
           WHEN 'index validation: sorting tuples' THEN '8 of 12'
           WHEN 'index validation: scanning table' THEN '9 of 12'
           WHEN 'waiting for old snapshots' THEN '10 of 12'
           WHEN 'waiting for readers before marking dead' THEN '11 of 12'
           WHEN 'waiting for readers before dropping' THEN '12 of 12'
       END AS phase_progress,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.blocks_done / nullif(p.blocks_total, 0), 2)::text || '%', 'not applicable'),
           p.blocks_done::text,
           p.blocks_total::text
       ) AS scan_progress,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.tuples_done / nullif(p.tuples_total, 0), 2)::text || '%', 'not applicable'),
           p.tuples_done::text,
           p.tuples_total::text
       ) AS tuples_loading_progress,
       format(
           '%s (%s of %s)',
           coalesce((100 * p.lockers_done / nullif(p.lockers_total, 0))::text || '%', 'not applicable'),
           p.lockers_done::text,
           p.lockers_total::text
       ) AS lockers_progress,
       format(
           '%s (%s of %s)',
           coalesce((100 * p.partitions_done / nullif(p.partitions_total, 0))::text || '%', 'not applicable'),
           p.partitions_done::text,
           p.partitions_total::text
       ) AS partitions_progress,
       p.current_locker_pid,
       trim(trailing ';' from l.query) AS current_locker_query
  FROM pg_stat_progress_create_index   AS p
  JOIN pg_stat_activity                AS a ON a.pid = p.pid
  LEFT JOIN pg_stat_activity           AS l ON l.pid = p.current_locker_pid
 ORDER BY clock_timestamp() - a.xact_start DESC;
Enter fullscreen mode Exit fullscreen mode

Monitoring the progress of vacuum, vacuum analyze

SELECT p.datname                                                 AS database_name,
       p.pid,
       clock_timestamp() - a.xact_start                          AS duration_so_far,
       a.application_name,
       a.client_addr,
       a.usename,
       coalesce(a.wait_event_type ||'.'|| a.wait_event, 'false') AS waiting,
       trim(trailing ';' from a.query)                           AS query,
       a.state,
       p.relid::regclass                                         AS table_name,
       pg_size_pretty(pg_relation_size(p.relid))                 AS table_size,
       pg_size_pretty(pg_total_relation_size(p.relid))           AS total_table_size,
       CASE
           WHEN ltrim(a.query) ~* '^autovacuum.*to prevent wraparound' THEN 'wraparound'
           WHEN ltrim(a.query) ~* '^vacuum' THEN 'user'
           ELSE 'regular'
       END AS mode,
       p.phase,
       CASE p.phase  
           WHEN 'initializing' THEN '1 of 7'
           WHEN 'scanning heap' THEN '2 of 7'
           WHEN 'vacuuming indexes' THEN '3 of 7'
           WHEN 'vacuuming heap' THEN '4 of 7'
           WHEN 'cleaning up indexes' THEN '5 of 7'
           WHEN 'truncating heap' THEN '6 of 7'
           WHEN 'performing final cleanup' THEN '7 of 7'
       END AS vacuum_phase_progress,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.heap_blks_scanned / nullif(p.heap_blks_total, 0), 2)::text || '%', 'not applicable'),
           p.heap_blks_scanned::text,
           p.heap_blks_total::text
       ) AS vacuum_scan_progress,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.heap_blks_vacuumed / nullif(p.heap_blks_total, 0), 2)::text || '%', 'not applicable'),
           p.heap_blks_vacuumed::text,
           p.heap_blks_total::text
       ) AS vacuum_progress,
       p.index_vacuum_count,
       p.max_dead_tuples,
       p.num_dead_tuples
  FROM pg_stat_progress_vacuum  AS p
  JOIN pg_stat_activity         AS a ON a.pid = p.pid
 ORDER BY clock_timestamp() - a.xact_start DESC;
Enter fullscreen mode Exit fullscreen mode

Monitoring the progress of vacuum full, cluster

SELECT p.datname                                                 AS database_name,
       p.pid,
       clock_timestamp() - a.xact_start                          AS duration_so_far,
       a.application_name,
       a.client_addr,
       a.usename,
       coalesce(a.wait_event_type ||'.'|| a.wait_event, 'false') AS waiting,
       p.command,
       trim(trailing ';' from a.query)                           AS query,
       a.state,
       p.relid::regclass                                         AS table_name,
       p.phase,
       CASE p.phase  
           WHEN 'initializing' THEN '1 of 8'
           WHEN 'seq scanning heap' THEN '2 of 8'
           WHEN 'index scanning heap' THEN '3 of 8'
           WHEN 'sorting tuples' THEN '4 of 8'
           WHEN 'writing new heap' THEN '5 of 8'
           WHEN 'swapping relation files' THEN '6 of 8'
           WHEN 'rebuilding index' THEN '7 of 8'
           WHEN 'performing final cleanup' THEN '7 of 8'
       END AS vacuum_phase_progress,
       cluster_index_relid::regclass AS cluster_index,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.heap_blks_scanned / nullif(p.heap_blks_total, 0), 2)::text || '%', 'not applicable'),
           p.heap_blks_scanned::text,
           p.heap_blks_total::text
       ) AS heap_scan_progress,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.heap_tuples_written / nullif(p.heap_tuples_scanned, 0), 2)::text || '%', 'not applicable'),
           p.heap_tuples_written::text,
           p.heap_tuples_scanned::text
       ) AS heap_tuples_written_progress,
       p.index_rebuild_count
  FROM pg_stat_progress_cluster AS p
  JOIN pg_stat_activity         AS a ON a.pid = p.pid
 ORDER BY clock_timestamp() - a.xact_start DESC;
Enter fullscreen mode Exit fullscreen mode

Monitoring the progress of analyze

SELECT p.datname                                                 AS database_name,
       p.pid,
       clock_timestamp() - a.xact_start                          AS duration_so_far,
       a.application_name,
       a.client_addr,
       a.usename,
       coalesce(a.wait_event_type ||'.'|| a.wait_event, 'false') AS waiting,
       trim(trailing ';' from a.query)                           AS query,
       a.state,
       p.relid::regclass                                         AS table_name,
       pg_size_pretty(pg_relation_size(p.relid))                 AS table_size,
       pg_size_pretty(pg_total_relation_size(p.relid))             AS total_table_size,
       CASE
           WHEN ltrim(a.query) ~* '^analyze' THEN 'user'
           ELSE 'regular'
       END AS mode,
       p.phase,
       CASE p.phase  
           WHEN 'initializing' THEN '1 of 6'
           WHEN 'acquiring sample rows' THEN '2 of 6'
           WHEN 'acquiring inherited sample rows' THEN '3 of 6'
           WHEN 'computing statistics' THEN '4 of 6'
           WHEN 'computing extended statistics' THEN '5 of 6'
           WHEN 'finalizing analyze' THEN '6 of 6'
       END AS phase_progress,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.sample_blks_scanned / nullif(p.sample_blks_total, 0), 2)::text || '%', 'not applicable'),
           p.sample_blks_scanned::text,
           p.sample_blks_total::text
       ) AS scan_progress,
       format(
           '%s (%s of %s)',
           coalesce((100 * p.ext_stats_computed / nullif(p.ext_stats_total, 0))::text || '%', 'not applicable'),
           p.ext_stats_computed::text,
           p.ext_stats_total::text
       ) AS extended_statistics_progress,
       format(
           '%s (%s of %s)',
           coalesce((100 * p.ext_stats_computed / nullif(p.child_tables_total, 0))::text || '%', 'not applicable'),
           p.child_tables_done::text,
           p.child_tables_total::text
       ) AS child_tables_progress,
       current_child_table_relid::regclass AS current_child_table
  FROM pg_stat_progress_analyze AS p
  JOIN pg_stat_activity         AS a ON a.pid = p.pid
 ORDER BY clock_timestamp() - a.xact_start DESC;
Enter fullscreen mode Exit fullscreen mode

Monitoring the progress of copy

SELECT p.datname                                                      AS database_name,
       p.pid,
       clock_timestamp() - a.xact_start                               AS duration_so_far,
       a.application_name,
       a.client_addr,
       a.usename,
       coalesce(a.wait_event_type ||'.'|| a.wait_event, 'false')      AS waiting,
       p.command,
       p.type,
       trim(trailing ';' from a.query)                                AS query,
       a.state,
       p.relid::regclass                                              AS table_name,
       coalesce(pg_size_pretty(pg_relation_size(p.relid)), '-')       AS table_size,
       coalesce(pg_size_pretty(pg_total_relation_size(p.relid)), '-') AS total_table_size,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.bytes_processed / nullif(p.bytes_total, 0), 2)::text || '%', 'not applicable'),
           p.bytes_processed::text,
           p.bytes_total::text
       ) AS bytes_progress,
       p.tuples_processed,
       p.tuples_excluded
  FROM pg_stat_progress_copy  AS p
  JOIN pg_stat_activity       AS a ON a.pid = p.pid
 ORDER BY clock_timestamp() - a.xact_start DESC;
Enter fullscreen mode Exit fullscreen mode

Monitoring the progress of basebackup

SELECT a.datname                                                      AS database_name,
       p.pid,
       clock_timestamp() - a.query_start                              AS duration_so_far,
       a.application_name,
       a.client_addr,
       a.usename,
       coalesce(a.wait_event_type ||'.'|| a.wait_event, 'false')      AS waiting,
       trim(trailing ';' from a.query)                                AS query,
       a.state,
       p.phase,
       CASE p.phase  
           WHEN 'initializing' THEN '1 of 6'
           WHEN 'waiting for checkpoint to finish' THEN '2 of 6'
           WHEN 'estimating backup size' THEN '3 of 6'
           WHEN 'streaming database files' THEN '4 of 6'
           WHEN 'waiting for wal archiving to finish' THEN '5 of 6'
           WHEN 'transferring wal files' THEN '6 of 6'
       END AS phase_progress,
       format(
           '%s (%s of %s)',
           coalesce(round(100.0 * p.backup_streamed / nullif(p.backup_total, 0), 2)::text || '%', 'not applicable'),
           p.backup_streamed::text,
           coalesce(p.backup_total::text, '0')
       ) AS backup_progress,
       format(
           '%s (%s of %s)',
           coalesce((100 * p.tablespaces_streamed / nullif(p.tablespaces_total, 0))::text || '%', 'not applicable'),
           p.tablespaces_streamed::text,
           p.tablespaces_total::text
       ) AS tablespace_progress
  FROM pg_stat_progress_basebackup  AS p
  JOIN pg_stat_activity             AS a ON a.pid = p.pid
 ORDER BY clock_timestamp() - a.query_start DESC;
Enter fullscreen mode Exit fullscreen mode

Little explanation:

The above queries have lot of things in common.

  • There is a phase column which tells what phase the process is currently on.
  • A phase_progress which gives the ratio of the current phase number and the total expected number of phases, which is gotten from the description of the respective views.
  • Then we have progress for the phases provided they are available otherwise not applicable.
  • For system/user commands such as analyze and vacuum, we have this expression trim(leading from a.query) which removes leading spaces before we compare to deduce whether it is autovacuum/autoanalyze or manual vacuum/manual analyze.
  • The combination of coalesce and nullif produces not applicable when one of the variables is NULL or 0.
  • Expression coalesce(a.wait_event_type ||'.'|| a.wait_event, 'false') gives details whether the process is waiting or not.
  • Finally, trim(trailing ';' from a.query) removes ; from the query text to give a nice format.

I hope you find these queries useful for your progress reporting.

P.S: All queries were tested on PostgreSQL 14.

Top comments (0)