DEV Community

Franck Pachot
Franck Pachot

Posted on • Edited on

Delete table and column statistics in PostgreSQL or YugabyteDB πŸ˜πŸš€

Anything I do, especially when it can change execution plans, should have its counterpart to revert back. In PostgreSQL we have an ANALYZE that gathers the statistics used by the query planner, but nothing to remove them. The idea is probably that, anyway, the automatic statistic gathering will put them back. But I still want to be able to control, at least for testing.

In YugabyteDB we use the PostgreSQL query planner but with additions to be cluster-aware, because it is a distributed SQL database. This is still in development. In the current version (2.11) ANALYZE is still a beta feature:

yugabyte=# select version();
                                                   version
-------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.2-YB-2.11.2.0-b0 on x86_64-pc-linux-gnu, compiled by gcc (Homebrew gcc 5.5.0_4) 5.5.0, 64-bit
(1 row)

yugabyte=# analyze my_table;
WARNING:  'analyze' is a beta feature!
HINT:  Set 'ysql_beta_features' yb-tserver gflag to true to suppress the warning for all beta features.
ANALYZE
Enter fullscreen mode Exit fullscreen mode

column statistics

I can easily check the column statistics (histograms) from pg_stats. For example here is how I check the columns that have a lot of values in the histograms:

select cardinality(most_common_vals),* 
from pg_stats 
where tablename='my_table'
order by cardinality(most_common_vals) desc nulls last;
Enter fullscreen mode Exit fullscreen mode

The default of 100 max can be changed with set default_statistics_target=100; but that's not the point. pg_stats is a view. The table behind it is pg_statistic. Then to delete the columns statistics gathered, here is an example:

delete from pg_statistic  
where starelid in (
select c.oid from pg_class c 
join pg_namespace n on n.oid = c.relnamespace
and relname in ('my_table' )
);
Enter fullscreen mode Exit fullscreen mode

Note that on YugabyteDB, modifying the catalog is protected and you will get:

ERROR:  Illegal state: Transaction for catalog table write operation 'pg_statistic' not found
Enter fullscreen mode Exit fullscreen mode

You need to explicitely allow it with:

set yb_non_ddl_txn_for_sys_tables_allowed=on;
Enter fullscreen mode Exit fullscreen mode

And because stats are cached in the sessions (to avoid single contention on the catalog shared by the master server), you need to re-connect to verify the change.

Table statistics

Table statistics are in pg_class, easy to update here in the same way. For example, if you want to remove all effects of ANALYZE, you can run:

begin transaction;
set local yb_non_ddl_txn_for_sys_tables_allowed=on;
update pg_class set reltuples=0
from pg_namespace
where reltuples>0 
  and relnamespace=pg_namespace.oid
  and nspname <> 'pg_catalog'
  and nspname !~ '^pg_toast'
  and nspname <> 'information_schema'
returning format('reset stats for %I.%I to %s'
                 , nspname,relname,reltuples
          ) done
;
set local yb_non_ddl_txn_for_sys_tables_allowed=off;
commit;
Enter fullscreen mode Exit fullscreen mode

Note that what is run with yb_non_ddl_txn_for_sys_tables_allowed set is processed as DDL. As DDL is non transactional and run in an autonomous transaction, you may not see the effect within the transaction. I've run in begin transaction only to set the parameter locally.

Finally, here is how torevert the effect of ANALYZE in YugabyteDB in the database it is connected to:

begin;

-- Allow transaction on system catalog tables.
set local yb_non_ddl_txn_for_sys_tables_allowed=on;

-- Update reltuples to 0 in pg_class
UPDATE pg_class SET reltuples=0
FROM pg_namespace
WHERE reltuples > 0 AND
        relnamespace=pg_namespace.oid AND
        nspname <> 'pg_catalog' AND
        nspname !~ '^pg_toast' AND
        nspname <> 'information_schema'
RETURNING FORMAT('reset reltuples for %I.%I to %s',
nspname, relname, reltuples) done;

-- Delete column statistics from pg_statistic
delete from pg_statistic as s USING
(select c.oid, c.relname, n.nspname from pg_class c 
join pg_namespace n on 
n.oid = c.relnamespace AND
        n.nspname <> 'pg_catalog' AND
            n.nspname !~ '^pg_toast' AND
            n.nspname <> 'information_schema') AS n_c,
    (select attname, attnum, attrelid FROM pg_attribute) AS a
where s.starelid = n_c.oid and s.staattnum = a.attnum and a.attrelid = s.starelid
RETURNING FORMAT('deleted column statistics for %I.%I.%I',
n_c.nspname, n_c.relname, a.attname) done;


-- Disallow transaction on system catalog tables.
set local yb_non_ddl_txn_for_sys_tables_allowed=off;

end;

Enter fullscreen mode Exit fullscreen mode

Top comments (0)