DEV Community

Franck Pachot
Franck Pachot

Posted on

Delete column statistics in PostgreSQL or YugabyteDB πŸ˜πŸš€

Anything I do, especially when it can change execution plans, should have its counterpart to revert back. In PostgreSQL we have an ANALYZE that gathers the statistics used by the query planner, but nothing to remove them. The idea is probably that, anyway, the automatic statistic gathering will put them back. But I still want to be able to control, at least for testing.

In YugabyteDB we use the PostgreSQL query planner but with additions to be cluster-aware, because it is a distributed SQL database. This is still in development. In the current version (2.11) ANALYZE is still a beta feature:

yugabyte=# select version();
                                                   version
-------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.2-YB-2.11.2.0-b0 on x86_64-pc-linux-gnu, compiled by gcc (Homebrew gcc 5.5.0_4) 5.5.0, 64-bit
(1 row)

yugabyte=# analyze my_table;
WARNING:  'analyze' is a beta feature!
HINT:  Set 'ysql_beta_features' yb-tserver gflag to true to suppress the warning for all beta features.
ANALYZE
Enter fullscreen mode Exit fullscreen mode

I can easily check the column statistics (histograms) from pg_stats. For example here is how I check the columns that have a lot of values in the histograms:

select cardinality(most_common_vals),* 
from pg_stats 
where tablename='my_table'
order by cardinality(most_common_vals) desc nulls last;
Enter fullscreen mode Exit fullscreen mode

The default of 100 max can be changed with set default_statistics_target=100; but that's not the point. pg_stats is a view. The table behind it is pg_statistic. Then to delete the columns statistics gathered, here is an example:

delete from pg_statistic  
where starelid in (
select c.oid from pg_class c 
join pg_namespace n on n.oid = c.relnamespace
and relname in ('my_table' )
);
Enter fullscreen mode Exit fullscreen mode

Note that on YugabyteDB, modifying the catalog is protected and you will get:

ERROR:  Illegal state: Transaction for catalog table write operation 'pg_statistic' not found
Enter fullscreen mode Exit fullscreen mode

You need to explicitely allow it with:

set yb_non_ddl_txn_for_sys_tables_allowed=on;
Enter fullscreen mode Exit fullscreen mode

And because stats are cached in the sessions (to avoid single contention on the catalog shared by the master server), you need to re-connect to verify the change.

This post is about the column statistics. Table statistics are in pg_class, easy to update here in the same way.

Discussion (0)