<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Saulo José Benvenutti</title>
    <description>The latest articles on DEV Community by Saulo José Benvenutti (@saulojb).</description>
    <link>https://dev.to/saulojb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3882490%2F788286d3-4fe2-4043-a959-caaf0bb9e7bd.png</url>
      <title>DEV Community: Saulo José Benvenutti</title>
      <link>https://dev.to/saulojb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/saulojb"/>
    <language>en</language>
    <item>
      <title>storage_engine: Two High-Performance Table Access Methods for PostgreSQL Analytics and HTAP Workloads</title>
      <dc:creator>Saulo José Benvenutti</dc:creator>
      <pubDate>Thu, 16 Apr 2026 13:13:44 +0000</pubDate>
      <link>https://dev.to/saulojb/storageengine-two-high-performance-table-access-methods-for-postgresql-analytics-and-htap-4hgp</link>
      <guid>https://dev.to/saulojb/storageengine-two-high-performance-table-access-methods-for-postgresql-analytics-and-htap-4hgp</guid>
      <description>&lt;p&gt;After working with PostgreSQL's Table Access Method (TAM) API introduced in version 12, I built &lt;code&gt;storage_engine&lt;/code&gt; — a PostgreSQL extension that ships two specialized storage engines as first-class AMs: &lt;code&gt;colcompress&lt;/code&gt; for column-oriented analytics and &lt;code&gt;rowcompress&lt;/code&gt; for append-heavy compressed workloads. Both coexist with standard heap tables in the same database, and both are available on &lt;a href="https://pgxn.org/" rel="noopener noreferrer"&gt;PGXN&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This article is aimed at DBAs and database architects evaluating columnar or compressed storage for analytical and HTAP workloads within PostgreSQL, without leaving the ecosystem.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lineage and Honest Attribution
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;storage_engine&lt;/code&gt; is a fork of &lt;a href="https://github.com/hydradatabase/hydra" rel="noopener noreferrer"&gt;Hydra Columnar&lt;/a&gt;, which is itself derived from &lt;a href="https://github.com/citusdata/citus" rel="noopener noreferrer"&gt;citus_columnar&lt;/a&gt; — originally built by Citus Data, now part of Microsoft. The original code is copyright Citus Data / Hydra, licensed under AGPL-3.0. I extend those foundations with &lt;code&gt;rowcompress&lt;/code&gt;, full DELETE/UPDATE support, MergeTree-style ordering, two-level zone-map pruning, and a redesigned parallel scan.&lt;/p&gt;

&lt;p&gt;Ideas borrowed from other systems are acknowledged explicitly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ClickHouse MergeTree&lt;/strong&gt;: per-table &lt;code&gt;orderby&lt;/code&gt; sort key and stripe/chunk-level zone-map pruning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apache Parquet&lt;/strong&gt;: row-group statistics, column projection, dictionary encoding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DuckDB&lt;/strong&gt;: vectorized expression evaluation for columnar batches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I mention this upfront because the PostgreSQL community values transparency, and because it helps you understand what is new versus what is inherited.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Two Access Methods in One Extension?
&lt;/h2&gt;

&lt;p&gt;Most workloads are not purely analytical or purely transactional. A common HTAP pattern involves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wide fact tables updated in bulk and queried with selective column projections → &lt;strong&gt;colcompress&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;High-volume append-only logs, audit trails, or time-series where all columns are read together → &lt;strong&gt;rowcompress&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Having both AMs in a single extension means a single &lt;code&gt;CREATE EXTENSION storage_engine&lt;/code&gt; installs both, all catalog objects land in the &lt;code&gt;engine&lt;/code&gt; schema, and C symbols carry the &lt;code&gt;se_&lt;/code&gt; prefix to avoid conflicts with &lt;code&gt;citus_columnar&lt;/code&gt; or any other columnar extension you may already have loaded.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;storage_engine&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Column-oriented analytics table&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ts&lt;/span&gt;         &lt;span class="n"&gt;timestamptz&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_id&lt;/span&gt;    &lt;span class="nb"&gt;bigint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;event_type&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt;      &lt;span class="n"&gt;float8&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;colcompress&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Row-oriented compressed log table&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;audit_log&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;        &lt;span class="n"&gt;bigserial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;logged_at&lt;/span&gt; &lt;span class="n"&gt;timestamptz&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt;   &lt;span class="nb"&gt;text&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;rowcompress&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both tables live alongside heap tables in the same database. Joins between them and heap tables work normally through the standard PostgreSQL executor.&lt;/p&gt;




&lt;h2&gt;
  
  
  colcompress: Column-Oriented Storage with Vectorized Execution
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Storage Layout
&lt;/h3&gt;

&lt;p&gt;Data is stored column by column. Each column is divided into &lt;em&gt;stripes&lt;/em&gt; (default 150,000 rows), and each stripe is subdivided into &lt;em&gt;chunk groups&lt;/em&gt; (default 10,000 rows). Every chunk records its column's minimum and maximum value:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Table file
├── Stripe 1  (rows 1 – 150,000)
│   ├── Chunk group 0  (rows 1 – 10,000)
│   │   ├── Column A  [min, max, compressed values…]
│   │   ├── Column B  [min, max, compressed values…]
│   │   └── …
│   └── Chunk group 1  (rows 10,001 – 20,000)  …
└── Stripe 2  (rows 150,001 – 300,000)  …
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A scan reads only the columns referenced by the query. On a 30-column table where a query touches 4 columns, the I/O reduction is roughly 87%. This is the fundamental advantage of columnar storage for analytics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compression Algorithms
&lt;/h3&gt;

&lt;p&gt;Each chunk group is compressed independently. Available algorithms:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;none&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No compression&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;lz4&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fast compression/decompression, moderate ratio&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;zstd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;High ratio, configurable level 1–19 (default: 3)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;pglz&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;PostgreSQL's built-in LZ variant&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alter_colcompress_table_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'events'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;regclass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;compression&lt;/span&gt;       &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'zstd'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;compression_level&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Two-Level Zone-Map Pruning
&lt;/h3&gt;

&lt;p&gt;This is one of the most impactful features for DBAs managing large time-series or event tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stripe-level pruning (coarse)&lt;/strong&gt; — Before reading any data, the scan aggregates min/max across all chunks of each stripe and tests the resulting stripe-wide range against the query's WHERE predicates using PostgreSQL's &lt;code&gt;predicate_refuted_by&lt;/code&gt;. Stripes provably disjoint from the predicate are skipped entirely — no decompression, no I/O. EXPLAIN reports this directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Custom Scan (ColumnarScan) on events
  Engine Stripes Removed by Pruning: 41
  Engine Stripes Read: 12
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Chunk-level pruning (fine)&lt;/strong&gt; — Within each stripe that survives the coarse pass, individual chunk groups are tested against the same predicate. Chunk groups whose min/max range cannot satisfy the predicate are skipped.&lt;/p&gt;

&lt;p&gt;The two layers compose. On a large, well-sorted table, a date-range query eliminates entire stripes before touching them, then further prunes chunk groups within the survivors. The practical result is I/O amplification comparable to an index scan, without maintaining a B-tree.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pruning effectiveness scales with data sortedness.&lt;/strong&gt; This brings us to the MergeTree-inspired ordering mechanism.&lt;/p&gt;

&lt;h3&gt;
  
  
  MergeTree-Like Ordering
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Define a global sort key&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alter_colcompress_table_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'events'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;regclass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;orderby&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'ts ASC, user_id ASC'&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- After a bulk load, compact and globally sort&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;colcompress_merge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'events'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;colcompress_merge&lt;/code&gt; copies all live rows to a temporary heap table, truncates the target, and re-inserts rows in the defined order — writing fresh, globally ordered stripes. After this, &lt;code&gt;WHERE ts BETWEEN x AND y&lt;/code&gt; skips almost all chunks on typical time-series data.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;DBA note:&lt;/strong&gt; &lt;code&gt;colcompress_merge&lt;/code&gt; acquires &lt;code&gt;AccessExclusiveLock&lt;/code&gt; for the duration of the operation. Schedule it during a maintenance window for large tables. There is no online/concurrent mode.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Vectorized Execution
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;colcompress&lt;/code&gt; ships a vectorized expression evaluation engine that processes WHERE clauses and aggregates in column-oriented batches of up to 10,000 values per call. This eliminates per-row interpreter overhead and maps naturally onto column chunks.&lt;/p&gt;

&lt;p&gt;Supported vectorized operations:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Types&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Comparison operators (&lt;code&gt;=&lt;/code&gt;, &lt;code&gt;&amp;lt;&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;&lt;/code&gt;, &lt;code&gt;&amp;lt;=&lt;/code&gt;, &lt;code&gt;&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;gt;=&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;int2&lt;/code&gt;, &lt;code&gt;int4&lt;/code&gt;, &lt;code&gt;int8&lt;/code&gt;, &lt;code&gt;float4&lt;/code&gt;, &lt;code&gt;float8&lt;/code&gt;, &lt;code&gt;date&lt;/code&gt;, &lt;code&gt;timestamp&lt;/code&gt;, &lt;code&gt;timestamptz&lt;/code&gt;, &lt;code&gt;char&lt;/code&gt;, &lt;code&gt;bpchar&lt;/code&gt;, &lt;code&gt;text&lt;/code&gt;, &lt;code&gt;varchar&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;bool&lt;/code&gt;, &lt;code&gt;oid&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aggregates (&lt;code&gt;count&lt;/code&gt;, &lt;code&gt;sum&lt;/code&gt;, &lt;code&gt;avg&lt;/code&gt;, &lt;code&gt;max&lt;/code&gt;, &lt;code&gt;min&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;int2&lt;/code&gt;, &lt;code&gt;int4&lt;/code&gt;, &lt;code&gt;int8&lt;/code&gt;, &lt;code&gt;float8&lt;/code&gt;, &lt;code&gt;date&lt;/code&gt;, &lt;code&gt;timestamptz&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both features are on by default and can be toggled per session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;storage_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enable_vectorization&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;storage_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enable_column_cache&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Parallel Scan
&lt;/h3&gt;

&lt;p&gt;The AM implements the full PostgreSQL parallel Table AM protocol via Dynamic Shared Memory (DSM). The coordinator divides the stripe range across workers; each worker independently reads and decompresses its assigned stripes. Parallel scan stacks on top of vectorized execution — each worker runs its own vectorized pipeline.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;storage_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enable_parallel_execution&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;storage_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;min_parallel_processes&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Standard PostgreSQL parallel knobs also apply&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;max_parallel_workers_per_gather&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Parallel scan and stripe pruning (v1.0.6+):&lt;/strong&gt; Stripe pruning is active in both the sequential and parallel scan paths. The coordinator pre-filters stripe IDs against the query predicate before distributing work to workers, so each worker receives only stripes that survive the min/max test — no wasted I/O in parallel mode. This is confirmed by the benchmarks: Q5 (date-range, 1 month) achieves 22.4ms in serial and 28.2ms in parallel — both with 6 of 7 stripes eliminated by pruning.&lt;/p&gt;

&lt;h3&gt;
  
  
  DELETE, UPDATE, and Upserts
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;colcompress&lt;/code&gt; implements DELETE and UPDATE via a &lt;strong&gt;row mask&lt;/strong&gt; stored in &lt;code&gt;engine.row_mask&lt;/code&gt;. Each deleted row is recorded as a bit in a per-chunk-group bitmask; the scan engine skips masked rows without rewriting the stripe. UPDATE is delete-then-insert. Deleted rows are reclaimed during VACUUM, which rewrites affected stripes and clears the mask.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;storage_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enable_dml&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;-- default: on&lt;/span&gt;

&lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'1 year'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'purchase'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Standard upserts also work, with a unique index on the conflict target&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'click'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;CONFLICT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EXCLUDED&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Index-Backed Scan
&lt;/h3&gt;

&lt;p&gt;For document repositories — tables storing XML, PDF, JSON blobs that need columnar compression but are fetched by primary key — an index scan path is available:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Enable per table (persisted across reconnects)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alter_colcompress_table_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'documents'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;regclass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index_scan&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Or per session for all colcompress tables&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;storage_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;enable_columnar_index_scan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The combination of &lt;code&gt;zstd&lt;/code&gt; compression (3–10× ratio on large binary/text documents) and index-driven point lookups is compelling for document storage use cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do not create B-tree indexes on columns covered by the &lt;code&gt;orderby&lt;/code&gt; key on analytical tables.&lt;/strong&gt; The PostgreSQL planner may prefer &lt;code&gt;IndexScan&lt;/code&gt; for range queries, which bypasses stripe pruning entirely (&lt;code&gt;randomAccess=true&lt;/code&gt;). Use GIN indexes for JSONB and array columns, and rely on stripe pruning for range predicates.&lt;/p&gt;




&lt;h2&gt;
  
  
  rowcompress: Batch-Compressed Row Storage
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;rowcompress&lt;/code&gt; stores rows in fixed-size batches (default 10,000 rows per batch). Each batch is serialized using heap tuple format and compressed as a single unit. Batch metadata — file offset, byte size, first row number, row count — is stored in &lt;code&gt;engine.row_batch&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This AM suits append-heavy workloads where compression matters but column projection is not needed: event logs, audit trails, time-series tables where many or all columns are queried together. Typical storage savings are 2–10× with zstd.&lt;/p&gt;

&lt;p&gt;Compared to &lt;code&gt;colcompress&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reads full rows (no column projection)&lt;/li&gt;
&lt;li&gt;Lower write latency per row (no columnar transposition at write time)&lt;/li&gt;
&lt;li&gt;No vectorized execution or chunk-level pruning&lt;/li&gt;
&lt;li&gt;Parallel reads via atomic batch claiming (workers self-schedule, zero coordinator overhead)&lt;/li&gt;
&lt;li&gt;Full compression algorithm support
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;alter_rowcompress_table_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'audit_log'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;regclass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;batch_size&lt;/span&gt;        &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;compression&lt;/span&gt;       &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'zstd'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;compression_level&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Rewrite all batches after changing compression options&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rowcompress_repack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'audit_log'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Benchmarks
&lt;/h2&gt;

&lt;p&gt;Benchmark suite: 1,000,000 rows, PostgreSQL 18.3, AMD Ryzen 7 5800H (8-core), 40 GB RAM, &lt;code&gt;shared_buffers=10GB&lt;/code&gt;. &lt;code&gt;colcompress&lt;/code&gt; configured with &lt;code&gt;lz4&lt;/code&gt; compression and &lt;code&gt;orderby = 'event_date ASC'&lt;/code&gt; (globally sorted via &lt;code&gt;colcompress_merge&lt;/code&gt;). Results are the median of 3 runs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Serial (JIT=off, max_parallel_workers_per_gather=0)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxhuk0hvm99n3k8a4u2a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flxhuk0hvm99n3k8a4u2a.png" alt="Serial benchmark chart" width="800" height="299"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;heap&lt;/th&gt;
&lt;th&gt;colcompress&lt;/th&gt;
&lt;th&gt;rowcompress&lt;/th&gt;
&lt;th&gt;citus_columnar&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Q1 &lt;code&gt;count(*)&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;39.8ms&lt;/td&gt;
&lt;td&gt;43.0ms&lt;/td&gt;
&lt;td&gt;313ms&lt;/td&gt;
&lt;td&gt;36.6ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q2 &lt;code&gt;SUM/AVG&lt;/code&gt; numeric + double&lt;/td&gt;
&lt;td&gt;188.6ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;117.4ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;358ms&lt;/td&gt;
&lt;td&gt;122.9ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q3 &lt;code&gt;GROUP BY&lt;/code&gt; country (10 vals)&lt;/td&gt;
&lt;td&gt;219.0ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;162.0ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;395ms&lt;/td&gt;
&lt;td&gt;139.4ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q4 &lt;code&gt;GROUP BY&lt;/code&gt; event_type + p95&lt;/td&gt;
&lt;td&gt;538.9ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;448.4ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;685ms&lt;/td&gt;
&lt;td&gt;469.7ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q5 date range 1 month&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20.8ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;22.4ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;59.1ms&lt;/td&gt;
&lt;td&gt;20.6ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q6 JSONB &lt;code&gt;@&amp;gt;&lt;/code&gt; GIN&lt;/td&gt;
&lt;td&gt;123.1ms&lt;/td&gt;
&lt;td&gt;162.2ms&lt;/td&gt;
&lt;td&gt;326ms&lt;/td&gt;
&lt;td&gt;238.1ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q7 JSONB key + GROUP BY&lt;/td&gt;
&lt;td&gt;388.5ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;310.3ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;550ms&lt;/td&gt;
&lt;td&gt;358.2ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q8 array &lt;code&gt;@&amp;gt;&lt;/code&gt; GIN&lt;/td&gt;
&lt;td&gt;63.0ms&lt;/td&gt;
&lt;td&gt;122.7ms&lt;/td&gt;
&lt;td&gt;274ms&lt;/td&gt;
&lt;td&gt;140.9ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q9 LIKE text scan&lt;/td&gt;
&lt;td&gt;150.9ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;90.9ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;338ms&lt;/td&gt;
&lt;td&gt;89.9ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q10 heavy multi-agg&lt;/td&gt;
&lt;td&gt;1953ms&lt;/td&gt;
&lt;td&gt;1939ms&lt;/td&gt;
&lt;td&gt;2109ms&lt;/td&gt;
&lt;td&gt;1925ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Q5 on &lt;code&gt;colcompress&lt;/code&gt; achieves heap-equivalent performance (22.4ms vs. 20.8ms) because stripe pruning eliminates 6 of 7 stripes — data is physically sorted by &lt;code&gt;event_date&lt;/code&gt; via &lt;code&gt;orderby&lt;/code&gt;. lz4 decompression adds negligible overhead over the pruned data.&lt;/p&gt;

&lt;p&gt;Q6 and Q8 (GIN index on JSONB/arrays) benefit from the GIN index without needing stripe pruning. This is expected behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Parallel (JIT=on, max_parallel_workers_per_gather=16)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0a5ogmuqtbvldu2rdmb7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0a5ogmuqtbvldu2rdmb7.png" alt="Parallel benchmark chart" width="800" height="299"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;heap&lt;/th&gt;
&lt;th&gt;colcompress&lt;/th&gt;
&lt;th&gt;rowcompress&lt;/th&gt;
&lt;th&gt;citus_columnar&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Q1 &lt;code&gt;count(*)&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;18.3ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;16.4ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;148ms&lt;/td&gt;
&lt;td&gt;37.9ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q2 &lt;code&gt;SUM/AVG&lt;/code&gt; numeric + double&lt;/td&gt;
&lt;td&gt;53.5ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;29.7ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;166ms&lt;/td&gt;
&lt;td&gt;121.5ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q3 &lt;code&gt;GROUP BY&lt;/code&gt; country&lt;/td&gt;
&lt;td&gt;61.6ms&lt;/td&gt;
&lt;td&gt;166ms&lt;/td&gt;
&lt;td&gt;161ms&lt;/td&gt;
&lt;td&gt;143ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q4 &lt;code&gt;GROUP BY&lt;/code&gt; event_type + p95&lt;/td&gt;
&lt;td&gt;540ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;316ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;674ms&lt;/td&gt;
&lt;td&gt;470ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Q5 date range 1 month&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;21.4ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;28.2ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;73.3ms&lt;/td&gt;
&lt;td&gt;21.1ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q6 JSONB &lt;code&gt;@&amp;gt;&lt;/code&gt; GIN&lt;/td&gt;
&lt;td&gt;84.3ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;40.4ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;490ms&lt;/td&gt;
&lt;td&gt;245ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q7 JSONB key + GROUP BY&lt;/td&gt;
&lt;td&gt;392ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;65.7ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;687ms&lt;/td&gt;
&lt;td&gt;362ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q8 array &lt;code&gt;@&amp;gt;&lt;/code&gt; GIN&lt;/td&gt;
&lt;td&gt;61.6ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;32.7ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;273ms&lt;/td&gt;
&lt;td&gt;146ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q9 LIKE text scan&lt;/td&gt;
&lt;td&gt;48.7ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;25.4ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;157ms&lt;/td&gt;
&lt;td&gt;91.7ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Q10 heavy multi-agg&lt;/td&gt;
&lt;td&gt;1903ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;641ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2085ms&lt;/td&gt;
&lt;td&gt;1920ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Q5 on &lt;code&gt;colcompress&lt;/code&gt; (28.2ms) is now on par with the serial result (22.4ms) and heap (21.4ms). Since v1.0.6, the planner correctly applies &lt;code&gt;disable_cost&lt;/code&gt; to parallel index paths, ensuring &lt;code&gt;Parallel Custom Scan (ColcompressScan)&lt;/code&gt; with stripe pruning is chosen over &lt;code&gt;Parallel Index Scan&lt;/code&gt; when &lt;code&gt;index_scan=false&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The full benchmark kit is in &lt;code&gt;tests/bench/&lt;/code&gt;. See &lt;a href="https://github.com/saulojb/storage_engine/blob/main/BENCHMARKS.md" rel="noopener noreferrer"&gt;BENCHMARKS.md&lt;/a&gt; for environment details and reproduction steps.&lt;/p&gt;




&lt;h2&gt;
  
  
  Known Limitations for DBAs
&lt;/h2&gt;

&lt;p&gt;These are the limitations most relevant to production deployment decisions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No AFTER ROW triggers or foreign keys.&lt;/strong&gt; This is an architectural constraint of columnar storage. It also means &lt;code&gt;pg_repack&lt;/code&gt; cannot be used — it relies on AFTER ROW triggers internally. Use &lt;code&gt;engine.colcompress_repack()&lt;/code&gt; as a drop-in replacement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No VACUUM FULL / table rewrite.&lt;/strong&gt; Use &lt;code&gt;engine.colcompress_repack()&lt;/code&gt; / &lt;code&gt;engine.rowcompress_repack()&lt;/code&gt; instead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No CLUSTER support.&lt;/strong&gt; Use &lt;code&gt;engine.colcompress_merge()&lt;/code&gt; with an &lt;code&gt;orderby&lt;/code&gt; option to achieve equivalent physical ordering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No unlogged tables.&lt;/strong&gt; &lt;code&gt;CREATE UNLOGGED TABLE ... USING colcompress&lt;/code&gt; is not supported.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sort-on-write is disabled when B-tree indexes exist.&lt;/strong&gt; Run &lt;code&gt;engine.colcompress_merge()&lt;/code&gt; after bulk loads to re-establish global sort order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;colcompress_repack&lt;/code&gt; is not online.&lt;/strong&gt; It acquires &lt;code&gt;AccessExclusiveLock&lt;/code&gt; for the full duration. Schedule during maintenance windows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AFTER STATEMENT triggers are supported.&lt;/strong&gt; Only row-level (&lt;code&gt;FOR EACH ROW&lt;/code&gt;) AFTER triggers are blocked.&lt;/p&gt;




&lt;h2&gt;
  
  
  Management Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.alter_colcompress_table_set(regclass, ...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Set options on a colcompress table&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.alter_colcompress_table_reset(regclass, ...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Reset colcompress options to system defaults&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.colcompress_merge(regclass)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rewrite and globally sort a colcompress table by its &lt;code&gt;orderby&lt;/code&gt; key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.colcompress_repack(regclass)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Alias for &lt;code&gt;colcompress_merge&lt;/code&gt;; drop-in replacement for &lt;code&gt;pg_repack&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.alter_rowcompress_table_set(regclass, ...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Set options on a rowcompress table&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.alter_rowcompress_table_reset(regclass, ...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Reset rowcompress options to system defaults&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.rowcompress_repack(regclass)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rewrite all batches with current options&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Catalog views for inspection:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;View&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.colcompress_options&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Per-table options for all colcompress tables&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.colcompress_stripes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Stripe-level metadata (offset, size, row range) per table&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.rowcompress_options&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Per-table options for all rowcompress tables&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;engine.rowcompress_batches&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Batch-level metadata for all rowcompress tables&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All views grant &lt;code&gt;SELECT&lt;/code&gt; to &lt;code&gt;PUBLIC&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Build from source
&lt;/h3&gt;

&lt;p&gt;Requires PostgreSQL server headers and &lt;code&gt;pg_config&lt;/code&gt; in PATH. Supports PostgreSQL 13–18.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;dist/
&lt;span class="nb"&gt;sudo &lt;/span&gt;make &lt;span class="nt"&gt;-j&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;nproc&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt; &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add to &lt;code&gt;postgresql.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;shared_preload_libraries&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'storage_engine'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;citus&lt;/code&gt; or &lt;code&gt;pg_cron&lt;/code&gt; are also in &lt;code&gt;shared_preload_libraries&lt;/code&gt;, load order matters — &lt;code&gt;citus&lt;/code&gt; must appear before &lt;code&gt;storage_engine&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="py"&gt;shared_preload_libraries&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'pg_cron,citus,storage_engine'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;storage_engine&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  License
&lt;/h2&gt;

&lt;p&gt;AGPL-3.0. This is relevant for organizations that distribute modified versions or embed the extension in SaaS products — review the license terms accordingly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/saulojb/storage_engine" rel="noopener noreferrer"&gt;https://github.com/saulojb/storage_engine&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PGXN:&lt;/strong&gt; &lt;a href="https://pgxn.org/" rel="noopener noreferrer"&gt;https://pgxn.org/&lt;/a&gt; &lt;em&gt;(search: storage_engine)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BENCHMARKS.md:&lt;/strong&gt; &lt;a href="https://github.com/saulojb/storage_engine/blob/main/BENCHMARKS.md" rel="noopener noreferrer"&gt;https://github.com/saulojb/storage_engine/blob/main/BENCHMARKS.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hydra Columnar (upstream):&lt;/strong&gt; &lt;a href="https://github.com/hydradatabase/hydra" rel="noopener noreferrer"&gt;https://github.com/hydradatabase/hydra&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Feedback, issues, and pull requests are welcome on GitHub.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Saulo José Benvenutti — Data Architect / PostgreSQL DBA&lt;/em&gt;&lt;br&gt;
&lt;em&gt;📧 &lt;a href="mailto:saulojb@gmail.com"&gt;saulojb@gmail.com&lt;/a&gt; · 🔗 &lt;a href="https://github.com/saulojb" rel="noopener noreferrer"&gt;github.com/saulojb&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>database</category>
      <category>performance</category>
      <category>postgres</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
