DEV Community

Franck Pachot for AWS Heroes

Posted on

PostgreSQL GIN + B-Tree

In the previous posts, I exposed how to build a table, in addition to the array datataype, to act as an index, but maintained by a trigger and queried with an explicit join. A GIN index may be preferable because it is maintained without the need to declare a trigger, and because the query planner can use it even when we query the table. However, querying the table was more efficient because my query, in addition to filtering on tag_ids or group_ids, filtered on the created_date timestamp. I was able to add it in my table custom table, but not in the GIN index:

postgres=> create index posts_by_user_group_ids_created on posts_by_user using gin (group_ids, created_date);
ERROR:  data type timestamp with time zone has no default operator class for access method "gin"
HINT:  You must specify an operator class for the index or define a default operator class for the data type.
Enter fullscreen mode Exit fullscreen mode

However, if we still want to use a GIN index, there's the possibility to combine GIN indexed column with BTREE indexed column thanks to the btree_gin extension:

postgres=> create extension btree_gin;
CREATE EXTENSION
Enter fullscreen mode Exit fullscreen mode

It is now possible to index on the group_ids[] and tag_ids[] with created_date timestamptz:

postgres=> create index posts_by_user_group_ids_created on posts_by_user using gin (group_ids, created_date);
CREATE INDEX
Time: 67771.641 ms (01:07.772)
postgres=> create index posts_by_user_tag_ids_created   on posts_by_user using gin (tag_ids, created_date);
CREATE INDEX
Time: 68495.108 ms (01:08.495)
Enter fullscreen mode Exit fullscreen mode

First, let's have a look at the executions plans before that.

Execution plans before

I did that on the table created in the previous post, filled with 5 million rows:

postgres=> \d posts_by_user
                                 Table "public.posts_by_user"
    Column    |           Type           | Collation | Nullable |           Default
--------------+--------------------------+-----------+----------+------------------------------
 user_id      | bigint                   |           | not null |
 post_id      | bigint                   |           | not null | generated always as identity
 group_ids    | bigint[]                 |           |          |
 tag_ids      | bigint[]                 |           |          |
 content      | text                     |           |          |
 created_date | timestamp with time zone |           | not null |
Indexes:
    "posts_by_user_pkey" PRIMARY KEY, btree (user_id, created_date, post_id)
    "posts_by_user_user_id_post_id_key" UNIQUE CONSTRAINT, btree (user_id, post_id)

postgres=> select count(*) from posts_by_user;
  count
---------
 5000000
(1 row)

Time: 417.648 ms
Enter fullscreen mode Exit fullscreen mode

Before creating the GIN index, here was my query:

postgres=> explain (analyze, buffers)
           select *
            from posts_by_user
            where created_date > now() - interval '1 month'
              and tag_ids @>'{1}'
order by created_date desc limit 100;
                                                                       QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=275177.48..275177.73 rows=100 width=405) (actual time=188.100..188.121 rows=100 loops=1)
   Buffers: shared hit=69605
   ->  Sort  (cost=275177.48..275178.02 rows=217 width=405) (actual time=188.098..188.111 rows=100 loops=1)
         Sort Key: created_date DESC
         Sort Method: quicksort  Memory: 124kB
         Buffers: shared hit=69605
         ->  Bitmap Heap Scan on posts_by_user  (cost=174740.49..275169.19 rows=217 width=405) (actual time=149.292..187.996 rows=193 loops=1)
               Recheck Cond: (created_date > (now() - '1 mon'::interval))
               Filter: (tag_ids @> '{1}'::bigint[])
               Rows Removed by Filter: 37305
               Heap Blocks: exact=35233
               Buffers: shared hit=69605
               ->  Bitmap Index Scan on posts_by_user_pkey  (cost=0.00..174740.44 rows=35513 width=0) (actual time=140.588..140.589 rows=39145 loops=1)
                     Index Cond: (created_date > (now() - '1 mon'::interval))
                     Buffers: shared hit=32824
 Planning Time: 0.118 ms
 Execution Time: 188.340 ms
(17 rows)

Time: 282.833 ms
Enter fullscreen mode Exit fullscreen mode

This is not efficient: reads 32824 pages from the primary key index to filter on the timestamp, but filtering on the tag later Filter: (tag_ids @> '{1}'::bigint[]). This is visible from Rows Removed by Filter: 38714 were read to be discarded later.

With the GIN index on (tag_ids), it was better, but still not optimal:

postgres=> explain (analyze, buffers)
           select *
            from posts_by_user
            where created_date > now() - interval '1 month'
              and tag_ids @>'{1}'
order by created_date desc limit 100;
                                                                     QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=89306.17..89306.42 rows=100 width=405) (actual time=9.542..9.562 rows=100 loops=1)
   Buffers: shared hit=1530
   ->  Sort  (cost=89306.17..89306.71 rows=217 width=405) (actual time=9.540..9.552 rows=100 loops=1)
         Sort Key: created_date DESC
         Sort Method: quicksort  Memory: 124kB
         Buffers: shared hit=1530
         ->  Bitmap Heap Scan on posts_by_user  (cost=288.80..89297.88 rows=217 width=405) (actual time=2.009..9.478 rows=193 loops=1)
               Recheck Cond: (tag_ids @> '{1}'::bigint[])
               Filter: (created_date > (now() - '1 mon'::interval))
               Rows Removed by Filter: 26807
               Heap Blocks: exact=1521
               Buffers: shared hit=1530
               ->  Bitmap Index Scan on posts_by_user_tag_ids  (cost=0.00..288.75 rows=30500 width=0) (actual time=1.682..1.682 rows=27000 loops=1)
                     Index Cond: (tag_ids @> '{1}'::bigint[])
                     Buffers: shared hit=9
 Planning Time: 0.129 ms
 Execution Time: 9.594 ms
(17 rows)

Time: 103.207 ms
Enter fullscreen mode Exit fullscreen mode

Now, the GIN index is used to access to the specific tags from tag_ids, but the later filtering on timestamp Filter: (created_date > (now() - '1 mon'::interval)) discards Rows Removed by Filter: 26807 rows.

Execution plan with btree_gin

Here is the plan with the btree_gin created using gin (tag_ids, created_date):

postgres=> explain (analyze, buffers)
           select *
            from posts_by_user
            where created_date > now() - interval '1 month'
              and tag_ids @>'{1}'
order by created_date desc limit 100;
                                                                       QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=917.04..917.29 rows=100 width=405) (actual time=56.695..56.710 rows=100 loops=1)
   Buffers: shared hit=454
   ->  Sort  (cost=917.04..917.58 rows=217 width=405) (actual time=56.694..56.700 rows=100 loops=1)
         Sort Key: created_date DESC
         Sort Method: quicksort  Memory: 124kB
         Buffers: shared hit=454
         ->  Bitmap Heap Scan on posts_by_user  (cost=54.23..908.74 rows=217 width=405) (actual time=56.456..56.624 rows=193 loops=1)
               Recheck Cond: ((tag_ids @> '{1}'::bigint[]) AND (created_date > (now() - '1 mon'::interval)))
               Heap Blocks: exact=182
               Buffers: shared hit=454
               ->  Bitmap Index Scan on posts_by_user_tag_ids_created  (cost=0.00..54.17 rows=217 width=0) (actual time=56.427..56.428 rows=193 loops=1)
                     Index Cond: ((tag_ids @> '{1}'::bigint[]) AND (created_date > (now() - '1 mon'::interval)))
                     Buffers: shared hit=272
 Planning Time: 0.155 ms
 Execution Time: 57.426 ms
(15 rows)

Time: 151.462 ms
Enter fullscreen mode Exit fullscreen mode

This time, all the predicates are index conditions: Index Cond: ((tag_ids @> '{1}'::bigint[]) AND (created_date > (now() - '1 mon'::interval))) and there is no Rows Removed by Filter to remove later.

This is the best usage of GIN index, reading around 454 buffers in 57 milliseconds (when in cache - can be much longer from disk).

However, the solution presented in the previous post, if you can build and query the additional table, is more efficient:

postgres=> explain (analyze,buffers)
           select posts_by_user.*
            from posts_by_user
            join posts_by_tag
            using(user_id, created_date, post_id)
            where posts_by_tag.created_date
                   > now() - interval '1 month'
                  and tag_id =1
            order by created_date desc limit 100
 ;
                                                                           QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.00..741.19 rows=59 width=405) (actual time=0.039..0.696 rows=100 loops=1)
   Buffers: shared hit=419
   ->  Nested Loop  (cost=1.00..741.19 rows=59 width=405) (actual time=0.038..0.685 rows=100 loops=1)
         Buffers: shared hit=419
         ->  Index Only Scan Backward using posts_by_tag_pkey on posts_by_tag  (cost=0.57..241.75 rows=59 width=24) (actual time=0.021..0.143
rows=100 loops=1)
               Index Cond: ((tag_id = 1) AND (created_date > (now() - '1 mon'::interval)))
               Heap Fetches: 100
               Buffers: shared hit=19
         ->  Index Scan using posts_by_user_user_id_post_id_key on posts_by_user  (cost=0.43..8.45 rows=1 width=405) (actual time=0.005..0.005 rows=1 loops=100)
               Index Cond: ((user_id = posts_by_tag.user_id) AND (post_id = posts_by_tag.post_id))
               Filter: (posts_by_tag.created_date = created_date)
               Buffers: shared hit=400
 Planning Time: 0.261 ms
 Execution Time: 0.731 ms
(14 rows)

Time: 95.173 ms
Enter fullscreen mode Exit fullscreen mode

Even when reading the same number of blocks as the GIN index, because the structure is similar, the simple Index Scan and Index Only Scan are faster than the Bitmap Scans required by GIN on PostgreSQL.

I've written those posts to show that in PostgreSQL, you have many possibilities, not limited to the normalized heap tables and btree indexes. The right choice requires a good understanding of the access patterns, and verification with the execution plan. I'll talk about modern data modeling at next PGDay:


This is not limited to PostgreSQL and should be available in all PostgreSQL-compatible databases. I've run this in RDS Aurora with PostgreSQL 12.7 compatibility. The previous post was on YugabyteDB which has GIN index, without Bitmap Scan, very similar to the table I've build, but fully automated and transparent. The btree_gin extension is not yet there, but soon - GIN features are tracked in issue 7850

Discussion (0)