Franck Pachot for YugabyteDB Distributed PostgreSQL Database

Posted on Apr 22, 2023

Range indexes for LIKE queries in YugabyteDB

#yugabytedb #postgres #distributed #sql

SQL is powerful and indexes can provide fast access even when you only know partially the value you are looking for.

For this example, I'll use a list of popular baby names born in the US between 1880 and 2009 collected by Hadley Wickham:

drop table if exists baby_names;
create table baby_names ( 
  primary key (name, year, sex)
 ,year int , name text, percent float, sex text
);
\! wget -qc "https://github.com/hadley/data-baby-names/blob/master/baby-names.csv?raw=true"
\copy baby_names from 'baby-names.csv?raw=true' with ( skip 1, format csv )
;

The primary key starts with the name and is HASH sharded on it (the default in YugabyteDB is HASH on the first column):

yugabyte=# \d baby_names
                  Table "public.baby_names"
 Column  |       Type       | Collation | Nullable | Default
---------+------------------+-----------+----------+---------
 year    | integer          |           | not null |
 name    | text             |           | not null |
 percent | double precision |           |          |
 sex     | text             |           | not null |
Indexes:
    "baby_names_pkey" PRIMARY KEY, lsm (name HASH, year ASC, sex ASC)

A query for one name is fast, using the index:

yugabyte=# explain (analyze, dist, costs off)
           select name, year, sex from baby_names 
           where name = 'Frank' order by year;

                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Index Scan using baby_names_pkey on baby_names (actual time=1.743..1.800 rows=188 loops=1)
   Index Cond: (name = 'Frank'::text)
   Storage Index Read Requests: 1
   Storage Index Execution Time: 2.000 ms
 Planning Time: 0.059 ms
 Execution Time: 1.834 ms
 Storage Read Requests: 1
 Storage Write Requests: 0
 Storage Execution Time: 2.000 ms
 Peak Memory Usage: 8 kB
(10 rows)

However, a common use-case is when we enter only the beginning of the name. For example, with a name LIKE 'Fran%'.

No index: Seq Scan

Without an index, the whole table is scanned:

yugabyte=# explain (analyze, dist, costs off)
           select name, year, sex from baby_names 
           where name like 'Fran%' order by year;

 Sort (actual time=118.913..118.959 rows=1420 loops=1)
   Sort Key: year
   Sort Method: quicksort  Memory: 146kB
   ->  Seq Scan on baby_names (actual time=118.334..118.620 rows=1420 loops=1)
         Remote Filter: (name ~~ 'Fran%'::text)
         Storage Table Read Requests: 1
         Storage Table Execution Time: 117.999 ms
 Planning Time: 1.266 ms
 Execution Time: 119.047 ms
 Storage Read Requests: 1
 Storage Write Requests: 0
 Storage Execution Time: 117.999 ms
 Peak Memory Usage: 254 kB
(13 rows)

This is efficient (120 milliseconds) getting many rows (rows=1420) in one call to the storage (Read Requests: 1) thanks to the Remote Filter. Without this optimization (you can set yb_enable_expression_pushdown to false; to disable it) it would take 254 read requests to fetch 258000 rows by batches of 1024 rows (--ysql_prefetch_limit) to apply the filter in YSQL (the postgres backend of YugabyteDB) and discard most of them (Rows Removed by Filter: 256580).

Even if it is fast, the time complexity is O(N) on the number of rows in the table. To be scalable, we want an index access for which the time complexity depends on the result only.

Range index: Index Scan

The like 'Fran%' is a range query: it starts at 'Fran' and stops at 'Frao' (o being the next value after n in en_US.UTF-8 with lc_collate=C ). The query planner knows that, and can add a ((name >= 'Fran'::text) AND (name < 'Frao'::text)) condition.

My primary key index being hashed on name it cannot be used for a range scan over multiple names. I'm creating a range index for it:

create index baby_names_range_name on baby_names ( name ASC , year ASC );

yugabyte=# explain (analyze, dist, costs off)
           select name, year,sex from baby_names 
           where name like 'Fran%' order by year;

                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Sort (actual time=13.473..13.522 rows=1420 loops=1)
   Sort Key: year
   Sort Method: quicksort  Memory: 146kB
   ->  Index Scan using baby_names_range_name on baby_names (actual time=9.891..13.202 rows=1420 loops=1)
         Index Cond: ((name >= 'Fran'::text) AND (name < 'Frao'::text))
         Remote Index Filter: (name ~~ 'Fran%'::text)
         Storage Index Read Requests: 2
         Storage Index Execution Time: 5.000 ms
         Storage Table Read Requests: 2
         Storage Table Execution Time: 8.000 ms
 Planning Time: 2.966 ms
 Execution Time: 13.605 ms
 Storage Read Requests: 4
 Storage Write Requests: 0
 Storage Execution Time: 13.000 ms
 Peak Memory Usage: 873 kB
(16 rows)

The time is faster even if there are more read requests (for the table and the index) because each request has read only the necessary rows.

Range index: Index Only Scan

If I need only the name and year, as they are in the index, it is even faster without the need to read the table:

yugabyte=# explain (analyze, dist, costs off)
           select name, year from baby_names 
           where name like 'Fran%' order by year;

                                                   QUERY PLAN                                 
--------------------------------------------------------------------------------------------------------------
 Sort (actual time=8.182..8.237 rows=1420 loops=1)
   Sort Key: year
   Sort Method: quicksort  Memory: 115kB
   ->  Index Only Scan using baby_names_range_name on baby_names (actual time=5.287..7.926 rows=1420 loops=1)
         Index Cond: ((name >= 'Fran'::text) AND (name < 'Frao'::text))
         Remote Filter: (name ~~ 'Fran%'::text)
         Heap Fetches: 0
         Storage Index Read Requests: 2
         Storage Index Execution Time: 8.000 ms
 Planning Time: 3.053 ms
 Execution Time: 8.315 ms
 Storage Read Requests: 2
 Storage Write Requests: 0
 Storage Execution Time: 8.000 ms
 Peak Memory Usage: 222 kB
(15 rows)

Here I have Storage Index Read Requests but no Storage Table Read Requests thanks to the Index Only Scan. I could have the same even when querying the sexcolumn if it was added to the index on baby_names ( name ASC , year ASC ) include ( sex );

GIN index with Trigrams

The above was a range scan because the prefix of the pattern was known ('Fran%). If the first letters are unknown, like '%Fran%, we are back to full table scan:

yugabyte=# explain (analyze, costs off)
           select name, year from baby_names 
           where name like '%Fran%' order by year;

                                  QUERY PLAN
-------------------------------------------------------------------------------
 Sort (actual time=119.386..119.429 rows=1420 loops=1)
   Sort Key: year
   Sort Method: quicksort  Memory: 115kB
   ->  Seq Scan on baby_names (actual time=118.954..119.137 rows=1420 loops=1)
         Remote Filter: (name ~~ '%Fran%'::text)
         Storage Table Read Requests: 1
         Storage Table Execution Time: 117.998 ms
 Planning Time: 2.233 ms
 Execution Time: 119.996 ms
 Storage Read Requests: 1
 Storage Write Requests: 0
 Storage Execution Time: 117.998 ms
 Peak Memory Usage: 234 kB
(13 rows)

For this, the solution is indexing parts of the test with a GIN index. Those parts will be trigrams and the extension pg_trgm adds a gin_trgm_ops operator for the GIN index:

yugabyte=# create extension if not exists pg_trgm;

yugabyte=# create index baby_names_trg_name on baby_names
           using gin ( name gin_trgm_ops);

yugabyte=# explain (analyze, costs off)
           select name, year from baby_names 
           where name like '%Fran%' order by year;

                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Sort (actual time=11.024..11.075 rows=1126 loops=1)
   Sort Key: year
   Sort Method: quicksort  Memory: 101kB
   ->  HashAggregate (actual time=10.637..10.803 rows=1126 loops=1)
         Group Key: year, name
         ->  Index Scan using baby_names_trg_name on baby_names (actual time=6.670..10.273 rows=1420 loops=1)
               Index Cond: (name ~~ '%Fran%'::text)
               Rows Removed by Index Recheck: 70
 Planning Time: 0.890 ms
 Execution Time: 11.178 ms
 Storage Read Requests: 0
 Storage Write Requests: 0

Here 1490 rows (out of the 258000 rows in my table) have been accessed by index, with a few false positives that were removed after the index scan (Rows Removed by Index Recheck: 70) to get the final rows=1420.

More about Trigrams

In addition to indexing text by trigrams, pg_trgm comes with interesting functions and YugabyteDB being PostgreSQL compatible, supports all of them. Here is an example:

yugabyte=# select distinct name <-> '%Fran%' as distance, name
           from baby_names where name like '%Fran%' order by 1;

 distance |    name
----------+------------
        0 | Fran
 0.428571 | Franc
 0.428571 | Frank
 0.428571 | Franz
      0.5 | Franco
 0.555556 | Frances
 0.555556 | Francis
 0.555556 | Frankie
      0.6 | Francina
      0.6 | Franklyn
      0.6 | Franklin
      0.6 | Francine
      0.6 | Francies
 0.636364 | Francisca
 0.636364 | Francesca
 0.636364 | Francisco
 0.636364 | Francesco
 0.666667 | Francisqui
(18 rows)

Here, not only I can list names that contain a pattern, but I can also order them so that the closest to the pattern are displayed first. For example, if the name you are looking for was 'Francesca' or 'Francisca' you would have probably searched with a closer pattern, like '%Fran%a':

yugabyte=> select distinct name <-> '%Fran%a' as distance, name
           from baby_names where name like '%Fran%a' order by 1;

 distance |   name
----------+-----------
 0.666667 | Francina
 0.692308 | Francesca
 0.692308 | Francisca
(3 rows)

Ordering on the similarity distance with a limit can suggest the most probable values. If the user doesn't find it she can add more to the pattern.

What about CockroachDB?

This post was inspired by a user moving from CockroachDB to YugabyteDB. He was creating the index without mentioning the range sharding (the ASC) because CRDB doesn't have the equivalent of YugabyteDB hash sharding and defaults to ASC. In YugabyteDB, the default is HASH on the first column, and the index scan for a prefixed LIKE pattern was not used.

With explicit range sharding (and it makes sense to explicitly define ASC or DESC as the performance of ORDER BY may be different on a backward scan) the index scan is possible as in the demo above. The same features as PostgreSQL are available in YugabyteDB, with horizontal scalability in addition to it.

Note that I tried to run the same example on CockroachDB but I was quickly limited by the lack of PostgreSQL compatibility:


root@localhost:26257/defaultdb> select version();
                                       version
--------------------------------------------------------------------------------------
  CockroachDB CCL v22.2.8 (x86_64-pc-linux-gnu, built 2023/04/17 13:22:08, go1.19.6)
(1 row)

# loading from a CSV file

root@localhost:26257/defaultdb> \copy baby_names from 'baby-names.csv?raw=true' with ( skip 1, format csv );

ERROR: at or near "baby-names.csv?raw=true": syntax error: unimplemented: this syntax
SQLSTATE: 0A000
DETAIL: source SQL:
copy baby_names from 'baby-names.csv?raw=true' with ( skip 1, format csv )
                     ^
HINT: You have attempted to use a feature that is not yet implemented.

# creating a GIN index

root@localhost:26257/defaultdb> create index baby_names_trg_name on baby_names using gin ( name gin_trgm_ops);

ERROR: unimplemented: primary key column name cannot be present in an inverted index
SQLSTATE: 0A000
HINT: You have attempted to use a feature that is not yet implemented.

# using PG_TRGM functions

root@localhost:26257/defaultdb> select distinct name <-> '%Fran%' as distance, name from baby_names where name like '%Fran%' order by 1;

invalid syntax: statement ignored: at or near "-": syntax error
SQLSTATE: 42601
DETAIL: source SQL:
select distinct name <-> '%Fran%' as distance, name from baby_names where name like '%Fran%' order by 1
                      ^
HINT: try \h SELECT

The only thing that is working like PostgreSQL is the first example: prefixed LIKE condition on a regular index. I didn't have to create the index as the primary key is already range sharded (I could have done the same in YugabyteDB by creating the table with primary key ( name asc, year, sex )).

YugabyteDB automatic sharding

All those are PostgreSQL commands and work the same in YugabyteDB. The only thing to take care is the sharding method, with the PostgreSQL compatible syntax ASC or DESC to define range sharding when you know you will have range queries. Do not worry about which value to split on: auto-splitting will do its job when the table grows. In the absence of range query access patterns, it is still better to use HASH sharding as it directly distributes the rows. The hash function (you can see it with yb_hash_code()) returns a value from 0x0000 to 0xFFFF so that it can start with multiple shards (ranges of hash values) and can be split further.

If you don't know all your access patterns, looking at the datatype can help. A UUID or number generated from a sequence are probably good candidates for HASH. Timestamps, numbers with arithmetic semantic, and alphabetical text, will probably be queried by ranges and ordered. In the example above, name and year are in this category.

DEV Community

Range indexes for LIKE queries in YugabyteDB

No index: Seq Scan

Range index: Index Scan

Range index: Index Only Scan

GIN index with Trigrams

More about Trigrams

What about CockroachDB?

YugabyteDB automatic sharding

Top comments (0)

Read next

How to connect and query multiple databases with a single REST API

Зоопарк Hibernate: N+1 запросов или как накормить жадного бегемота

Deploying Payload CMS to Fly.io

SaaS multi-tenant web app with user creation and login, in 200 lines of code