Sadeq Dousti

Posted on Jul 7, 2024

Database: Indexing issues with random UUIDs

#uuid #index #database #postgres

UUIDs are very popular among developers as the datatype for the identifier of a table. Database administrators, on the other hand, despise UUIDs, as they are not very database friendly: Most UUID types are random, meaning they are all over the place. As a result, the DBMS has to put extra effort to generate, process, store, and retrieve them.

In my opinion, use of UUIDs must generally be confined to the cases where "universally unique" identifiers are needed. For an identifier that is unique within a table or a certain domain, a database sequence number would usually be OK.

There are certain cases where this general statement is debatable. See the Discussion section at the end of this article.

Also, there are versions of UUID that are time-sorted, and are preferred to random ones.

In this post, we will see a few issues with indexing UUID columns. While PostgreSQL is used, I can imagine other DBMSs have similar issues with UUIDs. We also use some "sorted" versions of UUIDs, and compare them with the random ones.

TLDR (from the summary section):

The index on a randomly sorted UUID column is more than 20% larger.

The time required to insert randomly generated UUID values is more than twice the time required to insert ordered UUID values, when the column has an index. In particular, the index creation time is 3-4 times slower.

Setup

The database schema consists of two tables with identical schemas to store random and ordered UUIDs, respectively:

create table t_rnd(id uuid primary key);
create table t_ord(id uuid primary key);

We may also use temporary tables for fast operations (store intermediate data). Note that these tables have no primary keys:

create temp table tmp_rnd(id uuid);
create temp table tmp_ord(id uuid);

I'll use psql as my database client, and specially use its meta-commands to check the timing (\timing) size of tables and indexes (\dt+ and \di+).

The PostgreSQL server is version 16, which supports digit grouping. This makes the code more readable: 10 million is written as 10_000_000.

All timings are reported while running both psql and the PostgreSQL server on a MacBook M1 Pro.

Generating and storing random UUIDs

Since version 13, PostgresQL is equipped with the function gen_random_uuid(). This function returns a version 4 (random) UUID.

Next, we generate 10 million UUIDv4, and store them in the t_rnd table. Timings are reported because I enabled \timing in psql:

insert into t_rnd
select gen_random_uuid()
from generate_series(1, 10_000_000);

This took around 56 seconds.

Look at the table and index information:

postgres=# \dt+ t_rnd

List of relations
┌─[ RECORD 1 ]──┬───────────┐
│ Schema        │ public    │
│ Name          │ t_rnd     │
│ Type          │ table     │
│ Owner         │ postgres  │
│ Persistence   │ permanent │
│ Access method │ heap      │
│ Size          │ 422 MB    │
│ Description   │           │
└───────────────┴───────────┘

postgres=# \di+ t_rnd_pkey

List of relations
┌─[ RECORD 1 ]──┬────────────┐
│ Schema        │ public     │
│ Name          │ t_rnd_pkey │
│ Type          │ index      │
│ Owner         │ postgres   │
│ Table         │ t_rnd      │
│ Persistence   │ permanent  │
│ Access method │ btree      │
│ Size          │ 383 MB     │
│ Description   │            │
└───────────────┴────────────┘

Specially, look at the index size: 383 MB.

Now let's turn our attention to an ordered UUID case.

Generating and storing ordered UUIDs

There are several proposals for ordered UUIDs, such as UUID Version 7 (UUIDv7). But since PostgreSQL does not yet support them natively, let's just "cast" consecutive integers to UUIDs. We will later use an application to generate UUIDv7s for us.

To cast an integer i to UUID, we can use the following code:

lpad(to_hex(i), 32, '0')::uuid

Next, we generate 10 million consecutive UUIDs, and store them in the t_ord table:

insert into t_ord
select lpad(to_hex(i), 32, '0')::uuid
from generate_series(1, 10_000_000) as i;

This took around 24 seconds (as opposed to 56 seconds with random UUIDs).

Look at the table and index information:

postgres=# \dt+ t_ord

List of relations
┌─[ RECORD 1 ]──┬───────────┐
│ Schema        │ public    │
│ Name          │ t_ord     │
│ Type          │ table     │
│ Owner         │ postgres  │
│ Persistence   │ permanent │
│ Access method │ heap      │
│ Size          │ 422 MB    │
│ Description   │           │
└───────────────┴───────────┘

postgres=# \di+ t_ord_pkey

List of relations
┌─[ RECORD 1 ]──┬────────────┐
│ Schema        │ public     │
│ Name          │ t_ord_pkey │
│ Type          │ index      │
│ Owner         │ postgres   │
│ Table         │ t_ord      │
│ Persistence   │ permanent  │
│ Access method │ btree      │
│ Size          │ 301 MB     │
│ Description   │            │
└───────────────┴────────────┘

The index size: 301 MB, which is more than 20% improvement over the random UUIDs.

Was the timing comparison fair?

The generation and storage of UUIDs took:

56 seconds for random UUIDs
24 seconds for ordered UUIDs

But in this post, we are mostly interested in the storage time. So let's generate and store them in the temp table, and then observe the time it takes to store them in the main table. To be completely fair, we recreate the tables:

drop table t_rnd, t_ord;
create table t_rnd(id uuid primary key);
create table t_ord(id uuid primary key);

Next, generate and insert random UUIDs into the temp table:

insert into tmp_rnd
select gen_random_uuid()
from generate_series(1, 10_000_000);

Guess what? It took less than 7 seconds.

insert into t_rnd
select *
from tmp_rnd;

This took around 51 seconds. Table and index sizes are as before with tmp_rnd.

Now let's do the same with ordered UUIDs:

insert into tmp_ord
select lpad(to_hex(i), 32, '0')::uuid
from generate_series(1, 10_000_000) as i;

This also took less than 7 seconds.

insert into t_ord
select *
from tmp_ord;

This took around 21 seconds. Table and index sizes are as before with tmp_ord.

Before we wrap up this section let's check what will happen if we did not have an index to begin with:

create table t_rnd_no_pk(id uuid);
create table t_ord_no_pk(id uuid);

insert into t_rnd_no_pk select * from tmp_rnd;
insert into t_ord_no_pk select * from tmp_ord;

Both took around 10 seconds! So, the majority of slow down is to create an index. The following table summarizes everything we discussed so far:

	Random UUID	Ordered UUID
Time to generate & save to tmp table	7	7
Time save to table without PK	10	10
Time to save to table with PK	51	21
Table size (MB)	422	422
Index size (MB)	384	301

Side note: The 3 second (10-7) is spent to write the data to WAL, which is not required for tmp tables.

It follows that the time to create an index is 51-10 = 41 seconds with random UUIDs, and 21-10 = 11 seconds with ordered UUIDs. This means an almost 400% speed up! But why?

B-Tree indexes

The reason why the random UUID indexes are bigger and slower are explained in Vlad Mihalcea's excellent blog, which I quote here verbatim:

Indexing random values using B+Tree causes a lot of problems:

Index pages will have a very low fill factor because the values come randomly. So, a page of 8kB will end up storing just a few elements, therefore wasting a lot of space, both on the disk and in the database memory, as index pages could be cached in the Buffer Pool.

Because the B+Tree index needs to rebalance itself in order to maintain its equidistant tree structure, the random key values will cause more index page splits and merges as there is no predetermined order of filling the tree structure.

The low "fill factor" is the reason why they are larger, and the more index page splits and merges are why they are built slower.

Fill factor

To showcase the fill factor issue, simply install the pgstattuple extension, and then query both indexes:

create extension if not exists pgstattuple;

select r.avg_leaf_density as rnd_ff,
       o.avg_leaf_density ord_ff,
       r.leaf_fragmentation as rnd_frag,
       o.leaf_fragmentation as ord_frag
from pgstatindex('t_rnd_pkey') as r, 
     pgstatindex('t_ord_pkey') as o;

result:

┌────────┬────────┬──────────┬──────────┐
│ rnd_ff │ ord_ff │ rnd_frag │ ord_frag │
├────────┼────────┼──────────┼──────────┤
│  71.19 │  90.04 │    49.77 │        0 │
└────────┴────────┴──────────┴──────────┘

The random index has a 20% lower fill factor, and almost 50% fragmentation [1].

👉 The default PostgreSQL B-Tree fill factor is 90%, which is achieved by the ordered index.

Side Note: If you reindex the t_rnd_pkey index, it will also achieve the 90% fill factor, as all the data required to build the index is readily available. You can also try creating a new index with
create index t_rnd_idx on t_rnd(id);
In my case, it took only 6 seconds, and achieved the 90% fill factor without fragmentation.
Also, try the following to achieve a higher fill factor, though higher is not necessarily better!
create index t_rnd_idx_2 on t_rnd(id) with (fillfactor = 100);

Index page splits and merges

It's rather hard to demonstrate B-Tree node splits and merges, as PostgreSQL does not readily make these stats available. However, we may track the number of index pages over time, after each insert to the table. The following code inserts 100K random and ordered UUIDs into a separate table, and keeps track of leaf_pages of each index in a separate table. It then export the tracked data to /tmp/data.csv for further analysis:

create table demo_rnd(id uuid primary key);
create table demo_ord(id uuid primary key);

create table demo_idx as
select
    0 as i,
    r.leaf_pages as rnd,
    o.leaf_pages as ord
from
    pgstatindex ('demo_rnd_pkey') r,
    pgstatindex ('demo_ord_pkey') o;

do $do$
begin
    for i in 1..100_000 loop
        insert into demo_rnd
        select gen_random_uuid();

        insert into demo_ord
        select lpad(to_hex(i), 32, '0')::uuid;

        insert into demo_idx
        select
            i,
            r.leaf_pages as rnd,
            o.leaf_pages as ord
        from
            pgstatindex ('demo_rnd_pkey') r,
            pgstatindex ('demo_ord_pkey') o;

        commit;
end loop;
end
$do$;

create index on demo_idx (i);

\copy (select * from demo_idx order by i) to /tmp/data.csv csv header;

The following Python3 code is used to show the number of leaf_pages after each insert for both random and ordered UUIDs:

import pandas as pd
import matplotlib.pyplot as plt

# Read the CSV file
data = pd.read_csv('/tmp/data.csv')

# Plot the data
plt.figure(figsize=(10, 6))
plt.plot(data['i'], data['rnd'], label='rnd')
plt.plot(data['i'], data['ord'], label='ord')
plt.xlabel('i')
plt.ylabel('leaf_pages')
plt.title('Plot of rnd and ord leaf_pages against i')
plt.legend()
plt.grid(True)
plt.show()

Here's a zoomed-in version of the above plot:

Since we only inserted into the tables, no node merges happened in the B-Tree. However, we can easily see node splits by observing the index page count increase: The random UUID index - blue graph - splits into many pages in an accelerated pace, and then slowly fills them in. However, the ordered UUID index - orange line - splits the pages and fills them steadily. The slope of this line corresponds to the 90% fill factor.

UUIDv7

To generate UUIDv7 in PostgreSQL, we can use the third-party extension pg_uuidv7 [2]. The x86_64 Linux module is pre-compiled and released. But for MacOS, I just ran make && make install, and then installed the extension by running

create extension if not exists pg_uuidv7;

postgres=# select uuid_generate_v7();

┌──────────────────────────────────────┐
│           uuid_generate_v7           │
├──────────────────────────────────────┤
│ 01908e8d-a5f7-74cd-8da5-ca7a4b6580fc │
└──────────────────────────────────────┘

The timings are quite similar with the ordered UUID we concocted before:

create temp table tmp_7(id uuid);
create table t_7_no_pk(id uuid);
create table t_7(id uuid primary key);


insert into tmp_7
select uuid_generate_v7()
from generate_series(1, 10_000_000);
-- Time: 7132.475 ms (00:07.132)

insert into t_7_no_pk select * from tmp_7;
-- Time: 9651.792 ms (00:09.652)

insert into t_7 select * from tmp_7;
-- Time: 24832.326 ms (00:24.832)

The table size is identical to before (422 MB), but the index size is very large (402 MB), even larger than the random UUIDs! Looking at the stats, we see a 67% average leaf density:

postgres=# select * from pgstatindex('t_7_pkey');

┌─[ RECORD 1 ]───────┬───────────┐
│ version            │ 4         │
│ tree_level         │ 2         │
│ index_size         │ 421830656 │
│ root_block_no      │ 295       │
│ internal_pages     │ 252       │
│ leaf_pages         │ 51240     │
│ empty_pages        │ 0         │
│ deleted_pages      │ 0         │
│ avg_leaf_density   │ 67.42     │
│ leaf_fragmentation │ 44.06     │
└────────────────────┴───────────┘

This came as a surprise to me: The index is both faster and larger than the index on the random UUIDs! Looking deeper, I found out that the extension does NOT generate time-sorted UUIDs (see also [3]):

select i, uuid_generate_v7()
from generate_series(1, 10) as i
order by 2;

┌────┬──────────────────────────────────────┐
│ i  │           uuid_generate_v7           │
├────┼──────────────────────────────────────┤
│  4 │ 01908eee-c083-7029-9ad0-9d12ba763d90 │
│ 10 │ 01908eee-c083-715d-a88e-e36973ae964f │
│  1 │ 01908eee-c083-7357-aaa8-ba06f7232f2d │
│  6 │ 01908eee-c083-757f-b47d-40156da5f2f7 │
│  9 │ 01908eee-c083-7595-8a37-9024f862217d │
│  7 │ 01908eee-c083-75f1-b9f1-b031f29263c5 │
│  8 │ 01908eee-c083-7607-9003-d001115fb89b │
│  5 │ 01908eee-c083-7a43-b726-e1eca4f15698 │
│  3 │ 01908eee-c083-7ea6-a572-1b6be63c8b7a │
│  2 │ 01908eee-c083-7fa0-b565-6e91d905fb1d │
└────┴──────────────────────────────────────┘

Other UUIDv7 implementations that I've seen, such as the Java implementation uuid-creator [4], guarantee a time sorted output by using a (randomly initialized) counter.

To simulate this a time-sorted sequence, let's cluster the table tmp_7 based on the id first. Here's the full SQL code:

create temp table tmp_7(id uuid);
create table t_7_no_pk(id uuid);
create table t_7(id uuid primary key);

insert into tmp_7
select uuid_generate_v7()
from generate_series(1, 10_000_000);
-- Time: 6945.381 ms (00:06.945)

--------------------------------------------
-- Clustering tmp_7 ensures proper ordering
--------------------------------------------
create index on tmp_7(id);
-- Time: 3633.888 ms (00:03.634)
cluster tmp_7 using tmp_7_id_idx;
-- Time: 4858.199 ms (00:04.858)


insert into t_7_no_pk select * from tmp_7;
-- Time: 9012.780 ms (00:09.013)

insert into t_7 select * from tmp_7;
-- Time: 27749.267 ms (00:27.749)

We can now see that the index size is 301 MB, as expected.

Summary

In this analytical post, we found out two important facts about indexing UUID columns in PostgreSQL:

The index on a randomly sorted UUID column is more than 20% larger.
The time required to insert randomly generated UUID values is more than twice the time required to insert ordered UUID values, when the column has an index. In particular, the index creation time is 3-4 times slower.

Discussion

In the beginning of this article, I stated a personal opinion:

In my opinion, use of UUIDs must generally be confined to the cases where "universally unique" identifiers are needed. For an identifier that is unique within a table or a certain domain, a database sequence number would usually be OK.

Some of my colleagues reached out and pointed out to counter-arguments:

Sequential identifiers are not an option when they are exposed to the outside, and the business rules require them to be "non-guessable".
There is a certain value in the ability to assign an ID to an entity BEFORE going to the database. You can batch generate them, use any algorithm you like (snowflake, ULID, non-standard UUIDs, etc.). This works great with non-relational databases (some of them don't support auto-increment IDs at all).
Easier to correlate in the context of mid/large organization. Auto-increment IDs will always coincide between different entities (i.e., you can have an order with ID = 10, as well as a trade, a booking, and a reservation, with the same ID). Having UUID protects you from accidentally looking at the wrong entity.

These are all valid arguments, but there are sometimes workarounds that you might be interested in:

Regarding batch sequence generation, some DB sequences can be batch-generated too [5]: But one DB query is necessary anyway to fetch all those values.
Regarding the distinctness from all other entities: In this case, we can use the numbers internally, but convert them to a unique, string format when communicating with external systems. For instance, the trade ID is 10 internally, but whenever we share it in events, REST responses, or logs, we use Trade-0000000010 as the ID. Also, whenever we get this string format from a request, we need to map it to the numeric value.

Footnotes

[1] leaf_fragmentation is the percentage of leaf pages where the following leaf page has a lower block number, and might not be a good indication for anything. See Laurenz Albe's answer.

[2] UUIDv7 extension for PostgreSQL

[3] Comment on UUIDv7 extension

[4] uuid-creator

[5] PostgreSQL Sequence Cache

DEV Community

Database: Indexing issues with random UUIDs

TLDR (from the summary section):

Setup

Generating and storing random UUIDs

Generating and storing ordered UUIDs

Was the timing comparison fair?

B-Tree indexes

Fill factor

Index page splits and merges

UUIDv7

Summary

Discussion

Footnotes

Top comments (0)

Read next

How to Perform Bulk Insert with EF Core

Top 5 Text2SQL Tools for Effortless SQL Generation

Prisma & MongoDB: server to be run as a replica set

Postgres TypeScript bindings for "stored procedures"