DEV Community

UUID in PostgreSQL

Franck Pachot on January 27, 2024

This month's PGSQL Phriday #015 discusses using UUID in PostgreSQL. I have previously written on this topic (UUID or cached sequences?) , and I won...

Read full post

Rafael Ponte • Jan 29 '24

Thanks for sharing it, Franck. Excellent article 👏🏻

I'm not a database specialist like you, but as an application developer, I try to take the best out of the databases I work with as much as possible. I've been studying this theme since I wrote an article about it (in pt_BR 🇧🇷). So, I could sum up the "UUID as a PK" issue with RDBMS as follows:

The more random the key is, the worse it is for the database performance, especially with MySQL and MSSQL Server that use clustered indexes by default (that's why UUIDv4 might be terrible);
If your dataset fits into memory (buffer cache and OS cache), I would say it does not matter if you're using UUIDv4 or a sequential key;
I think this issue we'll be in the past as soon as UUIDv7 becomes a standard;

Well, although I still favor and recommend using sequential keys (int/bigint) over UUID as PKs, I also understand it will not be a big issue for the majority of microservices out there since their data and workload were partitioned already 😊

Mateusz Koteja • Jan 28 '24

Even more interesting is using uuidv7 (draft RFC) generated in the application. The best thing is that it is incremental unlike uuidv4 you use typically. Uuidv7 is compatible with v4, that is postgres can store it as uuid type, and you can sort by it easily.

Brian Lim • Jan 29 '24

One other consideration that goes all the way to the frontend is Relay and GraphQL. In particular, Relay expects a globally unique identifier graphql.org/learn/global-object-id... . Of course generally the object type (table) is automatically appended to the ID so there's no need to specifically choose UUID. But supposedly you can use both in the same table (have an autoincrementing sequence plus another column with UUID and a non-clustered index). That is probably the best of both worlds.

Rafael Ponte • Jan 29 '24

@franckpachot, could you talk a little bit more about those 2 statements? Are they related? I got a bit confused by them.

It's also important to keep in mind that certain database limitations, such as PostgreSQL declarative partitioning or Citus sharding, may limit the possibilities of enforcing uniqueness through global indexes.

And this one:

While a UUID provides a high probability of uniqueness, a sequence strictly guaranties it, so if you cannot create all the unique constraints, a sequence may be the safer choice.

KRowin • Jan 28 '24

As ignorant end user my take away is: that’s awesome, I can request from vendor to integrate it and make available into our postgresql based system as in our line of work we need UUID in virtually every business correspondence with government agencies.

Franck Pachot AWS Heroes • Jan 28 '24

Yes, it makes sense to have a UUID when it is visible to the user and I suppose it must be random (not disclosing the insert time). Just know that B-Tree on them will be larger and see random writes so it is important that they fit in memory cache