This month's PGSQL Phriday #015 discusses using UUID in PostgreSQL. I have previously written on this topic (UUID or cached sequences?) , and I won...
For further actions, you may consider blocking this person and/or reporting abuse
Thanks for sharing it, Franck. Excellent article 👏🏻
I'm not a database specialist like you, but as an application developer, I try to take the best out of the databases I work with as much as possible. I've been studying this theme since I wrote an article about it (in pt_BR 🇧🇷). So, I could sum up the "UUID as a PK" issue with RDBMS as follows:
The more random the key is, the worse it is for the database performance, especially with MySQL and MSSQL Server that use clustered indexes by default (that's why UUIDv4 might be terrible);
If your dataset fits into memory (buffer cache and OS cache), I would say it does not matter if you're using UUIDv4 or a sequential key;
I think this issue we'll be in the past as soon as UUIDv7 becomes a standard;
Well, although I still favor and recommend using sequential keys (int/bigint) over UUID as PKs, I also understand it will not be a big issue for the majority of microservices out there since their data and workload were partitioned already 😊
One other consideration that goes all the way to the frontend is Relay and GraphQL. In particular, Relay expects a globally unique identifier graphql.org/learn/global-object-id... . Of course generally the object type (table) is automatically appended to the ID so there's no need to specifically choose UUID. But supposedly you can use both in the same table (have an autoincrementing sequence plus another column with UUID and a non-clustered index). That is probably the best of both worlds.
Even more interesting is using uuidv7 (draft RFC) generated in the application. The best thing is that it is incremental unlike uuidv4 you use typically. Uuidv7 is compatible with v4, that is postgres can store it as uuid type, and you can sort by it easily.
@franckpachot, could you talk a little bit more about those 2 statements? Are they related? I got a bit confused by them.
And this one:
As ignorant end user my take away is: that’s awesome, I can request from vendor to integrate it and make available into our postgresql based system as in our line of work we need UUID in virtually every business correspondence with government agencies.
Yes, it makes sense to have a UUID when it is visible to the user and I suppose it must be random (not disclosing the insert time). Just know that B-Tree on them will be larger and see random writes so it is important that they fit in memory cache