DEV Community

Cover image for Shrink UUIDs with PostgreSQL or Ruby
Sven Schwyn
Sven Schwyn

Posted on • Updated on

Shrink UUIDs with PostgreSQL or Ruby

For a pet project of mine, I'm experimenting with alternatives to the classic SERIAL primary keys which are simple, local and performant. However, you don't want to expose them in URLs because this does away with one layer of protection against forced browsing.

Weak serial booking codes were also at the root of the white hat hack of airline passenger details back in 2016.

Let me share my findings with you, but note: This is not a one-size-fits-all thing, you have to pick your primary keys according to your specific needs.

UUIDv7

An interesting alternative to SERIAL are UUIDv7. They are less performant and more memory hungry given the length, but unless you're dealing with a really huge number of records, the advantages might outweigh these downsides: They are time ordered and therefore INDEX-friendly (unlike fully random UUIDv4), yet their entropy is way large enough for external use and to safely expose them in URLs. (Even more so if some rate limiting is in place to slow down brute force attacks.)

However, UUIDs are the IBAN of IDs, a whopping 36 characters in length and therefore not exactly a healthy diet for your URLs. Say, you have unsafe restful routes with URLs looking as follows:

# Unnested route
https://example.org/users/533

# Nested route
https://example.org/users/533/transactions/743
Enter fullscreen mode Exit fullscreen mode

Now make these routes safe by using UUIDv7 as primary key:

# Unnested route
https://example.org/users/26d45f0c-1fa4-4875-aac7-01a5ec424aac

# Nested route
https://example.org/users/26d45f0c-1fa4-4875-aac7-01a5ec424aac/transactions/b436b1f2-711c-4a50-bb44-4e41acb7e5d6
Enter fullscreen mode Exit fullscreen mode

Boom! Even unnested routes are no longer handy. Is it really necessary for those UUIDs to take up so much space? Not really.

Shrink them UUIDs

It's quite easy to reduce the length of UUIDs by almost 40% from 36 to 22 characters without loosing any data or entropy. Here's how to do it in Ruby:

require 'base64'

def uuid_encode(uuid)
  [uuid.gsub('-', '')].
    pack('H*').
    bytes.
    pack('C*').
    then { Base64.urlsafe_encode64(_1) }.
    slice(0, 22)
end

def uuid_decode(encoded_uuid)
  Base64.urlsafe_decode64(encoded_uuid).
    bytes.
    pack('C*').
    unpack1('H*').
    unpack('a8a4a4a4a12').join('-')
end
Enter fullscreen mode Exit fullscreen mode

(I'm looking forward to Ruby 3.4 which will allow to rewrite the above as then { Base64.urlsafe_encode64(it) }, neat!)

Let's quickly do some roundtrips to check whether this really works:

require 'securerandom'

1000000.times do
  uuid = SecureRandom.uuid
  fail 'mismatch' unless uuid == uuid_decode(uuid_encode(uuid))
end
Enter fullscreen mode Exit fullscreen mode

The resulting URLs are quite a bit shorter now:

# Unnested route
https://example.org/users/JtRfDB-kSHWqxwGl7EJKrA

# Nested route
https://example.org/users/JtRfDB-kSHWqxwGl7EJKrA/transactions/tDax8nEcSlC7RE5BrLfl1g
Enter fullscreen mode Exit fullscreen mode

They still don't win a beauty contest, but if that is what you're after, you'd better use slugs where it's safely possible.

PostgreSQL – take over

You might not want to encode and decode UUIDs on the application layer but do this on the database itself:

CREATE OR REPLACE FUNCTION uuid_encode(uuid uuid) RETURNS text AS $$
  SELECT translate(
    encode(
      decode(
        replace(
          uuid::text,
          '-', ''
        ),
        'hex'
      ),
      'base64'
    ),
    '+/=', '-_'
  );
$$ LANGUAGE sql;

CREATE OR REPLACE FUNCTION uuid_decode(encoded_uuid text) RETURNS uuid AS $$
  SELECT regexp_replace(
    encode(
      decode(
        translate(
          encoded_uuid,
          '-_', '+/'
        ) || '==',
        'base64'
      ),
      'hex'
    ),
    '(.{8})(.{4})(.{4})(.{4})(.{12})', '\1-\2-\3-\4-\5'
  )::uuid
$$ LANGUAGE sql;
Enter fullscreen mode Exit fullscreen mode

Again, let's check whether this works:

DO
$do$
DECLARE
  uuid uuid;
BEGIN
 FOR i IN 1..1000000 LOOP
   uuid := gen_random_uuid();
   IF uuid_decode(uuid_encode(uuid)) != uuid THEN
     RAISE EXCEPTION 'mismatch';
   END IF;
 END LOOP;
END
$do$;
Enter fullscreen mode Exit fullscreen mode

Unfortunately, as of PostgreSQL 16, UUIDv7 are not yet supported out of the box. For the time being, use an extension such as pg_uuidv7 or pg_idkit to generate UUIDv7 e.g. as default primary key when you CREATE new records.

More chit chat on Mastodon

(Photo by Magda Ehlers)

Top comments (3)

Collapse
 
svoop profile image
Sven Schwyn • Edited

There's a discussion going on to make UUIDv7 (or similar) part of the PostgreSQL core itself. Depending on when the formal standard is finalized, this feature might even make it to PostgreSQL 17 later this year. πŸŽ‰

Collapse
 
svoop profile image
Sven Schwyn

Here's a little followup: In case you're on Mac or Linux and would like to experiment with UUIDv7 locally, you can use my Homebrew formula to install the pg_uuidv7 extension it on your machine.

Collapse
 
mcadariu profile image
Mircea Cadariu • Edited

Nice post. Could consider adding that you can also generate UUIDv7 directly in SQL, see: gist.github.com/kjmph/5bd772b2c2df...