Siva Sankaran

Posted on May 3

UUID vs UUIDv7 vs Snowflake ID: Choosing the Right Identifier for Backend Systems

#database #systemdesign #distributedsystems #interview

ID generation looks like a small backend decision.

In many systems, we simply add an id column, make it the primary key, and move on. But once the table grows, this decision can affect database performance, indexing, pagination, debugging, and how easily the system scales across services.

The common choices are:

UUIDv4
UUIDv7
Snowflake ID

Each one solves the uniqueness problem, but they behave differently in real backend systems.

UUIDv4: Simple, Random, and Widely Used

UUIDv4 is one of the most commonly used ID formats in backend applications.

A UUIDv4 looks like this:

550e8400-e29b-41d4-a716-446655440000

It is mostly random and can be generated independently by different services.

That makes UUIDv4 useful when we want distributed ID generation without depending on a central database or ID generator.

For example:

user_id
document_id
payment_id
report_request_id

But UUIDv4 has one important problem.

It is random.

This matters when UUIDv4 is used as the primary key in a relational database like PostgreSQL.

The PostgreSQL Problem with UUIDv4

Let’s take a simple invoice table.

CREATE TABLE invoice (
    id UUID PRIMARY KEY,
    customer_id UUID NOT NULL,
    amount NUMERIC(10, 2),
    status VARCHAR(30),
    created_at TIMESTAMP NOT NULL
);

Here, id is the primary key, so PostgreSQL automatically creates an index on it.

But in real APIs, we usually do not fetch invoices only by ID. Most APIs show the latest invoices first.

Example:

SELECT *
FROM invoice
WHERE customer_id = :customerId
ORDER BY created_at DESC
LIMIT 50;

To make this query fast, we usually add another index.

CREATE INDEX idx_invoice_created_at
ON invoice (created_at);

Now every insert has to update at least two indexes:

Primary key index on id
Secondary index on created_at

This is where UUIDv4 can hurt.

Because UUIDv4 is random, new records do not always get inserted near the end of the primary key index. They can land anywhere inside the B-tree index.

This can increase:

random index page writes
page splits
cache misses
index maintenance cost

So with UUIDv4, we often pay twice:

One random primary key index for uniqueness, 
and one timestamp index for sorting by creation time.

For small tables, this may not matter.

But for write-heavy tables like invoice events, audit logs, report requests, payment transactions, or message events, this can become expensive.

UUIDv7: UUID, But Time-Ordered

UUIDv7 solves this problem better.

It is still a UUID. It is still globally unique. It can still be generated independently by different services.

But unlike UUIDv4, UUIDv7 is time-ordered.

That means newer IDs are usually greater than older IDs.

So instead of this behavior:

UUIDv4: random insert location in the index

we get this:

UUIDv7: mostly increasing insert location in the index

This makes UUIDv7 much more database-friendly.

PostgreSQL still has to maintain indexes, but the primary key index becomes easier to handle because new rows are inserted closer to the end of the index instead of random positions.

UUIDv7 helps with:

better index locality
smoother insert behavior
easier cursor pagination
easier debugging because IDs roughly follow creation time
distributed ID generation without a central coordinator

One important point:

UUIDv7 does not always remove the need for `created_at`.

We should still keep created_at because it is clear, readable, and useful for business queries and reporting.

But UUIDv7 reduces the performance penalty of using UUID as the primary key.

So for most new backend systems, UUIDv7 is a better default than UUIDv4.

Snowflake ID: Compact and High-Scale

Snowflake IDs are different.

They are usually 64-bit numeric IDs.

Example:

175928847299117063

A Snowflake-style ID usually contains:

Timestamp + worker ID + sequence number

This makes the ID compact, sortable, and efficient for large-scale systems.

Compared to UUID:

UUID: 128 bits
Snowflake ID: 64 bits

That means Snowflake IDs usually need less storage and create smaller indexes.

Snowflake IDs are useful when we need very high-throughput ID generation, especially in systems like:

social media posts
chat messages
notifications
event streams
orders
logs

But Snowflake IDs come with operational complexity.

We need to manage worker IDs correctly. We need to handle clock rollback. We need to make sure two nodes do not generate the same ID with the same worker ID and sequence.

If worker ID coordination is wrong, Snowflake can generate duplicate IDs.

So Snowflake is powerful, but it requires more infrastructure discipline.

Simple Decision Rule

Use UUIDv4 when:

The ID is mostly used as an external or public reference.
Database write volume is not very high.
We do not care about sorting by ID.
We want simple random opaque identifiers.

Use UUIDv7 when:

We are building a new backend service.
We use PostgreSQL or another relational database.
The ID is the primary key.
The table is expected to grow large.
We want distributed ID generation without coordination.
We want IDs to roughly follow creation time.

Use Snowflake ID when:

We need compact 64-bit numeric IDs.
We generate IDs at very high scale.
We need sortable IDs.
We can safely manage worker IDs and clock behavior.

Final Recommendation

For most Spring Boot microservices using PostgreSQL, UUIDv7 is the best default today.

UUIDv4 is still useful for random external references, but it is not always ideal as a primary key for large write-heavy tables.

Snowflake IDs are excellent for very high-scale systems, but they add operational complexity because we must manage worker IDs, clock behavior, and sequence generation.

A simple recommendation is:

Use UUIDv7 as the default primary key for new backend tables.
Use UUIDv4 for opaque public references.
Use Snowflake only when compact, high-throughput, 
time-ordered numeric IDs are worth the extra operational complexity.

The main lesson is this:

An ID is not just an ID. In backend systems, 
the ID becomes part of the database write path, 
indexing strategy, API design, and scaling model.

Choosing the right one early can save us from painful migrations later.

DEV Community