ID generation looks like a small backend decision.
In many systems, we simply add an id column, make it the primary key, and move on. But once the table grows, this decision can affect database performance, indexing, pagination, debugging, and how easily the system scales across services.
The common choices are:
- UUIDv4
- UUIDv7
- Snowflake ID
Each one solves the uniqueness problem, but they behave differently in real backend systems.
UUIDv4: Simple, Random, and Widely Used
UUIDv4 is one of the most commonly used ID formats in backend applications.
A UUIDv4 looks like this:
550e8400-e29b-41d4-a716-446655440000
It is mostly random and can be generated independently by different services.
That makes UUIDv4 useful when we want distributed ID generation without depending on a central database or ID generator.
For example:
user_iddocument_idpayment_idreport_request_id
But UUIDv4 has one important problem.
It is random.
This matters when UUIDv4 is used as the primary key in a relational database like PostgreSQL.
The PostgreSQL Problem with UUIDv4
Letβs take a simple invoice table.
CREATE TABLE invoice (
id UUID PRIMARY KEY,
customer_id UUID NOT NULL,
amount NUMERIC(10, 2),
status VARCHAR(30),
created_at TIMESTAMP NOT NULL
);
Here, id is the primary key, so PostgreSQL automatically creates an index on it.
But in real APIs, we usually do not fetch invoices only by ID. Most APIs show the latest invoices first.
Example:
SELECT *
FROM invoice
WHERE customer_id = :customerId
ORDER BY created_at DESC
LIMIT 50;
To make this query fast, we usually add another index.
CREATE INDEX idx_invoice_created_at
ON invoice (created_at);
Now every insert has to update at least two indexes:
- Primary key index on
id - Secondary index on
created_at
This is where UUIDv4 can hurt.
Because UUIDv4 is random, new records do not always get inserted near the end of the primary key index. They can land anywhere inside the B-tree index.
This can increase:
- random index page writes
- page splits
- cache misses
- index maintenance cost
So with UUIDv4, we often pay twice:
One random primary key index for uniqueness,
and one timestamp index for sorting by creation time.
For small tables, this may not matter.
But for write-heavy tables like invoice events, audit logs, report requests, payment transactions, or message events, this can become expensive.
UUIDv7: UUID, But Time-Ordered
UUIDv7 solves this problem better.
It is still a UUID. It is still globally unique. It can still be generated independently by different services.
But unlike UUIDv4, UUIDv7 is time-ordered.
That means newer IDs are usually greater than older IDs.
So instead of this behavior:
UUIDv4: random insert location in the index
we get this:
UUIDv7: mostly increasing insert location in the index
This makes UUIDv7 much more database-friendly.
PostgreSQL still has to maintain indexes, but the primary key index becomes easier to handle because new rows are inserted closer to the end of the index instead of random positions.
UUIDv7 helps with:
- better index locality
- smoother insert behavior
- easier cursor pagination
- easier debugging because IDs roughly follow creation time
- distributed ID generation without a central coordinator
One important point:
UUIDv7 does not always remove the need for `created_at`.
We should still keep created_at because it is clear, readable, and useful for business queries and reporting.
But UUIDv7 reduces the performance penalty of using UUID as the primary key.
So for most new backend systems, UUIDv7 is a better default than UUIDv4.
Snowflake ID: Compact and High-Scale
Snowflake IDs are different.
They are usually 64-bit numeric IDs.
Example:
175928847299117063
A Snowflake-style ID usually contains:
Timestamp + worker ID + sequence number
This makes the ID compact, sortable, and efficient for large-scale systems.
Compared to UUID:
UUID: 128 bits
Snowflake ID: 64 bits
That means Snowflake IDs usually need less storage and create smaller indexes.
Snowflake IDs are useful when we need very high-throughput ID generation, especially in systems like:
- social media posts
- chat messages
- notifications
- event streams
- orders
- logs
But Snowflake IDs come with operational complexity.
We need to manage worker IDs correctly. We need to handle clock rollback. We need to make sure two nodes do not generate the same ID with the same worker ID and sequence.
If worker ID coordination is wrong, Snowflake can generate duplicate IDs.
So Snowflake is powerful, but it requires more infrastructure discipline.
Simple Decision Rule
Use UUIDv4 when:
- The ID is mostly used as an external or public reference.
- Database write volume is not very high.
- We do not care about sorting by ID.
- We want simple random opaque identifiers.
Use UUIDv7 when:
- We are building a new backend service.
- We use PostgreSQL or another relational database.
- The ID is the primary key.
- The table is expected to grow large.
- We want distributed ID generation without coordination.
- We want IDs to roughly follow creation time.
Use Snowflake ID when:
- We need compact 64-bit numeric IDs.
- We generate IDs at very high scale.
- We need sortable IDs.
- We can safely manage worker IDs and clock behavior.
Final Recommendation
For most Spring Boot microservices using PostgreSQL, UUIDv7 is the best default today.
UUIDv4 is still useful for random external references, but it is not always ideal as a primary key for large write-heavy tables.
Snowflake IDs are excellent for very high-scale systems, but they add operational complexity because we must manage worker IDs, clock behavior, and sequence generation.
A simple recommendation is:
Use UUIDv7 as the default primary key for new backend tables.
Use UUIDv4 for opaque public references.
Use Snowflake only when compact, high-throughput,
time-ordered numeric IDs are worth the extra operational complexity.
The main lesson is this:
An ID is not just an ID. In backend systems,
the ID becomes part of the database write path,
indexing strategy, API design, and scaling model.
Choosing the right one early can save us from painful migrations later.
Top comments (0)