I was stress-testing a message pipeline. Thousands of messages flying through queues, each needing a unique ID. The code looked fine. The network looked fine. But throughput kept hitting a ceiling around 400,000 messages/second — and refused to go higher.
After some profiling, I found the culprit: uuid::Uuid::now_v7().
Not the queue. Not the serializer. The ID generator.
Why UUID v7?
UUID v7 is a relatively recent RFC (finalized in 2024). Unlike v4 (pure random), v7 embeds a millisecond-precision Unix timestamp in the top 48 bits. That makes them naturally sortable — great for database primary keys, message IDs, log correlation, anywhere you want "roughly time-ordered" uniqueness without a central counter.
The layout looks like this:
|← 48 bits (ms timestamp) →|← 4 bits ver →|← 12 bits rand →|← 2 bits var →|← 62 bits rand →|
The problem: generating them correctly requires randomness. And the uuid crate, by default, uses a cryptographically secure RNG for those 74 random bits. That's OsRng — which means a syscall on every generation. Safe, correct, and very slow for high-throughput use.
The Bottleneck
// This was killing my throughput
let id = Uuid::now_v7(); // ~1400ns per call on macOS
~1.4 microseconds. Doesn't sound like much. But at scale:
- 1,400 ns × 1,000,000 = 1.4 seconds just generating IDs for 1M messages
- That's the 400k/s ceiling right there
Note: This is primarily a macOS problem. On Linux and Windows, the
uuidcrate's defaultOsRngis significantly cheaper — the gap there is roughly 10× slower thanfast-uuid-v7, not 165×. If you're on Linux in production, this may be less critical. But if you develop on a Mac and deploy to Linux, it's still worth knowing about.
The uuid crate does have a fast-rng feature flag that switches to a faster RNG. I only found out about it after building my own solution. It does help — but I'd advise against using it casually: enabling fast-rng is a global flag that affects all UUID generation in your binary, including v4 UUIDs you might be generating elsewhere for security-sensitive purposes (session tokens, CSRF tokens, etc.). Swapping out cryptographic randomness application-wide is a non-obvious footgun. fast-uuid-v7 keeps the fast path opt-in and explicit.
What I Built
fast-uuid-v7 — a focused Rust library that generates UUID v7 compatible IDs as fast as possible, without breaking the spec.
The core ideas:
1. Thread-Local State (No Locks)
thread_local! {
static RNG: RefCell<SmallRng> = RefCell::new(SmallRng::from_entropy());
}
Each thread gets its own RNG and counter. No mutexes, no atomics, no contention. This alone removes a significant source of overhead in multi-threaded workloads.
2. Amortized Timestamps
SystemTime::now() costs ~20–40 ns. Calling it for every ID is wasteful. Instead, we use CPU cycle counters (rdtsc on x86_64, cntvct_el0 on ARM) to cheaply detect whether a millisecond has elapsed. We only call the real system clock when needed.
3. SmallRng
We swap out OsRng for SmallRng from the rand crate — a fast, non-cryptographic PRNG. It's seeded once per thread from entropy. Not suitable for cryptography, but perfectly fine for database keys.
4. Stack-Allocated String Formatting
// Zero allocation — returns a stack-allocated FixedString
let id = gen_id_str(); // ~21–60 ns
The canonical UUID string representation (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) is 36 bytes. We write it directly into a stack buffer and return a type that implements Deref<Target=str>. No heap allocation at all.
Usage
[dependencies]
fast-uuid-v7 = "0.1"
use fast_uuid_v7::{gen_id, gen_id_str, gen_id_string};
// u128 — fastest, 74 bits random (~8–50 ns)
let id: u128 = gen_id();
// Stack string — zero allocation (~21–60 ns)
let id = gen_id_str();
println!("{}", id); // e.g. "01942f3a-bc12-7d4e-8f01-2b3c4d5e6f70"
// Heap String — for when you need an owned String (~85–130 ns)
let id = gen_id_string();
// Format an u128 as &str on the stack
let id = format_uuid(gen_id());
Benchmarks
On Apple M1 / recent x86_64:
| Method | Time |
|---|---|
uuid::Uuid::now_v7() (default) |
~1400 ns |
uuid::Uuid::now_v7() (fast-rng feature) |
~90 ns (u128), ~170 ns (string) |
fast_uuid_v7::gen_id() |
~8–50 ns |
fast_uuid_v7::gen_id_str() |
~21–60 ns |
fast_uuid_v7::gen_id_string() |
~85–130 ns |
That's up to 165× faster than the default uuid crate, and still 8–10× faster than uuid with fast-rng.
Generating 10 million IDs takes roughly 95 ms on a single core.
To benchmark the repository code on your own machine:
cargo bench
# or
cargo test --release -- test_next_id_performance --nocapture
Caveats
To be upfront about the limitations:
-
Not cryptographically secure. Do not use this for session tokens, secrets, or anything security-sensitive. Use the
uuidcrate withOsRngfor those. - Clock drift edge cases. The batched timestamp check assumes CPU counter frequency stability. VM migrations or unusual scheduling could cause a ~1ms timestamp lag. This will not happen on high throughput and I also couldn't replicate it, but this may be an issue in theory.
-
SystemTime::now()is still called periodically. The ~8 ns figure is the amortized hot-path cost. When a millisecond boundary is detected, the call drops to ~50 ns — still much faster than the baseline.
When Should You Use This?
Use fast-uuid-v7 if:
- You're generating IDs at very high rates (hundreds of thousands per second)
- You need time-sortable IDs (database PKs, log IDs, message IDs)
- Security/unpredictability of the random component is not a requirement
Stick with uuid if:
- You need cryptographically secure randomness
- Generation rate is not a bottleneck
- You want RFC-strict compliance guarantees
- You need parsing (!) - fast-uuid-v7 doesn't do any uuid parsing.
AI Disclaimer
This text here was mostly written with AI assistance. I am not a native speaker and an LLM can use much better words than I ever could. Also, some of the ideas behind fast-uuid-v7 were researched with AI assistance. The original idea, the actual optimization, benchmarks and everything else was my input.
Repository
The crate is MIT licensed and available on crates.io and GitHub.
Feedback, benchmarks on your hardware, and PRs are welcome.
Top comments (0)