DEV Community

Marco Mengelkoch
Marco Mengelkoch

Posted on

I Hit a 400k/s Wall — So I Built a Faster UUID v7 Generator in Rust

I was stress-testing a message pipeline. Thousands of messages flying through queues, each needing a unique ID. The code looked fine. The network looked fine. But throughput kept hitting a ceiling around 400,000 messages/second — and refused to go higher.

After some profiling, I found the culprit: uuid::Uuid::now_v7().

Not the queue. Not the serializer. The ID generator.


Why UUID v7?

UUID v7 is a relatively recent RFC (finalized in 2024). Unlike v4 (pure random), v7 embeds a millisecond-precision Unix timestamp in the top 48 bits. That makes them naturally sortable — great for database primary keys, message IDs, log correlation, anywhere you want "roughly time-ordered" uniqueness without a central counter.

The layout looks like this:

|← 48 bits (ms timestamp) →|← 4 bits ver →|← 12 bits rand →|← 2 bits var →|← 62 bits rand →|
Enter fullscreen mode Exit fullscreen mode

The problem: generating them correctly requires randomness. And the uuid crate, by default, uses a cryptographically secure RNG for those 74 random bits. That's OsRng — which means a syscall on every generation. Safe, correct, and very slow for high-throughput use.


The Bottleneck

// This was killing my throughput
let id = Uuid::now_v7(); // ~1400ns per call on macOS
Enter fullscreen mode Exit fullscreen mode

~1.4 microseconds. Doesn't sound like much. But at scale:

  • 1,400 ns × 1,000,000 = 1.4 seconds just generating IDs for 1M messages
  • That's the 400k/s ceiling right there

Note: This is primarily a macOS problem. On Linux and Windows, the uuid crate's default OsRng is significantly cheaper — the gap there is roughly 10× slower than fast-uuid-v7, not 165×. If you're on Linux in production, this may be less critical. But if you develop on a Mac and deploy to Linux, it's still worth knowing about.

The uuid crate does have a fast-rng feature flag that switches to a faster RNG. I only found out about it after building my own solution. It does help — but I'd advise against using it casually: enabling fast-rng is a global flag that affects all UUID generation in your binary, including v4 UUIDs you might be generating elsewhere for security-sensitive purposes (session tokens, CSRF tokens, etc.). Swapping out cryptographic randomness application-wide is a non-obvious footgun. fast-uuid-v7 keeps the fast path opt-in and explicit.


What I Built

fast-uuid-v7 — a focused Rust library that generates UUID v7 compatible IDs as fast as possible, without breaking the spec.

The core ideas:

1. Thread-Local State (No Locks)

thread_local! {
    static RNG: RefCell<SmallRng> = RefCell::new(SmallRng::from_entropy());
}
Enter fullscreen mode Exit fullscreen mode

Each thread gets its own RNG and counter. No mutexes, no atomics, no contention. This alone removes a significant source of overhead in multi-threaded workloads.

2. Amortized Timestamps

SystemTime::now() costs ~20–40 ns. Calling it for every ID is wasteful. Instead, we use CPU cycle counters (rdtsc on x86_64, cntvct_el0 on ARM) to cheaply detect whether a millisecond has elapsed. We only call the real system clock when needed.

3. SmallRng

We swap out OsRng for SmallRng from the rand crate — a fast, non-cryptographic PRNG. It's seeded once per thread from entropy. Not suitable for cryptography, but perfectly fine for database keys.

4. Stack-Allocated String Formatting

// Zero allocation — returns a stack-allocated FixedString
let id = gen_id_str(); // ~21–60 ns
Enter fullscreen mode Exit fullscreen mode

The canonical UUID string representation (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx) is 36 bytes. We write it directly into a stack buffer and return a type that implements Deref<Target=str>. No heap allocation at all.


Usage

[dependencies]
fast-uuid-v7 = "0.1"
Enter fullscreen mode Exit fullscreen mode
use fast_uuid_v7::{gen_id, gen_id_str, gen_id_string};

// u128 — fastest, 74 bits random (~8–50 ns)
let id: u128 = gen_id();

// Stack string — zero allocation (~21–60 ns)
let id = gen_id_str();
println!("{}", id); // e.g. "01942f3a-bc12-7d4e-8f01-2b3c4d5e6f70"

// Heap String — for when you need an owned String (~85–130 ns)
let id = gen_id_string();

// Format an u128 as &str on the stack
let id = format_uuid(gen_id());

Enter fullscreen mode Exit fullscreen mode

Benchmarks

On Apple M1 / recent x86_64:

Method Time
uuid::Uuid::now_v7() (default) ~1400 ns
uuid::Uuid::now_v7() (fast-rng feature) ~90 ns (u128), ~170 ns (string)
fast_uuid_v7::gen_id() ~8–50 ns
fast_uuid_v7::gen_id_str() ~21–60 ns
fast_uuid_v7::gen_id_string() ~85–130 ns

That's up to 165× faster than the default uuid crate, and still 8–10× faster than uuid with fast-rng.

Generating 10 million IDs takes roughly 95 ms on a single core.

To benchmark the repository code on your own machine:

cargo bench
# or
cargo test --release -- test_next_id_performance --nocapture
Enter fullscreen mode Exit fullscreen mode

Caveats

To be upfront about the limitations:

  • Not cryptographically secure. Do not use this for session tokens, secrets, or anything security-sensitive. Use the uuid crate with OsRng for those.
  • Clock drift edge cases. The batched timestamp check assumes CPU counter frequency stability. VM migrations or unusual scheduling could cause a ~1ms timestamp lag. This will not happen on high throughput and I also couldn't replicate it, but this may be an issue in theory.
  • SystemTime::now() is still called periodically. The ~8 ns figure is the amortized hot-path cost. When a millisecond boundary is detected, the call drops to ~50 ns — still much faster than the baseline.

When Should You Use This?

Use fast-uuid-v7 if:

  • You're generating IDs at very high rates (hundreds of thousands per second)
  • You need time-sortable IDs (database PKs, log IDs, message IDs)
  • Security/unpredictability of the random component is not a requirement

Stick with uuid if:

  • You need cryptographically secure randomness
  • Generation rate is not a bottleneck
  • You want RFC-strict compliance guarantees
  • You need parsing (!) - fast-uuid-v7 doesn't do any uuid parsing.

AI Disclaimer

This text here was mostly written with AI assistance. I am not a native speaker and an LLM can use much better words than I ever could. Also, some of the ideas behind fast-uuid-v7 were researched with AI assistance. The original idea, the actual optimization, benchmarks and everything else was my input.


Repository

The crate is MIT licensed and available on crates.io and GitHub.

Feedback, benchmarks on your hardware, and PRs are welcome.

Top comments (0)