No wall clock required: a 64-bit ID generator from a counter + reversible permutation
permid64 — Clock-free, persistent, reversible-permutation 64-bit ID generation.
GitHub: erickh826/permid64 · PyPI:pip install permid64
English version of the article; roadmap and limitation notes aligned with the current project README.
Part 1 — The classic trilemma
Almost every system that needs unique identifiers forces a choice among three familiar patterns: database auto-increment, random UUIDs, or timestamp-centric Snowflake-style IDs. Each looks workable at first; the painful trade-offs often show up only at scale. Before we see how permid64 sidesteps this triangle, it helps to be precise about what hurts in each option.
Auto-increment: your volume is public
An auto-increment primary key is the simplest uniqueness story, but it exposes the counter at the external boundary. An attacker who places two orders and subtracts the IDs learns exactly how many orders were issued in between.
Example: at 10:00 you receive order_id = 100420; at 14:00 another order returns 100780 — 360 orders in four hours, your growth curve laid bare. A competitor can automate small orders to sample that curve over time. For e-commerce or SaaS where such leakage matters, it is unacceptable. Sequential IDs also invite IDOR-style probing: increment or decrement the id and try other users’ resources.
UUID v4: the 128-bit random tax
UUID v4 uses 122 random bits so that, statistically, you get global uniqueness without coordination — which is why it is everywhere. The 128-bit width costs space in every index entry; worse, fully random keys interact badly with B+ tree indexes.
Page splits depend strongly on key distribution:
- Sequential keys: new rows land in the rightmost leaf; on average you trigger one split only every inserts ( = keys per page). The working set stays small and buffer-pool friendly.
- Random keys: each insert may hit an arbitrary leaf. Once the index exceeds RAM, almost every insert becomes a cache miss and a disk read.
With buffer pool size (in pages), insert rate (pages/sec), and fanout , random-key cache-miss pressure rises sharply once row count passes roughly:
After that, random inserts tend to cost one random I/O each and throughput collapses. On PostgreSQL or MySQL/InnoDB with a B+ tree clustered primary key, UUID v4 keys can slow writes by orders of magnitude once data outgrows memory. That is the structural UUID-as-PK problem.
Snowflake / timestamp IDs: you married the wall clock
Snowflake and relatives (TSID, ULID, …) embed wall-clock time in the high bits so ids are roughly monotonic — which fixes the B+ tree random-write pain. Correctness then depends on monotonic real time, which production clusters do not always provide.
Where clock issues bite:
- NTP step backward: correcting a fast clock can move system time backwards; if the step exceeds the sequence field’s capacity, duplicate ids become possible.
- VM pause / resume: after live migration or snapshot restore, guest time can jump backward by seconds or minutes.
-
Leap seconds: OSes disagree on how to represent them; some paths can yield duplicate
gettimeofday()results.
Once wall time goes backward, a timestamp scheme either emits duplicates or blocks until time catches up — neither is great. “Wait until the clock exceeds the last timestamp” moves the failure mode from duplicates to stalls; large NTP corrections can stall long enough to trip upstream timeouts.
Part 2 — Breaking the deadlock: counter in, permutation out
permid64 can be summarized in one line:
Uniqueness comes from the counter; the random-looking surface comes from a reversible permutation.
The counter supplies a strictly increasing sequence — under the documented persistence rules, that is the core uniqueness argument. A bijection over 64 bits maps consecutive inner values to values that look unstructured from the outside, so observers cannot read business volume or ordering off the wire. Because the map is invertible, every id can be decode()’d back to metadata with no information loss. You get “opaque externally” and “traceable internally” at the same time.
A clock-free observation
Claim: If an id generator never reads wall-clock time, then is immune to clock rollback (with respect to time as an input).
Argument: Suppose depends only on persistent counter state and fixed parameters . Then does not mention a time variable ; whatever NTP or the hypervisor does to the clock, the issued sequence is driven by monotonic advances of . So clock skew does not enter the uniqueness story for as defined.
Pipeline shape
Encode (issue an id):
seq = source.next() # monotonic counter (persistent)
raw = layout.compose(instance_id, seq) # pack 16-bit shard + 48-bit seq
id64 = permutation.forward(raw) # bijection: obfuscate
Decode (recover metadata):
raw = permutation.inverse(id64) # reverse bijection
meta = layout.decompose(raw) # → DecodedId(instance_id, sequence)
End-to-end, the design is a three-stage pipeline: Source → Layout → Permutation. Each stage has a narrow job and can be swapped independently (single responsibility). For example, you could replace the permutation with an AES-based format-preserving encryption (FPE) construction without rewriting the counter or layout layers. permid64 expresses layer contracts with Python Protocol so implementations can be injected. The next parts unpack each layer.
Part 3 — The three layers in depth
3.1 Source: block reservation
Problem: After restart, the counter must still advance; the naive approach is to fsync on every next() — too slow. On a fast NVMe, one fsync might be on the order of tens of microseconds, capping you around ~100k durable ops/sec, which is not enough for hot id paths.
Idea: Block reservation. You do not persist every value; you persist a high-water mark — the upper bound of the reserved range. Reserve block_size ids at once (default 4096), then serve the next 4095 ids from memory with no extra disk I/O.
Restart-safety (sketch)
Claim: With block reservation, a crash after restart cannot re-issue an id that was already issued before the crash.
Setup:
- Let be the last high-water mark successfully written to disk before any sequence is handed out ( = block size).
- A crash loses all in-memory state.
Goal: Any issued after restart is strictly greater than any issued before the crash.
Sketch:
- Before crash, the last durable watermark is . In that block, issued values satisfy .
- After crash, the process reloads from disk and reserves a new block: .
- The new block starts at , so post-restart values satisfy for all pre-crash .
So the old and new ranges are disjoint — no duplicates.
Gaps are acceptable
A crash wastes the rest of the current block (up to unused values). That is intentional:
Accept gaps to guarantee uniqueness.
Gaps do not create collisions; they only skip numbers. For an id generator, uniqueness is the hard requirement; consecutive values are not. Because ids pass through a permutation, external observers cannot infer where gaps occurred anyway.
Pseudocode
class PersistentCounterSource:
def __init__(self, state_file: str, block_size: int = 4096):
self.path = Path(state_file)
self.block_size = block_size
self._next = 0
self._limit = 0 # _next < _limit while block is live
def _reserve_block(self):
"""Read H from disk, write H' = H + B, set in-memory range."""
H = read_highwater(self.path) # 0 if file missing
H_new = H + self.block_size
atomic_write(self.path, H_new) # fsync + atomic rename
self._next = H
self._limit = H_new
def next(self) -> int:
if self._next >= self._limit:
self._reserve_block() # one I/O every block_size calls
val = self._next
self._next += 1
return val
Amortized disk cost is
per next().
3.2 Layout: bit packing
The layout packs instance_id (shard tag) and sequence into one 64-bit integer. The default split is:
Capacity:
- Up to distinct shards.
- Per shard, ids.
Even at one million ids per second, a single shard would take on the order of thousands of years to exhaust 48 bits.
Compose:
Decompose:
Layout64 is configurable — e.g. Layout64(instance_bits=10, sequence_bits=54) trades shard count for per-shard headroom. The only hard rule is instance_bits + sequence_bits == 64.
3.3 Permutation: mixing without losing invertibility
Why a bijection, not a hash?
If you “hid” the counter with SHA-256, you could not implement decode() to recover instance_id and sequence — hashes are one-way. permid64 needs a permutation (bijection) on 64-bit strings:
Definition: A map with for all .
Two built-ins ship today.
MultiplyOdd permutation
Map:
with odd and any 64-bit integer.
Invertibility
Claim: Odd has a modular inverse modulo , and
Sketch:
- Odd implies ; Bézout gives with .
- From : , multiply by to recover .
- Compose to check .
So is a bijection on .
Default constants
- — a 64-bit golden-ratio–related multiplier (odd), in the spirit of Knuth’s multiplicative hashing for good bit diffusion.
- — first 64 bits of SHA-256’s initial state, used as an independent additive offset.
Feistel64 permutation
A Feistel network is a classic construction: the round function need not be invertible; the round structure still yields a permutation.
permid64 uses a balanced Feistel on 64 bits: left and right halves are 32 bits each.
One Feistel round
Inverting one round
Claim: Given and round key , you can recover uniquely.
Proof:
From , we get .
From and substituting :
XOR both sides with :
So the inverse round is:
Takeaway: does not need an inverse; on the way back, is known as , so you only re-evaluate .
Six-round forward / inverse (symbolic)
Forward (keys in order):
Output is .
Inverse (keys in reverse):
This recovers .
Round function (as implemented)
def _round_f(r: int, k: int) -> int:
x = (r ^ k) & MASK32 # XOR round key
x = (x * 0x9E3779B1) & MASK32 # Knuth-style multiply (32-bit)
x ^= (x >> 16) # xor-shift: high → low bits
x = ((x << 5) | (x >> 27)) & MASK32 # rotate left by 5
x = (x * 0x85EBCA6B) & MASK32 # MurmurHash3 finalizer constant
x ^= (x >> 13) # xor-shift mix again
return x & MASK32
Constants:
-
0x9E3779B1— 32-bit golden-ratio–style multiplier for avalanche. -
0x85EBCA6B— MurmurHash3 finalizer constant for a second mixing stage.
Round-key derivation
From one 64-bit seed, xorshift64*–style steps yield rounds 32-bit subkeys:
@staticmethod
def _derive_round_keys(key: int, rounds: int) -> list[int]:
keys = []
x = key & MASK64
for _ in range(rounds):
x ^= (x >> 12) & MASK64
x ^= (x << 25) & MASK64
x ^= (x >> 27) & MASK64
x = (x * 0x2545F4914F6CDD1D) & MASK64
keys.append(x & MASK32)
return keys
Comparing the two permutations
| MultiplyOdd | Feistel64 | |
|---|---|---|
| Speed | ~500M ids/sec | ~150M ids/sec |
| Mixing | Affine (single round) | Nonlinear, 6 rounds |
| Invertibility | Odd multiply has mod inverse | Structural Feistel property |
| Math | Feistel network | |
| When | Throughput-first, “good enough” mixing | Stronger statistical obscurity |
Part 4 — Engineering trade-offs and roadmap
Limitations (read before adopting)
permid64 is not trying to be the one id scheme for every system. A few non-negotiables:
Single-process state file today:
PersistentCounterSourceis not safe for multiple processes sharing one state file. POSIX uses a best-effortfcntl.flockduring block reservation, but that is not a substitute for correct sharding. The supported pattern is distinctinstance_id+ distinct state file per process until multi-process locking lands. See README Limitations for detail.Not cryptography: Feistel here is obfuscation, not AES. Anyone who knows the permutation parameters can
decode()ids. Do not use these ids as secret tokens or auth credentials.Manual
instance_id: You must assign unique ids per shard/process yourself. Collisions happen if two live generators reuse the sameinstance_id. Use env vars or config management and validate at startup. Central allocation is on the roadmap (ReservedBlockSource, v0.5 in the table below).
Benchmarks (permutation only)
Apple M2 / Python 3.12–class numbers from the project’s benchmark script:
| Permutation | Throughput | Latency (order of) |
|---|---|---|
| MultiplyOdd | ~480M ids/sec | ~2 ns |
| Feistel (6 rounds) | ~140M ids/sec | ~7 ns |
| Identity (no mixing) | ~520M ids/sec | ~2 ns |
These figures are pure CPU permutation cost; block reservation amortizes fsync to about once per 4096 ids in steady state. For the authoritative roadmap (multi-process locking, encodings, allocator, reference implementations), see the README in the repo.
Roadmap (from README)
| Version | Theme | What's included | Status |
|---|---|---|---|
| v0.1 | Core primitives |
PersistentCounterSource, Feistel / Multiplicative permutation, decode()
|
Released |
| v0.2 | Encoding & Config |
IdentityPermutation, fixed-width Base62 + Crockford Base32, Id64Config + build_id64
|
Released |
| v0.3 | Multi-process safety | Process-safe file locking (fcntl / msvcrt), documented single-machine multi-process patterns, safer state file semantics |
Next |
| v0.4 | Human-friendly output | Check digit, PrefixedEncoder, FormatSpec (segmented display, ambiguity-reduced alphabets) |
Planned |
| v0.5 | Distributed sources |
ReservedBlockSource (central allocator, Redis / PG-backed block rental), instance_id helpers |
Planned |
| v0.6 | Solution presets |
OrderIdGenerator, TicketIdGenerator, CorrelationIdGenerator, IoTEventIdGenerator
|
Planned |
| v0.7 | Formal spec | Cross-language bit layout, profile aliases, compatibility document | Planned |
| v1.0 | Reference implementations | Go + Rust reference implementations, three-language cross-decode tests | Planned |
Quick start
from permid64 import Id64
gen = Id64.feistel(
instance_id=42,
state_file="my_service.state",
key=0xDEADBEEFCAFEBABE,
)
id_val = gen.next_u64() # int
token = gen.next_base62() # str
meta = gen.decode(id_val)
print(meta.instance_id) # 42
print(meta.sequence) # 0, 1, 2, ...
meta2 = gen.decode_base62(token)
Summary
permid64 escapes parts of the auto-increment / UUID / timestamp trilemma by proving uniqueness from a monotonic counter and hiding structure with a bijection you can invert for operations.
| Auto-increment | UUID v4 | Snowflake | permid64 | |
|---|---|---|---|---|
| Width | 32/64 | 128 | 64 | 64 |
| Depends on | DB sequence | RNG | Wall clock (NTP) | Persistent counter (file) |
| Clock rollback | none | none | high risk | none (no clock in path) |
| Surface | strictly increasing | random | roughly increasing | random-looking (invertible) |
| B+ tree friendliness | excellent | poor | good | moderate |
| Decodable | trivially leaks count | no | yes (time + worker) |
yes (instance_id + sequence) |
| Volume leakage | severe | none | some inference possible | none from sequential counter alone |
| Infra | database | none | coordination for worker ids | durable local storage |
| Cross-process | DB guarantees | probabilistic | worker-id scheme | today: manual shards; roadmap: locking + allocator |
There is no perfect id — only the right id for the constraints. If you want 64-bit, clock-free, decodable, opaque-looking identifiers in pure Python with zero runtime dependencies, permid64 is worth a serious look. The layered design also leaves room to swap in stronger permutations or add allocation without breaking the mental model.
- GitHub: https://github.com/erickh826/permid64
-
PyPI:
pip install permid64
Top comments (0)