There's a pattern that's been quietly shipping in production for over a decade: take a large read-only dataset, compile it into a flat binary, and mmap it instead of deserializing it into Java objects. Lucene does it. Chronicle does it. LMDB does it.
I built a reusable version of that pattern for rule/config/policy datasets and just open-sourced it.
Repo: github.com/AlphaSudo/rimg
The Problem
You have a service that evaluates rules, policies, feature flags, or lookup tables on every request. The dataset is large (10K–1M+ entries), rarely changes, and lives on the heap as a big object graph. You're paying for it in:
- Heap pressure: Hundreds of MB of static data the GC has to scan every cycle.
- Startup time: Deserializing JSON or querying a DB to build the graph.
- Reload cost: Rebuilding the whole graph when the dataset updates.
What rule-image does
The .rimg format is a custom binary featuring:
- A CHD-style Minimal Perfect Hash (MPHF) index.
- Optional Bloom filter for fast negative lookups.
- CRC32 corruption detection + SHA-256 integrity.
- Little-endian packed entries with natural alignment.
The Numbers
GeoIP-style showcase (5 million synthetic entries)
| Metric | Heap (POJO) | rule-image mapped |
|---|---|---|
| Heap after load | 853 MB | 8 MB |
| Load time | 6,683 ms | 145 ms |
| Reload time | 6,595 ms | 278 ms |
100K entries with fat metadata payloads
| Metric | Heap | Mapped |
|---|---|---|
| Heap after load | 1.47 GB | 7.33 MB |
The "Honest" Benchmark (Latency)
| Benchmark | Heap | Mapped |
|---|---|---|
| JMH single warm lookup | 21 ns/op | 79 ns/op |
| JMH composed (N=10) | 400 ns/op | 684 ns/op |
The Tradeoff: Warm lookup is slower. If your dataset is already loaded and "warm" in the CPU cache, plain heap objects win. rule-image wins on memory footprint, startup, reload, and cold/miss-path behavior.
Hot-swap Chaos Test
I threw 10,000 concurrent virtual-thread readers at the service harness while forcing an image swap every 500ms for 5 continuous minutes.
Result: Zero segfaults, zero stale reads, zero lost evaluations.
The reclamation strategy is epoch-based—each reader increments/decrements an epoch counter, and the swap thread waits for the epoch to stabilize before closing the old Arena.
The Valhalla Angle
The part I'm most excited about is the forward path. When JEP 401 (Value Classes) ships permanently, you'll be able to write zero-allocation views like this:
value class RuleHeader {
private final MemorySegment seg;
private final long base;
public int id() { return seg.get(JAVA_INT, base + 0); }
public int priority() { return seg.get(JAVA_INT, base + 4); }
public long mask() { return seg.get(JAVA_LONG, base + 8); }
}
Scalar replacement means this lives in registers. The hot path allocates exactly zero bytes end-to-end while your code reads like normal Java. I've drafted a post for valhalla-dev (included in the repo under docs/) and would love feedback from anyone tracking JEP 401.
Quick Start
# Requires JDK 26 (Temurin)
git clone [https://github.com/AlphaSudo/rimg.git](https://github.com/AlphaSudo/rimg.git)
cd rimg
./gradlew test
./gradlew :geoip-showcase:run --args="--entries 100000 --lookups 10000 --warmup-lookups 2000"
What this is NOT
- Not "faster than everything": Warm lookup loses to heap POJOs.
- Not production-ready for all workloads: This is a PoC with real evidence, but use with caution.
- Not novel at the JVM level: Lucene and Chronicle have used these techniques for 15+ years.
What IS arguably new: The specific packaging as a reusable AOT compiler + runtime, tuned for Virtual Threads (Loom), with a Valhalla-forward codegen path.
The data is in the repo. Judge for yourself.
Apache 2.0 · github.com/AlphaSudo/rimg

Top comments (0)