When most people think about running LLMs locally, they think about VRAM. But if you're running on a multi-socket server, there's a completely different bottleneck: NUMA memory topology. RAM Coffers is solving this.
The NUMA Problem
In a dual-socket or multi-socket server, each CPU has its own local memory bank. Accessing local RAM is fast. Accessing memory across the interconnect (Infinity Fabric on AMD, UPI on Intel) is 2-3x slower.
When an LLM inference engine doesn't know about NUMA topology, it can end up:
- Allocating model weights on the wrong NUMA node
- Cross-node memory access for every token generation
- 40-60% throughput degradation without any obvious cause
Enter RAM Coffers
RAM Coffers is RustChain's NUMA-aware LLM inference infrastructure. It:
- Detects NUMA topology at startup
- Allocates model weights on the appropriate memory banks
- Pins inference threads to the correct CPU cores
- Reports hardware attestation through the Beacon Protocol
The result is predictable, optimized inference on any server hardware — not just consumer GPUs.
Why This Matters for Decentralized Inference
The decentralized AI narrative often assumes consumer hardware. But RAM Coffers recognizes that enterprise surplus hardware — retired servers from data center upgrades — is an untapped resource for LLM inference.
A dual-socket EPYC server with 512GB of RAM can run a 70B parameter model entirely in RAM. But only if the memory access is NUMA-aware.
The Proof-of-Antiquity Angle
RAM Coffers ties into RustChain's Proof-of-Antiquity protocol. When you run inference on older hardware:
- Your machine is attested via hardware fingerprint
- The inference work contributes to the network
- You earn RTC tokens for compute contributed
- Your agent identity is recorded on Beacon
This creates an economic incentive for using surplus hardware instead of letting it e-waste.
Getting Started
If you have access to multi-socket server hardware:
# Check NUMA topology
numactl --hardware
# Install RAM Coffers (from RustChain)
# Run inference with NUMA awareness
ram-coffers start --node-0 --node-1
# Check Beacon registration
curl -sk https://50.28.86.131/beacon/atlas
The setup process auto-detects your NUMA topology and configures the inference engine accordingly.
The Bigger Picture
RAM Coffers is part of a broader philosophy: every piece of silicon has value, and that value should be verifiable. Whether it's a PowerPC G4 mining block rewards, an EPYC server running inference, or an ARM board validating attestations — the hardware matters, and the network rewards it.
Published as part of the RustChain bounty program. Learn more at rustchain.org
Top comments (0)