I recently coded a mini caching server in Rust, implementing the LRU (Least Recently Used) mechanism in under 1000 lines of code. By applying zero-copy techniques and careful memory management, I achieved high performance and production-ready stability.
Here is the breakdown of how I built it and why it's so fast.
The Core Design
The heart of the cache is the Cache struct. Instead of using standard pointer-based linked lists (which can be unsafe or slow in Rust due to RefCell overhead), I used a vector-based arena allocator approach.
Key Data Structures
Zero-Copy with
bytes::Bytes:
I usedbytes::Bytesfor both keys and values. This is a crucial optimization.Bytesacts as a reference-counted slice of memory (similar toArc<[u8]>). When we clone aKeyorValue, we aren't copying the underlying data; we are just incrementing a reference count. This makes passing data around extremely cheap.-
Index-based Linked List:
The LRU queue is implemented as a double-linked list, but nodes are stored in aVec<Entry>. Theprevandnextpointers are simplyu32indices into this vector.- Memory Efficiency: No memory fragmentation from allocating individual nodes.
- Cache Locality: The
Veckeeps entries close in memory, improving CPU cache hits.
type Key = Bytes;
type Value = Bytes;
type Index = u32;
#[derive(Debug)]
struct Entry {
key: Key,
value: Value,
prev: Option<Index>,
next: Option<Index>,
}
#[derive(Debug)]
pub struct Cache {
capacity: usize,
map: HashMap<Key, Index>, // Maps Key -> Index in the entries vector
entries: Vec<Entry>, // The "Arena"
head: Option<Index>, // MRU (Most Recently Used)
tail: Option<Index>, // LRU (Least Recently Used)
free: Vec<Index>, // Reusable slots
}
How It Works
The get Operation: O(1)
When retrieving an item, we look up the index in the HashMap. If found, we move the node to the head of the list (marking it as most recently used) and return a cheap clone of the value.
pub fn get(&mut self, key: &Key) -> Option<Value> {
let idx = *self.map.get(key)?;
self.move_to_head(idx);
Some(self.entries[idx as usize].value.clone()) // Zero-copy clone!
}
The put Operation: O(1)
Inserting a value handles three cases:
- Update: If the key exists, update value and move to head.
- Insert (Full): If capacity is reached, evict the tail (LRU), reuse its slot, and insert the new item at the head.
- Insert (Space Available): Allocate a new slot (or reuse a freed one) and insert at the head.
The eviction logic is particularly clean because we can directly reuse the index from the evicted tail for the new entry, avoiding reallocation.
pub fn put(&mut self, key: Key, value: Value) {
if let Some(&idx) = self.map.get(&key) {
self.entries[idx as usize].value = value;
self.move_to_head(idx);
return;
}
if self.map.len() == self.capacity {
self.evict();
}
let idx = self.alloc(Entry {
key: key.clone(),
value,
prev: None,
next: None,
});
self.map.insert(key, idx);
self.attach_head(idx);
}
Benchmarks
I ran a benchmark on my local machine to test the raw throughput of the cache implementation (excluding network overhead). The results are impressive:
Environment: cargo run --release, MacBook Pro (Apple Silicon).
Parameters: 500k Capacity, 1M Operations.
| Operation | Throughput | Latency (Total) |
|---|---|---|
| PUT (Insert) | ~5.38 Million ops/sec | 185ms (for 1M items) |
| GET (Hit) | ~12.09 Million ops/sec | 8.27ms (for 100k items) |
| GET (Miss) | ~6.98 Million ops/sec | 14.33ms (for 100k items) |
The use of Bytes and the index-based approach yields massive performance benefits, making this simple implementation competitive with production-grade systems.
The Server
Finally, I wrapped this Cache in an HTTP server using axum. To share the cache across threads, I used Arc<Mutex<Cache>>. While the Mutex introduces some contention, the raw speed of the underlying cache keeps the critical section extremely short.
use axum::{
extract::{Path, State},
routing::get,
Router,
body::Bytes,
};
use std::sync::{Arc, Mutex};
use std::env;
type SharedCache = Arc<Mutex<Cache>>;
#[tokio::main]
async fn main() {
let capacity: usize = env::var("CAPACITY")
.unwrap_or_else(|_| "100".to_string())
.parse()
.expect("CAPACITY must be a number");
let port = env::var("PORT").unwrap_or_else(|_| "3000".to_string());
let addr = format!("127.0.0.1:{}", port);
let shared_cache = Arc::new(Mutex::new(Cache::new(capacity)));
let app = Router::new()
.route("/{key}",
get(handle_get).put(handle_put)
)
.with_state(shared_cache);
let listener = tokio::net::TcpListener::bind(&addr).await.unwrap();
println!("Simple Cache Server running on {}", addr);
axum::serve(listener, app).await.unwrap();
}
The source code is available on GitHub.
Top comments (0)