DEV Community

Cover image for Rust Performance Boost: Building Efficient Caching Systems From Scratch
Aarav Joshi
Aarav Joshi

Posted on

Rust Performance Boost: Building Efficient Caching Systems From Scratch

As a best-selling author, I invite you to explore my books on Amazon. Don't forget to follow me on Medium and show your support. Thank you! Your support means the world!

Caching is one of the most powerful optimization techniques in software development. It's a strategy I've employed countless times to boost application performance by storing computed results for future use. When implementing caching systems, Rust stands out as an exceptional language due to its performance characteristics and safety guarantees.

I've found that Rust's ownership model provides unique advantages when building cache implementations. The strict memory management ensures we don't leak memory in long-lived caches while maintaining high performance. Let me share what I've learned about implementing efficient caching systems in Rust.

Understanding Caching Fundamentals

At its core, caching involves storing expensive computation results to avoid recalculating them. The performance gains come from the time difference between cache retrieval and the original computation. For a cache to be effective, retrieving cached data must be significantly faster than recomputing it.

In my experience, identifying what to cache requires understanding your application's hot paths. These are operations that happen frequently and are computationally expensive. Database queries, API calls, and complex calculations are prime candidates for caching.

Basic Cache Implementation in Rust

The simplest cache implementation uses a HashMap and a synchronization primitive:

use std::collections::HashMap;
use std::hash::Hash;
use std::sync::Mutex;

struct SimpleCache<K, V> {
    store: Mutex<HashMap<K, V>>,
}

impl<K: Eq + Hash + Clone, V: Clone> SimpleCache<K, V> {
    fn new() -> Self {
        SimpleCache {
            store: Mutex::new(HashMap::new()),
        }
    }

    fn get(&self, key: &K) -> Option<V> {
        let store = self.store.lock().unwrap();
        store.get(key).cloned()
    }

    fn insert(&self, key: K, value: V) {
        let mut store = self.store.lock().unwrap();
        store.insert(key, value);
    }
}
Enter fullscreen mode Exit fullscreen mode

This implementation works but has two major limitations: it locks the entire cache for each operation, and it has no eviction policy.

Thread-Safe Caching with RwLock

For read-heavy workloads, I recommend using RwLock instead of Mutex. This allows multiple threads to read from the cache simultaneously:

use std::collections::HashMap;
use std::hash::Hash;
use std::sync::RwLock;

struct ReadOptimizedCache<K, V> {
    store: RwLock<HashMap<K, V>>,
}

impl<K: Eq + Hash + Clone, V: Clone> ReadOptimizedCache<K, V> {
    fn new() -> Self {
        ReadOptimizedCache {
            store: RwLock::new(HashMap::new()),
        }
    }

    fn get(&self, key: &K) -> Option<V> {
        let store = self.store.read().unwrap();
        store.get(key).cloned()
    }

    fn insert(&self, key: K, value: V) {
        let mut store = self.store.write().unwrap();
        store.insert(key, value);
    }
}
Enter fullscreen mode Exit fullscreen mode

I've seen significant performance improvements in production systems by using RwLock for caches with high read-to-write ratios.

Implementing TTL-Based Cache Expiration

Time-to-live (TTL) expiration is crucial for keeping cached data fresh. Here's how I implement it:

use std::collections::HashMap;
use std::hash::Hash;
use std::sync::RwLock;
use std::time::{Duration, Instant};

struct TTLCache<K, V> {
    store: RwLock<HashMap<K, (V, Instant)>>,
    ttl: Duration,
}

impl<K: Eq + Hash + Clone, V: Clone> TTLCache<K, V> {
    fn new(ttl: Duration) -> Self {
        TTLCache {
            store: RwLock::new(HashMap::new()),
            ttl,
        }
    }

    fn get(&self, key: &K) -> Option<V> {
        let store = self.store.read().unwrap();

        if let Some((value, timestamp)) = store.get(key) {
            if timestamp.elapsed() < self.ttl {
                return Some(value.clone());
            }
        }

        None
    }

    fn insert(&self, key: K, value: V) {
        let mut store = self.store.write().unwrap();
        store.insert(key, (value, Instant::now()));
    }

    fn cleanup(&self) {
        let mut store = self.store.write().unwrap();
        store.retain(|_, (_, timestamp)| timestamp.elapsed() < self.ttl);
    }
}
Enter fullscreen mode Exit fullscreen mode

The cleanup method should be called periodically, perhaps by a background thread, to remove expired entries.

LRU Cache Implementation

For memory-constrained environments, I often use a Least Recently Used (LRU) cache. The linked_hash_map crate is perfect for this:

use linked_hash_map::LinkedHashMap;
use std::hash::Hash;
use std::sync::Mutex;

struct LRUCache<K, V> {
    store: Mutex<LinkedHashMap<K, V>>,
    capacity: usize,
}

impl<K: Eq + Hash + Clone, V: Clone> LRUCache<K, V> {
    fn new(capacity: usize) -> Self {
        LRUCache {
            store: Mutex::new(LinkedHashMap::with_capacity(capacity)),
            capacity,
        }
    }

    fn get(&self, key: &K) -> Option<V> {
        let mut store = self.store.lock().unwrap();

        if let Some(value) = store.get_refresh(key) {
            return Some(value.clone());
        }

        None
    }

    fn insert(&self, key: K, value: V) {
        let mut store = self.store.lock().unwrap();

        store.insert(key, value);

        if store.len() > self.capacity {
            if let Some((_, _)) = store.pop_front() {
                // Removed oldest item
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

The LinkedHashMap's get_refresh method moves the accessed entry to the end of the internal linked list, making it the most recently used item.

Sharded Cache for High Concurrency

When dealing with high concurrency, lock contention becomes a bottleneck. I've found that sharding the cache across multiple independent segments significantly improves performance:

use std::collections::HashMap;
use std::hash::{Hash, Hasher};
use std::sync::RwLock;

struct ShardedCache<K, V> {
    shards: Vec<RwLock<HashMap<K, V>>>,
    shard_count: usize,
}

impl<K: Eq + Hash + Clone, V: Clone> ShardedCache<K, V> {
    fn new(shard_count: usize) -> Self {
        let mut shards = Vec::with_capacity(shard_count);
        for _ in 0..shard_count {
            shards.push(RwLock::new(HashMap::new()));
        }

        ShardedCache {
            shards,
            shard_count,
        }
    }

    fn get_shard_index(&self, key: &K) -> usize {
        let mut hasher = std::collections::hash_map::DefaultHasher::new();
        key.hash(&mut hasher);
        (hasher.finish() as usize) % self.shard_count
    }

    fn get(&self, key: &K) -> Option<V> {
        let shard_idx = self.get_shard_index(key);
        let shard = &self.shards[shard_idx];

        let store = shard.read().unwrap();
        store.get(key).cloned()
    }

    fn insert(&self, key: K, value: V) {
        let shard_idx = self.get_shard_index(&key);
        let shard = &self.shards[shard_idx];

        let mut store = shard.write().unwrap();
        store.insert(key, value);
    }
}
Enter fullscreen mode Exit fullscreen mode

This approach divides the cache into multiple independent shards, each with its own lock. When a thread accesses the cache, it only needs to acquire a lock for a specific shard rather than the entire cache.

Asynchronous Cache Operations

In modern Rust applications using async/await, we need cache implementations that work with asynchronous code. Here's a simple async-compatible cache:

use std::collections::HashMap;
use std::hash::Hash;
use tokio::sync::RwLock;

struct AsyncCache<K, V> {
    store: RwLock<HashMap<K, V>>,
}

impl<K: Eq + Hash + Clone, V: Clone> AsyncCache<K, V> {
    fn new() -> Self {
        AsyncCache {
            store: RwLock::new(HashMap::new()),
        }
    }

    async fn get(&self, key: &K) -> Option<V> {
        let store = self.store.read().await;
        store.get(key).cloned()
    }

    async fn insert(&self, key: K, value: V) {
        let mut store = self.store.write().await;
        store.insert(key, value);
    }
}
Enter fullscreen mode Exit fullscreen mode

This implementation uses Tokio's RwLock, which supports the async/await syntax.

Combining Caching Strategies

In practice, I've found that combining multiple caching strategies often yields the best results. Here's an example of a cache that uses both TTL and LRU eviction:

use linked_hash_map::LinkedHashMap;
use std::hash::Hash;
use std::sync::RwLock;
use std::time::{Duration, Instant};

struct HybridCache<K, V> {
    store: RwLock<LinkedHashMap<K, (V, Instant)>>,
    ttl: Duration,
    capacity: usize,
}

impl<K: Eq + Hash + Clone, V: Clone> HybridCache<K, V> {
    fn new(ttl: Duration, capacity: usize) -> Self {
        HybridCache {
            store: RwLock::new(LinkedHashMap::with_capacity(capacity)),
            ttl,
            capacity,
        }
    }

    fn get(&self, key: &K) -> Option<V> {
        let mut store = self.store.write().unwrap();

        if let Some((value, timestamp)) = store.get_refresh(key) {
            if timestamp.elapsed() < self.ttl {
                return Some(value.clone());
            } else {
                // Expired item, remove it
                store.remove(key);
                return None;
            }
        }

        None
    }

    fn insert(&self, key: K, value: V) {
        let mut store = self.store.write().unwrap();

        store.insert(key, (value, Instant::now()));

        // Enforce capacity limit
        while store.len() > self.capacity {
            if let Some(_) = store.pop_front() {
                // Removed oldest item
            }
        }
    }

    fn cleanup(&self) {
        let mut store = self.store.write().unwrap();
        let now = Instant::now();

        store.retain(|_, (_, timestamp)| now.duration_since(*timestamp) < self.ttl);
    }
}
Enter fullscreen mode Exit fullscreen mode

This cache limits the number of entries and ensures data freshness through TTL eviction.

Leveraging Existing Cache Libraries

While implementing a custom cache can be educational, I often use established libraries for production code. Moka is an excellent caching library for Rust:

use moka::sync::Cache;
use std::time::Duration;

fn main() {
    // Create a cache with a maximum of 10,000 entries
    // that expire after 30 minutes
    let cache = Cache::builder()
        .max_capacity(10_000)
        .time_to_live(Duration::from_secs(1800))
        .build();

    // Insert a value
    cache.insert("key", "value");

    // Get a value (returns None if not present)
    let value = cache.get(&"key");
    assert_eq!(value, Some("value"));

    // Cache handles eviction and cleanup automatically
}
Enter fullscreen mode Exit fullscreen mode

Moka provides thread-safe caches with automatic eviction, statistics, and high performance through concurrent data structures.

Distributed Caching with Rust

For applications that span multiple servers, distributed caching becomes essential. Redis is a popular choice, and the redis crate provides Rust bindings:

use redis::{Client, Commands};

fn main() -> redis::RedisResult<()> {
    let client = Client::open("redis://127.0.0.1/")?;
    let mut con = client.get_connection()?;

    // Set a value with an expiration of 60 seconds
    let _: () = con.set_ex("my_key", "my_value", 60)?;

    // Get a value
    let value: Option<String> = con.get("my_key")?;
    assert_eq!(value, Some("my_value".to_string()));

    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

I've successfully used Redis with Rust in production for shared caching across multiple application instances.

Monitoring and Metrics

A well-designed cache should include instrumentation. I like to track hit rates, miss rates, and eviction counts:

use std::collections::HashMap;
use std::hash::Hash;
use std::sync::{Arc, RwLock};

struct MonitoredCache<K, V> {
    store: RwLock<HashMap<K, V>>,
    hits: RwLock<u64>,
    misses: RwLock<u64>,
}

impl<K: Eq + Hash + Clone, V: Clone> MonitoredCache<K, V> {
    fn new() -> Self {
        MonitoredCache {
            store: RwLock::new(HashMap::new()),
            hits: RwLock::new(0),
            misses: RwLock::new(0),
        }
    }

    fn get(&self, key: &K) -> Option<V> {
        let store = self.store.read().unwrap();

        match store.get(key) {
            Some(value) => {
                let mut hits = self.hits.write().unwrap();
                *hits += 1;
                Some(value.clone())
            }
            None => {
                let mut misses = self.misses.write().unwrap();
                *misses += 1;
                None
            }
        }
    }

    fn insert(&self, key: K, value: V) {
        let mut store = self.store.write().unwrap();
        store.insert(key, value);
    }

    fn get_stats(&self) -> (u64, u64, f64) {
        let hits = *self.hits.read().unwrap();
        let misses = *self.misses.read().unwrap();
        let total = hits + misses;

        let hit_rate = if total > 0 {
            hits as f64 / total as f64
        } else {
            0.0
        };

        (hits, misses, hit_rate)
    }
}
Enter fullscreen mode Exit fullscreen mode

Monitoring cache performance helps identify when adjustments are needed to cache parameters like TTL or capacity.

Performance Considerations

I've learned that several factors affect cache performance:

  1. Cache Size: Larger caches capture more data but consume more memory. I typically tune this based on available RAM and hit rate metrics.

  2. Eviction Policy: LRU works well for most applications, but FIFO or random eviction can be appropriate in specific scenarios.

  3. Serialization: For distributed caches, serialization overhead is significant. Using binary formats like bincode or msgpack typically outperforms JSON.

  4. Lock Contention: As mentioned earlier, sharding the cache is effective for reducing contention.

Conclusion

Efficient caching in Rust leverages the language's performance characteristics and safety guarantees. From simple in-memory caches to distributed systems, the principles remain consistent: store data close to where it's used, evict intelligently, and monitor performance.

Rust's ownership model ensures that cached data doesn't lead to memory leaks, while its concurrency primitives enable thread-safe access patterns. Whether you're building a web application, a data processing pipeline, or a low-latency service, effective caching will significantly improve your application's performance.

By understanding the trade-offs between different caching strategies and implementing the appropriate solution for your specific needs, you can achieve substantial performance improvements while maintaining Rust's safety guarantees.


101 Books

101 Books is an AI-driven publishing company co-founded by author Aarav Joshi. By leveraging advanced AI technology, we keep our publishing costs incredibly low—some books are priced as low as $4—making quality knowledge accessible to everyone.

Check out our book Golang Clean Code available on Amazon.

Stay tuned for updates and exciting news. When shopping for books, search for Aarav Joshi to find more of our titles. Use the provided link to enjoy special discounts!

Our Creations

Be sure to check out our creations:

Investor Central | Investor Central Spanish | Investor Central German | Smart Living | Epochs & Echoes | Puzzling Mysteries | Hindutva | Elite Dev | JS Schools


We are on Medium

Tech Koala Insights | Epochs & Echoes World | Investor Central Medium | Puzzling Mysteries Medium | Science & Epochs Medium | Modern Hindutva

Top comments (0)