LNATION for LNATION

Posted on Apr 8

LRU::Cache - A Fast Least Recently Used Cache For Perl

#perl #c #xs #programming

Caching is one of those things every application needs eventually. Whether it's DNS lookups, database rows, or computed results, keeping frequently accessed data close avoids expensive recomputation. The classic data structure for this is the Least Recently Used (LRU) cache: a fixed-size store that automatically evicts the oldest unused entry when full.

LRU::Cache is a new CPAN module that implements an LRU cache entirely in C via XS. Every operation — get, set, delete, exists — is O(1). No Perl-level hash ties, no linked list objects, no method dispatch on the hot path if you don't want it.

Quick Start

use LRU::Cache;

my $cache = LRU::Cache::new(1000);

$cache->set('user:42', { name => 'Alice', role => 'admin' });
$cache->set('user:43', { name => 'Bob',   role => 'editor' });

my $user = $cache->get('user:42');   # promotes to front
print $user->{name};                 # "Alice"

The constructor takes a single argument: the maximum number of entries. Once the cache hits capacity, the next set evicts whatever was accessed longest ago.

The Eviction Model

Every get or set promotes an entry to the front of the list. The entry at the tail — the one untouched for the longest — is the first to go when the cache is full.

my $cache = LRU::Cache::new(3);

$cache->set('a', 1);
$cache->set('b', 2);
$cache->set('c', 3);

# Access 'a' — it moves to the front
$cache->get('a');

# Cache is full. This evicts 'b' (the least recently used).
$cache->set('d', 4);

# 'b' is gone
print $cache->get('b');   # undef

After the get('a'), the internal order is a → c → b. Adding d evicts b from the tail.

Peeking Without Promoting

Sometimes you want to check a value without affecting the eviction order. peek does exactly that:

my $cache = LRU::Cache::new(3);
$cache->set('a', 1);
$cache->set('b', 2);
$cache->set('c', 3);

# peek returns the value but doesn't promote
my $val = $cache->peek('a');   # 1

# 'a' is still the oldest
my ($oldest_key) = $cache->oldest;   # 'a'

# Now get('a') promotes it
$cache->get('a');
($oldest_key) = $cache->oldest;      # 'b'

This is useful when you're iterating over cache contents for monitoring or debugging without disturbing the access pattern.

Inspecting the Cache

oldest and newest return both the key and value:

my $cache = LRU::Cache::new(5);
$cache->set('first',  'I was first');
$cache->set('second', 'I was second');
$cache->set('third',  'I was third');

my ($newest_key, $newest_val) = $cache->newest;
# $newest_key = 'third', $newest_val = 'I was third'

my ($oldest_key, $oldest_val) = $cache->oldest;
# $oldest_key = 'first', $oldest_val = 'I was first'

keys returns all keys in most-recently-used order:

$cache->get('first');   # promote 'first'
my @keys = $cache->keys;
# ('first', 'third', 'second')

And delete hands back whatever was removed:

my $deleted = $cache->delete('second');
# $deleted = 'I was second'

The Function-Style API

Method dispatch in Perl isn't free. For a cache that sits in a tight loop, the overhead of $cache->get(...) — which goes through entersub, method resolution, and argument shifting — adds up. LRU::Cache offers a function-style API that is custom OP optimised that eliminates this entirely:

use LRU::Cache qw(import);

my $cache = LRU::Cache::new(10_000);

lru_set($cache, 'key', $value);
my $v   = lru_get($cache, 'key');
my $hit = lru_exists($cache, 'key');
my $p   = lru_peek($cache, 'key');
lru_delete($cache, 'key');

As I say these aren't just wrappers. At compile time, Perl's call checker mechanism replaces the entersub op with a custom op that goes directly to the C implementation. No method lookup, no @_ construction, no argument unpacking. The cache pointer is read directly off the pad.

The result is roughly 2x faster than the method-style API, and 5–20x faster than a pure Perl implementation.

Benchmarks

Single-operation rates, no batching loops — each iteration is one cache call:

Operation	Pure Perl	XS Method	XS Function
set (existing)	1,715,893/s	21,445,932/s	45,124,820/s
get (hit)	5,271,976/s	18,302,304/s	33,017,565/s
get (miss)	8,133,594/s	22,125,453/s	46,998,887/s
exists (hit)	7,080,776/s	19,521,882/s	38,745,035/s
peek	6,410,022/s	16,842,291/s	32,437,007/s

A Practical Example: Caching Expensive Lookups

Here's a pattern that comes up constantly — wrapping an expensive operation with a cache:

use LRU::Cache qw(import);

my $dns_cache = LRU::Cache::new(1000);

sub resolve {
    my ($domain) = @_;
    my $ip = lru_get($dns_cache, $domain);
    if (!defined $ip) {
        $ip = expensive_dns_lookup($domain);
        lru_set($dns_cache, $domain, $ip);
    }
    return $ip;
}

The cache transparently sits in front of the expensive call. Hits are served from C-backed memory in constant time. Misses fall through to the real lookup and get cached for next time. When the cache fills up, stale domains are evicted automatically.

Install

cpanm LRU::Cache

Top comments (3)

Matthew O. Persico • Apr 9

Given metacpan.org/pod/Cache::LRU and given that I am not sure that the LRU name space has no other meaning than a cache, some one might suggest that this module should have been named Cache::LRU::PP or Cache::LRU_PP. However, that implies that the API would be the same in both implementations. Since it is not, you did the right thing.

Ed Sabol • Apr 8 • Edited

LRU::Cache::new(1000) is a bit weird. LRU::Cache->new(1000) would be the common Perl OO style. Why did you go with the former? Do both styles work?

LNATION LNATION • Apr 8

They are both supported as I check inside new whether I have 1 or 2 args, if 2 I know It's called via the method (->) dispatch. I have started to prefer functional '::' as you remove the method dispatch overhead that exists because of all the magic perl has around (->).

| Name | Rate | XS method | XS func
| XS method | 5547044/s | -- | -10%
| XS func | 6189613/s | 12% | --