DEV Community

Cover image for A "switch" for Perl that compiles away: introducing Switch::Declare
LNATION
LNATION

Posted on

A "switch" for Perl that compiles away: introducing Switch::Declare

Perl has had a complicated relationship with switch.

given/when arrived leaning on smartmatch, and smartmatch turned out to be a pit of surprising behaviour — so the whole feature was been experimental, then discouraged, then warned-about for years. Switch::Back.pm is given/when for a post-given/when Perl then there is the old Switch.pm that reached for a source filter, which means it rewrites your program text before Perl ever sees it; and then finally there are a handful of other implementations on cpan including my own Switch::Again.

So most of us generally just write if/elsif chains and move on. They're honest and fast, but they're also noisy: the scrutinee gets repeated on every branch, and a six way string dispatch turns into six eq comparisons stacked on top of each other.

Switch::Declare is an attempt to get the nice syntax without any of the historical baggage. The pitch in one line:

a real switch/case keyword that is parsed entirely at compile time,
scoped lexically like a proper pragma, and lowered to the same op tree
no source filter, no smartmatch, no runtime dependencies.

use Switch::Declare;

switch ($value) {
    case 200           { handle_ok()    }   # numeric   -> ==
    case "GET"         { handle_get()   }   # string    -> eq
    case /^\d+$/       { all_digits()   }   # regex     -> =~
    case [400 .. 499]  { client_error() }   # range     -> >= && <=
    case ["a","b","c"] { in_set()       }   # list      -> membership
    case \&is_weekend  { weekend()      }   # predicate -> $code->($topic)
    default            { fallback()     }
}
Enter fullscreen mode Exit fullscreen mode

It's an expression, too

The construct yields the value of the arm that ran, so you can use it on the right hand side of an assignment instead of mutating a variable in each branch:

my $label = switch ($status) {
    case 200 { "ok" }
    case 404 { "missing" }
    default  { "other" }
};
Enter fullscreen mode Exit fullscreen mode

The scrutinee — $status here — is evaluated exactly once. The first matching case wins, there is no implicit fall-through, and a trailing default catches everything else. As an expression with no match and no default, you get undef.

The patterns are a small, predictable grammar

Rather than try to be clever, case recognises a deliberately tiny set of literal pattern shapes, and each one lowers to the cheapest operator that does the job:

Pattern Example Becomes
number literal case 200 numeric ==
string literal case "GET" string eq
regex case /^\d+$/ $topic =~ /.../
range [LO .. HI] case [400..499] inclusive bounds
list [a, b, c] case [1, 2, 3] membership (OR)
predicate case \&is_even $code->($topic)

The predicate form takes either a code reference (\&name, package-qualified names work too) or an inline sub { ... } that closes over the surrounding lexicals, which is the escape hatch for anything the literal grammar deliberately doesn't cover:

my $limit = 100;
switch ($n) {
    case sub { $_[0] > $limit } { "over" }
    default                     { "ok"   }
}
Enter fullscreen mode Exit fullscreen mode

Because the grammar is literals rather than arbitrary expressions,
classification is never ambiguous, and the compiler always knows exactly which operator to emit.

Why "compile-time" matters for speed

This is the part I'm most pleased with. switch is installed through Perl's core keyword-plugin and lexer APIs. When the parser reaches the keyword, the module reads the whole construct, builds an op tree for it then and there, and hands that back in place of the keyword. After compilation, nothing of the parser remains — there is no dispatcher subroutine sitting between you and your code at runtime, no per-call wrapper, no closure to invoke.

For the common case — a plain variable or constant scrutinee with
single-expression arms — switch compiles to exactly a hand-written if/elsif (ternary) chain: no temporary, no extra scope, no extra ops. In the bundled benchmark (xt/bench.pl) the two run within measurement noise of each other (0–2%).

Dispatch mode: O(n) chain → O(1) lookup

There's a nice bonus. When a switch is effectively a lookup table, every case maps a string literal to a constant value, and there are at least a handful of arms the module quietly lowers it to a single hash lookup against a table built once at compile time, instead of walking a chain of eq tests:

# compiles to one hash lookup, not six string comparisons
my $name = switch ($code) {
    case "GET"    { "read"   }
    case "PUT"    { "update" }
    case "POST"   { "create" }
    case "DELETE" { "remove" }
    case "PATCH"  { "modify" }
    case "HEAD"   { "peek"   }
    default       { "?"      }
};
Enter fullscreen mode Exit fullscreen mode

In the benchmark, a 20-arm string switch in dispatch mode runs about 2.5× faster than the equivalent if/elsif chain. You never opt in or out, it's chosen automatically, and it never changes behaviour.

A real lexical pragma

The switch keyword only exists inside the lexical scope of a
use Switch::Declare, and you can turn it off again with no Switch::Declare:

{
    use Switch::Declare;
    switch ($x) { ... }   # 'switch' is the keyword here
}

switch();                 # ...and an ordinary sub call out here
Enter fullscreen mode Exit fullscreen mode

Outside that scope, switch is just an identifier. So the keyword can't collide with a switch function in unrelated code, and importing the module has no spooky action at a distance.

Benchmarking

Numbers help. As I mentioned years ago I wrote
Switch::Again, which solves the same problem from the opposite end: switch LIST builds a closure at run time that matches each call's argument against its keys (via Struct::Match).

Here is the same 20-way string dispatch written both ways:

# Switch::Declare — parsed once at compile time, lowered to a hash lookup
my $n = switch ($key) {
    case "k0" { 0 } case "k1" { 1 } ... case "k19" { 19 }
    default   { -1 }
};

# Switch::Again — dispatcher closure built once, then called
my $sw = switch
    k0  => sub { 0 },
    k1  => sub { 1 },
    ...
    k19 => sub { 19 },
    default => sub { -1 };
my $n = $sw->($key);
Enter fullscreen mode Exit fullscreen mode

The benchmark builds the Switch::Again dispatcher once,
outside the timed loop, so we're comparing steady-state per-call cost — the fairest possible footing for a runtime matcher. Results on perl 5.42:

== 6 string arms (Switch::Declare chain mode) ==
                      Rate   Switch::Again        if/elsif Switch::Declare
Switch::Again     411555/s              --            -97%            -98%
if/elsif        13282260/s           3127%              --            -33%
Switch::Declare 19891605/s           4733%             50%              --

== 20 string arms (Switch::Declare dispatch mode -> hash lookup) ==
                      Rate   Switch::Again  hand %dispatch Switch::Declare
Switch::Again     183089/s              --            -98%            -99%
hand %dispatch  11888675/s           6393%              --            -40%
Switch::Declare 19837923/s          10735%             67%              --

== 3 regex arms ==
                     Rate   Switch::Again Switch::Declare
Switch::Again   1233188/s              --            -82%
Switch::Declare 6736915/s            446%              --
Enter fullscreen mode Exit fullscreen mode

So at six string arms Switch::Declare is about 48× faster than Switch::Again; at twenty arms, where dispatch mode kicks in, it's about 108× faster — and still 67% faster than a hand rolled %dispatch table, because it returns the matched constant straight out of the hash instead of calling a coderef per hit. Even on regexes, where there's no hash trick to play, it's about 5.5× ahead.

Getting it

cpanm Switch::Declare
Enter fullscreen mode Exit fullscreen mode

Give it a try, and let me know how it goes in the comment section below.

Top comments (0)