DEV Community

Vsevolog Grigoriev
Vsevolog Grigoriev

Posted on

1

Elixir: The Alchemy of Code Generation

This article was originally written in Russian and has been translated to English with the help of DeepSeek. You can find the original version here: Habr.

Elixir – a language created to reintroduce Erlang to the modern world. A syntax free of beloved but archaic punctuation marks; a development culture that emphasizes tooling quality and developer comfort; a comprehensive suite for building web services; a standard library unburdened by decades of legacy; and real macros.

At first glance, Elixir doesn’t introduce radically new concepts at its core. Indeed, developers familiar with both Elixir and Erlang can often imagine how code in one language would translate to the other. But not always — Elixir includes constructs that have no direct Erlang equivalents. How do they work? Obviously, Elixir expands them into some additional Erlang code during the compilation phase. Some transformations are intuitive to predict; others (spoiler alert) might reveal surprising compiler tricks.

This article explores the transformations Elixir code undergoes before reaching the Erlang compiler. We’ll dissect conditional constructs like if and cond, unravel the mysteries of the dot operator, trace the adventures of with and for comprehensions, demystify protocols, and marvel at the optimizations Elixir achieves.

Since the final result of the Elixir compiler's work is Erlang Abstract Code — an Erlang syntax tree — it is easy to reconstruct Erlang code from it. The following function will assist us in this process:

@spec to_abstract(module()) :: String.t()
def to_abstract(module) do
  module
  |> :code.get_object_code() # Get the module BEAM code
  |> then(fn {_module, beam, _path} -> beam end)
  |> :beam_lib.chunks([:abstract_code]) # Extract Abstract Code from debug section
  |> then(fn result ->
    {:ok, {_, [abstract_code: {:raw_abstract_v1, abstract_code}]}} = result
    abstract_code
  end)
  |> :erl_syntax.form_list()
  |> :erl_prettypr.format() # Translate Abstract Code back to Erlang
  |> List.to_string()
end
Enter fullscreen mode Exit fullscreen mode

Feel free to use it yourself if the article doesn’t cover topics you’re curious about. The full code is available on GitHub.

A heads-up: There will be plenty of Erlang code ahead, so familiarity with the language will help. But don’t worry if you’ve never encountered Erlang before — no overly complex syntax will appear. A basic grasp of Elixir is all you’ll need to follow along.

Conditional expressions

Let’s start with the basics. Unlike Erlang, which strictly operates with boolean values, Elixir introduces the concept of truthiness, where nil and false are considered falsy, and all other values are truthy.

Accordingly, all expressions that rely on this logic must expand into BEAM-compatible code during compilation:

Kernel.if/2

def if_thing(thing) do
  if thing, do: :thing, else: :other_thing
end
Enter fullscreen mode Exit fullscreen mode
if_thing(_thing@1) ->
  case _thing@1 of
    _@1 when _@1 =:= false orelse _@1 =:= nil -> other_thing;
    _ -> thing
  end.
Enter fullscreen mode Exit fullscreen mode

Kernel.SpecialForms.cond/1

def cond_example do
  cond do
    :erlang.phash2(1) -> 1
    :erlang.phash2(2) -> 2
    :otherwise -> :ok
  end
end
Enter fullscreen mode Exit fullscreen mode
cond_example() ->
  case erlang:phash2(1) of
    _@3 when _@3 /= nil andalso _@3 /= false -> 1;
    _ ->
      case erlang:phash2(2) of
        _@2 when _@2 /= nil andalso _@2 /= false -> 2;
        _ ->
          case otherwise of
            _@1 when _@1 /= nil andalso _@1 /= false -> ok;
            _ -> erlang:error(cond_clause)
          end
        end
  end.
Enter fullscreen mode Exit fullscreen mode

Kernel.!/1

def negate(thing), do: !thing
Enter fullscreen mode Exit fullscreen mode
negate(_thing@1) ->
  case _thing@1 of
    _@1 when _@1 =:= false orelse _@1 =:= nil -> true;
    _ -> false
  end.
Enter fullscreen mode Exit fullscreen mode

Indeed, every conditional expression relying on truthy/falsy evaluation expands into a case statement where the negative clause checks for membership in the falsy category.

But this isn’t always the case. For example:

def if_bool(thing) do
  if is_nil(thing), do: :thing, else: :other_thing

  if thing != nil, do: :thing, else: :other_thing
end
Enter fullscreen mode Exit fullscreen mode
if_bool(_thing@1) ->
  case _thing@1 == nil of
    false -> other_thing;
    true -> thing
  end,
  case _thing@1 /= nil of
    false -> other_thing;
    true -> thing
  end.
Enter fullscreen mode Exit fullscreen mode

This reveals the first optimization implemented by the compiler: if Elixir is certain that the evaluation will only involve boolean values, it generates a case statement

case Value of
  true -> success;
  false -> failure
end
Enter fullscreen mode Exit fullscreen mode

without considering truthiness semantics as in the general case

case Value of
  Value when Value =:= false orelse Value =:= nil -> failure;
  _ -> success
end
Enter fullscreen mode Exit fullscreen mode

The optimization condition is defined here:

case lists:member({optimize_boolean, true}, Meta) andalso elixir_utils:returns_boolean(EExpr) of
  true -> rewrite_case_clauses(Opts);
  false -> Opts
end,
Enter fullscreen mode Exit fullscreen mode

For this optimization to apply, the following conditions must be met:

a) The expression must be marked with the optimize_boolean flag. This flag is automatically set by the compiler for if, !, !!, and, and or expressions. !! represents another optimization: collapsing double negation into a truthiness check. And while and/or operate strictly on booleans, the optimize_boolean flag avoids generating a third case clause that would raise a BadBooleanError.

b) Elixir must be able to determine that the operation will return a boolean. This primarily occurs when using operators or is_ guards from the :erlang module, but more complex cases are also analyzed. For example if the compiler observes that every clause in a case or cond returns boolean values, it infers that the entire expression must return booleans.

The full logic can be seen here. Below is an excerpt of it:

returns_boolean(Bool) when is_boolean(Bool) -> true;

returns_boolean({{'.', _, [erlang, Op]}, _, [_]}) when Op == 'not' -> true;

returns_boolean({{'.', _, [erlang, Op]}, _, [_, _]}) when
  Op == 'and'; Op == 'or'; Op == 'xor';
  Op == '==';  Op == '/='; Op == '=<';  Op == '>=';
  Op == '<';   Op == '>';  Op == '=:='; Op == '=/=' -> true;

returns_boolean({{'.', _, [erlang, Op]}, _, [_, Right]}) when
  Op == 'andalso'; Op == 'orelse' ->
  returns_boolean(Right);

returns_boolean({{'.', _, [erlang, Fun]}, _, [_]}) when
  Fun == is_atom;   Fun == is_binary;   Fun == is_bitstring; Fun == is_boolean;
  Fun == is_float;  Fun == is_function; Fun == is_integer;   Fun == is_list;
  Fun == is_number; Fun == is_pid;      Fun == is_port;      Fun == is_reference;
  Fun == is_tuple;  Fun == is_map;      Fun == is_process_alive -> true;
...
Enter fullscreen mode Exit fullscreen mode

Key-based access

A modern convenience absent in Erlang is dedicated syntax for key-based access in data structures.

Elixir provides square bracket access [], which simply compiles to a call to Access.get/3:

def brackets(data) do
  data[:field]
end
Enter fullscreen mode Exit fullscreen mode
brackets(_data@1) ->
  'Elixir.Access':get(_data@1, field).
Enter fullscreen mode Exit fullscreen mode

Then there’s also dot notation access, which works exclusively for maps with atom keys, and here’s where things get more interesting:

def dot(map) when is_map(map) do
  map.field
end
Enter fullscreen mode Exit fullscreen mode
dot(_map@1) when erlang:is_map(_map@1) ->
  case _map@1 of
    #{field := _@2} -> _@2;
    _@2 ->
      case elixir_erl_pass:no_parens_remote(_@2, field) of
        {ok, _@1} -> _@1;
        _ -> erlang:error({badkey, field, _@2})
      end
  end;
Enter fullscreen mode Exit fullscreen mode

Instead of compiling directly into, say, erlang:map_get/2, this notation expands into two nested case statements.
The first clause retrieves the value from the map, while the nested case is a consequence of earlier design decisions.

The issue stems from Elixir allowing parentheses to be omitted in function calls:

iex(1)> DateTime.utc_now
~U[2025-02-17 12:35:39.575764Z]
Enter fullscreen mode Exit fullscreen mode

This also applies when the module is determined at runtime:

iex(2)> mod = DateTime
DateTime
iex(3)> mod.utc_now
warning: using map.field notation (without parentheses) to invoke function DateTime.utc_now() is deprecated, you must add parentheses instead: remote.function()
  (elixir 1.18.2) src/elixir.erl:386: :elixir.eval_external_handler/3
  (stdlib 6.2) erl_eval.erl:919: :erl_eval.do_apply/7
  (stdlib 6.2) erl_eval.erl:479: :erl_eval.expr/6
  (elixir 1.18.2) src/elixir.erl:364: :elixir.eval_forms/4
  (elixir 1.18.2) lib/module/parallel_checker.ex:120: Module.ParallelChecker.verify/1
~U[2025-02-17 12:36:23.248233Z]
Enter fullscreen mode Exit fullscreen mode

Notice that the notation mod.utc_now is ambiguous — it could be a function call or a key access. As a result, Elixir must generate code that checks at runtime whether the value being accessed is a function or a map.

Starting with this commit, a warning is now emitted, but the code still works.

Interestingly, the inverse scenario also requires additional logic:

def dot(module) when is_atom(module) do
  module.function()
end
Enter fullscreen mode Exit fullscreen mode
dot(_module@1) when erlang:is_atom(_module@1) ->
  case _module@1 of
    #{function := _@2} ->
      elixir_erl_pass:parens_map_field(function, _@2);
    _@2 -> _@2:function()
  end.
Enter fullscreen mode Exit fullscreen mode

Because map key access can also end with parentheses:

iex(1)> map = %{field: :value}
%{field: :value}
iex(2)> map.field()
warning: using module.function() notation (with parentheses) to fetch map field :field is deprecated, you must remove the parentheses: map.field
  (elixir 1.18.2) src/elixir.erl:386: :elixir.eval_external_handler/3
  (stdlib 6.2) erl_eval.erl:919: :erl_eval.do_apply/7
  (elixir 1.18.2) src/elixir.erl:364: :elixir.eval_forms/4
  (elixir 1.18.2) lib/module/parallel_checker.ex:120: Module.ParallelChecker.verify/1
  (iex 1.18.2) lib/iex/evaluator.ex:336: IEx.Evaluator.eval_and_inspect/3

:value
Enter fullscreen mode Exit fullscreen mode

Erlang to the Rescue!

The Erlang compiler optimizes code more aggressively, leveraging type information known at compile time. It can eliminate both case clauses entirely if confident the value is a map.

The simplest way to provide this type information is via function annotations:

def function(data) when is_map(data)
Enter fullscreen mode Exit fullscreen mode

or

def function(%{} = data)
Enter fullscreen mode Exit fullscreen mode

As a bonus, we’ll explore how this redundant code generation impacts the final program’s performance at the end of the article. Stay tuned 👍

With

Kernel.SpecialForms.with/1 is, in my opinion, one of the most valuable additions Elixir has brought to the BEAM ecosystem. So much so that Erlang adopted it almost verbatim, renaming it to maybe. One might hope that with could eventually compile directly to maybe in future compiler versions, but there’s a critical difference: maybe does not support guards in its pattern matches. As a result, Elixir must continue to manually translate with into lower-level constructs, both now and for the foreseeable future:

The example below represents an anti-pattern, but we need this specific code to examine with clauses containing else

def with_else(map) do
  with {_, {:ok, data}} <- {:map, Map.fetch(map, "data")},
       {_, {int, ""}} <- {:int, Integer.parse(data)} do
    int
  else
    {:map, :error} -> {:error, :no_data}
    {:int, _} -> {:error, :not_an_int}
  end
end
Enter fullscreen mode Exit fullscreen mode
with_else(_map@1) ->
  _@2 = fun ({map, error}) -> {error, no_data};
            ({int, _}) -> {error, not_an_int};
            (_@1) -> erlang:error({else_clause, _@1})
        end,
  case {map, maps:find(<<"data">>, _map@1)} of
    {_, {ok, _data@1}} ->
      case {int, 'Elixir.Integer':parse(_data@1)} of
        {_, {_int@1, <<>>}} -> _int@1;
        _@3 -> _@2(_@3)
      end;
    _@3 -> _@2(_@3)
  end.
Enter fullscreen mode Exit fullscreen mode

Here, the compiler extracts the else clauses into a separate lambda, passing values from mismatched case patterns into it. Interestingly, as recently as Elixir v1.16.3, the generated code was far more verbose:

with_else(_map@1) ->
  case {map, maps:find(<<"data">>, _map@1)} of
    {_, {ok, _data@1}} ->
      case {int, 'Elixir.Integer':parse(_data@1)} of
        {_, {_int@1, <<>>}} -> _int@1;
        _@1 ->
          case _@1 of
            {map, error} -> {error, no_data};
            {int, _} -> {error, not_an_int};
            _@2 -> erlang:error({else_clause, _@2})
          end
        end;
    _@1 ->
      case _@1 of
        {map, error} -> {error, no_data};
        {int, _} -> {error, not_an_int};
        _@2 -> erlang:error({else_clause, _@2})
      end
  end.
Enter fullscreen mode Exit fullscreen mode

In earlier versions, the compiler duplicated all else clauses for each branching path. While the Erlang compiler (to which Elixir delegates code generation) could theoretically optimize this by eliminating redundant clauses, such analysis was neither consistent nor universally applied.

As an aside, a with expression without an else clause compiles into straightforward, unnuanced code:

def without_else(map) do
  with {:ok, data} <- fetch_data(map),
       {:ok, int} <- parse_int(data) do
    int
  end
end
Enter fullscreen mode Exit fullscreen mode
without_else(_map@1) ->
  case fetch_data(_map@1) of
    {ok, _data@1} ->
      case parse_int(_data@1) of
        {ok, _int@1} -> _int@1;
        _@1 -> _@1
      end;
    _@1 -> _@1
  end.
Enter fullscreen mode Exit fullscreen mode

For

Kernel.SpecialForms.for/1 is arguably the most intricate syntactic sugar in Elixir. Personally, I avoid using it when possible, though some may find it appealing.

A basic for comprehension compiles into an Enum.map:

def basic do
  for i <- 1..5 do
    i * 1
  end
end
Enter fullscreen mode Exit fullscreen mode
basic() ->
  'Elixir.Enum':map(
    #{'__struct__' => 'Elixir.Range', first => 1, last => 5, step => 1},
    fun (_i@1) -> _i@1 * 1 end).
Enter fullscreen mode Exit fullscreen mode

When a filter is added, it expands into a reduce call. Notably, the result is reversed via lists:reverse/1 instead of Enum.reverse/1, saving a function call.

def filter do
  for i <- 1..10, div(i, 2) == 0 do
    i * 1
  end
end
Enter fullscreen mode Exit fullscreen mode
filter() ->
  lists:reverse(
    'Elixir.Enum':reduce(
      #{'__struct__' => 'Elixir.Range', first => 1, last => 10, step => 1},
      [],
      fun (_i@1, _@1) -> 
        case _i@1 div 2 == 0 of 
          true -> [_i@1 * 1 | _@1];
          false -> _@1
        end
      end)).
Enter fullscreen mode Exit fullscreen mode

Pattern-matching clauses are lifted into the head of the generated function. However, guards are expanded into nested case statements — a curious choice, given that guards here are subject to the same limitations as anywhere else.

def match do
  users = [user: "john", admin: "meg", guest: "barbara"]

  for {type, name} when type != :guest <- users do
    String.upcase(name)
  end
end
Enter fullscreen mode Exit fullscreen mode
match() ->
  _users@1 = [{user, <<"john">>},
              {admin, <<"meg">>},
              {guest, <<"barbara">>}],
  lists:reverse(
    'Elixir.Enum':reduce(
      _users@1,
      [],
      fun
        ({_type@1, _name@1}, _@1) ->
          case _type@1 /= guest of
            true -> ['Elixir.String':upcase(_name@1) | _@1];
            false -> _@1 end;
        (_, _@1) -> _@1
      end)).
Enter fullscreen mode Exit fullscreen mode

An optimization worth knowing about: if the compiler detects that the result of for expression is not used, it generates code that skips collecting it. For instance, while a basic for compiles to map, the same for that is not assigned to anything compiles to reduce with a nil accumulator:

def no_collect do
  for i <- 1..5 do
    i
  end

  :ok
end
Enter fullscreen mode Exit fullscreen mode
no_collect() ->
    'Elixir.Enum':reduce(#{'__struct__' => 'Elixir.Range',
                           first => 1, last => 5, step => 1},
                         [],
                         fun (_i@1, _@1) -> begin _i@1, nil end end),
    ok.
Enter fullscreen mode Exit fullscreen mode

Finally, when uniq: true is used, the compiler additionally stores values in MapSet to check them for uniqueness:

def unique do
  for i <- 1..10, uniq: true do
    i
  end
end
Enter fullscreen mode Exit fullscreen mode

I split the code into intermediate variables for clarity; the original code is written in a single line.

unique() ->
  Range = #{'__struct__' => 'Elixir.Range', first => 1, last => 10, step => 1},

  Function = fun (Elem, Acc) ->
    {List, MapSet} = Acc,
    Key = Elem,
    case MapSet of 
      #{Key := true} -> {List, MapSet};
      #{} -> {[Key | List], MapSet#{Key => true}}
    end
  end,

  Result = 'Elixir.Enum':reduce(Range, {[], #{}}, Function),
  lists:reverse(erlang:element(1, Result)).
Enter fullscreen mode Exit fullscreen mode

Protocols

To explore protocols, let’s use an example from the documentation:

defmodule ElixirJourney.Protocols do
  defprotocol Size do
    def size(data)
  end

  defimpl Size, for: BitString do
    def size(binary), do: byte_size(binary)
  end

  defimpl Size, for: Map do
    def size(map), do: map_size(map)
  end

  defimpl Size, for: Tuple do
    def size(tuple), do: tuple_size(tuple)
  end

  def protocol(value) do
    Size.size(value)
  end
end
Enter fullscreen mode Exit fullscreen mode

Invoking a protocol implementation is straightforward — it’s simply a function call to the protocol’s module (defprotocol and each defimpl generate their own modules):

-module('Elixir.ElixirJourney.Protocols').

...

protocol(_value@1) ->
  'Elixir.ElixirJourney.Protocols.Size':size(_value@1).
Enter fullscreen mode Exit fullscreen mode

Implementation modules are also compiled into expected forms, with two additions:

  • A behaviour declaration referencing the protocol
  • A meta-function __impl__ that provides metadata about the implementation (e.g., target protocol and module)
-module('Elixir.ElixirJourney.Protocols.Size.Map').

...

-behaviour('Elixir.ElixirJourney.Protocols.Size').

...

'__impl__'(for) -> 'Elixir.Map';
'__impl__'(protocol) ->
    'Elixir.ElixirJourney.Protocols.Size'.

size(_map@1) -> erlang:map_size(_map@1).
Enter fullscreen mode Exit fullscreen mode

The core logic resides in the protocol module itself.

The implementation varies depending on whether protocols are consolidated or not (via the :consolidate_protocols Mix flag). Let’s examine the consolidated case first:

-module('Elixir.ElixirJourney.Protocols.Size').

-behaviour('Elixir.Protocol').

-export_type([t/0]).
-type t() :: term().

-callback size(t()) -> term().
Enter fullscreen mode Exit fullscreen mode

The protocol module declares itself as a behaviour, with protocol functions defined as callbacks. By default, callbacks use term() type specs, but these can be refined in defprotocol:

defprotocol Size do
  @type t :: bitstring() | map() | tuple()

  @spec size(t()) :: non_neg_integer()
  def size(data)
end
Enter fullscreen mode Exit fullscreen mode
-export_type([t/0]).
-type t() :: bitstring() | map() | tuple().

-callback size(t()) -> non_neg_integer().
Enter fullscreen mode Exit fullscreen mode

The meta-function __protocol__ is generated, enabling runtime inspection of protocol details:

'__protocol__'(module) -> 'Elixir.ElixirJourney.Protocols.Size';
'__protocol__'(functions) -> [{size, 1}];
'__protocol__'('consolidated?') -> true;
'__protocol__'(impls) -> {consolidated, ['Elixir.BitString', 'Elixir.Map', 'Elixir.Tuple']}.
Enter fullscreen mode Exit fullscreen mode

Protocol function calls look like this:

size(_@1) -> ('impl_for!'(_@1)):size(_@1).
Enter fullscreen mode Exit fullscreen mode

where impl_for! ensures an error is raised for unimplemented types:

'impl_for!'(_@1) ->
  case impl_for(_@1) of
    _@2 when _@2 =:= false orelse _@2 =:= nil ->
      erlang:error(
        'Elixir.Protocol.UndefinedError':exception(
          [
            {protocol, 'Elixir.ElixirJourney.Protocols.Size'},
            {value, _@1},
            {description, <<>>}
          ]));
    _@3 -> _@3
  end.
Enter fullscreen mode Exit fullscreen mode

The actual implementation selection happens in impl_for:

impl_for(#{'__struct__' := _@1})
    when erlang:is_atom(_@1) ->
    struct_impl_for(_@1);
impl_for(_@1) when erlang:is_tuple(_@1) ->
    'Elixir.ElixirJourney.Protocols.Size.Tuple';
impl_for(_@1) when erlang:is_map(_@1) ->
    'Elixir.ElixirJourney.Protocols.Size.Map';
impl_for(_@1) when erlang:is_bitstring(_@1) ->
    'Elixir.ElixirJourney.Protocols.Size.BitString';
impl_for(_) -> nil.

struct_impl_for(_) -> nil.
Enter fullscreen mode Exit fullscreen mode

if the protocol had struct implementations, they would be enumerated in struct_impl_for

And that's it. During consolidation, the compiler gathers all protocol implementations across the project and generates a consolidated impl_for dispatch table.

If consolidation is disabled, the compiler generates a runtime impl_for check for every possible type, including dynamic struct name resolution via struct_impl_for:

impl_for(#{'__struct__' := _@2 = _@1})
  when erlang:is_atom(_@2) ->
  struct_impl_for(_@1);
impl_for(_@1) when erlang:is_tuple(_@1) ->
  case 'Elixir.Code':ensure_compiled('Elixir.ElixirJourney.Protocols.Size.Tuple') of
    {module, _@2} -> _@2;
    {error, _} -> nil
  end;
impl_for(_@1) when erlang:is_atom(_@1) ->
  case 'Elixir.Code':ensure_compiled('Elixir.ElixirJourney.Protocols.Size.Atom') of
    {module, _@2} -> _@2;
    {error, _} -> nil
  end;
impl_for(_@1) when erlang:is_list(_@1) ->
  case 'Elixir.Code':ensure_compiled('Elixir.ElixirJourney.Protocols.Size.List') of
    {module, _@2} -> _@2;
    {error, _} -> nil
  end;

...

struct_impl_for(_@1) ->
  case 'Elixir.Code':ensure_compiled('Elixir.Module':concat('Elixir.ElixirJourney.Protocols.Size', _@1)) of
    {module, _@2} -> _@2;
    {error, _} -> nil
  end.
Enter fullscreen mode Exit fullscreen mode

That is why unconsolidated protocols allow dynamic implementation injection at runtime but incur performance overhead.

String interpolation

Let's briefly examine string interpolation:

def interpolation do
  "This #{:will} #{"be"} #{[97]} #{"str" <> "ing"}"
end
Enter fullscreen mode Exit fullscreen mode
interpolation() ->
  <<"This ",
    case will of
      _@1 when erlang:is_binary(_@1) -> _@1;
      _@1 -> 'Elixir.String.Chars':to_string(_@1)
    end/binary,
    " ", "be", " ",
    case [97] of
      _@2 when erlang:is_binary(_@2) -> _@2;
      _@2 -> 'Elixir.String.Chars':to_string(_@2)
    end/binary,
    " ",
    case <<"str", "ing">> of
      _@3 when erlang:is_binary(_@3) -> _@3;
      _@3 -> 'Elixir.String.Chars':to_string(_@3)
    end/binary>>.
Enter fullscreen mode Exit fullscreen mode

Each value within curly braces is converted into a call to the String.Chars protocol. An interesting detail here is that Elixir does not fully rely on the protocol. Instead, it generates a separate case statement with an early return for scenarios where the value is already a string.

It appears Elixir anticipates that the Erlang compiler might optimize away the entire case statement in certain situations, if the compiler can determine that the value will always be a string.

By the way Elixir itself is capable of such optimization, but only in the simplest case — when a string literal is passed directly, like the "be" string in our example.

Compile-time computations

Here we encounter a fresh set of optimizations (introduced in 2024) that I stumbled upon by accident. In Elixir, we are accustomed to compile-time computations being as accessible as runtime ones. If you want something to be calculated during compilation, you simply move it out of a function into the module body and attach the result to an @attribute.

Now, however, Elixir additionally maintains a list of pure functions, based on which it can automatically hoist computations of these functions to compile time, regardless of their location in the code.

For Elixir v1.18.2, this list looks like this:

inline_pure_function('Elixir.Duration', 'new!') -> true;
inline_pure_function('Elixir.MapSet', new) -> true;
inline_pure_function('Elixir.String', length) -> true;
inline_pure_function('Elixir.String', graphemes) -> true;
inline_pure_function('Elixir.String', codepoints) -> true;
inline_pure_function('Elixir.String', split) -> true;
inline_pure_function('Elixir.Kernel', to_timeout) -> true;
inline_pure_function('Elixir.URI', new) -> true;
inline_pure_function('Elixir.URI', 'new!') -> true;
inline_pure_function('Elixir.URI', parse) -> true;
inline_pure_function('Elixir.URI', encode_query) -> true;
inline_pure_function('Elixir.URI', encode_www_form) -> true;
inline_pure_function('Elixir.URI', decode) -> true;
inline_pure_function('Elixir.URI', decode_www_for) -> true;
inline_pure_function('Elixir.Version', parse) -> true;
inline_pure_function('Elixir.Version', 'parse!') -> true;
inline_pure_function('Elixir.Version', parse_requirement) -> true;
inline_pure_function('Elixir.Version', 'parse_requirement!') -> true;
inline_pure_function(_Left, _Right) -> false.
Enter fullscreen mode Exit fullscreen mode

With this mechanism, constructing a new MapSet can immediately resolve to a precomputed structure:

set = MapSet.new([1, 2, 3])
Enter fullscreen mode Exit fullscreen mode
_set@1 = #{'__struct__' => 'Elixir.MapSet', map => #{1 => [], 2 => [], 3 => []}}
Enter fullscreen mode Exit fullscreen mode

Similarly, calculating the length of a string can be "baked" directly into the module as a literal value:

length = String.length("static string")
Enter fullscreen mode Exit fullscreen mode
_length@1 = 13
Enter fullscreen mode Exit fullscreen mode

These optimizations are somewhat short-sighted, relying on functions being called with literals. For example, a scenario like this:

length = String.length("dynamic" <> " " <> "string")
Enter fullscreen mode Exit fullscreen mode

would not be optimized

_length@2 = 'Elixir.String':length(<<"dynamic", " ", "string">>),
Enter fullscreen mode Exit fullscreen mode

If a computation returns an error, that error is faithfully baked into the compiled code:

version = Version.parse("static invalid")
Enter fullscreen mode Exit fullscreen mode
_version@1 = error,
Enter fullscreen mode Exit fullscreen mode

However, if the computation raises an exception, the compiler leaves the code untouched, ensuring the exception is thrown at runtime as normal:

version = Version.parse!("static invalid")
Enter fullscreen mode Exit fullscreen mode
_version@2 = 'Elixir.Version':'parse!'(<<"static invalid">>)
Enter fullscreen mode Exit fullscreen mode

Beyond the generic list of pure functions, the compiler includes several special-case optimizations. For example, converting the offset argument in shift functions from the Date, DateTime, NaiveDateTime, and Time modules into a Duration structure:

shifted = Date.shift(~D[2025-01-01], day: 1)
Enter fullscreen mode Exit fullscreen mode
_shifted@1 = 'Elixir.Date':shift(
  #{'__struct__' => 'Elixir.Date', calendar => 'Elixir.Calendar.ISO', year => 2025, month => 1, day => 1},
  #{
    '__struct__' => 'Elixir.Duration', day => 1, hour => 0, microsecond => {0, 0},
    minute => 0, month => 0, second => 0, week => 0, year => 0
  }
)
Enter fullscreen mode Exit fullscreen mode

Since these optimizations rely solely on module and function names, we can fool around a bit by replacing a built-in module:

defmodule Version do
  def parse(version) do
    IO.puts(version)
    version
  end 
end

defmodule M do
  def run do
    Version.parse("This will be printed at compiletime")
  end
end
Enter fullscreen mode Exit fullscreen mode

Elixir will complain about the replacement, but will execute the code nonetheless:

❯ elixir script.exs
    warning: redefining module Version (current version loaded from /usr/lib/elixir/lib/elixir/ebin/Elixir.Version.beam)
    │
  1 │ defmodule Version do
    │ ~~~~~~~~~~~~~~~~~~~~
    │
    └─ script.exs:1: Version (module)

This will be printed at compiletime
Enter fullscreen mode Exit fullscreen mode

Bonus: "fixing" dot access

As promised, here's a small bonus. If you’ve already forgotten the context: the compiler is forced to generate additional code that dynamically interprets data.field access due to syntax ambiguity. This extra code selects how to handle the expression based on the value at runtime:

def dot(map) when is_map(map) do
  map.field
end

def dot(module) when is_atom(module) do
  module.function()
end
Enter fullscreen mode Exit fullscreen mode
dot(_map@1) when erlang:is_map(_map@1) ->
  case _map@1 of
    #{field := _@2} -> _@2;
    _@2 ->
      case elixir_erl_pass:no_parens_remote(_@2, field) of
        {ok, _@1} -> _@1;
        _ -> erlang:error({badkey, field, _@2})
      end
  end;
dot(_module@1) when erlang:is_atom(_module@1) ->
  case _module@1 of
    #{function := _@2} ->
      elixir_erl_pass:parens_map_field(function, _@2);
    _@2 -> _@2:function()
  end.
Enter fullscreen mode Exit fullscreen mode

First, let’s measure how this affects performance:

Mix.install([:benchee])

# Hide the literal behind runtime computation, otherwise BEAM will be able to "bake" the value when accessing with map_get
map = Enum.random([%{name: "John Doe", age: 30, email: "john.doe@example.com"}])

Benchee.run(
  %{
    dot: fn ->
      name = map.name
      age = map.age
      email = map.email

      {name, age, email}
    end,
    pattern_match: fn ->
      %{name: name, age: age, email: email} = map

      {name, age, email}
    end,
    fetch!: fn ->
      name = Map.fetch!(map, :name)
      age = Map.fetch!(map, :age)
      email = Map.fetch!(map, :email)

      {name, age, email}
    end
  },
  measure_function_call_overhead: true
)
Enter fullscreen mode Exit fullscreen mode

We’ll compare three approaches: dot access, pattern-matching all three fields at once, and Map.fetch! (which, through collaboration between the Elixir and Erlang compilers, ultimately compiles down to :erlang.map_get).

Intuitively, I’d expect pattern-matching to be fastest since in a dynamically typed environment it only needs to typecheck the value once.

But the results surprised me:

Operating System: Linux
CPU Information: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
Number of Available Cores: 8
Available memory: 15.31 GB
Elixir 1.18.2
Erlang 27.2.1
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 2 s
time: 5 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 21 s

Measured function call overhead as: 19 ns
Benchmarking dot ...
Benchmarking fetch! ...
Benchmarking pattern_match ...
Calculating statistics...
Formatting results...

Name                    ips        average  deviation         median         99th %
fetch!              18.55 M       53.90 ns ±56533.65%          15 ns          22 ns
pattern_match       18.47 M       54.15 ns ±59289.07%          13 ns          20 ns
dot                 18.27 M       54.73 ns ±56682.74%          14 ns          18 ns

Comparison: 
fetch!              18.55 M
pattern_match       18.47 M - 1.00x slower +0.25 ns
dot                 18.27 M - 1.02x slower +0.82 ns
Enter fullscreen mode Exit fullscreen mode

Microbenchmarks like this should always be taken with a grain of salt. The values we are measuring here are minuscule, and any fluctuation: hardware interrupts, kernel scheduler activity and lots of other things — can introduce massive distortions.

However, across multiple runs, the results remained consistent. fetch! and pattern_match stayed neck-and-neck, while dot lagged slightly behind (by single-digit percentages). It seems the extra code barely impacts execution time.

Still, let’s imagine the compiler didn’t generate that extra code. Would it shrink the final binary?

To test this, we dive into the compiler internals. The dot operator, like all special forms, is expanded into Erlang Abstract Code in lib/elixir/src/elixir_erl_pass.erl.

Even without deep familiarity with the compiler’s internals or Erlang’s abstract code representation, it’s clear that this code handles the data.field case (without parentheses):

TError = {tuple, Ann, [{atom, Ann, badkey}, TRight, TVar]},
{{'case', Generated, TLeft, [
  {clause, Generated,
    [{map, Ann, [{map_field_exact, Ann, TRight, TVar}]}],
    [],
    [TVar]},
  {clause, Generated,
    [TVar],
    [],
    [{'case', Generated, ?remote(Generated, elixir_erl_pass, no_parens_remote, [TVar, TRight]), [
      {clause, Generated,
       [{tuple, Generated, [{atom, Generated, ok}, TInnerVar]}], [], [TInnerVar]},
      {clause, Generated,
       [{var, Generated, '_'}], [], [?remote(Ann, erlang, error, [TError])]}
    ]}]}
]}, SV};
Enter fullscreen mode Exit fullscreen mode

A separate branch exists to handle data.field(), but the logic is similar.

Our goal is to eliminate branching. For data.field(), we want a direct function call to field function of data module, and for data.field, a direct :erlang.map_get(field, data). map_get fits perfectly here as it also raises the same KeyError for missing keys.

The following patch achieves exactly this:

diff --git a/lib/elixir/src/elixir_erl_pass.erl b/lib/elixir/src/elixir_erl_pass.erl
index f1c13ca24..7b3358011 100644
--- a/lib/elixir/src/elixir_erl_pass.erl
+++ b/lib/elixir/src/elixir_erl_pass.erl
@@ -237,40 +237,12 @@ translate({{'.', _, [Left, Right]}, Meta, []}, _Ann, S)
   TRight = {atom, Ann, Right},

   Generated = erl_anno:set_generated(true, Ann),
-  {InnerVar, SI} = elixir_erl_var:build('_', SL),
-  TInnerVar = {var, Generated, InnerVar},
-  {Var, SV} = elixir_erl_var:build('_', SI),
-  TVar = {var, Generated, Var},

   case proplists:get_value(no_parens, Meta, false) of
     true ->
-      TError = {tuple, Ann, [{atom, Ann, badkey}, TRight, TVar]},
-      {{'case', Generated, TLeft, [
-        {clause, Generated,
-          [{map, Ann, [{map_field_exact, Ann, TRight, TVar}]}],
-          [],
-          [TVar]},
-        {clause, Generated,
-          [TVar],
-          [],
-          [{'case', Generated, ?remote(Generated, elixir_erl_pass, no_parens_remote, [TVar, TRight]), [
-            {clause, Generated,
-             [{tuple, Generated, [{atom, Generated, ok}, TInnerVar]}], [], [TInnerVar]},
-            {clause, Generated,
-             [{var, Generated, '_'}], [], [?remote(Ann, erlang, error, [TError])]}
-          ]}]}
-      ]}, SV};
+      {{call, Generated, {remote, Generated, {atom, Ann, erlang}, {atom, Ann, map_get}}, [TRight, TLeft]}, SL};
     false ->
-      {{'case', Generated, TLeft, [
-        {clause, Generated,
-          [{map, Ann, [{map_field_exact, Ann, TRight, TVar}]}],
-          [],
-          [?remote(Generated, elixir_erl_pass, parens_map_field, [TRight, TVar])]},
-        {clause, Generated,
-          [TVar],
-          [],
-          [{call, Generated, {remote, Generated, TVar, TRight}, []}]}
-      ]}, SV}
+      {{call, Generated, {remote, Generated, TLeft, TRight}, []}, SL}
     end;

 translate({{'.', _, [Left, Right]}, Meta, Args}, _Ann, S)
Enter fullscreen mode Exit fullscreen mode

Apply it and assemble the compiler:

❯ git apply dot-to-maps-get.patch  
❯ make compile  
Enter fullscreen mode Exit fullscreen mode

Run tests to verify nothing breaks:

❯ make test
Enter fullscreen mode Exit fullscreen mode

Only one test fails, complaining about an exception message mismatch:

  1) test blaming annotates undefined key error with nil hints (ExceptionTest)
     test/elixir/exception_test.exs:678
     Assertion with == failed
     code:  assert blame_message(nil, & &1.foo) ==
              "key :foo not found in: nil\n\nIf you are using the dot syntax, " <>
                "such as map.field, make sure the left-hand side of the dot is a map"
     left:  "expected a map, got: nil"
     right: "key :foo not found in: nil\n\nIf you are using the dot syntax, such as map.field, make sure the left-hand side of the dot is a map"
     stacktrace:
       test/elixir/exception_test.exs:679: (test)
Enter fullscreen mode Exit fullscreen mode

We can live with that. Everything else seems to be in order.

Add the patched compiler’s bin directory to the front of PATH:

PATH="path_to_elixir/elixir/bin:$PATH"
Enter fullscreen mode Exit fullscreen mode

and check that new compiler generates the expected code:

dot(_map@1) when erlang:is_map(_map@1) ->
  erlang:map_get(field, _map@1);
dot(_module@1) when erlang:is_atom(_module@1) ->
  _module@1:function().
Enter fullscreen mode Exit fullscreen mode

Yes, it does.

Run the benchmarks:

...
Name                    ips        average  deviation         median         99th %
pattern_match       19.09 M       52.37 ns ±60540.61%          12 ns          22 ns
dot                 18.66 M       53.60 ns ±57166.61%          14 ns          26 ns
fetch!              18.08 M       55.30 ns ±56749.55%          14 ns          28 ns

Comparison:
pattern_match       19.09 M
dot                 18.66 M - 1.02x slower +1.23 ns
fetch!              18.08 M - 1.06x slower +2.93 ns
Enter fullscreen mode Exit fullscreen mode

No dramatic changes. The access methods remain similar in performance.

Finally, let’s measure the impact on compiled code size. We'll take the Phoenix v1.7.20 source code, compile it with both compiler versions and compare the total size of BEAM files:

❯ git clone git@github.com:phoenixframework/phoenix.git && cd phoenix
...
❯ mix deps.get
...
❯ mix compile && du -s _build/dev
...
5400
❯ mix clean --depsPATH="path_to_elixir/elixir/bin:$PATH" mix compile && du -s _build/dev
...
5336
Enter fullscreen mode Exit fullscreen mode

The difference exists but is minimal — a ~1% size reduction.

Thus, we can confidently conclude that the extra code generated for dot access has negligible impact on both performance and code size 👌

Conclusion

This concludes our journey. As you can see, even when operating purely at the syntactic level (translating Elixir AST into Erlang AST), the compiler is capable of many fascinating optimizations and transformations.

There is sometimes a perception that using Elixir will result in performance loss compared to Erlang, since the latter is considered a "lower-level" language. While this is technically true, in practice these losses are either negligible or completely eliminated during compilation.

Another thought that has been on my mind for a long time and has only strengthened as I delved deeper into the compiler: the complexity of code in Erlang/Elixir remains roughly the same regardless of the complexity of the project it is part of. It’s always just functions working with data. The data could be plain structures, Ecto schemas in a web application, or even a syntax tree in a compiler — it is still just data being processed by ordinary functions. This is a rare quality. In my experience, the code of a language’s "internals" or its core frameworks is usually worlds away from what an average programmer writes in that language.

So when using Elixir, never be afraid to explore how things work under the hood. Don’t hesitate to dive into the code behind the documentation — the language makes it as easy as possible!

Top comments (0)

👋 Kindness is contagious

If you found this post helpful, please consider leaving a ❤️ or a kind comment!

Sounds good!