Nik Begley for Doctave

Posted on Aug 19, 2021 • Originally published at blog.doctave.com

Using Rust with Elixir for code reuse and performance

#programming #webdev #rust #elixir

Doctave.com is primarily built on the Elixir language. Specifically the
Phoenix Web framework. Some key parts of our stack however are built in Rust and in this post we'll examine the reasons why and how this works in practice.

Elixir & Phoenix

The Elixir/Phoenix stack has proved to be reliable and productive. The
Erlang/OTP backbone that the language is built upon is rock solid. It has a small memory footprint, and more than plenty fast for most web applications.

What's more, Phoenix Live View has made building interactive pages simpler than ever. It often removes the need for a complex front-end framework like React or Vue and an associated API.

This has been a common trend using Elixir. In other languages or frameworks you would reach for an external service or program to handle caching, background jobs, or pub-sub. Elixir has instead often proved sufficient and kept our codebase simpler. This table from Sasa Juric's book Elixir in Action sums up our experience:

Why Rust?

The reason we ended up adding Rust to our stack is two-fold:

Reuse code from our open source documentation generator CLI that is built in Rust
Rendering large Markdown takes a non-negligible amount of time in Elixir

Reason #1: Code Reuse

Doctave started its life as an open source CLI for generating documentation sites from Markdown.
It was built in Rust, partly for speed, but mainly for ease of deployment. The Doctave CLI can be compiled into static binaries that run on popular platforms. As anyone who has tried to distribute a multi-platform Ruby program can tell you, this makes life a lot simpler.

As we started building Doctave.com, we wanted to make sure we would render the same output from the CLI and on our hosted platform. This is when we ran into the first issue: Markdown is not a specification. There are in fact many flavors of Markdown - CommonMark and GitHub Flavored Markdown to name a few. While they are mostly compatible, they come with small differences. Your final HTML can be different depending on which standard your parser uses.

It turned out that making the most popular Elixir Markdown processor, Earmark (originally written by Dave Thomas) and pulldown-cmark, a Rust Markdown processor, produce the same output was going to be difficult. We also required some customization that was not available in both libraries.

In the end we extracted the Markdown processor code from our Rust CLI into its own crate: doctave-markdown. This crate wraps pulldown-cmark with some specific flags and customization, and sanitizes the final HTML.

Calling Rust from Elixir

Elixir (or rather Erlang) has a mechanism called Native Implemented Functions (NIFs for short) for calling native code. NIFs are an interesting topic that we can't go into deeply here. Just note that it's important to think about how your NIF interacts with the Erlang scheduler. Hanging in your native function can cause the Erlang process to stall.

Luckily, there is a great project called Rustler. It makes implementing NIFs in Rust a breeze. In our case, all we needed was ~100 lines of glue code that mapped Rust data structures to Elixir and did some argument parsing. Then we were ready to call our Rust code from Elixir.

Now our Elixir server (named Bolero) defines this function:

defmodule Doctave.Bolero.Markdown.Parser do
  use Rustler, otp_app: :bolero, crate: "doctave_markdown_bolero"

  # When your NIF is loaded, it will override this function.
  def parse(_markdown, _url_root, _link_rewrite_rules, _url_params),
    do: :erlang.nif_error(:nif_not_loaded)
end

and Rustler makes sure that the NIF will be available at runtime for us.

Reason #2: Performance

Talking about software performance and speed is always (or at least should be) a nuanced discussion.
The question "Is X fast" really only has one (non-)answer: "it depends".

So what does it depend on?

An HTTP request has a budget of about 100-200 milliseconds if you want your application to feel snappy. Doctave.com is a young application, and staying within this budget has been easy (especially given Phoenix can serve sub-millisecond responses). But over time your application will end up having to do more things. Inevitably your response times will grow without careful consideration. If you spend your budget too early, you may find yourself struggling to stay on budget in the future.

That's not to say you should not use up your budget - you just need to be mindful of it.

So the question then is:

Given this budget, how much did we save by parsing Markdown in Rust instead of Elixir?

Always measure

Let's answer this with an unscientific experiment comparing the Rust and Elixir Markdown implementations. We will use a Markdown file that shows all the features Doctave supports, and see how the two libraries perform.

To see how the performance scales with input size, we will also run the parser with the same input concatenated together 4, 16, 32, and 64 times. This gives us sizes ranging from ~2.5 kilobytes to ~160 kilobytes. This is a not unreasonable range of real-world input, even if larger inputs are certainly possible.

Here's the results running on a developer laptop with the following specs:

Operating System: Linux
CPU Information: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
Number of Available Cores: 8
Available memory: 39.06 GB

First for Elixir. (Measured with Benchee)

Name	ips	average	deviation	median	99th %
2486_bytes	151.94	6.58 ms	±3.23%	6.55 ms	7.25 ms
9944_bytes	38.68	25.85 ms	±10.65%	25.18 ms	36.14 ms
39776_bytes	9.62	103.99 ms	±2.49%	103.89 ms	110.14 ms
79552_bytes	4.40	227.50 ms	±2.85%	226.19 ms	245.43 ms
159104_bytes	2.17	461.75 ms	±4.48%	465.18 ms	494.03 ms

And then for Rust. (Measured with Criterion)

Name	99th % min	average	99th % max
2486_bytes	17.618 us	17.667 μs	17.722 us
9944_bytes	68.656 us	68.829 μs	69.059 us
39776_bytes	259.68 us	260.12 μs	260.71 us
79552_bytes	536.68 us	537.64 μs	538.67 us
159104_bytes	1.0690 ms	1.0704 ms	1.0.720 ms

Unsurprisingly, Rust outperforms Elixir by 2 orders of magnitude. It usually parses the Markdown into HTML in a matter of microseconds.

But to go back to the original question, is it worth it to use Rust instead of Elixir in this case? Given how easy it was to integrate Rust into the Elixir project, I would argue it was absolutely worth it. The cost of integrating a different language ended up being delightfully small.

Using Elixir would not have been prohibitively slow, but parsing in Rust instead barely eats into our budget. Elixir's parsing performance is fast enough most of the time, and a simple ETS-backed caching layer could help response times in pathological cases (though caching always does add its own set of problems).

Had the integration required more custom engineering, the calculus would not have been so clear. In that case we would have had to compare the complexity of caching responses to the complexity of integrating Rust into our toolchain.

Conclusions

For our use case the combination of Elixir and Rust has worked wonderfully. It has allowed us to rely on both languages for their strengths: Elixir/Erlang's networking and clustering features and Rust's native performance and correctness guarantees.

A big thanks to the creators of Rustler for making a great project for integrating these two languages!

Top comments (3)

Źmićer Rubinštejn • Aug 25 '21

The only question here:

What do you measure in Rust benchmark?

It’s something really unnatural, because it’s absolutely not interesting what is the performance of a Rust code.
The only interesting thing is performance of a NIF, but you need benchee to make this benchmark.