Jamie Gaskins

Posted on Sep 25, 2019

Enabling Crystal’s New Multicore Support

#crystal #performance #concurrency #parallelism

Crystal is a statically typed, object-oriented programming language with syntax heavily inspired by Ruby and concepts inspired by quite a few languages, including Ruby, Go, Rust, Swift, and more.

Concurrency

One of the nice things about Crystal is its concurrency model. Rather than managing threads, you spin up fibers using the spawn method:

spawn do
  puts "This runs when the program’s main fiber is sleeping or waiting on I/O"
end

This is powerful. Threads require a lot of hands-on management — if you create them, you have to collect them yourself — but fibers are managed by the garbage collector, so you can fire and forget if you like.

Fibers are also significantly lighter weight. You can use them to spin off background work so you can do things like send off 50 web requests concurrently rather than one at a time. You could do this with threads, but pthread_create and pthread_join (the libc functions that usually back threads) are expensive system calls, so you really shouldn't.

Concurrency != Parallelism

The downside of fibers has historically been that all Crystal fibers execute on the main thread. This lets you do a lot of I/O-bound work (HTTP requests, database queries, reading files from disk, etc), however CPU-bound work (serializing/deserializing JSON, crunching numbers for reports, etc) was limited to a single CPU core. Ruby, Python, and JavaScript all have this same limitation.

Yesterday, however, the Crystal team released version 0.31.0, which comes with multicore support! This allows us to do not only concurrent work like we had before, but also true parallel work — we can split our workload across as many CPU cores as our machine has. For example, here is a single Crystal process saturating a 32-core DigitalOcean droplet:

The code for this app was simply:

32.times { spawn { loop {} } }
sleep # Keep the main thread from exiting

Enabling Parallelism

Crystal's multicore support is still in preview, so it's off by default. You can enable it with the -D preview_mt compiler flag:

crystal build -Dpreview_mt my_app.cr

Or you can run it directly without creating build artifacts:

crystal run -Dpreview_mt my_app.cr

The best part, in my opinion, is that the level of parallelism (the number of fibers that can execute in parallel) is set up during app bootstrapping before your own application code actually runs by setting the CRYSTAL_WORKERS environment variable. This means you can tune the amount of CPU resources your app will try to use — and you can do it when the app starts rather than during the build process. So if you're running two different Crystal apps on the same 16-core server, they won't both be trying to use all 16 cores. You can assign 4 cores to one of them and 12 to the other:

CRYSTAL_WORKERS=4 first_app
CRYSTAL_WORKERS=12 second_app

How Does It Work?

Parallelism is configured while your app is bootstrapping — that is, wiring up all the parts it needs before it can begin executing your application code. That parallelism is achieved through a static thread pool. No threads are spun up or down while your app is running, all the threads are created by the time your first line of application code executes!

Each thread in this pool comes with its own fiber scheduler. It's basically doing what it does in single-thread mode, just across more threads. This means that a fiber currently runs only within the thread it's initially assigned to. This isn't necessarily the thread that created it, though. For example, if Thread 1 calls spawn, that new fiber may be assigned to Thread 4 and it will live on Thread 4 until it dies.

The Crystal team have discussed implementing "fiber stealing" (basically, if Thread 1 has nothing to do and Thread 2 has a lot of fibers, Thread 1 might take some of those fibers to spread around the work), but I have a feeling that's a ways off.

What Can I Do With This?

Anything that can benefit from concurrent work will automatically be parallelized. For example, web apps often use HTTP::Server from the Crystal standard library — either directly or through a framework such as Amber or Lucky. This class spins off every request handler in its own fiber. With the preview_mt flag enabled, this now spreads across CPU cores!

Background jobs like Sidekiq (yes, Sidekiq has been ported to Crystal by its author) perform each job in its own fiber. You can use the RabbitMQ client and spawn a fiber for each incoming message.

Or you can even split up work that you might otherwise do in parallel. Let's say you have the following code that iterates over an array and processes each one serially:

results = array.map { |value| process(value) }

This code is very simple, but if it takes a long time to process, it might be worth splitting individual parts across all of your CPU cores. To achieve this this, we can convert it into a producer/consumer setup where the producer spins up a fiber in which results are computed and passed through a Channel for the consumer to receive:

channel = Channel(MyValue).new(array.size)

array.each do |value|
  spawn { channel.send process(value) }
end

results = Array.new(array.size) { channel.receive }

Queues are often the solution to a producer/consumer problem and we're using a Channel as that queue. They're built into the standard library (and also used within Crystal itself) so we can count on them being there without installing additional dependencies.

Caveats

Unfortunately, multithreading there may be some libraries that aren't threadsafe yet. That's okay, fixing these issues is frequently a matter of wrapping mutexes around state changes to make them atomic (like you might with a database transaction), so it's a fantastic opportunity to make a contribution to an open-source library.

If you're unsure how to make this work for your application, you can always come by the Crystal Gitter channel or post on the Crystal forums or subreddit. The community is very helpful and we're all excited about multicore support coming to the Crystal ecosystem, so feel free to ask any questions you may have!

DEV Community

Enabling Crystal’s New Multicore Support

Concurrency

Concurrency != Parallelism

Enabling Parallelism

How Does It Work?

What Can I Do With This?

Caveats

Top comments (0)

Read next

Comparing Performance of Java 23 GC Algos (G1GC/ZGC/Shenandoah)

Architecture Patterns for Beginners: MVC, MVP, and MVVM

PostgreSQL Performance Tuning: The Power of work_mem

The Justin Beiber database problem!