Introduction
The Crystal programming language is notorious for its slow compilation times.
But have you ever wondered where Crystal actually spends most of its compilation time?
Figure: Crystal uses LLVM as its backend
The Crystal Compilation Pipeline
The Crystal compiler's compilation process consists of the following stages:
- new_program - Creating the program object
- parse - Lexical analysis and parsing
- semantic - Semantic analysis
- codegen - Generating object files
module Crystal
class Compiler
def compile(source : Source | Array(Source), output_filename : String) : Result
source = [source] unless source.is_a?(Array)
# 1 new_program
program = new_program(source)
# 2 parse
node = parse program, source
# 3 semantic
node = program.semantic node, cleanup: !no_cleanup?
# 4 codegen
units = codegen program, node, source, output_filename unless @no_codegen
# 5 cleanup
# ... omission ...
Result.new program, node
end
end
end
After this, linking is performed by the standard linker.
Command-Line Options for Compilation Statistics
Crystal provides a command-line option that displays compilation time statistics:
crystal build -s hoge.cr
However, this method doesn't show the execution time of native LLVM functions, which was insufficient for this article's investigation.
To get to the heart of the matter, I used print debugging to measure the compilation time.
Native LLVM Functions Called During Codegen
During the codegen stage, the following native LLVM functions are called:
-
LibLLVM.run_passes- Applies optimization passes to LLVM IR
-
LibLLVM.target_machine_emit_to_file- Generates object files
I measured the execution time of these functions using print debugging as well.
Results
Here are the results from compiling the Crystal compiler itself:
| Stage | Time (seconds) |
|---|---|
| new_program | 0.000388207 |
| parse | 0.000065000 |
| semantic | 12.552620028 |
| codegen | 355.245409133 |
| - LibLLVM.run_passes | 252.340241198 |
| - LibLLVM.target_machine_emit_to_file | 93.280652845 |
| cleanup | 0.000013180 |
| total | 367.798495548 |
Let me visualize this with a bar chart:
NOTE: This graph is from the original article and may differ slightly from the latest compiler.
Were the results what you expected?
- Lexical analysis and parsing take virtually no time!
- Semantic analysis (including type inference) also takes relatively little time!
In fact, the vast majority of the compilation time is spent in codegen, specifically in:
LibLLVM.run_passesLibLLVM.target_machine_emit_to_file
These are external LLVM function calls that happen outside of Crystal's control!
In this case of building the Crystal compiler itself with --release, the majority of compilation time was spent on LLVM optimization and code generation.
This might be a somewhat surprising result, don't you think?
How to Speed Up the Crystal Compiler
The parts of the Crystal compiler implemented in Crystal—namely lexical analysis, parsing, and semantic analysis—are already sufficiently fast. This means that to achieve further speedups, we would need hardcore approaches such as:
- Introducing parallelization even in release builds
- Optimizing LLVM itself (specifically for Crystal)
- Improving Crystal to generate LLVM IR that's easier for LLVM to process
However, since these approaches aren't very practical for everyday use, let me introduce a more accessible method:
Use -O3 Instead of --release
In the Crystal compiler, specifying --release is equivalent to specifying both -O3 and --single-module. If you're willing to sacrifice some optimization, you can specify only the -O3 option, which enables parallelization and can speed up compilation in many cases.
From here on, there's a bit of a speculative element to the discussion.
Why Crystal Doesn't Have Incremental Compilation or Shared Library Support
Crystal's --release Mode Includes --single-module
Crystal struggles with splitting code into separate compilation units and reusing the results. In particular, --release builds enable --single-module, which compiles everything into one massive LLVM module for optimization.
For comparison, Rust performs separate compilation for each crate even with --release. In Rust, you need to explicitly use -C lto=fat to get behavior similar to Crystal's, where the entire LLVM IR is optimized together.
Crystal's Weak Caching Mechanism
Crystal does have a mechanism that caches LLVM bitcode files (.bc) and object files on a per-type basis during normal builds, and can reuse object files only when the bitcode is completely unchanged.
This allows the compiler to skip the expensive object file generation step in some cases.
However, even in such cases, lexical analysis, parsing, and semantic analysis cannot be skipped. The comparison only happens after generating .bc files. And as we'll discuss later, cases where the bitcode is completely unchanged are actually quite rare.
Crystal Is a Statically-Typed Language Where the Caller Determines Types
Why can't Crystal split packages into multiple LLVM IR modules, precompile them, and reuse the results?
The main reason is that Crystal has strong type inference and union types, and the concrete types of methods change depending on the calling context.
Crystal is an unusual statically-typed language where the caller determines the types, enabling duck typing. However, the trade-off is that type signatures need to be inferred with every compilation.
Type IDs Change with Each Compilation
The Crystal compiler assigns a number to every class to resolve types. With each compilation, every type that appears gets assigned a "number." Let's say class A gets assigned the number "10" in one compilation. If you make a small change to the code and recompile, "10" might be assigned to a different class. Linking object files created this way causes type inconsistencies and fails, because conditional branches based on types won't work correctly.
Additionally, when loading multiple Crystal shared libraries simultaneously, there's the problem of runtime functions being multiply defined.
This makes it difficult for Crystal to split code into parts, precompile them, and reuse them later.
But is this an inherent characteristic of the Crystal language? Let's consider this from a more social context.
The Crystal Language Community and Resource Constraints
Crystal is known as a language with Ruby-like concise syntax that delivers excellent performance.
However, the Crystal development team has limited resources. While there is a dedicated team at Manas.Tech and community contributors worldwide, the resources are still limited compared to large corporations.
For instance, imagine if Apple were developing Crystal.
Apple engineers might make changes to clang/LLVM itself to significantly improve compilation speed.
Or, like Swift, they might define a proper ABI and create an intermediate language or binary format well-suited to Crystal. Similar to how Swift has SIL (Swift Intermediate Language) as an intermediate representation before converting to LLVM IR, Crystal could have its own optimized intermediate language. This would enable comparing modules at that stage, resolving types, and generating object files from there. (Though I'm not entirely sure if this is possible within the LLVM framework.)
However, the Crystal compiler we have isn't like that. It generates monolithic, massive LLVM IR and delegates all optimization to LLVM. For package management, downloading source code directly from GitHub is the mainstream approach.
There still seems to be room for improvement.
The characteristic of slow compilation but fast execution is not purely a linguistic characteristic of Crystal, but also stems from the resource constraints of the Crystal development team. In other words, if significant resources were invested in development in the future, these issues could potentially be improved.
Conclusion
Designing an ABI specification or intermediate language for Crystal is extremely difficult. However, if someone achieves this, it could become Crystal 2.0 or Crystal 3.0.
Even without going that far, finding ways to split the generated LLVM IR into multiple modules, or mangling function names and global variables, would represent significant progress.
Crystal doesn't have as vibrant a library ecosystem as some other languages. While the reasons aren't entirely clear, as we improve the environment for code reuse, techniques for improving compilation speed may also develop.
That's all for this article. Thank you for reading to the end!
This article was originally written in 2024 and revised in December 2025. It was translated from Japanese to English using Claude Sonnet.


Top comments (0)