Letās talk about a problem thatās been bothering ML teams for a while now. When weāre working with batch processing, Python does the job just fine. But real-time serving? Thatās where things get interesting.
I ran into this myself when building a service that needed to stay under 50ms latency. Even without strict latency requirements, everything might seem fine at first, but as the service grow, Python starts showing its limitations. Variables would randomly turn into None, and without type checks, tracking down these issues becomes a real headache.
Right now, we donāt have a go-to solution for real-time model serving. Teams often turn to tools like KServe or BentoML, but this means dealing with more moving parts ā more pods to watch, extra network calls slowing things down.
What about other languages? C++ is fast and works with pretty much every ML library out there, but letās be real ā building and maintaining a C++ backend service is not something most teams want to take on.
It would be great if we could build models in Python but serve them in a faster language. ONNX tries to solve this, and it works relatively great for neural networks. But when I tried using it with CatBoost, handling categorical features turned into a challenge ā the support just isnāt there yet.
This brings us to Rust, which offers an interesting middle ground:
- Itās just as fast as C++, but easier to my liking
- The type system keeps your business logic clean and predictable
- The compiler actually helps you write better code instead of just pointing out errors
- The ML ecosystem is growing, with support from big names like Microsoft
Working with Official Catboost in Rust
Good news ā thereās actually an official Catboost crate for Rust! But before you get too excited, let me tell you about quirks that I discovered along the way.
The tricky part isnāt the Rust code itself ā itās getting the underlying C++ libraries in place. Youāll need to compile Catboost from source, and getting the environment right for this is most difficult part.
Catboost team provides their own Ubuntu-based image for building it from source, which sounds great. But what if youāre planning to run your service on Debian to keep things light? Then you better build Catboost on the same version of Debian youāll use for serving, otherwise you might run into compatibility issues.
Letās talk about why this matters in practice. The Ubuntu build image needs a hefty 4+ GB of memory to work with. But if you set up a custom Debian build correctly, you can bring that down to just 1 GB. And when youāre running lots of services in the cloud, these numbers of extra memory usage start adding up in your monthly bill.
Setting Up Your Rust + Catboost Build Environment
Let me walk you through setting up a Debian-based environment for Catboost. Iāll explain not just what to do, but why each step matters.
Installing Catboost
On the Rust side it happens as simple as with other crates in Rust:
[package]
name = "MLApp"
version = "0.1.0"
edition = "2021"
[dependencies]
catboost = { git = "https://github.com/catboost/catboost", rev = "0bfdc35"}
However Catboost crate does not have precompiled C/C++ bindings and during installation (cargo build
) will try to compile it from sources specifically for your environment. So letās set up our environment.
Starting with the Right Base Image
First, weāre going with debian:bookworm-slim
as our base image. Why? It comes with CMake 3.24+, which we need for our build process. The āslimā variant keeps our image size down, which is always nice.
Setting Up the C++ Build Environment
We need a bunch of C++ packages, and while Iām using version 16 in our setup, you actually have flexibility here. Any version that supports -mno-outline-atomics
will work fine.
Letās break down our package installation into logical groups.
Setting Up Package Sources
First, we need to get our package sources in order. This part is crucial for getting the right LLVM tools:
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y - no-install-recommends \
# will use it to download packages
wget \
# cryptographic package to verify LLVM sources
gnupg \
# check Debian version to get correct LLVM package
lsb-release \
# package management helper
software-properties-common
Then we need to add LLVMās repository.
RUN wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - \
&& echo "deb http://apt.llvm.org/$(lsb_release -sc)/ llvm-toolchain-$(lsb_release -sc)-16 main" \
>> /etc/apt/sources.list.d/llvm.list
The main step of installing packages
We need quite a few packages, and I recommend to organize them by purpose ā it makes maintenance so much easier:
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y - no-install-recommends \
# Basic build essentials
build-essential \
pkg-config \
# Core development packages
libssl-dev \
cmake \
ninja-build \
python3-pip \
# LLVM toolchain - version 16 works great, but any version with
# -mno-outline-atomics support will do
clang-16 \
libc++-16-dev \
libc++abi-16-dev \
lld-16 \
# Don't forget git!
git
Next step cost me some time to figure out. Catboost expects to find clang in /usr/bin/clang
, but our installation puts it in /usr/bin/clang-16
. Thatās why we have this bit:
RUN ln -sf /usr/bin/clang-16 /usr/bin/clang && \
ln -sf /usr/bin/clang++-16 /usr/bin/clang++ && \
ln -sf /usr/bin/lld-16 /usr/bin/lld
And do not forget to set up environment variables
ENV CC=/usr/bin/clang
ENV CXX=/usr/bin/clang++
ENV LIBCLANG_PATH=/usr/lib/llvm-16/lib
ENV LLVM_CONFIG_PATH=/usr/bin/llvm-config-16
Managing Dependencies
We need Conan (version 2.4.1+) for handling C++ dependencies. A word of caution about the installation:
RUN pip3 install --break-system-packages "conan==2.11.0"
That --break-system-packages
flag might look scary, but itās actually the easiest way I found to install Python packages system-wide in newer Debian versions. Besides we wonāt be using much Python anyway in our build image.
Smart Build Strategy
Hereās a trick thatāll save you tons of build time during active development stage. Split your build into two steps:
- First, build just the dependencies:
COPY ./Cargo.* ./
RUN mkdir src && \
echo "fn main() {}" > src/main.rs && \
RUSTFLAGS="-C codegen-units=1" cargo build - release
Important note here. You need that RUSTFLAGS="-C codegen-units=1"
flag because it ensures that C++ and Rust play along.
- Then build your actual application:
COPY ./src src
RUN cargo build - release
This way, Docker caches the dependency build, and you only rebuild your app code when it changes. Much faster!
A Critical Warning About Memory
This is important: during the C++ build steps, youāll need machine with 20+ GB of memory (I used 32Gb). And hereās the part that cost me almost a day of debugging ā if you donāt have enough memory, you wonāt get a clear error message (or any to be honest). Instead, your build will mysteriously timeout, leaving you wondering what went wrong. I learned this one the hard way!
Wrapping It Up
Now we have a working Rust environment with Catboost that can handle all the good stuff: categories, text data, embeddings. Getting here wasnāt exactly easy.
Next time weāll cover:
- Building an Axum web service
- Smart model loading patterns
- Real-world performance tricks I learned along the way
So weāll turn this foundation into something that can actually serve models in production! I ran into some interesting problems while figuring this out, like accidentally loading the same model multiple times for each handler call.
And if youāve tried this out and hit any weird issues, let me know. Itās always interesting to hear what problems other people run into.
Top comments (0)