Letâs talk about a problem thatâs been bothering ML teams for a while now. When weâre working with batch processing, Python does the job just fine. But real-time serving? Thatâs where things get interesting.
I ran into this myself when building a service that needed to stay under 50ms latency. Even without strict latency requirements, everything might seem fine at first, but as the service grow, Python starts showing its limitations. Variables would randomly turn into None, and without type checks, tracking down these issues becomes a real headache.
Right now, we donât have a go-to solution for real-time model serving. Teams often turn to tools like KServe or BentoML, but this means dealing with more moving parts â more pods to watch, extra network calls slowing things down.
What about other languages? C++ is fast and works with pretty much every ML library out there, but letâs be real â building and maintaining a C++ backend service is not something most teams want to take on.
It would be great if we could build models in Python but serve them in a faster language. ONNX tries to solve this, and it works relatively great for neural networks. But when I tried using it with CatBoost, handling categorical features turned into a challenge â the support just isnât there yet.
This brings us to Rust, which offers an interesting middle ground:
- Itâs just as fast as C++, but easier to my liking
- The type system keeps your business logic clean and predictable
- The compiler actually helps you write better code instead of just pointing out errors
- The ML ecosystem is growing, with support from big names like Microsoft
Working with Official Catboost in Rust
Good news â thereâs actually an official Catboost crate for Rust! But before you get too excited, let me tell you about quirks that I discovered along the way.
The tricky part isnât the Rust code itself â itâs getting the underlying C++ libraries in place. Youâll need to compile Catboost from source, and getting the environment right for this is most difficult part.
Catboost team provides their own Ubuntu-based image for building it from source, which sounds great. But what if youâre planning to run your service on Debian to keep things light? Then you better build Catboost on the same version of Debian youâll use for serving, otherwise you might run into compatibility issues.
Letâs talk about why this matters in practice. The Ubuntu build image needs a hefty 4+ GB of memory to work with. But if you set up a custom Debian build correctly, you can bring that down to just 1 GB. And when youâre running lots of services in the cloud, these numbers of extra memory usage start adding up in your monthly bill.
Setting Up Your Rust + Catboost Build Environment
Let me walk you through setting up a Debian-based environment for Catboost. Iâll explain not just what to do, but why each step matters.
Installing Catboost
On the Rust side it happens as simple as with other crates in Rust:
[package]
name = "MLApp"
version = "0.1.0"
edition = "2021"
[dependencies]
catboost = { git = "https://github.com/catboost/catboost", rev = "0bfdc35"}
However Catboost crate does not have precompiled C/C++ bindings and during installation (cargo build
) will try to compile it from sources specifically for your environment. So letâs set up our environment.
Starting with the Right Base Image
First, weâre going with debian:bookworm-slim
as our base image. Why? It comes with CMake 3.24+, which we need for our build process. The âslimâ variant keeps our image size down, which is always nice.
Setting Up the C++ Build Environment
We need a bunch of C++ packages, and while Iâm using version 16 in our setup, you actually have flexibility here. Any version that supports -mno-outline-atomics
will work fine.
Letâs break down our package installation into logical groups.
Setting Up Package Sources
First, we need to get our package sources in order. This part is crucial for getting the right LLVM tools:
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y - no-install-recommends \
# will use it to download packages
wget \
# cryptographic package to verify LLVM sources
gnupg \
# check Debian version to get correct LLVM package
lsb-release \
# package management helper
software-properties-common
Then we need to add LLVMâs repository.
RUN wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - \
&& echo "deb http://apt.llvm.org/$(lsb_release -sc)/ llvm-toolchain-$(lsb_release -sc)-16 main" \
>> /etc/apt/sources.list.d/llvm.list
The main step of installing packages
We need quite a few packages, and I recommend to organize them by purpose â it makes maintenance so much easier:
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y - no-install-recommends \
# Basic build essentials
build-essential \
pkg-config \
# Core development packages
libssl-dev \
cmake \
ninja-build \
python3-pip \
# LLVM toolchain - version 16 works great, but any version with
# -mno-outline-atomics support will do
clang-16 \
libc++-16-dev \
libc++abi-16-dev \
lld-16 \
# Don't forget git!
git
Next step cost me some time to figure out. Catboost expects to find clang in /usr/bin/clang
, but our installation puts it in /usr/bin/clang-16
. Thatâs why we have this bit:
RUN ln -sf /usr/bin/clang-16 /usr/bin/clang && \
ln -sf /usr/bin/clang++-16 /usr/bin/clang++ && \
ln -sf /usr/bin/lld-16 /usr/bin/lld
And do not forget to set up environment variables
ENV CC=/usr/bin/clang
ENV CXX=/usr/bin/clang++
ENV LIBCLANG_PATH=/usr/lib/llvm-16/lib
ENV LLVM_CONFIG_PATH=/usr/bin/llvm-config-16
Managing Dependencies
We need Conan (version 2.4.1+) for handling C++ dependencies. A word of caution about the installation:
RUN pip3 install --break-system-packages "conan==2.11.0"
That --break-system-packages
flag might look scary, but itâs actually the easiest way I found to install Python packages system-wide in newer Debian versions. Besides we wonât be using much Python anyway in our build image.
Smart Build Strategy
Hereâs a trick thatâll save you tons of build time during active development stage. Split your build into two steps:
- First, build just the dependencies:
COPY ./Cargo.* ./
RUN mkdir src && \
echo "fn main() {}" > src/main.rs && \
RUSTFLAGS="-C codegen-units=1" cargo build - release
Important note here. You need that RUSTFLAGS="-C codegen-units=1"
flag because it ensures that C++ and Rust play along.
- Then build your actual application:
COPY ./src src
RUN cargo build - release
This way, Docker caches the dependency build, and you only rebuild your app code when it changes. Much faster!
A Critical Warning About Memory
This is important: during the C++ build steps, youâll need machine with 20+ GB of memory (I used 32Gb). And hereâs the part that cost me almost a day of debugging â if you donât have enough memory, you wonât get a clear error message (or any to be honest). Instead, your build will mysteriously timeout, leaving you wondering what went wrong. I learned this one the hard way!
Wrapping It Up
Now we have a working Rust environment with Catboost that can handle all the good stuff: categories, text data, embeddings. Getting here wasnât exactly easy.
Next time weâll cover:
- Building an Axum web service
- Smart model loading patterns
- Real-world performance tricks I learned along the way
So weâll turn this foundation into something that can actually serve models in production! I ran into some interesting problems while figuring this out, like accidentally loading the same model multiple times for each handler call.
And if youâve tried this out and hit any weird issues, let me know. Itâs always interesting to hear what problems other people run into.
Top comments (0)