When building ML systems, most people focus on the model.
But in production, the hard part is not training โ
itโs data, deployment, versioning, and serving.
Modern ML engineering solves this using the FTI pattern:
Feature โ Training โ Inference
This is like:
DB โ Backend โ UI
๐น Why we need ML pipelines
A real ML system must handle:
data ingestion
feature computation
model training
model versioning
deployment
monitoring
rollback
scaling
Without structure โ chaos.
๐น 1. Feature Pipeline
raw data โ features โ feature store
Responsibilities:
collect data
clean & validate
compute features
compute labels
version data
Features are saved in a feature store.
Why?
To avoid training / inference mismatch.
This solves:
training-serving skew
๐น 2. Training Pipeline
features โ training โ model โ model registry
Responsibilities:
load features
train model
evaluate
version model
store metadata
Models are saved in a model registry.
So we always know:
model v1 โ features F1 F2 F3
model v2 โ features F2 F3 F4
This makes rollback easy.
๐น 3. Inference Pipeline
features + model โ prediction
Inputs:
feature store
model registry
Outputs:
predictions
text
scores
embeddings
Can be:
batch
real-time API
streaming
Everything is versioned โ safe deployment.
๐น Why FTI is powerful
Instead of 20 components:
Feature
Training
Inference
Each pipeline can:
run separately
scale separately
use different tech
be built by different teams
Perfect for production ML.
๐น Works great for LLM / RAG / AI apps
Example for LLM Twin:
Feature
โ collect posts
โ create embeddings
Training
โ fine-tune model
Inference
โ retrieve context
โ generate text
Same pattern.
Different data.
โ Rule to remember
Every real ML system = Feature + Training + Inference
Understand this โ
you can design almost any ML architecture.



Top comments (5)
the training-serving skew point is the one that bites everyone in practice. you think you built a great model, deploy it, and suddenly accuracy tanks because inference is computing features slightly differently than training did. FTI makes this explicit - feature store as the contract between training and inference is the key insight. once you have that, debugging skew goes from "why does production look different" to "did something change upstream of the feature store"
This is an absolutely brilliant breakdown of the FTI pipeline and its power in scaling ML systems. The focus on data, deployment, versioning, and serving instead of just the model itself is exactly what differentiates production-grade systems from quick experiments. The simplicity of the Feature โ Training โ Inference structure is so elegant and practical, allowing each pipeline to be independently managed, scaled, and maintained. I particularly love how it solves real-world issues like training-serving skew and easy rollback through versioning. This model really aligns with modern engineering practices where clarity, modularity, and separation of concerns lead to smoother workflows and more maintainable systems. The fact that this pattern works seamlessly for LLM, RAG, and AI apps shows its versatility across emerging tech. Truly insightful and a must-read for anyone working on scalable, real-world ML systems.
interesting
Great
That's very interesting.