Six months ago, my team needed to deploy DeepSeek-R1 for internal use. We have a Kubernetes cluster — like everyone does in 2026 — so I started looking for tools.
The problem
There are basically three options for running LLMs on Kubernetes:
- KAITO (Microsoft) — CNCF Sandbox, 1600+ stars, but Azure-only. We are on AWS.
- KServe — CNCF Incubating, solid project, but requires Knative + ISTIO + 5+ other components.
- Raw vLLM — Great for serving, but you need to separately set up monitoring, tracing, auth, API keys, rate limiting, autoscaling.
So I did what any engineer would do: I built my own.
What I built
kube-llmops is a Kubernetes-native LLMOps platform that deploys everything you need with one Helm chart:
- Model serving — vLLM, llama.cpp, or TEI, auto-selected based on model format
- AI Gateway — LiteLLM for unified API, key management, rate limiting, budget control
- Observability — 11 pre-built Grafana dashboards + Langfuse v3 LLM tracing
- Autoscaling — KEDA with queue depth, TTFT P95, TPOT P95, scale-to-zero
- Security — Keycloak SSO, LLM-Guard prompt injection defense
- RAG — Dify + pgvector + TEI embedding + Ragas evaluation
- Fine-tuning — LLaMA-Factory + Argo Workflows + MLflow
Why KAITO doesn't work for everyone
KAITO is a great project — but it only runs on Azure. If you are on AWS, GCP, on-prem, or multi-cloud, you are out of luck.
kube-llmops is cloud-agnostic. It runs on AWS EKS, Google GKE, Azure AKS, on-prem, or local kind.
What makes it different
Feature comparison:
- Cloud-agnostic: kube-llmops Yes, KAITO Azure-only, KServe Yes
- One-command install: kube-llmops Yes, KAITO Yes, KServe No
- AI Gateway: kube-llmops Yes, KAITO No, KServe No
- LLM Tracing: kube-llmops Yes, KAITO No, KServe No
- Pre-built dashboards: kube-llmops 11, KAITO 0, KServe 0
- KEDA autoscaling: kube-llmops Yes, KAITO No, KServe Partial
- SSO: kube-llmops Yes, KAITO No, KServe No
- RAG infrastructure: kube-llmops Yes, KAITO No, KServe No
Try it
https://github.com/GaeaRuiW/kube-llmops
Give it a star if you want to support a truly open, cloud-agnostic LLMOps platform! ⭐
git clone https://github.com/GaeaRuiW/kube-llmops.git
cd kube-llmops
helm install kube-llmops charts/kube-llmops-stack -f charts/kube-llmops-stack/values-ci.yaml --namespace kube-llmops --create-namespace
Top comments (0)