I built an open-source alternative to Microsoft's KAITO that works on ANY Kubernetes cluster

GaeaRuiW — Tue, 09 Jun 2026 05:17:19 +0000

Six months ago, my team needed to deploy DeepSeek-R1 for internal use. We have a Kubernetes cluster — like everyone does in 2026 — so I started looking for tools.

The problem

There are basically three options for running LLMs on Kubernetes:

KAITO (Microsoft) — CNCF Sandbox, 1600+ stars, but Azure-only. We are on AWS.
KServe — CNCF Incubating, solid project, but requires Knative + ISTIO + 5+ other components.
Raw vLLM — Great for serving, but you need to separately set up monitoring, tracing, auth, API keys, rate limiting, autoscaling.

So I did what any engineer would do: I built my own.

What I built

kube-llmops is a Kubernetes-native LLMOps platform that deploys everything you need with one Helm chart:

Model serving — vLLM, llama.cpp, or TEI, auto-selected based on model format
AI Gateway — LiteLLM for unified API, key management, rate limiting, budget control
Observability — 11 pre-built Grafana dashboards + Langfuse v3 LLM tracing
Autoscaling — KEDA with queue depth, TTFT P95, TPOT P95, scale-to-zero
Security — Keycloak SSO, LLM-Guard prompt injection defense
RAG — Dify + pgvector + TEI embedding + Ragas evaluation
Fine-tuning — LLaMA-Factory + Argo Workflows + MLflow

Why KAITO doesn't work for everyone

KAITO is a great project — but it only runs on Azure. If you are on AWS, GCP, on-prem, or multi-cloud, you are out of luck.

kube-llmops is cloud-agnostic. It runs on AWS EKS, Google GKE, Azure AKS, on-prem, or local kind.

What makes it different

Feature comparison:

Cloud-agnostic: kube-llmops Yes, KAITO Azure-only, KServe Yes
One-command install: kube-llmops Yes, KAITO Yes, KServe No
AI Gateway: kube-llmops Yes, KAITO No, KServe No
LLM Tracing: kube-llmops Yes, KAITO No, KServe No
Pre-built dashboards: kube-llmops 11, KAITO 0, KServe 0
KEDA autoscaling: kube-llmops Yes, KAITO No, KServe Partial
SSO: kube-llmops Yes, KAITO No, KServe No
RAG infrastructure: kube-llmops Yes, KAITO No, KServe No

Try it

https://github.com/GaeaRuiW/kube-llmops

Give it a star if you want to support a truly open, cloud-agnostic LLMOps platform! ⭐

git clone https://github.com/GaeaRuiW/kube-llmops.git
cd kube-llmops
helm install kube-llmops charts/kube-llmops-stack -f charts/kube-llmops-stack/values-ci.yaml --namespace kube-llmops --create-namespace

KAITO vs KServe vs kube-llmops: Which Kubernetes LLM Platform Should You Choose in 2026?

GaeaRuiW — Tue, 02 Jun 2026 06:22:15 +0000

KAITO vs KServe vs kube-llmops

If you are running LLMs on Kubernetes in 2026, you have probably encountered three main options: KAITO (Microsoft/CNCF Sandbox), KServe (CNCF Incubating), and kube-llmops.

TL;DR Comparison

Feature	kube-llmops	KAITO	KServe
AI Gateway	Built-in (LiteLLM)	No	No
LLM Tracing	Langfuse v3	No	No
Grafana Dashboards	11 pre-built	No	No
KEDA Autoscaling	Yes	No	Partial
SSO	Keycloak OIDC	No	No
RAG	Dify + pgvector + TEI	No	No
Fine-tuning	LLaMA-Factory + Argo	Basic	No
Cloud-Agnostic	Yes	Azure-only	Yes

kube-llmops is the only platform that gives you a complete LLM operations stack in one Helm install: model serving (vLLM, llama.cpp), AI gateway, observability, RAG, fine-tuning, SSO, and autoscaling.