DEV Community

Cover image for Introducing KubeAI: Open AI Inference Operator
Sam Stoelinga
Sam Stoelinga

Posted on

3

Introducing KubeAI: Open AI Inference Operator

We recently launched KubeAI. The goal of KubeAI is to get LLMs, embedding models and Speech to text running on Kubernetes with ease.

KubeAI provides an OpenAI compatible API endpoint which makes it work out of the box with most software that works with the OpenAI APIs.

Repo on GitHub: substratusai/kubeai

Image description

When it comes to LLMs, KubeAI directly operates vLLM and Ollama servers in isolated Pods, configured and optimized on a model-by-model basis. You get metrics-based auto scaling out of the box (including scale-from-zero). When you hear scale-from-zero in Kubernetes-land you probably think Knative and Istio - but not in KubeAI! We made an early design decision to avoid any external dependencies (Kubernetes is complicated enough as-is).

We are hoping to release more functionality soon. Next up: model caching, metrics and dashboard.

If you need any help or have any feedback, reach out directly, here, or via the channels listed in the repo. We are currently making it our priority to assist the project’s early adopters. So far users have seen success in use cases ranging from processing large scale batches in the cloud to running lightweight inference at the edge.

API Trace View

Struggling with slow API calls?

Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (1)

Collapse
 
nstogner profile image
Nick Stogner

Co-author here, happy to answer any questions!

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay