I Built an Open-Source Tool for Debugging Kubernetes Agentically

Adam Dickinson — Wed, 26 Nov 2025 00:34:56 +0000

TLDR

Built an open-source tool called Kubently that lets you troubleshoot Kubernetes clusters through natural conversation with any major LLM. ~50ms command delivery, read-only by default, works on any K8s cluster (EKS, GKE, AKS, bare metal), multi-cluster from day one.

Docs: https://kubently.io
GitHub: https://github.com/kubently/kubently

The Problem

If you've spent any time debugging Kubernetes, you know the drill:

kubectl get pods -n production
kubectl describe pod some-pod-name-7f8b9c6d5-x2k4m
kubectl logs some-pod-name-7f8b9c6d5-x2k4m
kubectl get events -n production --sort-by='.lastTimestamp'
# repeat forever

The output is verbose. The debugging is manual. You're constantly context-switching between terminal, docs, and whatever monitoring tool you're using. Now multiply that by however many clusters you're managing across different providers.

I've been working with agentic systems through my involvement with CAIPE (Cloud Native AI Platform Engineering), and one thing became obvious: agents debug faster than I can half the time. They don't get tired, they don't forget to check events, and they're pretty good at pattern matching across large outputs.

So I built something around that.

What is Kubently?

Kubently is an open-source tool for troubleshooting Kubernetes agentically (hence the name - Kubernetes + agentically). It lets you debug clusters through natural conversation with any major LLM.

Instead of running kubectl commands manually, you describe the problem and let an agent do the investigation:

"The frontend pods in production keep restarting - can you figure out what's going on?"

The agent runs the commands, analyzes the output, and walks through the debugging process systematically.

Architecture

The system has three core components:

Kubently API - A horizontally scalable FastAPI service that handles command distribution, session management, and A2A (Agent-to-Agent) communication. Scales with Redis pub/sub.

Kubently Executor - A lightweight agent deployed in each target cluster. It's the only component that needs cluster access, and it's read-only by default with configurable RBAC rules.

LLM Integration - Supports multiple providers through a factory pattern. Works with whatever LLM setup you're running.

Command delivery happens via Server-Sent Events (SSE) with ~50ms latency. Fast enough that conversations feel responsive.

Key Design Decisions

Read-Only by Default

Security was a priority from day one. The executor only runs read operations unless explicitly configured otherwise. No accidental kubectl delete from an overeager agent.

Cloud Agnostic

Runs on any Kubernetes cluster - EKS, GKE, AKS, bare metal, k3s, whatever. If kubectl works, Kubently works.

Multi-Cluster Native

This wasn't an afterthought. Deploy an executor to each cluster, manage them all from a single API. The architecture assumes you're running multiple clusters because most teams are.

A2A Protocol Support

Native support for Agent-to-Agent communication means Kubently integrates with existing agentic systems. If you're already running something like CAIPE or using LangGraph/LangChain, Kubently slots in as a specialized Kubernetes debugging agent.

Black Box Architecture

Built with swappability in mind. Want to change the LLM provider? Swap it out. Want different agent logic? The interfaces are clean. I have a lot of ideas for future improvements and didn't want to paint myself into a corner.

What's Next

This is still early. There's plenty of room for improvement:

Better conversation memory and context handling
More sophisticated debugging strategies
Enhanced multi-cluster workflows
Improved observability integration

But it's functional and useful as-is. I've been using it for my own clusters and it's already saved me time.

Try It Out

Docs: https://kubently.io
GitHub: https://github.com/kubently/kubently

Feedback, bug reports, and feature requests are all welcome. If you find it useful, a star on GitHub helps with visibility.

Happy to answer questions in the comments.

DEV Community: Adam Dickinson