DEV Community

Alain Airom (Ayrom)
Alain Airom (Ayrom)

Posted on

Securing the Agentic Mesh with Kagenti

Building a Local Kagenti Sandbox with Podman, Kind, Minikube, and Bob

Introduction

Until just a few days ago, Kagenti wasn’t even on my radar. That all changed when I stumbled upon an insightful IBM Technology video titled “Kagenti’s Approach to Multi-Agent Security for AI Agents”.

The video broke down a massive operational hurdle in production AI: the “confused deputy”, where a delegated sub-agent can be tricked into abusing its authorization context to leak data or trigger malicious tool calls. Seeing how Kagenti addresses this at the infrastructure layer — using open standards like SPIFFE/SPIRE for cryptographic workload identities and OBridge delegation chains— completely hooked me. I knew right away that I needed to spin up an end-to-end sandbox environment to see it in action on my own hardware.

To bring this architecture to life locally, I mapped out a zero-configuration, lightweight container and Kubernetes stack. My building blocks consisted of;

  • Podman Desktop and the Podman Engine handling the container runtime,
  • Ollama running local LLMs to power the autonomous agentic logic,
  • and a combination of Kind and Minikube to orchestrate the local clusters. Rather than wrestling with the deployment manifests manually, I defined my specific project rules and architectural guidelines, handing them over to my AI development partner, Bob. With the right constraints in place, Bob generated a complete, end-to-end implementation that successfully wired the entire platform together.

Before we dig into the technical side of the deployment manifests and local configurations, let’s take a step back and look at what Kagenti actually is, what it does, and why it is rapidly becoming the go-to middleware for enterprise AI engineering.


What is Kagenti?

Image from Kagenti Github

Image from Kagenti Github

Kagenti (also evolving under the name Rosso) is an open-source, cloud-native middleware platform specifically engineered to deploy, secure, govern, and orchestrate AI agents inside Kubernetes environments. While developers have access to numerous frameworks for building AI applications (such as LangGraph, CrewAI, or AG2), there has historically been a significant operational gap in running that agentic code in production. Kagenti acts as the framework-neutral infrastructure layer that bridges this gap, turning AI agents into manageable, first-class Kubernetes workloads.

Excerpt from official repository;
Why Kagenti?
Despite the extensive variety of frameworks available for developing agent-based applications (LangGraph, CrewAI, AG2, etc.), there is a distinct lack of standardized methods for deploying and operating agent code in production environments. Agents are adept at reasoning, planning, and interacting with tools, but their full potential is often limited by:

  • Deployment Complexity — Each framework requires custom deployment scripts and infrastructure
  • Security Gaps — No standardized approach to authentication, authorization, and workload identity
  • Protocol Fragmentation — Agents and tools use different communication patterns
  • Operational Overhead — Scaling, monitoring, and lifecycle management require custom solutions
  • Kagenti addresses these challenges by enhancing existing agent frameworks with production-ready, framework-neutral infrastructure. Supported AI Use‑Case Types
  • Kagenti is designed to support a broad range of AI‑agent deployment patterns, including knowledge services, synchronous and asynchronous user‑authorized assistants, continuous monitoring agents, and event‑driven workflows.

The platform relies strictly on open, industry-converging standards. For communication, it leverages the Agent-to-Agent (A2A) protocol, allowing different AI agents to cleanly discover, collaborate, and delegate tasks to one another. For external tool integration, it relies on the Model Context Protocol (MCP), routing all tool calls (like interacting with databases, GitHub, or internal APIs) through an Envoy-backed MCP Gateway. To handle enterprise security, Kagenti incorporates a zero-trust model using SPIRE for workload identities and Keycloak for OAuth2 token validation, preventing common multi-agent security vulnerabilities like the “confused deputy” problem, where data permissions can accidentally leak along a chain of delegating agents.

How Kagenti is Used?

Kagenti is used by treating AI agents and their tools exactly like traditional containerized microservices. Instead of learning a entirely new, proprietary AI dashboard, platform engineers manage the lifecycle of an agent declaratively using Kubernetes Custom Resource Definitions (CRDs). This means agents can be versioned in Git, rolled out via standard CI/CD and GitOps workflows, and managed via command-line utilities using kubectl.

Operationally, the platform provides an Agent Development Kit (ADK) in both Python and Go to wrap existing agent logic without forcing developers to rewrite their code. When deployed, the Kagenti platform automatically injects critical runtime sidecars and shared services, including built-in LLM proxies (abstracting 15+ model providers), session memory backed by PostgreSQL, vector search capabilities via pgvector, and end-to-end tracing powered by OpenTelemetry and Phoenix. Administrators can monitor real-time execution steps, track token budgets, and view interaction histories directly from the Kagenti UI.


The Architecture of the simple implementation: Mapping Out the Sandbox


kagenti-hello-agent/
├── src/
│   └── hello_agent/
│       ├── __init__.py         # Package init
│       ├── agent.py            # A2A server entry point (AgentCard + routes + executor)
│       ├── llm.py              # Async OpenAI-compat LLM client + conversation memory
│       └── configuration.py   # Pydantic settings (env vars)
├── k8s/
│   └── deploy.yaml             # Namespace + Deployment + Service + AgentRuntime CR
├── scripts/
│   ├── setup-cluster.sh       # Resize Podman machine + start minikube + cert-manager + operator
│   ├── build-and-deploy.sh    # Build image → load into minikube → kubectl apply
│   ├── test-agent.sh          # Port-forward → health + AgentCard + JSON-RPC smoke tests
│   └── teardown.sh            # kubectl delete cleanup
├── Docs/
│   ├── Architecture.md        # This file — full Mermaid diagrams, platform stack
│   └── AccessGuide.md         # All port-forward commands, UIs, and observability options
├── Dockerfile                 # uv/Python 3.12 bookworm-slim, non-root UID 1001
├── pyproject.toml             # Python project + a2a-sdk>=1.1.0 + openai>=2.41 deps
└── .gitignore
Enter fullscreen mode Exit fullscreen mode

The samlpe playground runs inside a dedicated local environment optimized for resources. The underlying infrastructure maps out a clean communication pathway from our local host directly into the cluster:

Browser
    │
    │  http://kagenti-ui.localtest.me:19080
    ▼
localhost:19080   ←── kubectl port-forward ──►  svc/http-istio-np :80 (kagenti-system)
                                                       │
                                          Istio Gateway (NodePort)
                                                       │
                              ┌─────────────────────────────────────────┐
                              │  HTTPRoute Hostname Matching            │
                              │  kagenti-ui.localtest.me → UI :8080     │
                              │  keycloak.localtest.me   → Keycloak:8080│
                              └─────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The system is configured via localtest.me—a convenient domain that inherently resolves to 127.0.0.1 without forcing messy updates to your local /etc/hosts file. A single kubectl port-forward targeted at the Istio Ingress Gateway maps all of our services cleanly.

The Code: Building the kagenti-hello-agent

To keep things basic, the sample agent utilizes the official a2a-sdk to stand up a simple greeting assistant powered locally by Ollama (ibm/granite4:3b).

  • Configuration (configuration.py): We use Pydantic to cleanly bind environment strings to the running application. Notice how Bob points the llm_api_base to host.docker.internal so the containerized workload can speak directly back to Ollama running on our host machine.
# configuration.py
from pydantic_settings import BaseSettings


class Configuration(BaseSettings):
    """Agent configuration loaded from environment variables."""

    # LLM connection settings (compatible with Ollama / OpenAI-compatible endpoints)
    llm_model: str = "ibm/granite4:3b"
    llm_api_base: str = "http://host.docker.internal:11434/v1"
    llm_api_key: str = "dummy"

    # Server binding
    host: str = "0.0.0.0"
    port: int = 8000

    # Optional public endpoint override for the AgentCard URL
    agent_endpoint: str = ""
Enter fullscreen mode Exit fullscreen mode
  • The LLM Client Wrapper (llm.py): Using the AsyncOpenAI SDK, the core chat loop manages thread state inside memory via a unified context_id and tracks logs to make tracing readable.
# llm.py
import logging
from collections import defaultdict

from openai import AsyncOpenAI

from hello_agent.configuration import Configuration

logger = logging.getLogger(__name__)

SYSTEM_PROMPT = (
    "You are a friendly and concise greeting assistant. "
    "Your purpose is to greet users warmly and provide short, helpful responses. "
    "Guidelines:\n"
    "- Always start with a personalised greeting.\n"
    "- Ask the user's name if you don't know it yet.\n"
    "- Be brief: keep every response under 3 sentences.\n"
    "- If asked about your capabilities, explain you are a greeting agent running on Kagenti.\n"
    "- Always end with an encouraging or friendly closing.\n"
)

# In-memory conversation history keyed by context_id.
# NOTE: In production, replace with a persistent store.
_conversations: dict[str, list[dict[str, str]]] = defaultdict(list)


async def chat(context_id: str, user_message: str) -> str:
    """Send a user message and get a response, preserving per-context history."""
    config = Configuration()

    client = AsyncOpenAI(
        base_url=config.llm_api_base,
        api_key=config.llm_api_key,
    )

    history = _conversations[context_id]
    history.append({"role": "user", "content": user_message})

    messages = [{"role": "system", "content": SYSTEM_PROMPT}] + history

    logger.info(
        "Sending %d messages to LLM (context=%s, model=%s)",
        len(messages),
        context_id,
        config.llm_model,
    )

    response = await client.chat.completions.create(
        model=config.llm_model,
        messages=messages,
    )

    reply = response.choices[0].message.content
    history.append({"role": "assistant", "content": reply})

    logger.info("LLM reply (context=%s): %s", context_id, reply[:200])
    return reply
Enter fullscreen mode Exit fullscreen mode
  • Serving via JSON-RPC over HTTP (agent.py): The A2A protocol works over standard JSON-RPC. We spin up a Starlette web application and hand off routing safely with enable_v0_3_compat=True to comply with Kagenti's platform communication needs.


# excerpt of agent.py
import os
import uvicorn
from starlette.applications import Starlette
from starlette.responses import JSONResponse
from starlette.routing import Route
from a2a.server.routes import create_agent_card_routes, create_jsonrpc_routes

async def health(_: Request) -> JSONResponse:
    return JSONResponse({"status": "ok"})

def run() -> None:
    host = os.getenv("HOST", "0.0.0.0")
    port = int(os.getenv("PORT", "8000"))

    # ... Agent initialization omitted for brevity ...

    routes = [Route("/health", health, methods=["GET"])]
    routes.extend(create_jsonrpc_routes(handler, "/", enable_v0_3_compat=True))

    app = Starlette(routes=routes)
    uvicorn.run(app, host=host, port=port)
Enter fullscreen mode Exit fullscreen mode

Integrating with Kubernetes: The Manifests

How do we hand control of our microservice over to the Kagenti Operator? Through standard Custom Resource Definitions (CRDs). In our deploy.yaml, we attach critical platform labels and declare an AgentRuntime resource:

# ─────────────────────────────────────────────────────────────────────────────
# Deployment Manifests — Hello Kagenti Agent
#
# Deploys the Hello Kagenti Agent as a Kubernetes Deployment + Service,
# and registers it with the Kagenti Operator via an AgentRuntime CR.
#
# Prerequisites:
#   - Kagenti Operator installed on the cluster
#   - Namespace "team1" exists  (kubectl create namespace team1)
#   - Image built and loaded into minikube (see scripts/build-and-deploy.sh)
# ─────────────────────────────────────────────────────────────────────────────

# ── Namespace ────────────────────────────────────────────────────────────────
apiVersion: v1
kind: Namespace
metadata:
  name: team1

---
# ── Deployment ───────────────────────────────────────────────────────────────
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-kagenti-agent
  namespace: team1
  labels:
    app.kubernetes.io/name: hello-kagenti-agent
    app.kubernetes.io/version: "1.0.0"
    app.kubernetes.io/component: agent
    # Required by Kagenti Operator for agent discovery
    protocol.kagenti.io/a2a: ""
    kagenti.io/framework: "a2a-sdk"
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: hello-kagenti-agent
  template:
    metadata:
      labels:
        app.kubernetes.io/name: hello-kagenti-agent
        app.kubernetes.io/component: agent
        # Kagenti injects its AuthBridge sidecar on pods with this label
        kagenti.io/type: agent
    spec:
      containers:
        - name: hello-agent
          image: localhost/hello-kagenti-agent:1.0.0
          imagePullPolicy: Never          # Image loaded directly into minikube
          ports:
            - name: http
              containerPort: 8000
              protocol: TCP
          env:
            - name: HOST
              value: "0.0.0.0"
            - name: PORT
              value: "8000"
            # ── LLM connection ──────────────────────────────────────────────
            # Points to Ollama running on the host machine.
            # host.docker.internal resolves to the host from inside a pod.
            - name: LLM_API_BASE
              value: "http://host.docker.internal:11434/v1"
            - name: LLM_API_KEY
              value: "dummy"
            - name: LLM_MODEL
              value: "ibm/granite4:3b"
            # ── Agent card public URL ───────────────────────────────────────
            # Leave empty to use the container's host:port.
            # Override with the externally reachable URL when using Ingress.
            - name: AGENT_ENDPOINT
              value: ""
          livenessProbe:
            httpGet:
              path: /health
              port: http
            initialDelaySeconds: 10
            periodSeconds: 30
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /health
              port: http
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 5
          resources:
            requests:
              memory: "128Mi"
              cpu: "100m"
            limits:
              memory: "256Mi"
              cpu: "500m"

---
# ── Service ───────────────────────────────────────────────────────────────────
apiVersion: v1
kind: Service
metadata:
  name: hello-kagenti-agent
  namespace: team1
  labels:
    app.kubernetes.io/name: hello-kagenti-agent
    kagenti.io/type: agent
spec:
  type: ClusterIP
  selector:
    app.kubernetes.io/name: hello-kagenti-agent
  ports:
    - name: http
      port: 8000
      targetPort: 8000
      protocol: TCP

---
# ── AgentRuntime CR ───────────────────────────────────────────────────────────
# Registers the Deployment with the Kagenti Operator.
# The operator will:
#   1. Apply kagenti.io/type: agent labels
#   2. Compute a config hash and trigger a rolling update
#   3. Inject the AuthBridge sidecar (if SPIRE is enabled)
#   4. Create an AgentCard for automatic discovery
apiVersion: agent.kagenti.dev/v1alpha1
kind: AgentRuntime
metadata:
  name: hello-kagenti-agent-runtime
  namespace: team1
spec:
  type: agent
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hello-kagenti-agent
Enter fullscreen mode Exit fullscreen mode
  • When this AgentRuntime resource is applied, the Kagenti Operator triggers a rolling update, registers our target service inside Keycloak, calculates configuration hashes, and hooks our pod cleanly into the zero-trust system without changing a single line of our Python logic.

Bringing It All Together

Want to fire this up on your machine? The installation lifecycle maps to a quick series of automation steps:

# 1. Start your podman machine with a proper RAM budget
podman machine start

# 2. Fire up minikube leveraging the podman driver
minikube start --profile minikube-1 --driver=podman --memory=7500 --cpus=4

# 3. Compile and apply your manifests
cd kagenti-hello-agent
./scripts/build-and-deploy.sh

# 4. Open up your connection gateway
kubectl port-forward -n kagenti-system svc/http-istio-np 19080:80 &
Enter fullscreen mode Exit fullscreen mode

Once executed, navigate your browser over to http://kagenti-ui.localtest.me:19080 to log into your brand new dashboard and see Bob's work running natively inside your infrastructure layer. Secure, observable, and fully scalable.


Conclusion: When to Deploy Kagenti

Implementing this local sandbox demonstrates that Kagenti is a crucial infrastructure and control plane layer designed to bring order, governance, and enterprise-grade security to multi-agent workloads.

When to Use Kagenti

Kagenti could be taken into your architectures’ consideration when the AI initiative needs to scale with one ore more of these criteria:

  • Multi-Agent Collaboration: We run multiple interconnected agents that need to seamlessly discover one another, communicate via standard protocols (A2A), and delegate sub-tasks safely without leaking authorization states.
  • Strict Security & Compliance Requirements: We operate in regulated industries (like healthcare, finance, or public sector platform engineering) where vulnerabilities should be prevented, like the Confused Deputy problem.
  • Kubernetes/GitOps Alignment: The engineering organization is heavily anchored in cloud-native ecosystems.
  • Framework Agnosticism: To avoid vendor lock-in. Kagenti allows development teams to build agents using their favorite SDKs (LangGraph, CrewAI, AutoGen, or custom Python libraries) while guaranteeing uniform platform-level routing, rate limiting, and observability.

What Has Been Achieved by Bob? :)

By constructing this end-to-end sandbox environment, Bob successfully established a secure blueprint for an (here localized) AI development:

  • Z*ero-Configuration Routing: Leveraged local ingress topologies (localtest.me) to cleanly expose Kagenti's full UI and backend stack out of an isolated **Minikube* cluster using standard port-forwards.
  • Local LLM Isolation: Configured a containerized Python agent to tap back into the host machine’s hardware layer via Ollama, ensuring heavy model inference remains fully local, private, and zero-cost.
  • Declarative Lifecycle Management: Wired the hello-kagenti-agent directly to the Kagenti Operator using an AgentRuntime CRD. The platform automatically takes care of proxy injection, API exposure, and Keycloak client registration without requiring manual plumbing or modification to the core agent logic.

>>> Thanks for reading <<<

Links

Top comments (0)