<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Michael Levan</title>
    <description>The latest articles on DEV Community by Michael Levan (@thenjdevopsguy).</description>
    <link>https://dev.to/thenjdevopsguy</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F332370%2F8c2bbfb4-514c-47e4-bee2-a2053d30ed5b.jpeg</url>
      <title>DEV Community: Michael Levan</title>
      <link>https://dev.to/thenjdevopsguy</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/thenjdevopsguy"/>
    <language>en</language>
    <item>
      <title>Managing MCP Servers and Tools With Agentregistry OSS</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Sat, 04 Apr 2026 22:38:18 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/managing-mcp-servers-and-tools-with-agentregistry-oss-4iga</link>
      <guid>https://dev.to/thenjdevopsguy/managing-mcp-servers-and-tools-with-agentregistry-oss-4iga</guid>
      <description>&lt;p&gt;Three big topics when it comes to MCP:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;How do you know the MCP Server is secure?&lt;/li&gt;
&lt;li&gt;Where is it stored?&lt;/li&gt;
&lt;li&gt;Is it version-controlled, or can anyone just change it at any time?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;And that's where having an MCP registry comes into play.&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn how to securely store your MCP Server, and it's available tools to be used later within your Agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Recap: What Is MCP?
&lt;/h2&gt;

&lt;p&gt;Model Context Protocol (MCP) is a spec/standard created by and open-sourced Anthropic. The goal of MCP is to have a server that hosts tools, and these tools are able to implement certain functionality for what you're working on. For example, you can use a Kubernetes MCP Server that can do everything from list/describe/log Pods and deploy objects to Kubernetes. MCP uses JSON-RPC 2.0 for it's communication layer underneath the hood for communication between an Agent (the client) and MCP tools (hosted on the server).&lt;/p&gt;

&lt;h3&gt;
  
  
  The "Is MCP Dead" Debate
&lt;/h3&gt;

&lt;p&gt;I was at MCPDevSummit in NY this week, and I caught a keynote that explained the need for MCP Server tools pretty nicely from a theoretical perspective. Right now, it may be easier for Agents to talk to MCP Server tools vs having them talk tens or hundreds of APIs directly. The reason why is that it's simpler for an Agent to call a tool and have that tool (because a tool, underneath the hood, is simply a function/method) call the APIs instead. What this could come down to is less tokens used and less context bloat, along with hopefully, better results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuring Agentregistry Locally
&lt;/h2&gt;

&lt;p&gt;With an understanding of what MCP is at a high level, let's dive into the hands-on portion of this blog post. In this section, you'll get agentregistry deployed, which takes around 30 seconds.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pull down the latest version of agentregistry.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -fsSL https://raw.githubusercontent.com/agentregistry-dev/agentregistry/main/scripts/get-arctl | bash
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Run the following command, which starts the agentregistry daemon.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; arctl daemon start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll see an output similar to the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Starting agentregistry daemon...
✓ agentregistry daemon started successfully
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Open Docker and you'll see agentregistry running along with a link you can click to reach the UI.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2o2r4btqcxw1leuo18b4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2o2r4btqcxw1leuo18b4.png" alt=" " width="800" height="145"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You should now see the agentregistry UI.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ovggf398k1qhfgzwqrt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ovggf398k1qhfgzwqrt.png" alt=" " width="800" height="502"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sidenote: if you have a remote registry, you can connect to it with the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;arctl --registry-url http://YOUR-HOST:12121 version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Adding An MCP Server To Agentregistry
&lt;/h2&gt;

&lt;p&gt;With agentregistry deployed, you can now add an MCP Server to the registry to ensure it's stored and secured. For testing purposes, lets use the filesystem MCP Server that's stored on GitHub.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Using &lt;code&gt;arctl mcp publish&lt;/code&gt;, you'll pass in the following flags.&lt;/li&gt;
&lt;li&gt;

&lt;ol&gt;
&lt;li&gt;MCP Server: server-filesystem&lt;/li&gt;
&lt;li&gt;Type: NPM&lt;/li&gt;
&lt;li&gt;Version: 0.6.3
&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;arctl mcp publish io.github.modelcontextprotocol/server-filesystem --type npm --package-id
  @modelcontextprotocol/server-filesystem --version 0.6.3 --description 'MCP server for filesystem access' --git
  https://github.com/modelcontextprotocol/servers.git -v
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The MCP Server will now show in your registry.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0le5e7xcxkaqqwd8dw67.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0le5e7xcxkaqqwd8dw67.png" alt=" " width="800" height="452"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxe8777eysnt9hbrekdq5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxe8777eysnt9hbrekdq5.png" alt=" " width="800" height="501"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can also add your MCP Server via the UI.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Click the purple &lt;strong&gt;+ Add&lt;/strong&gt; button and choose &lt;strong&gt;Server&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrco0sa7biqz3v7cip0d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvrco0sa7biqz3v7cip0d.png" alt=" " width="678" height="666"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add in the details about your MCP Server.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuiax24owam8a047oxluc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuiax24owam8a047oxluc.png" alt=" " width="800" height="536"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Having a safe, secure, and reliable place to store something as prone to security incidents as MCP Servers is key to creating a proper posture for you and your organization when using AI. This is why agentregistry can also be used to store Agent Skills and prompts. Because the majority of what you're using is either a function/method (an MCP Server tool) or .MD files/text files, shadow AI can easily occur.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>kubernetes</category>
      <category>agents</category>
    </item>
    <item>
      <title>Running OpenClaw on Kubernetes</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Sat, 14 Mar 2026 17:00:08 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/running-openclaw-on-kubernetes-57ki</link>
      <guid>https://dev.to/thenjdevopsguy/running-openclaw-on-kubernetes-57ki</guid>
      <description>&lt;p&gt;The "new and exciting" way of interacting with Agents is the personal assistant method (from iMessage, WhatsApp, or whatever else) and this interest is taking the industry by storm. OpenAI "bought" OpenClaw, Nvidia is investing in its own version of personal assistants, and several other organizations are trying to figure out how to implement this in production.&lt;/p&gt;

&lt;p&gt;The question is - does it run on your infra?&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn how to implement OpenClaw in Kubernetes and observe/secure it with agentgateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along with this blog post from a hands-on perspective, you will need the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Kubernetes cluster running with at least 2 vCPUs and 2–4 GB RAM, though 8 GB RAM and higher are recommended for smoother performance.&lt;/li&gt;
&lt;li&gt;agentgateway installed, which you can find &lt;a href="https://agentgateway.dev/docs/kubernetes/main/quickstart/install/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Containerization Setup
&lt;/h2&gt;

&lt;p&gt;There are two different ways that you can use OpenClaw in a containerized fashion:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build your own container image. There's a Dockerfile in the OpenClaw repo which you can find &lt;a href="https://github.com/openclaw/openclaw/blob/main/docker-setup.sh" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Use a container image that was already built. There is an official Alpine image which you can find &lt;a href="https://hub.docker.com/r/alpine/openclaw" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The first option will, of course, be the most secure, as you can build the container image yourself and ensure you know what is within the Dockerfile. In air-gapped environments, this would be the ideal setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentgateway Config To Observe and Secure Agentic Traffic
&lt;/h2&gt;

&lt;p&gt;Once the containerization setup is complete, you can begin the agentgateway setup.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a Gateway using the Kubernetes Gateway API CRDs and the agentgateway Gateway class.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Gateway&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gateway.networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-oc&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;gatewayClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway&lt;/span&gt;
  &lt;span class="na"&gt;listeners&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HTTP&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8081&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http&lt;/span&gt;
    &lt;span class="na"&gt;allowedRoutes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;namespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;All&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Set an env variable with an Anthropic API key
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a Kubernetes Secret with the API key.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Secret&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;anthropic-secret&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-oc&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Opaque&lt;/span&gt;
&lt;span class="na"&gt;stringData&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$ANTHROPIC_API_KEY&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create an agentgatewaybackend, which tells the Gateway what to route to. In this case, it's using Anthropic as the LLM Provider.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Please note:&lt;/strong&gt; When using an &lt;code&gt;ai&lt;/code&gt; agentgatewaybackend with the Anthropic provider, agentgateway attempts to parse and re-marshal the request body as a structured LLM message, which fails on OpenClaw's native Anthropic format due to a missing type field in complex message content. Switching to a static backend pointing directly at &lt;code&gt;api.anthropic.com:443&lt;/code&gt; tells agentgateway to forward the request as-is without any LLM-specific processing, while still providing routing, observability, and logging on all traffic. The &lt;code&gt;tls: {}&lt;/code&gt; policy is required because &lt;code&gt;api.anthropic.com&lt;/code&gt; listens on HTTPS (port 443), and without it, agentgateway sends plain HTTP, which Cloudflare rejects.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway.dev/v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AgentgatewayBackend&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-oc&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;anthropic&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;static&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api.anthropic.com&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;443&lt;/span&gt;
  &lt;span class="na"&gt;policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;tls&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{}&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a route that points to the path &lt;code&gt;/v1/messages&lt;/code&gt;, which is the format that Anthropic expects.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gateway.networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HTTPRoute&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ocroute&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;parentRefs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-oc&lt;/span&gt;
      &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;matches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PathPrefix&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/v1/messages&lt;/span&gt;
    &lt;span class="na"&gt;backendRefs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
      &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
      &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway.dev&lt;/span&gt;
      &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AgentgatewayBackend&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Deploy OpenClaw On Kubernetes
&lt;/h2&gt;

&lt;p&gt;With the Gateway deployed, let's set up OpenClaw.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Please note&lt;/strong&gt;: This deployment is just for testing and does not include anything for persistent volumes for data that is not ephemeral. If you want that configuration, you can create a PVC and mount it on &lt;code&gt;/home/node/.openclaw&lt;/code&gt; and &lt;code&gt;/home/node/workspace&lt;/code&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a ConfigMap to map the configuration that's needed for OpenClaw to route traffic through agentgateway. Please remember you will need to replace the &lt;code&gt;baseUrl&lt;/code&gt; with the hostname or IP of your Gateway.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ConfigMap&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw-agw-config&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;agw-overlay.json&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;{&lt;/span&gt;
      &lt;span class="s"&gt;"gateway": {&lt;/span&gt;
        &lt;span class="s"&gt;"bind": "lan"&lt;/span&gt;
      &lt;span class="s"&gt;},&lt;/span&gt;
      &lt;span class="s"&gt;"models": {&lt;/span&gt;
        &lt;span class="s"&gt;"mode": "merge",&lt;/span&gt;
        &lt;span class="s"&gt;"providers": {&lt;/span&gt;
          &lt;span class="s"&gt;"anthropic": {&lt;/span&gt;
            &lt;span class="s"&gt;"baseUrl": "http://YOUR_AGENTGATEWAY_HOSTNAME_OR_IP:8081",&lt;/span&gt;
            &lt;span class="s"&gt;"models": []&lt;/span&gt;
          &lt;span class="s"&gt;}&lt;/span&gt;
        &lt;span class="s"&gt;}&lt;/span&gt;
      &lt;span class="s"&gt;}&lt;/span&gt;
    &lt;span class="s"&gt;}&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a Kubernetes Deployment that points to the OpenClaw Alpine container image.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Please note&lt;/strong&gt;: When deploying OpenClaw in Kubernetes with agentgateway, the &lt;code&gt;openclaw.json&lt;/code&gt; config file needs to include the agentgateway baseUrl to route LLM traffic through the gateway. However, OpenClaw auto-generates its base config (including auth tokens and default settings) at startup, and any config modification, from the initial overlay or from running openclaw onboard triggers OpenClaw's built-in hot-reload, which performs a full process restart that kills PID 1 and causes the container to crash. The solution uses a wrapper shell script that pre-creates openclaw.json with the agentgateway overlay before OpenClaw starts&lt;/p&gt;

&lt;p&gt;(so initial startup merges cleanly), and runs OpenClaw inside a while true loop so the shell remains PID 1 and automatically restarts OpenClaw whenever a config change triggers its internal restart, preventing the container from exiting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Please note&lt;/strong&gt;: The &lt;code&gt;models: []&lt;/code&gt; parameter is required by the schema, but it also causes the a &lt;code&gt;ANTHROPIC_MODEL_ALIASES&lt;/code&gt; error. This is a known bug in &lt;code&gt;v2026.3.12&lt;/code&gt;. The &lt;code&gt;ANTHROPIC_MODEL_ALIASES&lt;/code&gt; error is a temporal dead zone issue that affects any config using an Anthropic primary model. The workaround is to use &lt;code&gt;v2026.3.11&lt;/code&gt; instead. That's why you see that image pinned in the deployment below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
          &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;alpine/openclaw:2026.3.11&lt;/span&gt;
          &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gateway&lt;/span&gt;
              &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;18789&lt;/span&gt;
              &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridge&lt;/span&gt;
              &lt;span class="na"&gt;containerPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;18790&lt;/span&gt;
              &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
          &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;sh&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;-c&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
              &lt;span class="s"&gt;mkdir -p /home/node/.openclaw&lt;/span&gt;
              &lt;span class="s"&gt;cp /tmp/agw-overlay.json /home/node/.openclaw/openclaw.json&lt;/span&gt;
              &lt;span class="s"&gt;trap 'kill $(jobs -p) 2&amp;gt;/dev/null' EXIT&lt;/span&gt;
              &lt;span class="s"&gt;while true; do&lt;/span&gt;
                &lt;span class="s"&gt;docker-entrypoint.sh node openclaw.mjs gateway --allow-unconfigured&lt;/span&gt;
                &lt;span class="s"&gt;echo "OpenClaw process exited, restarting..."&lt;/span&gt;
                &lt;span class="s"&gt;sleep 2&lt;/span&gt;
              &lt;span class="s"&gt;done&lt;/span&gt;
          &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agw-config&lt;/span&gt;
              &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/tmp/agw-overlay.json&lt;/span&gt;
              &lt;span class="na"&gt;subPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agw-overlay.json&lt;/span&gt;
          &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4"&lt;/span&gt;
              &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8Gi"&lt;/span&gt;
            &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;4"&lt;/span&gt;
              &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8Gi"&lt;/span&gt;
      &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agw-config&lt;/span&gt;
          &lt;span class="na"&gt;configMap&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw-agw-config&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a Service for OpenClaw. This Service will be used in the next section when implementing agentgateway for secure and observable OpenClaw traffic.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f -&amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterIP&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;openclaw&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gateway&lt;/span&gt;
      &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;18789&lt;/span&gt;
      &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gateway&lt;/span&gt;
      &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridge&lt;/span&gt;
      &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;18790&lt;/span&gt;
      &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bridge&lt;/span&gt;
      &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll now see that OpenClaw is running.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;openclaw-bf55866b7-s7wn6   1/1     Running       0          4m36s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Onboard OpenClaw
&lt;/h2&gt;

&lt;p&gt;For OpenClaw to work, you need to set configurations like how you want to interact with OpenClaw (iMessage, Telegram, etc.) and the LLM Provider you want to use. To do that, you need to run the &lt;code&gt;openclaw onboard&lt;/code&gt; command. Because this is running in Kubernetes, you can exec into the Pod.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; YOUR_OPENCLAW_POD_NAME &lt;span class="nt"&gt;-n&lt;/span&gt; default &lt;span class="nt"&gt;--&lt;/span&gt; openclaw onboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll see an output similar to the below and you can get started with the onboarding process.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3m7zswjpeg8nw3o8i9o8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3m7zswjpeg8nw3o8i9o8.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After the onboarding, you can test and ensure that OpenClaw is passing traffic through agentgateway.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl &lt;span class="nb"&gt;exec &lt;/span&gt;OPENCLAW_POD_NAME &lt;span class="nt"&gt;-n&lt;/span&gt; default &lt;span class="nt"&gt;--&lt;/span&gt; openclaw agent &lt;span class="nt"&gt;--message&lt;/span&gt; &lt;span class="s2"&gt;"Say hi"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And you'll see traffic routing through agentgateway similar to the below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2026-03-14T15:41:26.634010Z     info    request gateway=agentgateway-system/agentgateway-oc listener=http route=agentgateway-system/ocroute endpoint=api.anthropic.com:443 src.addr=10.224.0.149:62282 http.method=POST http.host=52.241.254.163 http.path=/v1/messages http.version=HTTP/1.1 http.status=200 protocol=http duration=2936ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>ai</category>
      <category>programming</category>
      <category>kubernetes</category>
      <category>openclaw</category>
    </item>
    <item>
      <title>Route and Secure OpenAI Azure Foundry Traffic Through Your AI Gateway</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Tue, 10 Mar 2026 15:21:50 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/route-and-secure-openai-azure-foundry-traffic-through-your-ai-gateway-3i73</link>
      <guid>https://dev.to/thenjdevopsguy/route-and-secure-openai-azure-foundry-traffic-through-your-ai-gateway-3i73</guid>
      <description>&lt;p&gt;As you begin to expland into various Agentic frameworks, there's a good chance you will end up choosing the one that exists within the cloud provider you're already using. If you're in Azure, that's Azure Foundry.&lt;/p&gt;

&lt;p&gt;The question then becomes "How do I securely route and observe the traffic?".&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn how to route Foundry traffic through a secure, reliable, and performant AI Gateway with agentgateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along with this blog post in a hands-on fashion, you'll need the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;An Azure account.&lt;/li&gt;
&lt;li&gt;Agentgateway installed (OSS), which you can find &lt;a href="https://agentgateway.dev/docs/kubernetes/latest/install/helm/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What Is Microsoft Foundry
&lt;/h2&gt;

&lt;p&gt;Foundry is the Agentic framework within Azure. If you use AWS and have heard of Bedrock before or GCP and have heard of Vertex AI, it's all very similar. They allow you to host Models from various providers (OpenAI, Anthropic, etc.) and connect to those Models from a centralized endpoint with the same API key/token (so you don't have to worry about various keys per provider). Some of the services, like Foundry, also allow you to connect to tools and fine-tune the Models you're working with.&lt;/p&gt;

&lt;p&gt;The "tldr" is that it's an Agentic hosting platform to connect to various LLMs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Azure Foundry Setup
&lt;/h2&gt;

&lt;p&gt;With the knowledge around what Foundry is in place, let's dive into the setup. You'll start with setting up Foundry.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Within the Azure porta, search for &lt;strong&gt;foundry&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7r93af9dn6rwznqn1u7v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7r93af9dn6rwznqn1u7v.png" alt=" " width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In the Foundry portal, click the blue &lt;strong&gt;+ Create&lt;/strong&gt; button.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb9ign1jwfwvvnjg7mwxs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb9ign1jwfwvvnjg7mwxs.png" alt=" " width="800" height="840"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the Foundry resource within your respective subscription and resource group.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv2aw3k4daboobc5132jw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv2aw3k4daboobc5132jw.png" alt=" " width="800" height="423"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Once Foudnry is created, you'll see a UI similar to the belo. Save the project API key. You'll need it for the next section when you create the Gateway configuration.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffywspjvnp3vadtxghyxy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffywspjvnp3vadtxghyxy.png" alt=" " width="800" height="403"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Within Foundry, search for &lt;strong&gt;gpt-5-mini&lt;/strong&gt;. Realistically, you can use any Model, but the mini Models will save you some money.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7111ow16u9m0riildjiu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7111ow16u9m0riildjiu.png" alt=" " width="800" height="694"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Deploy the Model with the default settings.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcx9ldl7hqcxw1xtftaat.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcx9ldl7hqcxw1xtftaat.png" alt=" " width="800" height="694"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the Model deployed, you will now be able to reach it with agentgateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gateway Configuration
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Create an environment variable with the Foundry API key that you saved in the previous section in step 4.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AZURE_FOUNDRY_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a Gateway object listening on port &lt;code&gt;8081&lt;/code&gt;.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Gateway&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gateway.networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-azureopenai-route&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-azureopenai-route&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;gatewayClassName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway&lt;/span&gt;
  &lt;span class="na"&gt;listeners&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HTTP&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8081&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http&lt;/span&gt;
    &lt;span class="na"&gt;allowedRoutes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;namespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;All&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Save the ALB IP of the Gateway in an environment variable. If you're not using a k8s cluster that can create a public ALB IP, you can use &lt;code&gt;localhost&lt;/code&gt; when connecting to the Gateway as long as you port-forward the k8s Gateway svc.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;INGRESS_GW_ADDRESS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get svc &lt;span class="nt"&gt;-n&lt;/span&gt; agentgateway-system agentgateway-azureopenai-route &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"{.status.loadBalancer.ingress[0]['hostname','ip']}"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="nv"&gt;$INGRESS_GW_ADDRESS&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a k8s secret that stores the Foundry API key.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Secret&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azureopenai-secret&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-azureopenai-route&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Opaque&lt;/span&gt;
&lt;span class="na"&gt;stringData&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$AZURE_FOUNDRY_API_KEY&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;The agentgateway backend will tell the Gateway what to route to. In this case, it's the &lt;strong&gt;gpt-5-mini&lt;/strong&gt; Model. You'll also point to the Foundry endpoint.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway.dev/v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AgentgatewayBackend&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-azureopenai-route&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azureopenai&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ai&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;azureopenai&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mlevantesting-resource.services.ai.azure.com&lt;/span&gt;
        &lt;span class="na"&gt;deploymentName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gpt-5-mini&lt;/span&gt;
        &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;2025-01-01-preview&lt;/span&gt;
  &lt;span class="na"&gt;policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;auth&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;secretRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azureopenai-secret&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;The last step is to create a route. Because you're using a GPT Model, the path will be &lt;code&gt;/v1/chat/completions&lt;/code&gt;, but you can set a custom route to shorten the path.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="s"&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gateway.networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HTTPRoute&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azureopenai&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
  &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-azureopenai-route&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;parentRefs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-azureopenai-route&lt;/span&gt;
      &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;matches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;PathPrefix&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/azureopenai&lt;/span&gt;
    &lt;span class="na"&gt;filters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;URLRewrite&lt;/span&gt;
      &lt;span class="na"&gt;urlRewrite&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ReplaceFullPath&lt;/span&gt;
          &lt;span class="na"&gt;replaceFullPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/v1/chat/completions&lt;/span&gt;
    &lt;span class="na"&gt;backendRefs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azureopenai&lt;/span&gt;
      &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway-system&lt;/span&gt;
      &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;agentgateway.dev&lt;/span&gt;
      &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AgentgatewayBackend&lt;/span&gt;
&lt;span class="s"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Test the route to the OpenAI Model via agentgateway. Swap out &lt;code&gt;$INGRESS_GW_ADDRESS&lt;/code&gt; with &lt;code&gt;localhost&lt;/code&gt; if your Gateway doesn't have a public ALB IP.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INGRESS_GW_ADDRESS&lt;/span&gt;&lt;span class="s2"&gt;:8081/azureopenai"&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; content-type:application/json &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
  "messages": [
    {
      "role": "system",
      "content": "You are a skilled cloud-native network engineer."
    },
    {
      "role": "user",
      "content": "Write me a paragraph containing the best way to think about Istio Ambient Mesh"
    }
  ]
}'&lt;/span&gt; | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see an output similar to the below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffqjbsr8dm65xv4h3egu3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffqjbsr8dm65xv4h3egu3.png" alt=" " width="800" height="664"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>agents</category>
    </item>
    <item>
      <title>Intercept, Inspect, Secure: Proxying Claude Code CLI Traffic</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Fri, 20 Feb 2026 12:39:27 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/intercept-inspect-secure-proxying-claude-code-cli-traffic-gen</link>
      <guid>https://dev.to/thenjdevopsguy/intercept-inspect-secure-proxying-claude-code-cli-traffic-gen</guid>
      <description>&lt;p&gt;Architecture diagrams always look something like this:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Agent -&amp;gt; Gateway -&amp;gt; LLM&lt;/code&gt; (or MCP Server).&lt;/p&gt;

&lt;p&gt;The Agents that organizations are typically referring to are Agents that perform an action via prompts by a user or autonomously, and those Agents are usually running in a system somewhere in production. That is, however, not where the majority of Agentic traffic originates. The larger chunk of traffic comes from Agentic clients (Claude Code CLI, Cursor, Copilot, etc.) and because of that, we now must think about Agents not only running in production systems, but on someone's laptop.&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn how to secure that traffic within an Agentic client with agentgateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gateway Configuration
&lt;/h2&gt;

&lt;p&gt;The first thing to ensure is that you have a proper AI Gateway configured so traffic from the Agentic CLI client can get from point A to point B securely. In this case, you can use agentgateway, which is an AI Gateway built from the ground up specifically for AI traffic.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generate an API key and put it into an environment variable so a k8s Secret can be created with it later.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export ANTHROPIC_API_KEY=
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create the Gateway object.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: agentgateway-route
  namespace: agentgateway-system
  labels:
    app: agentgateway
spec:
  gatewayClassName: enterprise-agentgateway
  infrastructure:
    parametersRef:
      name: tracing
      group: enterpriseagentgateway.solo.io
      kind: EnterpriseAgentgatewayParameters
  listeners:
  - protocol: HTTP
    port: 8080
    name: http
    allowedRoutes:
      namespaces:
        from: All
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create the secret for Anthropic. This way, you have proper access to Anthropic via your Gateway for LLM calls.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: Secret
metadata:
  name: anthropic-secret
  namespace: agentgateway-system
  labels:
    app: agentgateway-route
type: Opaque
stringData:
  Authorization: $ANTHROPIC_API_KEY
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create an Agentgateway Backend.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Two things to keep in mind with the &lt;code&gt;AgentgatewayBackend&lt;/code&gt; config.&lt;/p&gt;

&lt;p&gt;The first is that notice the routes are going through &lt;code&gt;/v1/messages&lt;/code&gt; and not &lt;code&gt;/v1/chat/completions&lt;/code&gt; like you'd normally see in an OpenAI API format spec route. The reason is that Agentgateway can handle the translation (from Anthropic spec to OpenAI spec), but because you're routing traffic directly through Claude, no translation occurs, which is why the Anthropic spec is needed.&lt;/p&gt;

&lt;p&gt;The second thing is with the two configurations below, you'll see either a Model specified (Opus) or an open bracket to specify any Model you want. The reason why is because if you specify a Model in your &lt;code&gt;AgentgatewayBackend&lt;/code&gt; and then use a different Model in Claude Code CLI, you will get a &lt;code&gt;400&lt;/code&gt; error that says something along the lines of "thinking mode isn't enabled", which isn't the error that Claude Code should be showing you, but that's what you'll most likely see. If you specify Opus, you must use Opus in your Claude Code CLI configuration. If you specify no Model and just a Provider (&lt;code&gt;anthropic: {}&lt;/code&gt;), you can use any Model you'd like.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  labels:
    app: agentgateway-route
  name: anthropic
  namespace: agentgateway-system
spec:
  ai:
    provider:
        anthropic:
          model: "claude-opus-4-6"
  policies:
    ai:
      routes:
        '/v1/messages': Messages
        '*': Passthrough
    auth:
      secretRef:
        name: anthropic-secret
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or without a specified Model&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  labels:
    app: agentgateway-route
  name: anthropic
  namespace: agentgateway-system
spec:
  ai:
    provider:
      anthropic: {}
  policies:
    auth:
      secretRef:
        name: anthropic-secret
    ai:
      routes:
        '/v1/messages': Messages
        '*': Passthrough
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create the routing configurations that point to your Gateway and use the Agentgateway Backend you created in the previous step as the reference.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: claude
  namespace: agentgateway-system
  labels:
    app: agentgateway-route
spec:
  parentRefs:
    - name: agentgateway-route
      namespace: agentgateway-system
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - name: anthropic
      namespace: agentgateway-system
      group: agentgateway.dev
      kind: AgentgatewayBackend
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Test Connectivity
&lt;/h2&gt;

&lt;p&gt;With the Gateway, Backend, and Route configured, let's ensure that the Claude Code CLI traffic can successfully go through agentgateway.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Grab your ALB IP from the Gateway within an environment variable. If you're running this locally and don't have access to an ALB IP, you can skip this test and just use &lt;code&gt;localhost&lt;/code&gt; after port-forwarding the Gateway service.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export INGRESS_GW_ADDRESS=$(kubectl get svc -n agentgateway-system agentgateway-route -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}")
echo $INGRESS_GW_ADDRESS
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Test the LLM connectivity through your Gateway with a single prompt.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ANTHROPIC_BASE_URL="http://$INGRESS_GW_ADDRESS$:8080" claude -p "What is a credit card"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or with &lt;code&gt;localhost&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ANTHROPIC_BASE_URL="http://127.0.0.1:8080" claude -p "What is a credit card"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also go into Claude Code CLI if you just run &lt;code&gt;ANTHROPIC_BASE_URL="[http://127.0.0.1:8080](http://127.0.0.1:8080/)" claude&lt;/code&gt; or &lt;code&gt;ANTHROPIC_BASE_URL="http://$INGRESS_GW_ADDRESS$:8080"&lt;/code&gt; and you'll be able to prompt it with whatever you'd like.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fklxudnqnpuxanlo2jcf5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fklxudnqnpuxanlo2jcf5.png" alt=" " width="800" height="650"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the traffic connectivity tested, let's implement Prompt Guards.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt Guards
&lt;/h2&gt;

&lt;p&gt;Connectivity through agentgateway with Claude Code CLI has been tested and confirmed, so now, let's move into the security piece.&lt;/p&gt;

&lt;p&gt;The number 1 thing organizations want to be able to secure is what can actually get prompted via an Agent. For example, the last thing you want is to have someone prompt an Agent with &lt;code&gt;Delete all of the Kubernetes clusters in production&lt;/code&gt; and it actually does it. To avoid this, you need to ensure that what a user can prompt is something that they should be able to prompt.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Modify the &lt;code&gt;AgentgatewayBackend&lt;/code&gt; with a prompt guard. Notice how this is a regex and for the test, we want to block any traffic that has the words &lt;code&gt;credit card&lt;/code&gt; in it.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  labels:
    app: agentgateway-route
  name: anthropic
  namespace: agentgateway-system
spec:
  ai:
    provider:
        anthropic:
          model: "claude-opus-4-6"
  policies:
    ai:
      routes:
        '/v1/messages': Messages
        '*': Passthrough
      promptGuard:
        request:
        - response:
            message: "Rejected due to inappropriate content"
          regex:
            action: Reject
            matches:
            - "credit card"
    auth:
      secretRef:
        name: anthropic-secret
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also do the same thing without a Model specified:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  labels:
    app: agentgateway-route
  name: anthropic
  namespace: agentgateway-system
spec:
  ai:
    provider:
      anthropic: {}
  policies:
    auth:
      secretRef:
        name: anthropic-secret
    ai:
      routes:
        '/v1/messages': Messages
        '*': Passthrough
      promptGuard:
        request:
        - response:
            message: "Rejected due to inappropriate content"
          regex:
            action: Reject
            matches:
            - "credit card"
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the check again by running either of the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ANTHROPIC_BASE_URL="http://$INGRESS_GW_ADDRESS:8080" claude -p "What is a credit card"&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ANTHROPIC_BASE_URL="http://$INGRESS_GW_ADDRESS:8080" claude -p&lt;/code&gt; and then prompting within Claude Code &lt;code&gt;What is a credit card&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You'll get an output similar to the one below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxd9e7mq4k0e4j6ruggkz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxd9e7mq4k0e4j6ruggkz.png" alt=" " width="800" height="221"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With traffic routing through agentgateway from Claude Code CLI and the knowledge of how prompt guards can work in this scenario, you can now secure traffic from anyones laptop/desktop when they're using an Agentic CLI client.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>kubernetes</category>
      <category>claude</category>
    </item>
    <item>
      <title>Build AI Agents on Kubernetes: Kagent + Amazon Bedrock Setup Guide</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Sat, 24 Jan 2026 13:19:06 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/build-ai-agents-on-kubernetes-kagent-amazon-bedrock-setup-guide-497n</link>
      <guid>https://dev.to/thenjdevopsguy/build-ai-agents-on-kubernetes-kagent-amazon-bedrock-setup-guide-497n</guid>
      <description>&lt;p&gt;Managing various LLM provider accounts, subscriptions, and cost can get cumbersome for many organizations in a world where multiple LLMs are used. To avoid this, you can use what can be called a "middle ground" between your Agent and the LLM provider.&lt;/p&gt;

&lt;p&gt;With AWS Bedrock, you can set up an API key and access various LLMs from Claude to GPT to Llama from one place. Instead of having multiple API keys and various accounts, you can route all of your Agentic traffic from your Agent to an LLM via Bedrock.&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn how to set up an Agent via kagent to access Bedrock Models and use them to perform any action you'd like.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along with this blog post from a hands-on perspective, you should have the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Kubernetes cluster.&lt;/li&gt;
&lt;li&gt;Kagent installed, which you can find &lt;a href="https://github.com/AdminTurnedDevOps/agentic-demo-repo/blob/main/kagent-oss/install.md" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Configuring Access To AWS
&lt;/h2&gt;

&lt;p&gt;The first step is ensuring that you have proper access to AWS so you can use the Model that you'd like to implement within your Agent.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create environment variables with your AWS access key, secret, and region. To retrieve an AWS access key and secret, you'll need to create them in AWS IAM.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export AWS_ACCESS_KEY_ID=&amp;lt;your-access-key-id&amp;gt;
export AWS_SECRET_ACCESS_KEY=&amp;lt;your-secret-access-key&amp;gt;
export AWS_REGION=us-west-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Once you have access, you can run the command below which will show you what Models are available in your region of choice.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;aws bedrock list-inference-profiles --region us-east-1 \
  --query "inferenceProfileSummaries[?contains(inferenceProfileId, 'claude')].{id:inferenceProfileId,name:inferenceProfileName}" \
  --output table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's an example of the output you should see on your terminal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;----------------------------------------------------------------------------
|                                  ListInferenceProfiles                                  |
+---------------------------------------------------+----------------------
|                        id                         |                name                 |
+---------------------------------------------------+-----------------------
|  us.anthropic.claude-sonnet-4-20250514-v1:0       |  US Claude Sonnet 4                 |
|  global.anthropic.claude-sonnet-4-5-20250929-v1:0 |  Global Claude Sonnet 4.5           |
|  us.anthropic.claude-haiku-4-5-20251001-v1:0      |  US Anthropic Claude Haiku 4.5      |
|  global.anthropic.claude-haiku-4-5-20251001-v1:0  |  Global Anthropic Claude Haiku 4.5  |
|  us.anthropic.claude-opus-4-5-20251101-v1:0       |  US Anthropic Claude Opus 4.5       |
|  global.anthropic.claude-opus-4-5-20251101-v1:0   |  GLOBAL Anthropic Claude Opus 4.5   |
|  us.anthropic.claude-sonnet-4-5-20250929-v1:0     |  US Anthropic Claude Sonnet 4.5     |
+---------------------------------------------------+-----------------------
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Next, go into AWS Bedrock and generate an API key. Although you have access to your AWS account, there's a separate API key needed to access LLMs via AWS Bedrock.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fregjy5u0q393dzhxlcru.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fregjy5u0q393dzhxlcru.png" width="800" height="397"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create an environment variable with the API key.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export BEDROCK_API_KEY=
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this configuration, you can now begin the Model and Agent setup so you can access LLMs via Bedrock through kagent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model And Agent Setup
&lt;/h2&gt;

&lt;p&gt;The next phase is to create a Model Config which will be how the Agent knows what Model to access. In this case, the Model called to within the Model Config will be an OpenAI GPT Model.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a Kubernetes secret that contains your AWS access key, secret, and Bedrock API key.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create secret generic kagent-bedrock-aws -n kagent \
  --from-literal=AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
  --from-literal=AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
  --from-literal=BEDROCK_API_KEY=$BEDROCK_API_KEY \
  --from-literal=AWS_SESSION_TOKEN=""
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Implement a Model config that calls out to the &lt;code&gt;openai.gpt-oss-20b-1:0&lt;/code&gt; Model using your Bedrock API key secret. You'll also see the base URL which is the URL where the Model and provider exist via Bedrock.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
  name: bedrock-model-config
  namespace: kagent
spec:
  apiKeySecret: kagent-bedrock-aws
  apiKeySecretKey: BEDROCK_API_KEY
  model: openai.gpt-oss-20b-1:0
  provider: OpenAI
  openAI:
    baseUrl: "https://bedrock-runtime.us-east-1.amazonaws.com/openai/v1"
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Check that the Model was accepted.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get modelconfig bedrock-model-config -n kagent -o jsonpath='{.status.conditions}' | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll see an output similar to the below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[
  {
    "lastTransitionTime": "...",
    "message": "",
    "reason": "ModelConfigReconciled",
    "status": "True",
    "type": "Accepted"
  }
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With the Model config set up, you can now create the Agent and test it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Bedrock With Kagent
&lt;/h2&gt;

&lt;p&gt;With kagent installed, you have access to various CRDs like the &lt;code&gt;ModelConfig&lt;/code&gt; object you created in the previous section. Within the kagent CRDs, you also have access to the &lt;code&gt;Agent&lt;/code&gt; object, which allows you to define everything from what Model Config to use to the prompt to MCP Server tools and Agent Skills.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new Agent with the YAML below. It includes all of the secrets needed, a prompt, and a few MCP Server tools.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
  name: bedrock-agent-test
  namespace: kagent
spec:
  description: Kubernetes troubleshooting agent powered by Claude via Bedrock
  type: Declarative
  declarative:
    modelConfig: bedrock-model-config
    deployment:
      env:
        - name: AWS_ACCESS_KEY_ID
          valueFrom:
            secretKeyRef:
              name: kagent-bedrock-aws
              key: AWS_ACCESS_KEY_ID
        - name: AWS_SECRET_ACCESS_KEY
          valueFrom:
            secretKeyRef:
              name: kagent-bedrock-aws
              key: AWS_SECRET_ACCESS_KEY
        - name: AWS_SESSION_TOKEN
          valueFrom:
            secretKeyRef:
              name: kagent-bedrock-aws
              key: AWS_SESSION_TOKEN
    systemMessage: |
      You're a friendly and helpful agent that uses Kubernetes tools to help with troubleshooting and deployments.

      # Instructions
      - If user question is unclear, ask for clarification before running any tools
      - Always be helpful and friendly
      - If you don't know how to answer the question, respond with "Sorry, I don't know how to answer that"

      # Response format
      - ALWAYS format your response as Markdown
      - Include a summary of actions you took and an explanation of the result
    tools:
      - type: McpServer
        mcpServer:
          name: kagent-tool-server
          kind: RemoteMCPServer
          toolNames:
          - k8s_get_available_api_resources
          - k8s_get_cluster_configuration
          - k8s_get_events
          - k8s_get_pod_logs
          - k8s_get_resource_yaml
          - k8s_get_resources
          - k8s_check_service_connectivity
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Wait until the Agent is up and operational.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n kagent --watch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Open the kagent dashboard&lt;/li&gt;
&lt;li&gt;Go to your new Agent.&lt;/li&gt;
&lt;li&gt;Prompt it with something like &lt;code&gt;What can you do?&lt;/code&gt;. You'll see an output similar to the one below.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq8ihiq17lgyuaza9en0x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq8ihiq17lgyuaza9en0x.png" alt=" " width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>agents</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Routing Observable and Secure Traffic Through Claude</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Sun, 18 Jan 2026 15:38:08 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/routing-observable-and-secure-traffic-through-claude-2idk</link>
      <guid>https://dev.to/thenjdevopsguy/routing-observable-and-secure-traffic-through-claude-2idk</guid>
      <description>&lt;p&gt;AI traffic that goes through enterprise systems should include everything from servers, cloud environments, and even laptops, desktops, and mobile devices. This level of observability and security isn't "new"; the industry has had it for years with Mobile Device Management (MDM) software. With AI workloads, however, the concepts of properly observing and securing local systems seem to have been forgotten.&lt;/p&gt;

&lt;p&gt;And we can't forget about AI traffic.&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn how to route local AI traffic through agentgateway when tools like Claude desktop are interacting with MCP Servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along from a hands-on perspective, you'll need the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Kubernetes cluster.&lt;/li&gt;
&lt;li&gt;Claude Desktop.&lt;/li&gt;
&lt;li&gt;Agentgateway installed.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Low-Hanging Fruit
&lt;/h2&gt;

&lt;p&gt;Organizations, enterprises, teams, and engineers are working on consistent ways to implement Agentic infrastructure, whether that be on systems, domain-specific Agents, generic Agents, MCP, and everything in between. This is typically happening in many places today at the, what we can call "backend layer". The "backend layer" are the cloud environments, servers running AI workloads, and networks.&lt;/p&gt;

&lt;p&gt;However, there's one piece to the puzzle that seems to be overlooked - the "frontend layer". These are the user devices (laptops, desktops, mobile devices) within the organization that are being used at work.&lt;/p&gt;

&lt;p&gt;In the engineering space, that typically falls into the LLM, Agents, or desktop software that engineers are using (Claude Code, Claude Desktop, Gemini CLI, etc.). With these "frontend layer" tools, it's open to all with zero observability or security. Now, the goal isn't to completely lock everything down to where no one can use AI, but there needs to be defense in depth, security practices, and perhaps most importantly, observability for all AI traffic even, and especially, when it's coming from a local machine.&lt;/p&gt;

&lt;p&gt;Much like all systems (laptops, desktops, mobile devices) go through networks within the enterprise that are the internal networks (traffic through a router and rules in place by a firewall and observed at the packet level), AI traffic needs to be looked at the same way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying An MCP Server
&lt;/h2&gt;

&lt;p&gt;The first step in the journey is to give Claude Code desktop "something" to route to. This could be another Agent, various Models, or an MCP Server for specific tool selection needs. This section will walk you through how to deploy an MCP Server on a Kubernetes cluster.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Deploy the following configuration which contains a configmap that has the MCP Server configuration, a Kubernetes Deployment, and a Kubernetes Service.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: mcp-math-script
  namespace: default
data:
  server.py: |
    import uvicorn
    from mcp.server.fastmcp import FastMCP
    from starlette.applications import Starlette
    from starlette.routing import Route
    from starlette.requests import Request
    from starlette.responses import JSONResponse, Response

    mcp = FastMCP("Math-Service")

    @mcp.tool()
    def add(a: int, b: int) -&amp;gt; int:
        return a + b

    @mcp.tool()
    def multiply(a: int, b: int) -&amp;gt; int:
        return a * b

    async def handle_mcp(request: Request):
        try:
            data = await request.json()
            method = data.get("method")
            msg_id = data.get("id")
            result = None

            if method == "initialize":
                result = {
                    "protocolVersion": "2024-11-05",
                    "capabilities": {"tools": {}},
                    "serverInfo": {"name": "Math-Service", "version": "1.0"}
                }

            elif method == "notifications/initialized":
                # Notifications are fire-and-forget, return empty 202 response
                return Response(status_code=202)

            elif method == "tools/list":
                tools_list = await mcp.list_tools()
                result = {
                    "tools": [
                        {
                            "name": t.name,
                            "description": t.description,
                            "inputSchema": t.inputSchema
                        } for t in tools_list
                    ]
                }

            elif method == "tools/call":
                params = data.get("params", {})
                name = params.get("name")
                args = params.get("arguments", {})

                # Call the tool
                tool_result = await mcp.call_tool(name, args)

                # --- FIX: Serialize the content objects manually ---
                serialized_content = []
                for content in tool_result:
                    if hasattr(content, "type") and content.type == "text":
                        serialized_content.append({"type": "text", "text": content.text})
                    elif hasattr(content, "type") and content.type == "image":
                         serialized_content.append({
                             "type": "image",
                             "data": content.data,
                             "mimeType": content.mimeType
                         })
                    else:
                        # Fallback for dictionaries or other types
                        serialized_content.append(content if isinstance(content, dict) else str(content))

                result = {
                    "content": serialized_content,
                    "isError": False
                }

            elif method == "ping":
                result = {}

            else:
                return JSONResponse(
                    {"jsonrpc": "2.0", "id": msg_id, "error": {"code": -32601, "message": "Method not found"}},
                    status_code=404
                )

            return JSONResponse({"jsonrpc": "2.0", "id": msg_id, "result": result})

        except Exception as e:
            # Print error to logs for debugging
            import traceback
            traceback.print_exc()
            return JSONResponse(
                {"jsonrpc": "2.0", "id": None, "error": {"code": -32603, "message": str(e)}},
                status_code=500
            )

    app = Starlette(routes=[
        Route("/mcp", handle_mcp, methods=["POST"]),
        Route("/", lambda r: JSONResponse({"status": "ok"}), methods=["GET"])
    ])

    if __name__ == "__main__":
        print("Starting Fixed Math Server on port 8000...")
        uvicorn.run(app, host="0.0.0.0", port=8000)
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-math-server
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mcp-math-server
  template:
    metadata:
      labels:
        app: mcp-math-server
    spec:
      containers:
      - name: math
        image: python:3.11-slim
        command: ["/bin/sh", "-c"]
        args:
        - |
          pip install "mcp[cli]" uvicorn starlette &amp;amp;&amp;amp;
          python /app/server.py
        ports:
        - containerPort: 8000
        volumeMounts:
        - name: script-volume
          mountPath: /app
        readinessProbe:
          httpGet:
            path: /
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: script-volume
        configMap:
          name: mcp-math-script
---
apiVersion: v1
kind: Service
metadata:
  name: mcp-math-server
  namespace: default
spec:
  selector:
    app: mcp-math-server
  ports:
  - port: 80
    targetPort: 8000
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The MCP Server should now be running in a Pod via the &lt;code&gt;default&lt;/code&gt; Namespace with the &lt;code&gt;mcp-math-server&lt;/code&gt; k8s Service sitting in front of the Pod.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuring A Gateway
&lt;/h2&gt;

&lt;p&gt;With the MCP Server deployed, you need a way to pass traffic through to it. If you think about when Agents communicate to other Agents, MCP Servers, or LLMs, there's a "middle layer", which is how the Agent gets from point A (itself) to point B (the MCP Server in this case), that "middle layer" is where the packets flow, which is the Gateway.&lt;/p&gt;

&lt;p&gt;If you aren't running on a Kubernetes cluster that has the ability to create a public ALB with an IP address that's accessible externally, you can use something like &lt;a href="https://github.com/metallb/metallb" rel="noopener noreferrer"&gt;Metallb&lt;/a&gt; or &lt;code&gt;port-forward&lt;/code&gt; the Gateway in your terminal.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new Gateway, which will use the agentgateway Gateway Class. It will be listening on port 8080 and allow traffic from the same Namespace as where the Gateway is deployed (&lt;code&gt;agentgateway-system&lt;/code&gt;).
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: agentgateway-mcp
  namespace: agentgateway-system
spec:
  gatewayClassName: enterprise-agentgateway
  listeners:
  - name: http
    port: 8080
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: Same
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Implement an agentgateway backend, which is what tells the Gateway what to route to. In this case, it's the MCP Server that you deployed in the previous section.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: agentgateway.dev/v1alpha1
kind: AgentgatewayBackend
metadata:
  name: demo-mcp-server
  namespace: agentgateway-system
spec:
  mcp:
    targets:
      - name: demo-mcp-server
        static:
          host: mcp-math-server.default.svc.cluster.local
          port: 80
          path: /mcp
          protocol: StreamableHTTP
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create an HTTP route so there's a path for the Gateway to route to. In this case, the "path" is the MCP Server via the agentgateway backend.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: mcp-route
  namespace: agentgateway-system
spec:
  parentRefs:
  - name: agentgateway-mcp
  rules:
  - backendRefs:
    - name: demo-mcp-server
      namespace: agentgateway-system
      group: agentgateway.dev
      kind: AgentgatewayBackend
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Retrieve the IP address of the Gateway. If an external one doesn't exist, you can &lt;code&gt;port-forward&lt;/code&gt; the Gateway service.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export GATEWAY_IP=$(kubectl get svc agentgateway-mcp -n agentgateway-system -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo $GATEWAY_IP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Open MCP Inspector to test the traffic to the MCP Server.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npx modelcontextprotocol/inspector#0.16.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Add the following URL into MCP Inspector. If you're port forwarding the Gateway service, use &lt;code&gt;localhost&lt;/code&gt; instead of an IP address.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://YOUR_ALB_LB_IP:8080/mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you search for tools, you should see an &lt;code&gt;add&lt;/code&gt; and &lt;code&gt;multiply&lt;/code&gt; tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configure Claude Desktop With An MCP Server
&lt;/h2&gt;

&lt;p&gt;The last step is to configure Claude Desktop to route through/use the AI gateway (agentgateway) that you deployed in the previous section. This will ensure that the traffic flowing from Claude Desktop to the MCP Server is observable, has the ability to be secured, and is going through a properly built Gateway designed specifically for AI workloads.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new file called &lt;code&gt;claude_desktop_config.json&lt;/code&gt; in the path where Claude exists (like in the following example).
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir -p ~/Library/Application\ Support/Claude
cat &amp;gt; ~/Library/Application\ Support/Claude/claude_desktop_config.json &amp;lt;&amp;lt; 'EOF'
{
  "mcpServers": {
    "math-service": {
      "command": "npx",
      "args": ["-y", "supergateway", "--streamableHttp", "http://YOUR_ALB_LB_IP:8080/mcp"]
    }
  }
}
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;After saving the config, restart Claude Desktop for changes to take effect. If you don't see any errors when opening Claude Desktop, that means the configuration that you added in step 1 worked as expected.&lt;/li&gt;
&lt;li&gt;With Claude Desktop open, ask it a simple question like &lt;code&gt;What is 2 + 2&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Traffic is now routing through agentgateway via Claude Code!&lt;/p&gt;

</description>
      <category>programming</category>
      <category>kubernetes</category>
      <category>ai</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Running Any AI Agent on Kubernetes: Step-by-Step</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Sat, 13 Dec 2025 23:35:58 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/running-any-ai-agent-on-kubernetes-step-by-step-10n5</link>
      <guid>https://dev.to/thenjdevopsguy/running-any-ai-agent-on-kubernetes-step-by-step-10n5</guid>
      <description>&lt;p&gt;There are many Agentic creation frameworks ranging from CrewAI to kagent to langchain and several others which are typically written in Python or JS. If you're an engineer working on Kubernetes, you may be thinking "What about a declarative Agent deployment method?"&lt;/p&gt;

&lt;p&gt;In this blog post, you'll see how to create your own Agent in an Agent framework and then deploy it to kagent in a declarative fashion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along with this blog post, you should have the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Kubernetes cluster deployed with kagent installed. If you've never installed kagent, you can find the how-to &lt;a href="https://www.cloudnativedeepdive.com/kagent-claude-k8s-your-private-agentic-troubleshooter/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Python3.10 or above installed.&lt;/li&gt;
&lt;li&gt;Docker desktop (or just the Docker engine) installed to build the container image.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What Are BYO Agents
&lt;/h2&gt;

&lt;p&gt;BYO (Bring Your Own) means you can create an Agent in any of the supported providers from kagent. You can also create your agent fully along with connect it to MCP sercers in kagent, but if you're already used to writing your Agents in Python using CrewAI, ADK, langchain, or any other framework, kagent gives you the ability to import those Agents. The only thing you need to do is containerize the Agent, which is straightforward with a Dockerfile (you'll see an example in the section on creating Agents).&lt;/p&gt;

&lt;h2&gt;
  
  
  Building An Agent
&lt;/h2&gt;

&lt;p&gt;With the previous section giving you knowledge around BYO Agents, it's time to start creating an Agent and see it run within Kubernetes. The next two sections will walk you through how to build a custom Agent with Agent Development Kit (ADK), which is an Agent creation framework and use an existing Agent to see the process of getting one that's readily available for Kubernetes deployed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating An Agent
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Install the Google ADK library. Depending on where you're running the below, you may need to use &lt;code&gt;pip3&lt;/code&gt; instead of &lt;code&gt;pip&lt;/code&gt;.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pip install google-adk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;With the &lt;code&gt;adk&lt;/code&gt; command, use the &lt;code&gt;create&lt;/code&gt; subcommand to create a scaffolding for an ADK Agent in Python.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;adk create NAME_OF_YOUR_AGENT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see an output similar to the one below (with the name of your Agent).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl43fyfdmuylpb1vnx0u2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl43fyfdmuylpb1vnx0u2.png" alt=" " width="478" height="302"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You can &lt;code&gt;cd&lt;/code&gt; into the directory and use the &lt;code&gt;run&lt;/code&gt; subcommand to see it in action as with the scaffolding, you'll have an Agent template.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd adk/NAME_OF_YOUR_AGENT &amp;amp;&amp;amp; adk run NAME_OF_YOUR_AGENT
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using An Existing Agent
&lt;/h3&gt;

&lt;p&gt;To make life a bit easier, instead of having to go and build out everything that is needed for the Agent to be containerized, you can use one that was already built and tested (by myself). If you're wondering "Well, why did I build an Agent then?" it's because with that Agent, you'll be able to containerize it and run it yourself after seeing the example in this section as you can use it as a reference.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Clone the &lt;code&gt;agentic-demo-code&lt;/code&gt; repo and &lt;code&gt;cd&lt;/code&gt; into the &lt;code&gt;adk/troubleshoot-agent&lt;/code&gt; directory.&lt;/li&gt;
&lt;li&gt;Open the &lt;code&gt;Dockerfile&lt;/code&gt; and you should see the file contents below.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;### STAGE 1: base image
ARG DOCKER_REGISTRY=ghcr.io
ARG VERSION=0.7.4
FROM $DOCKER_REGISTRY/kagent-dev/kagent/kagent-adk:$VERSION

WORKDIR /app

COPY troubleshootagent/ troubleshootagent/
COPY pyproject.toml pyproject.toml
COPY uv.lock uv.lock
COPY how-it-works.md how-it-works.md

RUN uv sync --locked --refresh

CMD ["troubleshootagent"]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Run the following command to build the container image.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker build . -t troubleshootagent:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you see an error about a "uv sync", run the following command to create a lock file for library versions and dependencies.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;uv lock
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see that the image was fully built.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4j3f0ao8up8o5nfi84fl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4j3f0ao8up8o5nfi84fl.png" alt=" " width="800" height="378"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;With the Agent container image local, you'll need to push it to a container registry of your choosing. Considering Docker Hub is free, you can use that if you'd prefer. Below is an example with my GitHub org.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker tag troubleshootagent:latest adminturneddevops/troubleshootagent:latest

docker push adminturneddevops/troubleshootagent:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you don't want to push the container image to your container registry, you can use &lt;code&gt;adminturneddevops/troubleshootagent:latest&lt;/code&gt; in the next section since the container image will be public.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying An Agent On Kubernetes
&lt;/h2&gt;

&lt;p&gt;With the Agent fully built, it's time to deploy it on Kubernetes using the kagent framework. This will give you a declarative method of running Agents in a mature orchestration platform like Kubernetes.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;For the Agent to work, it'll connect to an LLM. You need authentication/API access to an LLM of your choosing. In this scenario, Google Gemini is used, but you can swap it for any AI Provider you'd like to use.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Use an env variable to expoert the API key.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export GOOGLE_API_KEY=
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a Kubernetes Secret with the API key.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: Secret
metadata:
  name: kagent-google
  namespace: kagent
type: Opaque
stringData:
  GOOGLE_API_KEY: $GOOGLE_API_KEY
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Use the &lt;code&gt;Agent&lt;/code&gt; object via the kagent CRDs to add the Agent to kagent.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
  name: troubelshoot-agent
  namespace: kagent
spec:
  description: This agent is used to be a Platform Engineering troubleshoot expert.
  type: BYO
  byo:
    deployment:
      image: adminturneddevops/troubleshootagent:latest
      env:
        - name: GOOGLE_API_KEY
          valueFrom:
            secretKeyRef:
              name: kagent-google
              key: GOOGLE_API_KEY
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Confirm that the Agent is running by looking at the Pod in the &lt;code&gt;kagent&lt;/code&gt; Namespace.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n kagent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5ey66i2y21dioxvayt3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo5ey66i2y21dioxvayt3.png" alt=" " width="800" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can now begin using the Agent in kagent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmee3ilk04eeugnlfab48.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmee3ilk04eeugnlfab48.png" alt=" " width="800" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>kubernetes</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Context-Aware Networking &amp; Runtimes: Agentic End-To-End</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Sat, 06 Dec 2025 14:33:18 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/context-aware-networking-runtimes-agentic-end-to-end-1cen</link>
      <guid>https://dev.to/thenjdevopsguy/context-aware-networking-runtimes-agentic-end-to-end-1cen</guid>
      <description>&lt;p&gt;AI network traffic can very much feel like a black box. You open an AI provider console or an Agent, ask a question or perform a task, and then what happens? Where does that traffic go? Is the traffic secure? Is it going to the appropriate destination? Where’s the context?&lt;/p&gt;

&lt;p&gt;There are, what feels like hundreds of questions that need to be answered from when an Agent makes a call to an LLM to when you get a response.&lt;/p&gt;

&lt;p&gt;For those questions to be answered, you need an end-to-end workflow for when traffic leaves the Agent to when you get a response or a task is completed.&lt;/p&gt;

&lt;p&gt;In this blog post, you’ll learn how to, from a hands-on perspective, accomplish answering those questions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Kagent
&lt;/h2&gt;

&lt;p&gt;The first step in the process is the installation of the two platforms you'll be using, which is kagent and agentgateway. Agentgateway will be installed in the next section and in this section, you'll deploy kagent, which is an Agent framework that runs on Kubernetes.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install the kagent CRDs and create the &lt;code&gt;kagent&lt;/code&gt; Namespace.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install kagent-crds oci://ghcr.io/kagent-dev/kagent/helm/kagent-crds \
    --namespace kagent \
    --create-namespace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Specify your AI provider API key. For the purposes of this blog post, Anthropic is used. However, you can use whichever provider you'd like that's &lt;a href="https://kagent.dev/docs/kagent/supported-providers" rel="noopener noreferrer"&gt;supported&lt;/a&gt;.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export ANTHROPIC_API_KEY=your_api_key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Install kagent with your specified provider.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install kagent oci://ghcr.io/kagent-dev/kagent/helm/kagent \
    --namespace kagent \
    --set providers.default=anthropic \
    --set providers.anthropic.apiKey=$ANTHROPIC_API_KEY \
    --set ui.service.type=LoadBalancer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Ensure that kagent is installed successfully.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get svc -n kagent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see an output similar to the below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxaamgy8y39lpu8wa1693.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxaamgy8y39lpu8wa1693.png" alt=" " width="800" height="501"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Agentgateway + Kgateway
&lt;/h2&gt;

&lt;p&gt;The next step in the process is to install kgateway with agentgateway enabled. Kgateway is the control plane (think of it as the brains of the operation) and agentgateway is the AI-enabled data plane/proxy for all agentic traffic. You'll see how you can track, observe, and secure all traffic going through an AI Agent with agentgateway.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Installed the Kubernetes Gateway API CRDs as the Gateway is build with these CRDs to ensure flexibility and agnostic compatibilty.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/standard-install.yaml

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Install the kgateway CRDs.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade -i --create-namespace --namespace kgateway-system kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds  \
--version v2.1.1 \
--set controller.image.pullPolicy=Always
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Install kgateway as the control plane along with agentgateway enabled for AI-related data plane/proxy traffic.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade -i -n kgateway-system kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway \
    --version v2.1.1 \
    --set agentgateway.enabled=true \
    --set controller.image.pullPolicy=Always
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Ensure that kgateways control plane was installed successfully by checking to see if the kgateway Pod is running.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n kgateway-system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;The gateway classes for both kgateway (standard Envoy/stateless traffic) and agentgateway (AI-related/stateful traffic) should now be available.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get gatewayclass
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see an output similar to the below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME           CONTROLLER                  ACCEPTED   AGE
agentgateway   kgateway.dev/agentgateway   True       20h
kgateway       kgateway.dev/kgateway       True       20h
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  LLM Gateway Creation
&lt;/h2&gt;

&lt;p&gt;Now that the installation of both the AI Agent framework (kagent) and the AI gateway (agentgateway) is installed, it's time to start the configuration for LLM-related traffic. Any time that you use an AI Agent in kagent, you can observe and secure that traffic via agentgateway. Think of agentgateway as the path that gets your AI traffic from point A to point B.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the secret for your AI provider. In this case, Anthropic is used.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: Secret
metadata:
  name: anthropic-secret
  namespace: kagent
  labels:
    app: agentgateway
type: Opaque
stringData:
  Authorization: $ANTHROPIC_API_KEY
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a Gateway using agentgateway as the data plane/proxy. When the Gateway is fully up and running, it should have a public IP address if you're on a managed Kubernetes cluster. If you aren't, you'll want to &lt;code&gt;port-forward&lt;/code&gt; the Gateway service to be used in the next section. You can find out by running &lt;code&gt;kubectl get svc -n kgateway-system&lt;/code&gt;.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: agentgateway
  namespace: kgateway-system
  labels:
    app: agentgateway
spec:
  gatewayClassName: agentgateway
  listeners:
  - protocol: HTTP
    port: 8080
    name: http
    allowedRoutes:
      namespaces:
        from: All
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;The Backend is used to tell the gateway what/where to route to. In this case, you'll be routing to an LLM.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: gateway.kgateway.dev/v1alpha1
kind: Backend
metadata:
  labels:
    app: agentgateway
  name: anthropic
  namespace: kgateway-system
spec:
  type: AI
  ai:
    llm:
        anthropic:
          authToken:
            kind: SecretRef
            secretRef:
              name: anthropic-secret
          model: "claude-3-5-haiku-latest"
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Ensure that the backend was deployed successfully.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get backend -n kgateway-system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;The last step is to create a route. The gateway and the backend are both configured, but when you hit the Gateway, there's no configuration to tell the Gateway where to route to. This is where the &lt;code&gt;HTTPRoute&lt;/code&gt; object comes into play.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: claude
  namespace: kgateway-system
  labels:
    app: agentgateway
spec:
  parentRefs:
    - name: agentgateway
      namespace: kgateway-system
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /anthropic
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          type: ReplaceFullPath
          replaceFullPath: /v1/chat/completions
    backendRefs:
    - name: anthropic
      namespace: kgateway-system
      group: gateway.kgateway.dev
      kind: Backend
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the next section, you'll see how to create and connect an Agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating and Connecting An Agent
&lt;/h2&gt;

&lt;p&gt;With the Gateway created and traffic flow enabled to flow through the agentgateway proxy, you can create an Agent which will route all packets through agentgateway.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the Model Config which adds the usage of Claude (or any other LLM) to an Agent that can consume it in kagent. Notice how the &lt;code&gt;baseUrl&lt;/code&gt; points to the agentgateway IP. The agentgateway IP was created when you ran the &lt;code&gt;Gateway&lt;/code&gt; object in the previous section. As mentioned in the previous section, if you aren't using a managed k8s cluster, you'll want to &lt;code&gt;port-forward&lt;/code&gt; the Gateway service so it can be accessed as the proxy.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
  name: anthropic-model-config
  namespace: kagent
spec:
  apiKeySecret: anthropic-secret
  apiKeySecretKey: Authorization
  model: claude-3-5-haiku-latest
  provider: OpenAI
  openAI:
    baseUrl: http:YOUR_AGENTGATEWAY_IP:8080/anthropic
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;💡&lt;/p&gt;

&lt;p&gt;You'll notice that although Anthropic is used, OpenAI is the provider spec. The reason why is because the route for &lt;code&gt;/v1/chat/completions&lt;/code&gt; is an OpenAI schema. We're using the schema, but the backend is still connecting to a Claude model.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the Agent which connects to the Model Config.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
  name: testing-agentgateway
  namespace: kagent
spec:
  description: This agent can use a single tool to expand it's Kubernetes knowledge for troubleshooting and deployment
  type: Declarative
  declarative:
    modelConfig: anthropic-model-config
    systemMessage: |-
      You're a friendly and helpful agent that uses the Kubernetes tool to help troubleshooting and deploy environments

      # Instructions

      - If user question is unclear, ask for clarification before running any tools
      - Always be helpful and friendly
      - If you don't know how to answer the question DO NOT make things up
        respond with "Sorry, I don't know how to answer that" and ask the user to further clarify the question

      # Response format
      - ALWAYS format your response as Markdown
      - Your response will include a summary of actions you took and an explanation of the result
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Open kagent and find the &lt;code&gt;testing-agentgateway&lt;/code&gt; Agent.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4idwaeqqppnyf3xgzzi8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4idwaeqqppnyf3xgzzi8.png" alt=" " width="800" height="501"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Prompt the Agent with &lt;code&gt;What can you do?&lt;/code&gt;. You should see an output similiar to the below.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mbqfhd0oo7drxz6lh1y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6mbqfhd0oo7drxz6lh1y.png" alt=" " width="800" height="506"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open a terminal and run the following commands:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n kgateway-system

kubectl logs YOUR_AGENTGATEWAY_POD_NAME -n kgateway-system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll now be able to see that the traffic routed through the agentgateway proxy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="nx"&gt;gateway&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;kgateway&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;system&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;agentgateway&lt;/span&gt; &lt;span class="nx"&gt;listener&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;http&lt;/span&gt; &lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;kgateway&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;system&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;claude&lt;/span&gt; &lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;com&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;443&lt;/span&gt; &lt;span class="nx"&gt;src&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;addr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;192.168&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mf"&gt;46.17&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;29426&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;POST&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;a1e5a3b9a8eba4aa09517966f1777763&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;34157947&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;us&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;east&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;elb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;amazonaws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;com&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sr"&gt;/anthropic/&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mf"&gt;1.1&lt;/span&gt; &lt;span class="nx"&gt;http&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="nx"&gt;protocol&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;llm&lt;/span&gt; &lt;span class="nx"&gt;gen_ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;operation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt; &lt;span class="nx"&gt;gen_ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;anthropic&lt;/span&gt; &lt;span class="nx"&gt;gen_ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;claude&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;haiku&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;latest&lt;/span&gt; &lt;span class="nx"&gt;gen_ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;claude&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;haiku&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;20241022&lt;/span&gt; &lt;span class="nx"&gt;gen_ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;input_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;183&lt;/span&gt; &lt;span class="nx"&gt;gen_ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output_tokens&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;240&lt;/span&gt; &lt;span class="nx"&gt;duration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4797&lt;/span&gt;&lt;span class="nx"&gt;ms&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



</description>
      <category>ai</category>
      <category>kubernetes</category>
      <category>programming</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Security Holes in MCP Servers and How To Plug Them</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Tue, 02 Dec 2025 12:35:43 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/security-holes-in-mcp-servers-and-how-to-plug-them-d61</link>
      <guid>https://dev.to/thenjdevopsguy/security-holes-in-mcp-servers-and-how-to-plug-them-d61</guid>
      <description>&lt;p&gt;Model Context Protocol (MCP), has officially hit one year old as of November 25th and although there have been some amazing innovations within MCP, one issue still persists - the gaping security hole. This is no secret as just about every organization is talking about it. The long-running joke so far has been “The S in MCP stands for security”.&lt;/p&gt;

&lt;p&gt;Aside from prompt injections, MCP Server security is arguably the biggest issue in the AI security world right now.&lt;/p&gt;

&lt;p&gt;In this blog post, you’ll learn how to fix the gaps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along with this blog post, you should have:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Kubernetes cluster running (it can be local).&lt;/li&gt;
&lt;li&gt;Kubernetes Gateway API CRDs installed, which you can find &lt;a href="https://gateway-api.sigs.k8s.io/guides/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why Security Matters For MCP
&lt;/h2&gt;

&lt;p&gt;There are two forms of MCP servers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;stdio&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;StreamableHTTP&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;code&gt;stdio&lt;/code&gt; (the communication method) stands for standard input/output. When using standard input/output, you’re typically targeting a pre-built MCP Server that’s written in, typically, Python or JS (Go is up and coming in this space) that’s called to locally via a command like &lt;code&gt;uvx&lt;/code&gt; or &lt;code&gt;npx&lt;/code&gt; (depending on how the MCP Server is built). It’s not a standard library download (like a &lt;code&gt;pip install&lt;/code&gt;), but instead stored in cache (e.g - &lt;code&gt;~/.local/share/uv&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;code&gt;StreamableHTTP&lt;/code&gt; is an external (or even internal) server that you’re reaching out to that’s not cached locally. A good example of this is the GitHub Copilot MCP Server. It’s an MCP Server that you’re reaching out to over the HTTP protocol instead of via a local cache.&lt;/p&gt;

&lt;p&gt;💡 There was another option called SSE which you may see, but it is now deprecated as of June 2025.&lt;/p&gt;

&lt;p&gt;Regardless of which option you choose, there are many security holes.&lt;/p&gt;

&lt;p&gt;One (of the many) problem with &lt;code&gt;stdio&lt;/code&gt; MCP Servers is that you can't run them through a Gateway. That means no AuthN/Z, no rate limiting, and no tool control. The MCP Servers are effectively open and usable by anyone in an organization unless you’re manually locking each computer down with Claude Desktop configurations (which, spoiler alert: no ones doing).&lt;/p&gt;

&lt;p&gt;With Streamable HTTP, you’re in the dark. You’re connecting Agents to some black box running in someones datacenter with who knows what (if any) security protocols, and even if there are security protocols, that doesn’t help from an overall AuthN/Z perspective for your organization. There’s also no way to even test the security without a proper pentest, which wouldn’t be legal without explicit permission from the organization hosting the MCP Server. The only way to do it would be to put an Agent in front of the MCP Server, but then you're not actually securing the MCP Server, you're securing the Agent.&lt;/p&gt;

&lt;p&gt;As Model Context Protocol stands right now, there’s only one true way to secure it - with a proper AI Gateway.&lt;/p&gt;

&lt;p&gt;Solo Enterprise For Agentgateway implements everything from locking down tools to proper user and system-level authentication to MCP Servers with or without Agents. In the following sections, you’ll see how to configure security for MCP.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploy An MCP Server and Agentgateway
&lt;/h2&gt;

&lt;p&gt;The first step is to deploy an MCP Server and a Gateway via agentgateway enterprise so we not only have an MCP Server to test with, but a proper AI gateway to secure our MCP connectivity.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a new Kubernetes Deployment pointing to the test MCP Server that is containerized. You’ll also see a Service that gets deployed so the Gateway can properly connect to it.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;kubectl&lt;/span&gt; &lt;span class="nx"&gt;apply&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;apps&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-website-fetcher&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-website-fetcher&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-website-fetcher&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-website-fetcher&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;ghcr&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;io&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;peterj&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;mcp-website-fetcher&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;&lt;span class="na"&gt;main&lt;/span&gt;
        &lt;span class="na"&gt;imagePullPolicy&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Always&lt;/span&gt;
&lt;span class="err"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-website-fetcher&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-website-fetcher&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;80&lt;/span&gt;
    &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;8000&lt;/span&gt;
    &lt;span class="na"&gt;appProtocol&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;kgateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dev&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;mcp&lt;/span&gt;
&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Deploy the Backend so agentgateway knows what to route to. In this case, it’s routing to the MCP Server service that you deployed in step 1.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;kubectl&lt;/span&gt; &lt;span class="nx"&gt;apply&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;kgateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dev&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Backend&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-backend&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gloo-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;MCP&lt;/span&gt;
  &lt;span class="na"&gt;mcp&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;targets&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-target&lt;/span&gt;
      &lt;span class="na"&gt;static&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-website-fetcher&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;svc&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;cluster&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;local&lt;/span&gt;
        &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;80&lt;/span&gt;
        &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;StreamableHTTP&lt;/span&gt;
&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Deploy the Gateway using the agentgateway enterprise class. This Gateway is what will be used for MCP Inspector to connect to (more on Inspector coming up).
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;kubectl&lt;/span&gt; &lt;span class="nx"&gt;apply&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;networking&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;k8s&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;io&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Gateway&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;agentgateway&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gloo-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;gatewayClassName&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;agentgateway-enterprise&lt;/span&gt;
  &lt;span class="na"&gt;listeners&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;http&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;8080&lt;/span&gt;
    &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;HTTP&lt;/span&gt;
    &lt;span class="na"&gt;allowedRoutes&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;namespaces&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Same&lt;/span&gt;
&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create an HTTP route so you can route the traffic to the backend, which you created in step 3.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;kubectl&lt;/span&gt; &lt;span class="nx"&gt;apply&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;networking&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;k8s&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;io&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;HTTPRoute&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-route&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gloo-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;parentRefs&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;agentgateway&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;backendRefs&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-backend&lt;/span&gt;
      &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;kgateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dev&lt;/span&gt;
      &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Backend&lt;/span&gt;
&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Capture the Gateway ALB IP in an environment variable to be used when connecting to the MCP Server. If you’re running this locally, you can do a &lt;code&gt;port-forward&lt;/code&gt; on the Gateway and use &lt;a href="http://localhost" rel="noopener noreferrer"&gt;&lt;code&gt;localhost&lt;/code&gt;&lt;/a&gt; within the IP address section when connecting to it.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="nx"&gt;GATEWAY_IP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;kubectl&lt;/span&gt; &lt;span class="kd"&gt;get&lt;/span&gt; &lt;span class="nx"&gt;svc&lt;/span&gt; &lt;span class="nx"&gt;agentgateway&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;n&lt;/span&gt; &lt;span class="nx"&gt;gloo&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;system&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;o&lt;/span&gt; &lt;span class="nx"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;{.status.loadBalancer.ingress[0].ip}&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;echo&lt;/span&gt; &lt;span class="nx"&gt;$GATEWAY_IP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Open MCP Inspector to connect to the MCP Server.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;npx&lt;/span&gt; &lt;span class="nx"&gt;modelcontextprotocol&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;inspector&lt;/span&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="mf"&gt;0.16&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The URL to put into MCP Inspector is: &lt;code&gt;http://YOUR_ALB_LB_IP:8080/mcp&lt;/code&gt; or if you’re running locally, &lt;code&gt;http://localhost:8080/mcp&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc0yf8ylrnhjeny31q8z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvc0yf8ylrnhjeny31q8z.png" alt=" " width="800" height="351"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You’re now connected to the MCP Server via Inspector (the MCP client), but as you can see, it’s fully open. There’s no security at all. In the next section, that’ll be fixed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Secure MCP Server Auth
&lt;/h2&gt;

&lt;p&gt;With a properly deployed MCP Server and agentgateway in front of it, let’s begin the journey of securing MCP Server connectivity. The first step is to enable token-based authentication. In this case, you’ll use a JWT token.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Add in a traffic policy for auth based on a JWT token.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;kubectl&lt;/span&gt; &lt;span class="nx"&gt;apply&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gloo&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;solo&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;io&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;GlooTrafficPolicy&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;jwt&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gloo-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;targetRefs&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;networking&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;k8s&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;io&lt;/span&gt;
      &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Gateway&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;agentgateway&lt;/span&gt;
  &lt;span class="na"&gt;glooJWT&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;beforeExtAuth&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;providers&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;selfminted&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;issuer&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;solo&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;io&lt;/span&gt;
          &lt;span class="na"&gt;jwks&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;local&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'{"keys":[{"kty":"RSA","kid":"solo-public-key-001","use":"sig","alg":"RS256","n":"AOfIaJMUm7564sWWNHaXt_hS8H0O1Ew59-nRqruMQosfQqa7tWne5lL3m9sMAkfa3Twx0LMN_7QqRDoztvV3Wa_JwbMzb9afWE-IfKIuDqkvog6s-xGIFNhtDGBTuL8YAQYtwCF7l49SMv-GqyLe-nO9yJW-6wIGoOqImZrCxjxXFzF6mTMOBpIODFj0LUZ54QQuDcD1Nue2LMLsUvGa7V1ZHsYuGvUqzvXFBXMmMS2OzGir9ckpUhrUeHDCGFpEM4IQnu-9U8TbAJxKE5Zp8Nikefr2ISIG2Hk1K2rBAc_HwoPeWAcAWUAR5tWHAxx-UXClSZQ9TMFK850gQGenUp8","e":"AQAB"}]}'&lt;/span&gt;
&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Save the token for auth via MCP Inspector.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6InNvbG8tcHVibGljLWtleS0wMDEifQ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;eyJpc3MiOiJzb2xvLmlvIiwib3JnIjoic29sby5pbyIsInN1YiI6ImJvYiIsInRlYW0iOiJvcHMiLCJleHAiOjIwNzQyNzQ5NTQsImxsbXMiOnsibWlzdHJhbGFpIjpbIm1pc3RyYWwtbGFyZ2UtbGF0ZXN0Il19fQ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GF_uyLpZSTT1DIvJeO_eish1WDjMaS4BQSifGQhqPRLjzu3nXtPkaBRjceAmJi9gKZYAzkT25MIrT42ZIe3bHilrd1yqittTPWrrM4sWDDeldnGsfU07DWJHyboNapYR&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;KZGImSmOYshJlzm1tT_Bjt3&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;RK3OBzYi90_wl0dyAl9D7wwDCzOD4MRGFpoMrws_OgVrcZQKcadvIsH8figPwN4mK1U_1mxuL08RWTu92xBcezEO4CdBaFTUbkYN66Y2vKSTyPCxg3fLtg1mvlzU1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;Wgm2xZIiPiarQHt6Uq7v9ftgzwdUBQM1AYLvUVhCN6XkkR9OU3p0OXiqEDjAxcg&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Try to reconnect to the MCP Server and you’ll see an error similar to the below:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;Connection&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;Check&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nx"&gt;your&lt;/span&gt; &lt;span class="nx"&gt;MCP&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="nx"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;running&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;proxy&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="nx"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;correct&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Within MCP Inspector, click on &lt;strong&gt;Authentication&lt;/strong&gt; and add in the following:&lt;/li&gt;
&lt;li&gt;Header Name: &lt;strong&gt;Authorization&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Bearer Token: Bobs Token from step 2&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffus385h83g5xo9n07hfg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffus385h83g5xo9n07hfg.png" alt=" " width="384" height="644"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You should now be able to connect to the MCP Server successfully.&lt;/p&gt;

&lt;p&gt;With proper auth set up, you now know that not just anyone can use your agentgateway to connect to an MCP Server. This allows you to ensure that the traffic you’re observing from an AuthN/Z perspective is valid in comparison to the “anyone can do whatever they want” nature of MCP Servers without agentgateway in place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Locking Down MCP Tool Lists
&lt;/h2&gt;

&lt;p&gt;The final step is to specify what MCP Tools are available. One of the main issues for organizations is they want to use Tools within an MCP Server, but not all Tools. For example, maybe a person or an AI Agent connecting to an MCP Server only needs the ability to view/list/get resources (like a readonly Agent), but with the current architecture out of the box available for MCP, that’s not doable.&lt;/p&gt;

&lt;p&gt;However, with traffic policies via agentgateway, it is.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a policy that specifies no tools available for use. This will help in testing the ability to lock down tools via the Gateway.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;kubectl&lt;/span&gt; &lt;span class="nx"&gt;apply&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;kgateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dev&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;TrafficPolicy&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;jwt-rbac&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gloo-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;targetRefs&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;kgateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dev&lt;/span&gt;
      &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Backend&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-backend&lt;/span&gt;
  &lt;span class="na"&gt;rbac&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;policy&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matchExpressions&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
        &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="s"&gt;'mcp.tool.name == ""'&lt;/span&gt;
&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Disconnect and reconnect via the MCP Inspector and you should see no tools available.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxf7ay60kp3tgiijb26j1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxf7ay60kp3tgiijb26j1.png" alt=" " width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Update the policy to include the &lt;strong&gt;fetch&lt;/strong&gt; tool.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsx"&gt;&lt;code&gt;&lt;span class="nx"&gt;kubectl&lt;/span&gt; &lt;span class="nx"&gt;apply&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="err"&gt;&amp;lt;&lt;/span&gt;&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;kgateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dev&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;TrafficPolicy&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;jwt-rbac&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gloo-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;targetRefs&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="na"&gt;group&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;gateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;kgateway&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="na"&gt;dev&lt;/span&gt;
      &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;Backend&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="na"&gt;mcp-backend&lt;/span&gt;
  &lt;span class="na"&gt;rbac&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;policy&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matchExpressions&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
        &lt;span class="err"&gt;-&lt;/span&gt; &lt;span class="s"&gt;'mcp.tool.name == "fetch"'&lt;/span&gt;
&lt;span class="na"&gt;EOF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Reconnect to the MCP Server via Inspector and you’ll now see the tool available.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1bjmw9ea661dyxwo1hb8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1bjmw9ea661dyxwo1hb8.png" alt=" " width="800" height="351"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;With all of the security concerns around MCP Servers, it reminds us of a very important aspect of cyber security - it’s not about trying to block all bad actors, it’s about mitigating as much risk as possible. That should be the goal for every organization and with these implementations, your MCP security posture should be in a much better place.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>kubernetes</category>
      <category>docker</category>
    </item>
    <item>
      <title>FinOps For Agentic: How To Capture Token Usage Cost Across LLMs</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Tue, 25 Nov 2025 15:25:48 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/finops-for-agentic-how-to-capture-token-usage-cost-across-llms-5fco</link>
      <guid>https://dev.to/thenjdevopsguy/finops-for-agentic-how-to-capture-token-usage-cost-across-llms-5fco</guid>
      <description>&lt;p&gt;There's one major topic that every organization is talking about right now when it comes to Agentic workloads:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;How am I going to track cost?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Tracking cost comes down to Agentic traffic, LLM traffic, and overall Token usage. The problem is that right now, it's scattered. Everyone is using an Agent and that Agent is tied to an API key, not a Gateway or a user. There's no way to track the cost.&lt;/p&gt;

&lt;p&gt;Until now.&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn how to track all AI/LLM/Token usage/cost across all Gateways at once with agentgateway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along with this blog post from a hands-on perspective, you'll want to have the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Kubernetes cluster&lt;/li&gt;
&lt;li&gt;An Anthropic API key&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you don't have these readily available, you can follow along from a theoretical perspective. If you have another AI provider like OpenAI and not Anthropic, you can swap out the Model/LLM calls in the &lt;code&gt;Backend&lt;/code&gt; object for whichever Model you want to use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gateway Installation
&lt;/h2&gt;

&lt;p&gt;In this section, you will use two open-source tools - kgateway as the control plane and agentgateway as the data plane/proxy.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install the Kubernetes Gateway API CRDs.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/standard-install.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Install the CRDs for kgateway.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade -i --create-namespace --namespace kgateway-system --version v2.2.0-main \
kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds \
--set controller.image.pullPolicy=Always
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Install kgateway with agentgateway enabled.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade -i --namespace kgateway-system --version v2.2.0-main kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway \
  --set gateway.aiExtension.enabled=true \
  --set agentgateway.enabled=true  \
  --set controller.image.pullPolicy=Always
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Confirm that the kgateway control plane is operational.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n kgateway-system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Grafana Configuration
&lt;/h2&gt;

&lt;p&gt;For the purposes of this blog post, you'll use the kube-prometheus Stack to set up metric collection and showcase those metrics via a dashboard within Grafana to see the AI/LLM/Token usage/cost. To accomplish that, you'll install kube-prometheus and configure ServiceMonitors/PrometheusRules&lt;/p&gt;

&lt;h3&gt;
  
  
  Install Kube-Prometheus
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Install kube-prometheus.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace \
  --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false \
  --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false \
  --set prometheus.prometheusSpec.ruleSelectorNilUsesHelmValues=false \
  --set prometheus.prometheusSpec.retention=30d \
  --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=50Gi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Access the Grafana dashboard.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl port-forward -n monitoring svc/kube-prometheus-stack-grafana 3000:80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Username: admin&lt;/p&gt;

&lt;p&gt;Password retrieval: &lt;code&gt;kubectl get secret -n monitoring kube-prometheus-stack-grafana -o jsonpath='{.data.admin-password}' | base64 -d &amp;amp;&amp;amp; echo&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Configure Monitors
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Go to this &lt;a href="https://github.com/AdminTurnedDevOps/agentic-demo-repo/blob/main/agentgateway-oss-k8s/cost/cost-across-dataplanes/monitoring.yaml" rel="noopener noreferrer"&gt;repo&lt;/a&gt;, copy the &lt;code&gt;monitoring.yaml&lt;/code&gt; config, save it, and apply it.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f monitoring.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Configure The Dashboard
&lt;/h3&gt;

&lt;p&gt;The last step is to configure a proper dashboard so you can see all of the Token/LLM/AI costs via the Gateways.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open Grafana&lt;/li&gt;
&lt;li&gt;Go to &lt;strong&gt;Dashboards &amp;gt; Import&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Upload &lt;code&gt;grafana-dashboard.json&lt;/code&gt; or paste its contents. You can find the dashboard &lt;a href="https://github.com/AdminTurnedDevOps/agentic-demo-repo/blob/main/agentgateway-oss-k8s/cost/cost-across-dataplanes/grafana-dashboard.json" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Select "Prometheus" as the data source and click &lt;strong&gt;Import&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Create Gateways
&lt;/h2&gt;

&lt;p&gt;Now that kgateway is installed, you can create Gateways with backends to LLMs. In this section, you'll create 3 very similar Gateways. The reason you'll create 3 is to showcase how you can collect LLM/AI/Token costs across multiple Gateways at once instead of having to collect the cost for each Gateway one at a time, which saves engineering/FinOps cycles.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configure Gateways
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Set your Anthropic API key as an environment variable.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export ANTHROPIC_API_KEY=
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a Secret for your API key.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: Secret
metadata:
  name: anthropic-secret
  namespace: kgateway-system
  labels:
    app: agentgateway
type: Opaque
stringData:
  Authorization: $ANTHROPIC_API_KEY
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create Gateway/Route/Backend number 1.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: agentgateway1
  namespace: kgateway-system
  labels:
    app: agentgateway1
spec:
  gatewayClassName: agentgateway
  listeners:
  - protocol: HTTP
    port: 8080
    name: http
    allowedRoutes:
      namespaces:
        from: All
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: Backend
metadata:
  labels:
    app: agentgateway1
  name: anthropic1
  namespace: kgateway-system
spec:
  type: AI
  ai:
    llm:
        anthropic:
          authToken:
            kind: SecretRef
            secretRef:
              name: anthropic-secret
          model: "claude-3-5-haiku-latest"
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: claude1
  namespace: kgateway-system
  labels:
    app: agentgateway1
spec:
  parentRefs:
    - name: agentgateway1
      namespace: kgateway-system
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /anthropic
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          type: ReplaceFullPath
          replaceFullPath: /v1/chat/completions
    backendRefs:
    - name: anthropic1
      namespace: kgateway-system
      group: gateway.kgateway.dev
      kind: Backend
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Number 2.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: agentgateway2
  namespace: kgateway-system
  labels:
    app: agentgateway2
spec:
  gatewayClassName: agentgateway
  listeners:
  - protocol: HTTP
    port: 8080
    name: http
    allowedRoutes:
      namespaces:
        from: All
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: Backend
metadata:
  labels:
    app: agentgateway2
  name: anthropic2
  namespace: kgateway-system
spec:
  type: AI
  ai:
    llm:
        anthropic:
          authToken:
            kind: SecretRef
            secretRef:
              name: anthropic-secret
          model: "claude-3-5-haiku-latest"
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: claude2
  namespace: kgateway-system
  labels:
    app: agentgateway2
spec:
  parentRefs:
    - name: agentgateway2
      namespace: kgateway-system
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /anthropic
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          type: ReplaceFullPath
          replaceFullPath: /v1/chat/completions
    backendRefs:
    - name: anthropic2
      namespace: kgateway-system
      group: gateway.kgateway.dev
      kind: Backend
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Number 3.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f- &amp;lt;&amp;lt;EOF
kind: Gateway
apiVersion: gateway.networking.k8s.io/v1
metadata:
  name: agentgateway3
  namespace: kgateway-system
  labels:
    app: agentgateway3
spec:
  gatewayClassName: agentgateway
  listeners:
  - protocol: HTTP
    port: 8080
    name: http
    allowedRoutes:
      namespaces:
        from: All
---
apiVersion: gateway.kgateway.dev/v1alpha1
kind: Backend
metadata:
  labels:
    app: agentgateway3
  name: anthropic3
  namespace: kgateway-system
spec:
  type: AI
  ai:
    llm:
        anthropic:
          authToken:
            kind: SecretRef
            secretRef:
              name: anthropic-secret
          model: "claude-3-5-haiku-latest"
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: claude3
  namespace: kgateway-system
  labels:
    app: agentgateway3
spec:
  parentRefs:
    - name: agentgateway3
      namespace: kgateway-system
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /anthropic
    filters:
    - type: URLRewrite
      urlRewrite:
        path:
          type: ReplaceFullPath
          replaceFullPath: /v1/chat/completions
    backendRefs:
    - name: anthropic3
      namespace: kgateway-system
      group: gateway.kgateway.dev
      kind: Backend
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Test Gateways
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Add the load balancer IP into an environment variable for each Gateway.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export INGRESS_GW_ADDRESSONE=$(kubectl get svc -n kgateway-system agentgateway1 -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}")
echo $INGRESS_GW_ADDRESSONE

export INGRESS_GW_ADDRESSTWO=$(kubectl get svc -n kgateway-system agentgateway2 -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}")
echo $INGRESS_GW_ADDRESSTWO

export INGRESS_GW_ADDRESSTHREE=$(kubectl get svc -n kgateway-system agentgateway3 -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}")
echo $INGRESS_GW_ADDRESSTHREE
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Test each Gateway.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl "$INGRESS_GW_ADDRESSONE:8080/anthropic" -v \ -H content-type:application/json -H x-api-key:$ANTHROPIC_API_KEY -H "anthropic-version: 2023-06-01" -d '{
  "model": "claude-sonnet-4-5",
  "messages": [
    {
      "role": "system",
      "content": "You are a skilled cloud-native network engineer."
    },
    {
      "role": "user",
      "content": "Write me a paragraph containing the best way to think about Istio Ambient Mesh"
    }
  ]
}' | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl "$INGRESS_GW_ADDRESSTWO:8080/anthropic" -v \ -H content-type:application/json -H x-api-key:$ANTHROPIC_API_KEY -H "anthropic-version: 2023-06-01" -d '{
  "model": "claude-sonnet-4-5",
  "messages": [
    {
      "role": "system",
      "content": "You are a skilled cloud-native network engineer."
    },
    {
      "role": "user",
      "content": "Write me a paragraph containing the best way to think about Istio Ambient Mesh"
    }
  ]
}' | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl "$INGRESS_GW_ADDRESSTHREE:8080/anthropic" -v \ -H content-type:application/json -H x-api-key:$ANTHROPIC_API_KEY -H "anthropic-version: 2023-06-01" -d '{
  "model": "claude-sonnet-4-5",
  "messages": [
    {
      "role": "system",
      "content": "You are a skilled cloud-native network engineer."
    },
    {
      "role": "user",
      "content": "Write me a paragraph containing the best way to think about Istio Ambient Mesh"
    }
  ]
}' | jq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Metrics Testing
&lt;/h2&gt;

&lt;p&gt;Now that everything is deployed, you can start checking the metrics for Token cost/usage, LLM cost, and see it all for every Gateway within the same dashboard.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Port-forward Prometheus
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl port-forward -n monitoring svc/kube-prometheus-stack-prometheus 9090:9090
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Check Token usage.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -s 'http://localhost:9090/api/v1/query?query=agentgateway:input_tokens:total' | jq '.data.result[0].value'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Check the overall cost.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -s 'http://localhost:9090/api/v1/query?query=agentgateway:cost_usd:total_daily' | jq '.data.result[0].value'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see outputs similar to the below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[
  1763214193.134,
  "41.31458333333333"
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[
  1763214214.598,
  "0.0008821471428571428"
]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Go to the Grafana Dashboard and you'll now see all of the Gateway costs per Gateway.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frqhuz263joog0xqa9syc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frqhuz263joog0xqa9syc.png" alt=" " width="800" height="408"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>kubernetes</category>
      <category>programming</category>
    </item>
    <item>
      <title>Deploying Local AI Agents In Kubernetes</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Sat, 08 Nov 2025 16:09:44 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/deploying-local-ai-agents-in-kubernetes-3087</link>
      <guid>https://dev.to/thenjdevopsguy/deploying-local-ai-agents-in-kubernetes-3087</guid>
      <description>&lt;p&gt;There are two types of Models/LLMs you see in today's Agentic world:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;"SaaS-based Models", which are Models that are managed for you (Claude, Gemini, GPT, etc.)&lt;/li&gt;
&lt;li&gt;Local Models, which you manage yourself.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;From a security, governance, and overall data control perspective, some organizations want to go with local Models.&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn how to manage and deploy a local Model using Kubernetes primitives and kagent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To follow along with this blog post in a hands-on fashion, you should have the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A Kubernetes cluster. If you're using a local cluster, ensure that your local machine has enough CPU/memory for a more resource-intensive environment.&lt;/li&gt;
&lt;li&gt;An Anthropic API key. If you don't have one and/or prefer to use another AI provider, &lt;a href="https://kagent.dev/docs/kagent/supported-providers" rel="noopener noreferrer"&gt;there are several providers supported by kagent&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Deploying Ollama
&lt;/h2&gt;

&lt;p&gt;The first step is to deploy your local Model. In this case, you'll see Ollama, which is a popular Model for local deployments.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a Kubernetes Namespace for your Llama Model.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create ns ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Deploy the Ollama Model as a Kubernetes Deployment and attach a Service to it. Notice how there's a fair amount of CPU and memory given to the Deployment. The reason is that local models are typically slower. The goal with more CPU and memory (when a GPU doesn't exist) is that it'll be faster to use.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ollama
  namespace: ollama
spec:
  selector:
    matchLabels:
      name: ollama
  template:
    metadata:
      labels:
        name: ollama
    spec:
      initContainers:
      - name: model-puller
        image: ollama/ollama:latest
        command: ["/bin/sh", "-c"]
        args:
          - |
            ollama serve &amp;amp;
            sleep 10
            ollama pull llama3
            pkill ollama
        volumeMounts:
        - name: ollama-data
          mountPath: /root/.ollama
        resources:
          requests:
            memory: "8Gi"
          limits:
            memory: "12Gi"
      containers:
      - name: ollama
        image: ollama/ollama:latest
        ports:
        - name: http
          containerPort: 11434
          protocol: TCP
        volumeMounts:
        - name: ollama-data
          mountPath: /root/.ollama
        resources:
          requests:
            memory: "8Gi"
          limits:
            memory: "12Gi"
      volumes:
      - name: ollama-data
        emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
  name: ollama
  namespace: ollama
spec:
  type: ClusterIP
  selector:
    name: ollama
  ports:
  - port: 80
    name: http
    targetPort: http
    protocol: TCP
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Give the Pod a few minutes to get up and running, as it's fairly large and it's downloaded the Llama Model.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Confirm that the Model was downloaded.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl exec -n ollama deployment/ollama -- ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see an output similar to the one below, indicating that the Model has been downloaded successfully.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Defaulted container "ollama" out of: ollama, model-puller (init)
NAME             ID              SIZE      MODIFIED
llama3:latest    365c0bd3c000    4.7 GB    About a minute ago
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Deploying kagent
&lt;/h2&gt;

&lt;p&gt;Now that the Llama Model is on your Kubernetes cluster, you can deploy kagent to manage that Model and attach an Agent to it.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install the kagent CRDs.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install kagent-crds oci://ghcr.io/kagent-dev/kagent/helm/kagent-crds \
    --namespace kagent \
    --create-namespace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Set an environment variable for your Anthropic API key.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export ANTHROPIC_API_KEY=your_api_key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;💡&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install kagent.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install kagent oci://ghcr.io/kagent-dev/kagent/helm/kagent \
    --namespace kagent \
    --set providers.default=anthropic \
    --set providers.anthropic.apiKey=$ANTHROPIC_API_KEY \
    --set ui.service.type=LoadBalancer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;💡&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Retrieve the IP address of your Agent.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get svc -n kagent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're running locally and don't have a way to retrieve a public IP address, you can port-forward the kagent UI service.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl port-forward svc/kagent-ui -n kagent 8080:8080
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Open the kagent UI and either go through the wizard or click the &lt;strong&gt;skip&lt;/strong&gt; button on the bottom left (going through the wizard isn't needed for the purposes of this blog post).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You should see the UI similar to the screenshot below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnnn1d4zk2om7yx1mf2q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnnn1d4zk2om7yx1mf2q.png" alt=" " width="800" height="517"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Create A Model Config
&lt;/h2&gt;

&lt;p&gt;With kagent installed on your Kubernetes cluster, you can manage Agents, Models, and MCP Servers in a declarative fashion. One object you can use is the &lt;code&gt;ModelConfig&lt;/code&gt;, which allows you to import a Model into kagent. In this case, you'll import the Llama Model that you created.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run the following configuration (notice how it's pointing to the Ollama Kubernetes Service).
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: kagent.dev/v1alpha2
kind: ModelConfig
metadata:
  name: llama3-model-config
  namespace: kagent
spec:
  model: llama3
  provider: Ollama
  ollama:
    host: http://ollama.ollama.svc.cluster.local:80
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Get the Model config to ensure that it deployed successfully.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get modelconfig -n kagent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll see the AI Provider that you did the kagent installation with and Ollama.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NAME                   PROVIDER    MODEL
default-model-config   Anthropic   claude-3-5-haiku-20241022
llama3-model-config    Ollama      llama3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Go to the UI and click on &lt;strong&gt;View &amp;gt; Models&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74kvf8t81ql8yf4dzn80.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74kvf8t81ql8yf4dzn80.png" alt=" " width="330" height="185"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You should now see the Model within kagent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft51weyw1usgdxdsytpgk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft51weyw1usgdxdsytpgk.png" alt=" " width="800" height="603"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You'll also now see Llama as an option within kagent when you create an Agent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fssi3uo9zdmscyzoj1adm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fssi3uo9zdmscyzoj1adm.png" alt=" " width="695" height="713"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>programming</category>
      <category>kubernetes</category>
    </item>
    <item>
      <title>AWS EKS Model Context Protocol (MCP): How It Improves Kubernetes Reliability</title>
      <dc:creator>Michael Levan</dc:creator>
      <pubDate>Tue, 08 Jul 2025 12:51:32 +0000</pubDate>
      <link>https://dev.to/thenjdevopsguy/aws-eks-model-context-protocol-mcp-how-it-improves-kubernetes-reliability-5h0e</link>
      <guid>https://dev.to/thenjdevopsguy/aws-eks-model-context-protocol-mcp-how-it-improves-kubernetes-reliability-5h0e</guid>
      <description>&lt;p&gt;The closer we get to using AI in our day-to-day, the more we'll need to ensure that the data we're interacting with from the LLMs that are used is accurate. This way, engineers and technical leadership teams can ensure the usage is worth the time and expense (expense of using LLMs and time for engineers to get trained up on them).&lt;/p&gt;

&lt;p&gt;The majority of organizations cannot spend millions of dollars to train a Model to help with AIOps and Kubernetes workloads, but MCP Servers can be used to ensure accurate and reliable information.&lt;/p&gt;

&lt;p&gt;In this blog post, you'll learn the key aspects of the EKS MCP Server, how to use the EKS MCP server for your Elastic Kubernetes Service cluster, and how to interact with an MCP Server using Python.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;If you want to follow along from a hands-on perspective, you'll need:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;An AWS account and an EKS cluster deployed. If you don't have an EKS cluster deployed, this blog post will still be very helpful as you'll have a code example for when you do have an EKS cluster deployed.&lt;/li&gt;
&lt;li&gt;Python3 installed.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Model Context Protocol (MCP) Recap
&lt;/h2&gt;

&lt;p&gt;When thinking about what MCP is, think "An API call to a collection of needed data". As the name suggest, it could be an actual server/VM that's hosting the data. Programmatically, it could be a client/server architecture where the "server" in the case of MCP is something like a PyPi package that you can "download" and call upon. You'd then use a client (e.g - a script) to access the tools within the MCP Server.&lt;/p&gt;

&lt;p&gt;The overall goal of MCP is simple; accurate information. Sometimes, LLMs don't give you accurate information and because everyone wants as close to "accurate" as an LLM can get, an MCP Server can be used to store accurate information for the job you're trying to do. That's why MCP Servers are popping up for everything from Kubernetes to documentation to CICD.&lt;/p&gt;

&lt;p&gt;💡I see MCP Servers as an almost "universal RAG". When building a RAG into an application, it's doing the same thing - calling upon accurate information (blog links, articles, etc.). An MCP Server is doing the same thing. The only difference is it isn't tied to a particular application like a RAG is.&lt;/p&gt;

&lt;h2&gt;
  
  
  The AWS EKS MCP Server
&lt;/h2&gt;

&lt;p&gt;The EKS MCP Server, as the name suggests, is an MCP Server that's designed for EKS.&lt;/p&gt;

&lt;p&gt;It has the following tools/functionality:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;list_k8s_resources&lt;/code&gt;: List k8s resources.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;list_api_versions&lt;/code&gt;: List k8s API versions.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;manage_k8s_resource&lt;/code&gt;: CRUD operations for k8s resources&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;apply_yaml&lt;/code&gt;: Like &lt;code&gt;kubectl apply -f&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;get_k8s_events&lt;/code&gt;: Like &lt;code&gt;kubectl get events&lt;/code&gt;suggests&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;get_pod_logs&lt;/code&gt;: Like &lt;code&gt;kubectl get pods&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;💡Here's the part that kind of still has me scratching my head. Things like listing Kubernetes resources and applying Kubernetes Manifests, at least in my opinion, don't really seem like something an MCP Server needs to do. My only assumption is (and this is an early assumption because MCP is only about 6 months old at the time of writing this) is maybe it'll be used for some autonomous actions (e.g - AI, go deploy this thing for me).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;manage_eks_stacks&lt;/code&gt;: CloudFormation templates for EKS clusters.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;search_eks_troubleshoot_guide&lt;/code&gt;: Search EKS docs for info.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;get_cloudwatch_logs&lt;/code&gt;: Logs from CloudWatch for Pods or the Control Plane.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;get_cloudwatch_metrics&lt;/code&gt;: Metrics from CloudWatch for Pods and Clusters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Combining all of this functionality in an MCP Server, you have a way to troubleshoot, deploy, and receive information about workloads running in EKS.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using The AWS EKS MCP Server
&lt;/h2&gt;

&lt;p&gt;There are a few different ways to interact with an MCP Server. In the case of the EKS MCP Server, you'll need a client. That client could be anything from Cline for VS Code to a Python script that interacts with the &lt;code&gt;mcp&lt;/code&gt; library.&lt;/p&gt;

&lt;p&gt;In this case, we'll see how it's done with Python.&lt;/p&gt;

&lt;p&gt;First, below is what an MCP Server configuration looks like. It's use &lt;code&gt;uvx&lt;/code&gt;, which uses &lt;code&gt;uv&lt;/code&gt; to execute Python tools. Notice how it also takes some environment variables and arguments.&lt;/p&gt;

&lt;p&gt;💡uv is a Python package manager that's written in Rust. It's gaining a ton of popularity for replacing pip.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
    "mcpServers": {
        "awslabs.eks-mcp-server": {
            "command": "uvx",
            "args": [
                "awslabs.eks-mcp-server",
                "--allow-write"
            ],
            "env": {
                "AWS_PROFILE": "default",
                "AWS_REGION": "us-west-2",
                "FASTMCP_LOG_LEVEL": "INFO"
            }
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First, import the libraries that are needed. In this case, it'll be the MCP libraries, the async library for concurrent tasks, and the JSON library for clean output to the terminal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, specify a function. In this case, it'll be an &lt;code&gt;async&lt;/code&gt; function. MCP Servers are I/O centric, so having concurrency actions for handling I/O bound tasks is important.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;async def main()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first piece of logic in the function is to call out to the AWS EKS MCP Server. In this case, it's hard-coded into the Python script. However, you could save it as a &lt;code&gt;.json&lt;/code&gt; file and call upon it instead of hard-coding it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    server_params = StdioServerParameters(
        command="uvx",
        args=["awslabs.eks-mcp-server", "--allow-write"],
        env={
            "AWS_PROFILE": "default",
            "AWS_REGION": "us-east-1",
            "FASTMCP_LOG_LEVEL": "INFO"
        }
    )
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the EKS MCP Server is defined, you can call upon it via the MCP library and specify some functionality. In the example below, it's using the &lt;code&gt;list_k8s_resources&lt;/code&gt; tool and specifying the name of a cluster and the kind/object that you want to call upon within the &lt;code&gt;list_k8s_resources&lt;/code&gt; tool&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            result = await session.call_tool("list_k8s_resources", arguments={
                'cluster_name': 'k8squickstart-cluster',
                'kind': 'Pod',
                'api_version': 'v1',
            })
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The last step is to output the results to the terminal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;            for c in result.content:
                data = json.loads(c.text)
                print(json.dumps(data, indent=2))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Putting the code together, it'll look like the below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import json

async def main():
    server_params = StdioServerParameters(
        command="uvx",
        args=["awslabs.eks-mcp-server", "--allow-write"],
        env={
            "AWS_PROFILE": "default",
            "AWS_REGION": "us-east-1",
            "FASTMCP_LOG_LEVEL": "INFO"
        }
    )

    # Connect to EKS MCP server
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            result = await session.call_tool("list_k8s_resources", arguments={
                'cluster_name': 'k8squickstart-cluster',
                'kind': 'Pod',
                'api_version': 'v1',
            })

            for c in result.content:
                data = json.loads(c.text)
                print(json.dumps(data, indent=2))

if __name__ == "__main__":
    asyncio.run(main())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save the code above in a file called &lt;code&gt;eksmcp.py&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running The Client
&lt;/h2&gt;

&lt;p&gt;Now that the client is built, the only thing left is to run it and confirm that it works as expected.&lt;/p&gt;

&lt;p&gt;From the terminal where you saved the script, run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python3 eksmcppy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;💡Depending on your OS, you may have to use python instead of python3. It all depends on how you configired your alias.&lt;/p&gt;

&lt;p&gt;The first output you'll see is the authentication/authorization step, which looks at your AWS config locally for proper access.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2025-07-08 08:45:33.682 | INFO     | awslabs.eks_mcp_server.server:main:140 - Starting EKS MCP Server in restricted sensitive data access mode
[07/08/25 08:45:33] INFO     Processing request of type CallToolRequest                                                   server.py:619
                    INFO     Found credentials in shared credentials file: ~/.aws/credentials                       credentials.py:1352
                    INFO     Found credentials in shared credentials file: ~/.aws/credentials                       credentials.py:1352
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second output is a quick list of what resources (in this case, Pods) are deployed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2025-07-08 08:45:35.392 | INFO     | awslabs.eks_mcp_server.logging_helper:log_with_request_id:49 - [request_id=1] Cleaned up resource responses for Pod resources
2025-07-08 08:45:35.392 | INFO     | awslabs.eks_mcp_server.logging_helper:log_with_request_id:49 - [request_id=1] Listed 8 Pod resources in all namespaces
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll then see a JSON output of the Pods running within your environment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvht65c2aj2n4cj2qeqt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvht65c2aj2n4cj2qeqt.png" alt=" " width="746" height="828"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Congrats! You've successfully implemented a solid use case of the AWS EKS MCP Server with Python.&lt;/p&gt;

</description>
      <category>python</category>
      <category>programming</category>
      <category>mcp</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
