<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alister Baroi</title>
    <description>The latest articles on DEV Community by Alister Baroi (@alisterbaroi).</description>
    <link>https://dev.to/alisterbaroi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3793080%2Faa9f5766-bbc8-4978-b7ae-3a081475d824.jpg</url>
      <title>DEV Community: Alister Baroi</title>
      <link>https://dev.to/alisterbaroi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alisterbaroi"/>
    <language>en</language>
    <item>
      <title>Anthropic Mythos Broke Firefox: 271 zero-day vulnerabilities</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Thu, 23 Apr 2026 23:12:05 +0000</pubDate>
      <link>https://dev.to/alisterbaroi/anthropic-mythos-broke-firefox-271-zero-day-vulnerabilities-3p0</link>
      <guid>https://dev.to/alisterbaroi/anthropic-mythos-broke-firefox-271-zero-day-vulnerabilities-3p0</guid>
      <description>&lt;p&gt;&lt;strong&gt;271 zero-day vulnerabilities. One AI model. One Firefox release.&lt;/strong&gt; And that's just one of four stories worth your attention this fortnight.&lt;/p&gt;

&lt;p&gt;If you run engineering, security, or AI at your company, this article will give you a clear message: AI is no longer something your team &lt;em&gt;uses&lt;/em&gt;. It's something your team (and your attackers) &lt;em&gt;deploys&lt;/em&gt;. Here are the four moves that matter, and the numbers behind each.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Mythos found 271 zero-day vulnerabilities in Firefox 150
&lt;/h2&gt;

&lt;p&gt;On April 22, Mozilla shipped &lt;a href="https://www.firefox.com/en-US/firefox/150.0/releasenotes/" rel="noopener noreferrer"&gt;Firefox 150&lt;/a&gt; with patches for &lt;strong&gt;271 security vulnerabilities&lt;/strong&gt;, all identified by Anthropic's unreleased Mythos model during what Mozilla calls &lt;em&gt;its initial evaluation&lt;/em&gt;. For context: across all of 2025, Mozilla patched roughly &lt;strong&gt;73 high-severity Firefox bugs&lt;/strong&gt;. Mythos delivered almost &lt;strong&gt;4× that count&lt;/strong&gt; in one evaluation window.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mythos&lt;/strong&gt; is distributed under Anthropic's restricted &lt;a href="https://www.anthropic.com/glasswing" rel="noopener noreferrer"&gt;Project Glasswing&lt;/a&gt; programme: not a public model, and not available via API.&lt;/li&gt;
&lt;li&gt;Firefox 150's security advisory lists &lt;strong&gt;41&lt;/strong&gt; &lt;a href="https://en.wikipedia.org/wiki/Common_Vulnerabilities_and_Exposures" rel="noopener noreferrer"&gt;CVE&lt;/a&gt;s; three of those CVEs are memory-safety roll-ups that bundle many of the 271 individual defects.&lt;/li&gt;
&lt;li&gt;The most serious finds were &lt;strong&gt;use-after-free bugs in the DOM and WebRTC&lt;/strong&gt;, the same bug class that has driven browser exploitation for two decades.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mozilla's caveat&lt;/strong&gt; (worth quoting verbatim): Mythos did not find any category of bug that an elite human researcher could not have found. The gain is &lt;strong&gt;scale and speed&lt;/strong&gt;, not new capability.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"A gap between machine-discoverable and human-discoverable bugs favors the attacker, who can concentrate many months of costly human effort to find a single bug. Closing this gap erodes the attacker's long-term advantage by making all discoveries cheap."&lt;/em&gt; — &lt;a href="https://www.mozilla.org/" rel="noopener noreferrer"&gt;Mozilla&lt;/a&gt;, on the shift in attacker/defender economics.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If Anthropic can hand Mozilla 271 real bugs in a single evaluation, assume your own vendors (and your adversaries) are running similar passes on your stack. The question to ask this quarter is no longer &lt;em&gt;"do we use AI in our security review?"&lt;/em&gt; — it is &lt;em&gt;"which of our vendors do, and what does our threat model look like if attackers scale this before we do?"&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Anthropic launched Claude Design
&lt;/h2&gt;

&lt;p&gt;On April 17, Anthropic released &lt;a href="https://www.anthropic.com/news/claude-design-anthropic-labs" rel="noopener noreferrer"&gt;Claude Design&lt;/a&gt;, a new Anthropic Labs product built on &lt;a href="https://www.anthropic.com/news/claude-opus-4-7" rel="noopener noreferrer"&gt;Claude Opus 4.7&lt;/a&gt;. It turns Claude into a design tool that produces real deliverables: prototypes, slide decks, one-pagers, marketing collateral.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reads your &lt;strong&gt;codebase and existing design files&lt;/strong&gt; to apply brand rules automatically.&lt;/li&gt;
&lt;li&gt;Accepts &lt;strong&gt;5+ input formats:&lt;/strong&gt; text prompts, images, DOCX, PPTX, XLSX.&lt;/li&gt;
&lt;li&gt;Exports to &lt;strong&gt;Canva&lt;/strong&gt;, &lt;strong&gt;PDF&lt;/strong&gt;, &lt;strong&gt;PPTX&lt;/strong&gt;, &lt;strong&gt;HTML&lt;/strong&gt;, or a shareable internal URL.&lt;/li&gt;
&lt;li&gt;Hands off to &lt;strong&gt;Claude Code&lt;/strong&gt; when a prototype needs real implementation.&lt;/li&gt;
&lt;li&gt;Available in research preview across &lt;strong&gt;4 subscription tiers&lt;/strong&gt;: Pro, Max, Team, Enterprise.&lt;/li&gt;
&lt;li&gt;Datadog's quantified claim: prototyping that took &lt;strong&gt;one week of back-and-forth&lt;/strong&gt; now happens in &lt;strong&gt;one conversation&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is Anthropic stepping out of "model behind an API" and into "end-user product", competing directly with &lt;strong&gt;Figma&lt;/strong&gt;, &lt;strong&gt;Canva&lt;/strong&gt;, and the slide-building half of &lt;strong&gt;Microsoft 365&lt;/strong&gt;. If your product organisation still treats model vendors as neutral infrastructure, that assumption has a shorter shelf life than your next budget cycle. The vendor now competes with some of your tooling.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Google open-sourced DESIGN.md
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://labs.google/" rel="noopener noreferrer"&gt;Google Labs&lt;/a&gt; released a draft open-source specification called &lt;a href="https://blog.google/innovation-and-ai/models-and-research/google-labs/stitch-design-md/" rel="noopener noreferrer"&gt;DESIGN.md&lt;/a&gt;, a format that describes design systems in a way AI agents can read, reason about, and validate against. It shipped alongside &lt;a href="https://stitch.withgoogle.com/" rel="noopener noreferrer"&gt;Stitch&lt;/a&gt; (Google's AI UI tool), but the format itself is &lt;strong&gt;platform-agnostic&lt;/strong&gt; and hosted on &lt;a href="https://github.com/google-labs-code/design.md" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Encodes design intent so AI agents stop guessing, &lt;em&gt;"agents can know exactly what a color is for"&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Includes built-in &lt;a href="https://www.w3.org/TR/WCAG21/" rel="noopener noreferrer"&gt;WCAG&lt;/a&gt; &lt;strong&gt;accessibility validation&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Portable across any tool or platform, not locked to Stitch.&lt;/li&gt;
&lt;li&gt;Released as a draft spec, open to contribution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Watch the &lt;strong&gt;format&lt;/strong&gt;, not the tool. Markdown files that AI agents read for persistent context — &lt;em&gt;CLAUDE.md, AGENTS.md, README.md&lt;/em&gt;, and now &lt;em&gt;DESIGN.md&lt;/em&gt; — are becoming the lingua franca of AI-native workflows. The standard here is being set in public, right now. Whichever spec wins become the default your engineering teams (and their AI copilots) work against for the next decade. API.md, SECURITY.md, and ONBOARDING.md are the obvious next chapters. If you have a design system or a platform team, this is a draft you want an opinion on.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. OpenAI is quietly building "Hermes" — always-on agents inside ChatGPT
&lt;/h2&gt;

&lt;p&gt;Leaked internal screenshots, surfaced by &lt;a href="https://www.testingcatalog.com/openai-develops-platform-for-always-on-agents-on-chatgpt/" rel="noopener noreferrer"&gt;TestingCatalog&lt;/a&gt; between &lt;strong&gt;April 6–21&lt;/strong&gt;, show OpenAI actively developing a platform codenamed &lt;strong&gt;Hermes&lt;/strong&gt;. It adds persistent, &lt;strong&gt;24/7&lt;/strong&gt; agents to ChatGPT — agents that run even when you are not at the keyboard.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Custom workflows&lt;/strong&gt; and skill assembly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Task scheduling&lt;/strong&gt; and event-triggered actions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External messaging connectors:&lt;/strong&gt; agents can reach users outside ChatGPT.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Role-based templates:&lt;/strong&gt; leaked screenshots show CTO and CPO archetypes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-agent orchestration&lt;/strong&gt;, integrated with OpenAI's existing Workflows builder.&lt;/li&gt;
&lt;li&gt;Status: internal beta. No release date confirmed. &lt;strong&gt;Unofficial:&lt;/strong&gt; treat as leak, not announcement.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Signal for engineering leaders:&lt;/strong&gt; If Hermes ships in the form shown, ChatGPT stops being a chat interface and becomes a &lt;strong&gt;runtime for autonomous systems&lt;/strong&gt;, a direct competitor to &lt;strong&gt;Salesforce Agentforce&lt;/strong&gt;, &lt;strong&gt;Microsoft Copilot Studio&lt;/strong&gt;, and every agent startup built &lt;em&gt;on top of&lt;/em&gt; the OpenAI API. Those startups are then competing with their own platform provider, using agent patterns their provider can see in aggregate across hundreds of millions of users. If your 2026 roadmap includes an AI agent strategy built on vendor APIs, this is the risk line item you want in your Q3 review.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Thread
&lt;/h2&gt;

&lt;p&gt;Four announcements, two weeks, one pattern. AI this fortnight was not about bigger models or cleaner benchmarks. It was about AI &lt;strong&gt;doing the work&lt;/strong&gt; — finding real zero-days in shipped software, producing design artifacts that replace a week of iteration, standardizing how agents read intent, and (in OpenAI's case) running as always-on infrastructure your teams have not yet budgeted for.&lt;/p&gt;

&lt;p&gt;The message for leaders is simple: &lt;em&gt;the operational reality of AI is moving faster than most roadmaps were written to handle&lt;/em&gt;. &lt;/p&gt;

</description>
      <category>anthropic</category>
      <category>hermes</category>
      <category>mozilla</category>
      <category>firefox</category>
    </item>
    <item>
      <title>KubeVirt Networking: How to Preserve VM IP Addresses During Migration</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Tue, 21 Apr 2026 20:55:58 +0000</pubDate>
      <link>https://dev.to/tigeraio/kubevirt-networking-how-to-preserve-vm-ip-addresses-during-migration-1fe9</link>
      <guid>https://dev.to/tigeraio/kubevirt-networking-how-to-preserve-vm-ip-addresses-during-migration-1fe9</guid>
      <description>&lt;p&gt;Organisations are re-evaluating their VM infrastructure. The economics have shifted, the tooling has matured, and the case for running two separate platforms, one for containers, one for VMs, is getting harder to justify. Platform teams that spent years managing hypervisor infrastructure are being asked to consolidate, and most are landing on the same answer: Kubernetes.&lt;/p&gt;

&lt;p&gt;KubeVirt makes running VMs on Kubernetes possible. But &lt;a href="https://www.tigera.io/blog/deep-dive/the-power-of-kubevirt-and-calico/" rel="noopener noreferrer"&gt;KubeVirt networking&lt;/a&gt; – what happens to a VM’s IP address, VLAN, and security posture when it lands in a cluster – is where most migration plans hit a wall. The reasons go beyond cost:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Most enterprises already run Kubernetes.&lt;/strong&gt; Containers are already there. Adding VMs to the same platform consolidates tooling, lifecycle management, networking models, and security policy into a single operational model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two platforms means double the overhead.&lt;/strong&gt; Separate infrastructure means separate upgrade cycles, separate monitoring, separate network configuration, and separate on-call runbooks. Platform consolidation has direct operational value.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kubernetes is mature enough.&lt;/strong&gt; KubeVirt has reached the point where it’s a viable production choice for enterprise VM workloads.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The decision to migrate is being made. The question is &lt;strong&gt;how to do it without causing chaos.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing KubeVirt
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://kubevirt.io/" rel="noopener noreferrer"&gt;KubeVirt&lt;/a&gt; extends the Kubernetes API with new custom resource types: &lt;code&gt;VirtualMachine&lt;/code&gt; and &lt;code&gt;VirtualMachineInstance&lt;/code&gt;. These make VMs first-class Kubernetes objects — scheduled, managed, and observable through the same tools and APIs as containers.&lt;/p&gt;

&lt;p&gt;A VM running in KubeVirt runs inside a &lt;code&gt;virt-launcher pod&lt;/code&gt;. Kubernetes schedules that pod to a node with available resources, the same way it schedules any other workload. The VM gets CPU and memory from the node. It doesn’t know it moved.&lt;/p&gt;

&lt;p&gt;That’s the point: from the VM’s perspective, KubeVirt is invisible. The operating system keeps running. The application keeps running.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwrh4hrko84vggixyj8dz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwrh4hrko84vggixyj8dz.png" width="676" height="254"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;KubeVirt virt-launcher pods in Kubernetes&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The network is a different story
&lt;/h2&gt;

&lt;p&gt;When you migrate a VM, three things have to follow: compute, storage, and network. Compute and storage are properties of the VM itself — self-contained. KubeVirt handles them by giving the VM a new host and a new storage backend. The VM doesn’t notice.&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Dependency&lt;/strong&gt; | &lt;strong&gt;What KubeVirt Does&lt;/strong&gt; | &lt;strong&gt;Status&lt;/strong&gt; |&lt;br&gt;
| &lt;strong&gt;Compute ** | VM runs in a virt-launcher pod. Kubernetes schedules it. | Solved |&lt;br&gt;
| *&lt;em&gt;Storage *&lt;/em&gt; | Disk images mapped to Persistent Volumes via migration tools. | Solved |&lt;br&gt;
| **Network&lt;/strong&gt; | VM gets a new IP from the Kubernetes pod CIDR. | Note Solved |&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Dependencies of VM migrations&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The network is different. The network isn’t a property of the VM. &lt;strong&gt;It’s a property of the VM’s relationship to everything else in the infrastructure.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A VM’s compute dependency is between the VM and its host. A VM’s storage dependency is between the VM and a storage backend. But a VM’s network dependency is between the VM and every other system that knows how to reach it.&lt;/p&gt;

&lt;p&gt;That distinction is why networking is where VM migrations stall. This isn’t theoretical. KubeVirt’s own issue tracker documents the problem directly: a user &lt;a href="https://github.com/kubevirt/kubevirt/issues/14320" rel="noopener noreferrer"&gt;reported their VM’s IP changing after live migration&lt;/a&gt;, and a project maintainer confirmed: “Sticky IPs is not implemented.” The network identity doesn’t follow the VM by default.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwdy44194kw5fkekzay7s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwdy44194kw5fkekzay7s.png" width="800" height="584"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Lift-and-Shift VMs to Kubernetes with Calico L2 Bridge Networks&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Why default KubeVirt networking breaks VM migrations
&lt;/h2&gt;

&lt;p&gt;When a VM lands in Kubernetes using default pod networking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It receives a &lt;strong&gt;new IP address&lt;/strong&gt; from the cluster’s pod CIDR. A range that exists only inside the cluster&lt;/li&gt;
&lt;li&gt;The original &lt;strong&gt;VLAN doesn’t exist&lt;/strong&gt; inside the cluster. Kubernetes has no native VLAN concept in default networking&lt;/li&gt;
&lt;li&gt;Pod IPs are &lt;strong&gt;only meaningful inside the cluster&lt;/strong&gt;. The upstream network has no direct visibility into them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From the perspective of every system that previously knew the VM by its address, the VM has disappeared. Something with an unfamiliar IP has appeared inside a cluster that the upstream infrastructure can’t see into.&lt;/p&gt;

&lt;p&gt;A VM’s IP address accumulates dependencies over time. By the time you’re migrating it, that IP is embedded in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Firewall rules — &lt;/strong&gt; security teams wrote rules allowing or denying traffic to that specific address.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DNS records — &lt;/strong&gt; the hostname resolves to that IP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DHCP configuration — &lt;/strong&gt; the IP is reserved for that VM’s MAC address.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring and alerting —&lt;/strong&gt;  observability tools are configured to watch that address.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load balancer backends — &lt;/strong&gt; upstream load balancers route traffic to that IP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application configuration files — &lt;/strong&gt; other services have that IP hardcoded.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance and audit documentation —&lt;/strong&gt;  security posture records reference that IP in that VLAN.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;VLANs add another dimension. In enterprise environments, VLANs aren’t just a way to segment traffic, they’re security boundaries, designed and owned by the security team. Firewall rules are built around VLAN membership. Compliance frameworks reference VLAN placement. When the VM moves to Kubernetes with default networking, that VLAN disappears. The security boundary is gone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;None of this travels with the VM automatically&lt;/strong&gt;. And every broken dependency requires a different team to fix it.&lt;/p&gt;

&lt;p&gt;You can see this directly. Running &lt;code&gt;kubectl exec&lt;/code&gt; into the virt-launcher pod of a migrated VM shows the interfaces KubeVirt creates with default pod networking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;2: eth0@if9: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;mtu 1450
&lt;span class="go"&gt;inet 10.60.141.196/32 scope global eth0
&lt;/span&gt;&lt;span class="gp"&gt;3: k6t-eth0: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;mtu 1450
&lt;span class="go"&gt;inet 10.0.2.1/24 scope global k6t-eth0
&lt;/span&gt;&lt;span class="gp"&gt;4: tap0: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;mtu 1450 master k6t-eth0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;eth0 is a Calico-assigned pod CIDR address — meaningful only inside the cluster. k6t-eth0 is KubeVirt’s internal masquerade bridge. tap0 connects to the VM’s virtual NIC. The VM’s original IP is gone. The upstream network sees 10.60.141.196, not the address any firewall rule, DNS record, or application config was written for.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A lift-and-shift becomes a multi-team project
&lt;/h2&gt;

&lt;p&gt;Here’s what was planned: the platform team moves the VM. One team. The migration is invisible to the rest of the business.&lt;/p&gt;

&lt;p&gt;Here’s what actually happens with default pod networking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The IP changes&lt;/strong&gt;. The &lt;strong&gt;network team&lt;/strong&gt; needs to rewrite firewall rules and update DNS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The VLAN disappears&lt;/strong&gt;. The &lt;strong&gt;security team&lt;/strong&gt; needs to review the new network placement and approve it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application config breaks&lt;/strong&gt;. The &lt;strong&gt;application team&lt;/strong&gt; needs to update config files and hardcoded references&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every one of these requires sign-offs, tickets, and coordination&lt;/p&gt;

&lt;p&gt;A migration budgeted as a lift-and-shift gets delivered as a network redesign. &lt;strong&gt;Per VM&lt;/strong&gt;. At scale, the coordination cost makes migration impractical.&lt;/p&gt;

&lt;p&gt;This is where VM migration to Kubernetes stalls, not because the technology doesn’t work, but because the organisational cost exceeds what anyone planned or funded for.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to preserve VM IP addresses and VLANs in Kubernetes
&lt;/h2&gt;

&lt;p&gt;Think about what the problem really is. The VM had a home on the network. A specific IP, a specific VLAN, a specific place in the security model. When it moved to Kubernetes, that home disappeared. Default pod networking gave it a new address in a new network that nothing outside the cluster knows about.&lt;/p&gt;

&lt;p&gt;Calico L2 Bridge Networks solve this by doing the opposite. Calico L2 Bridge Networks extend a VM’s original Layer 2 network segment – including its IP address, VLAN, and MAC address – directly into a Kubernetes cluster via a node-level bridge, so the VM’s network identity survives the migration unchanged. Instead of putting the VM on Kubernetes’s network, it brings the VM’s original network into Kubernetes. The physical VLAN the VM lived on gets extended directly into the cluster via a bridge on the node. The VM connects to that bridge through a secondary interface, and the &lt;a href="http://tigera.io/blog/lift-and-shift-vms-to-kubernetes-with-calico-l2-bridge-networks/" rel="noopener noreferrer"&gt;VM preserves its original IP address&lt;/a&gt;, the same VLAN, and the same MAC address it had before the migration.&lt;/p&gt;

&lt;p&gt;Nothing on the outside knows anything has changed. The firewall still talks to the same IP. DNS still resolves to the right place. The monitoring dashboard still shows the right host. The application that had the IP hardcoded still connects. The security team’s VLAN boundary still exists — it just now exists inside Kubernetes too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpagqr6cfw811q39ztv4g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpagqr6cfw811q39ztv4g.png" width="800" height="574"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;L2 Bridge Mode with Calico by Tigera&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;You can see the difference at the interface level. With Calico L2 Bridge, that same &lt;code&gt;virt-launcher&lt;/code&gt; pod now looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;2: eth0@if9: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;mtu 1450
&lt;span class="go"&gt;   inet 10.60.141.196/32 scope global eth0
&lt;/span&gt;&lt;span class="gp"&gt;3: k6t-eth0: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;mtu 1450
&lt;span class="go"&gt;   inet 10.0.2.1/24 scope global k6t-eth0
&lt;/span&gt;&lt;span class="gp"&gt;4: tap0: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;mtu 1450 master k6t-eth0
&lt;span class="gp"&gt;5: net1: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;mtu 1500
&lt;span class="go"&gt;   link/ether 52:54:00:3a:7f:21 brd ff:ff:ff:ff:ff:ff
   inet 10.10.5.42/24 brd 10.10.5.255 scope global net1
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;net1&lt;/code&gt; is the secondary interface connected to the L2 bridge that Calico manages on the node. That’s the VM’s original IP&lt;code&gt;10.10.5.42&lt;/code&gt;, on its original subnet, with its original MAC address. The pod-side interfaces are still there, KubeVirt still needs them, but the VM’s actual network identity is preserved on &lt;code&gt;net1&lt;/code&gt;. That’s the interface the rest of your infrastructure talks to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a secondary interface and not the primary?
&lt;/h2&gt;

&lt;p&gt;KubeVirt manages the VM’s primary network interface through the &lt;code&gt;virt-launcher&lt;/code&gt; pod. That primary interface has two modes: &lt;strong&gt;masquerade&lt;/strong&gt; and &lt;strong&gt;bridge&lt;/strong&gt;. &lt;strong&gt;Masquerade&lt;/strong&gt; NATs all VM traffic through the pod’s IP. The VM is hidden behind the pod address. &lt;strong&gt;Bridge&lt;/strong&gt; mode connects the VM to the pod network bridge. Closer, but still the pod network, not your VLAN.&lt;/p&gt;

&lt;p&gt;Neither mode has a way to extend an external VLAN directly to the VM. They’re designed for pod networking, not for preserving legacy network identity.&lt;/p&gt;

&lt;p&gt;The secondary interface is what makes this work. Calico attaches an additional interface to the VM and that interface connects to the bridge Calico created on the node, which connects to the trunk carrying your VLAN from the physical switch. The VM’s traffic on that interface goes directly to the right network segment without any translation or tunnelling.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Calico sets it up
&lt;/h2&gt;

&lt;p&gt;The setup is declarative. You define what you want, Calico handles the plumbing.&lt;/p&gt;

&lt;p&gt;You create a &lt;code&gt;network&lt;/code&gt; resource in Kubernetes that tells Calico which VLAN to bridge and how to map it. Calico reads that and creates the bridge on the node automatically, attaches the trunk interface, and starts tracking the VM’s IP. A &lt;code&gt;NetworkAttachmentDefinition&lt;/code&gt; tells KubeVirt to attach the secondary interface at boot. The &lt;code&gt;VirtualMachine&lt;/code&gt; spec references the secondary network, and when the VM starts, &lt;code&gt;net1&lt;/code&gt; appears with the right IP.&lt;/p&gt;

&lt;p&gt;Migration tools like Forklift (for OpenShift Virtualisation) handle the mapping of existing VM interfaces to the cluster definitions and register the VM’s IP with Calico before migration. From that point, Calico owns the IP, tracking it, keeping routing state correct, and following the VM if it moves between nodes.&lt;/p&gt;

&lt;p&gt;Multiple VLANs can run through the same trunk-backed bridge. You don’t need separate infrastructure per VLAN, the same bridge handles them all.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you gain after the migration
&lt;/h2&gt;

&lt;p&gt;Getting the VM into Kubernetes without breaking anything is the primary goal. But once it’s there, a few things become available that weren’t possible in the hypervisor environment.&lt;/p&gt;

&lt;h3&gt;
  
  
  Network visibility
&lt;/h3&gt;

&lt;p&gt;In a traditional hypervisor setup, getting visibility into what a VM is actually doing on the network usually means deploying a separate agent, a network tap, or a dedicated monitoring tool per host. That visibility comes with the unified platform that Calico provides. Calico gives you traffic flow data, communication patterns, and network behaviour for VM interfaces without anything extra to install or manage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security policy you can actually version control
&lt;/h3&gt;

&lt;p&gt;The firewall rules that protected this VM before migration were probably sitting in a security team’s ticketing system, applied manually to a physical or virtual firewall. They worked, but they weren’t portable, they weren’t reviewable in a pull request, and they weren’t easy to audit.&lt;/p&gt;

&lt;p&gt;With Calico, you can express the same security posture as &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;Kubernetes-native network policy&lt;/a&gt;. Labels, selectors, declarative YAML. You don’t have to do this immediately as part of the migration. The VLAN boundary still exists, the existing firewall rules still apply. But when the security team is ready to modernise the policy model, the tooling is already there.&lt;/p&gt;

&lt;h3&gt;
  
  
  Live migration that doesn’t touch the network
&lt;/h3&gt;

&lt;p&gt;Once a VM is running in Kubernetes, it can move between nodes for patching, rebalancing, hardware failures, and the network configuration moves with it. Calico tracks the IP and updates routing state automatically. From the outside, nothing changes. The VM is just on a different node now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making VM migration to Kubernetes practical
&lt;/h2&gt;

&lt;p&gt;Migration projects fail when the platform team scopes a job as “move the VM” and it turns into “rebuild the network.” That scope creep isn’t a technical failure, it’s what happens when you use a networking model designed for stateless containers to move workloads that were designed around stable, long-lived network identities.&lt;/p&gt;

&lt;p&gt;Calico L2 Bridge Networks solve the right problem: keep the network identity intact during the move, let the migration stay within the platform team’s remit, and leave modernisation for when it’s actually planned and funded.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Move now. Modernise later. On your own timeline.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Watch our walkthrough to learn more: &lt;a href="http://youtube.com/watch?v=gxpm47mGKPc" rel="noopener noreferrer"&gt;Calico L2 Bridge Networking for Virtual Machines&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/kubevirt-networking-how-to-preserve-vm-ip-addresses-during-migration/" rel="noopener noreferrer"&gt;KubeVirt Networking: How to Preserve VM IP Addresses During Migration&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>vmmigration</category>
      <category>howto</category>
    </item>
    <item>
      <title>Your AI Agents Are Autonomous. But Are They Accountable?</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Fri, 17 Apr 2026 10:39:24 +0000</pubDate>
      <link>https://dev.to/tigeraio/your-ai-agents-are-autonomous-but-are-they-accountable-4pja</link>
      <guid>https://dev.to/tigeraio/your-ai-agents-are-autonomous-but-are-they-accountable-4pja</guid>
      <description>&lt;p&gt;&lt;em&gt;Why accountability, not capability, is the real bottleneck for enterprise agentic AI, and what security leaders need to do about it before regulators force the issue.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Every enterprise is building AI agents. Marketing has one summarizing campaign performance. Engineering has one triaging incidents. Customer support has one resolving tickets. Finance has one processing invoices. And increasingly, those agents are talking to each other: calling tools, accessing databases, delegating tasks across complex multi-hop chains.&lt;/p&gt;

&lt;p&gt;But here’s the question nobody wants to hear at 3 a.m. when something goes wrong: &lt;em&gt;who authorized that action, what policy permitted it, and what’s the full chain of events?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For most enterprises, the honest answer is: nobody knows. That’s not a governance problem — it’s an &lt;a href="https://www.tigera.io/blog/beyond-the-prompt-ai-agent-design-patterns-and-the-new-governance-gap/" rel="noopener noreferrer"&gt;AI agent accountability&lt;/a&gt; crisis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agents Are Scaling Faster Than Governance
&lt;/h2&gt;

&lt;p&gt;The data paints a stark picture. McKinsey research found that &lt;a href="https://www.mckinsey.com/capabilities/tech-and-ai/our-insights/tech-forward/state-of-ai-trust-in-2026-shifting-to-the-agentic-era" rel="noopener noreferrer"&gt;80% of organizations have already encountered risky behavior from AI agents&lt;/a&gt;. These actions were unintended, unauthorized, or outside acceptable guardrails. Yet only about one-third of organizations report meaningful governance maturity. Gartner predicts that over &lt;a href="https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027" rel="noopener noreferrer"&gt;40% of agentic AI projects will be canceled by the end of 2027&lt;/a&gt; due to escalating costs, unclear business value, or inadequate risk controls.&lt;/p&gt;

&lt;p&gt;This isn’t a future problem. This is the mainstream enterprise experience with agentic AI right now. And the pattern should feel familiar. A decade ago, enterprises faced “shadow IT,” where employees adopting cloud services without IT approval created ungoverned sprawl that took years to bring under control. Today, agentic architectures risk creating a new back door for “shadow AI,” and the stakes are higher. Unlike cloud services, agents don’t just store data; they make decisions, call APIs, access databases, and propagate those actions across other agents in a chain that nobody can trace.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Regulatory Clock
&lt;/h2&gt;

&lt;p&gt;Compliance deadlines on both sides of the Atlantic are months away. The EU AI Act’s main provisions take effect in August 2026, requiring action logging, transparency, and human oversight for high-risk AI systems. In the US, the Colorado AI Act (being the leading regulation) takes effect in June 2026, mandating risk management programs and impact assessments for high-risk AI. And Colorado isn’t the only state: California, New York, Utah, and Texas have already enacted AI governance laws, and there are 80+ AI governance bills under consideration in the current US Federal Congress.&lt;/p&gt;

&lt;p&gt;Two-thirds of industry leaders believe &lt;a href="https://www.isaca.org/resources/news-and-trends/industry-news/2025/the-looming-authorization-crisis-why-traditional-iam-fails-agentic-ai" rel="noopener noreferrer"&gt;formal agent accountability frameworks will become mandatory within the next two years&lt;/a&gt;. The question isn’t whether these requirements are coming. It’s whether your organization will be ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Pillars for Agent Accountability
&lt;/h2&gt;

&lt;p&gt;Not all “governance” is created equal. Many enterprises believe they have agent governance because they have network policies or an API gateway. But governance without accountability is security theater; it might prevent some bad outcomes, but it can’t prove why good outcomes were permitted, trace what happened when something goes wrong, or satisfy an auditor asking for evidence.&lt;/p&gt;

&lt;p&gt;True agent accountability requires five distinct capabilities working together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Traceability —&lt;/strong&gt; Can you trace what happened, end to end? When Agent A calls Agent B, which calls Tool C, which accesses Database D, can you reconstruct the entire chain with timestamps and outcomes at every hop? Without traceability, incident response is guesswork.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authorization provenance —&lt;/strong&gt; Can you prove why it was permitted? Not just “Agent A was allowed to call Agent B,” but “Agent A was allowed to call Agent B because Policy X grants agents with capability Y access to agents with risk-level Z.” This is the difference between a lock on the door and a sign-in sheet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identity and ownership —&lt;/strong&gt; Who owns this agent, and who is responsible when it acts? Every agent needs a verified identity and a clear human owner. Without it, accountability diffuses across components, and diffused accountability is no accountability at all.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy-based governance at scale —&lt;/strong&gt; Does your security model survive agent #101? With 10 agents, you can manage permissions by hand. With 100, you can’t. Scalable governance requires declarative, attribute-based policies that grow with the network, not against it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human oversight and intervention —&lt;/strong&gt; Can a human review, approve, or override? Effective oversight means visibility into what agents are doing, the ability to review interactions after the fact, and the power to modify policies or revoke access in real time.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you’re missing any one of these pillars, you have a gap that will surface during your next incident, audit, or regulatory review.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Existing Approaches Can’t Deliver AI Agent Accountability
&lt;/h2&gt;

&lt;p&gt;Enterprises aren’t starting from zero; most have invested in network policies, API gateways, RBAC, and protocols like MCP and A2A. The problem isn’t a lack of tools. It’s that these tools were designed for model outputs (a world where services are deterministic, communication patterns are predictable, and humans make all the decisions), not autonomous actions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;Network policies&lt;/a&gt; operate at the wrong abstraction level for agent accountability. They can say “pods in namespace A can reach pods in namespace B,” but they can’t say “Agent A with risk-level=low can only call agents with risk-level=low.” They have no concept of agent identity, capabilities, or policy attributes, and they produce no audit trail.&lt;/p&gt;

&lt;p&gt;API gateways handle north-south traffic but don’t understand the east-west, multi-hop nature of agent-to-agent communication. MCP and A2A solve the &lt;em&gt;how&lt;/em&gt; of agent communication, but explicitly assume someone else handles the &lt;em&gt;who&lt;/em&gt; and the &lt;em&gt;why&lt;/em&gt;. RBAC works at small scale but can’t express the nuanced, attribute-based policies that agent governance requires.&lt;/p&gt;

&lt;p&gt;The industry has solved agent communication and agent infrastructure. What’s missing is the accountability layer — the control plane that answers three questions for every agent interaction: Who authorized this? What policy permitted it? What’s the full record?&lt;/p&gt;

&lt;h2&gt;
  
  
  The AI Governance Gap Is Growing
&lt;/h2&gt;

&lt;p&gt;The enterprises that thrive in the agentic era won’t be the ones that deploy the most agents. They’ll be the ones that can prove their agents are operating within policy, trace every interaction end to end, and answer the question: &lt;em&gt;who’s accountable when the agent acts?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;We wrote a strategic guide to help you get there.&lt;/strong&gt; Our whitepaper, &lt;em&gt;Accountable AI Agents: A Strategic Guide for AI &amp;amp; Security Leaders Governing Autonomous AI at Scale&lt;/em&gt;, breaks down the full framework — the five pillars of agent accountability, why existing approaches leave gaps, and the architectural principles your governance platform needs to deliver. It also provides the solution, the accountability maturity model, which guides how to fix these security and accountability gaps. No product demos, no fluff. Just the blueprint your leadership team needs before the next incident or regulation forces your hand.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://info.tigera.io/rs/805-GFH-732/images/Whitepaper_Accountability_for_AI_Agents.pdf?version=0" rel="noopener noreferrer"&gt;Get the strategic guide for accountable AI agents →&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/your-ai-agents-are-autonomous-but-are-they-accountable/" rel="noopener noreferrer"&gt;Your AI Agents Are Autonomous. But Are They Accountable?&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>featuredblog</category>
      <category>technicalblog</category>
      <category>aiagentsecurity</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>Deployed Is Not the Same as Ready: How Mature Is Your Kubernetes Environment?</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Thu, 16 Apr 2026 22:00:25 +0000</pubDate>
      <link>https://dev.to/tigeraio/deployed-is-not-the-same-as-ready-how-mature-is-your-kubernetes-environment-317h</link>
      <guid>https://dev.to/tigeraio/deployed-is-not-the-same-as-ready-how-mature-is-your-kubernetes-environment-317h</guid>
      <description>&lt;p&gt;Kubernetes adoption is no longer the challenge it once was. More than 82% of enterprises run containers in production, most of them on multiple Kubernetes clusters. Adoption, however, does not mean operational maturity. These are two very different things. It is one thing to deploy workloads to a cluster or two and quite another to do it securely, efficiently and at scale.&lt;/p&gt;

&lt;p&gt;This distinction matters because the gap between adoption and &lt;a href="https://www.tigera.io/lp/ebook-building-resilient-multi-cluster-kubernetes/" rel="noopener noreferrer"&gt;Kubernetes operational maturity&lt;/a&gt; is where risk accumulates. Operationally mature organizations ship faster, recover from incidents in minutes instead of hours and consistently pass compliance audits. They spend less time dealing with outages and more time delivering new services to their customers.&lt;/p&gt;

&lt;p&gt;So what separates maturity from adoption? It comes down to a handful of foundational capabilities that, when done well, result in measurable business impact. Operational maturity — the ability to run Kubernetes workloads securely, efficiently, and at scale, with consistent policy enforcement, cross-cluster observability, and automated incident recovery — is not a destination; it is a continuous process of strengthening the architectural pillars that keep your Kubernetes environment production-ready.&lt;/p&gt;

&lt;h2&gt;
  
  
  What does operational maturity look like?
&lt;/h2&gt;

&lt;p&gt;Operational maturity spans several interconnected areas from &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security-best-practices/" rel="noopener noreferrer"&gt;Kubernetes security best practices&lt;/a&gt; to observability and multi-cluster connectivity that, taken together, determine how resilient, secure, and observable your Kubernetes environment truly is. One practical way to measure this is to walk through the capabilities your environment either has or does not have yet.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhk9f58pofqvo44c5txa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhhk9f58pofqvo44c5txa.png" width="800" height="515"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;A running vs an operationally mature Kubernetes environment&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Can you effectively isolate workloads from each other?
&lt;/h3&gt;

&lt;p&gt;The flat network default which allows pods to be created, destroyed and moved on the fly (a core Kubernetes capability) also creates a wide-open door for lateral movement if a workload is compromised.&lt;/p&gt;

&lt;p&gt;A tiered policy model addresses this by organizing &lt;a href="http://tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;network policies&lt;/a&gt; into layers of precedence, each owned by a different team. Security teams define high-priority guardrails—for example, blocking traffic to malicious destinations, enforcing tenant isolation—while platform teams secure infrastructure components and developers write fine-grained rules for their own applications. This separation of duties eliminates policy sprawl and ensures that a developer-created rule can never accidentally override a critical security baseline.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do you have a zero-trust security policy with pod to pod encryption and workload identity?
&lt;/h3&gt;

&lt;p&gt;In addition to isolation, security means a &lt;a href="https://www.tigera.io/learn/guides/zero-trust/" rel="noopener noreferrer"&gt;zero trust&lt;/a&gt; posture, and that in turn means mTLS for internal cluster traffic. mTLS has become a hard requirement, both for regulators and for security teams that have learned the hard way what unencrypted east-west traffic costs when something goes wrong.&lt;/p&gt;

&lt;p&gt;For organizations that have given up on service mesh, Istio ambient mode is worth a look. It delivers automatic mTLS and SPIFFE-based workload identity across all traffic without the resource cost of sidecars. L7 capabilities such as traffic shaping and advanced observability can be layered in selectively only for the services that need them.&lt;/p&gt;

&lt;p&gt;Security is the foundation and non-negotiable starting point on the journey towards a mature Kubernetes posture.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does your ingress solution have all the capabilities you need without relying on vendor-specific annotations?
&lt;/h3&gt;

&lt;p&gt;The retirement of Ingress NGINX Controller was a wake up call for many organizations making them realize that ‘good enough’ is, in fact, not good enough. Migrating to a robust and future proof implementation of &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-gateway-api/" rel="noopener noreferrer"&gt;Gateway API&lt;/a&gt; is one more step along the road to operational maturity.&lt;/p&gt;

&lt;p&gt;Ingress and traffic management are evolving rapidly. The Kubernetes Ingress API served its purpose for years, but reliance on annotations, limited protocol support, and a single-controller model have become constraints at scale. The Gateway API replaces it with a role-oriented model. This is more than a technical upgrade. It is a shift not only towards more granular and comprehensive traffic control but towards decentralized management where cluster administrators control the infrastructure and development teams define their application-specific routing rules.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is egress getting the attention it needs?
&lt;/h3&gt;

&lt;p&gt;Egress traffic management is the often overlooked sibling of ingress control. Without dedicated egress controls, outbound traffic from your cluster uses the node’s IP address, which means different tenants and workloads become indistinguishable to the outside world. This makes audit trails unreliable, complicates compliance, and creates real security exposure.&lt;/p&gt;

&lt;p&gt;An &lt;a href="https://www.tigera.io/learn/guides/kubernetes-networking/egress-gateway/" rel="noopener noreferrer"&gt;egress gateway architecture&lt;/a&gt; assigns each tenant or namespace a dedicated, static IP address for outbound traffic. External services can then allowlist those specific addresses, firewall rules become deterministic, and your security team can trace any outbound connection back to the workload that initiated it.&lt;/p&gt;

&lt;p&gt;If your pods need to access external endpoints egress control deserves a place on your maturity roadmap, not on the back burner.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do you connect your clusters?
&lt;/h3&gt;

&lt;p&gt;It is rare to find organizations with just one Kubernetes cluster in production. Spectro Cloud reported that l&lt;a href="https://www.spectrocloud.com/state-of-kubernetes-2025#overview-report" rel="noopener noreferrer"&gt;arge enterprises operate more than 20 clusters across five or more cloud environments&lt;/a&gt;. If you are running AI workloads that are more than a simple API for the company chatbot, deploying a multi-cluster architecture that isolates GPU heavy training jobs from inference endpoints is a baseline expectation.&lt;/p&gt;

&lt;p&gt;Unfortunately, the traditional &lt;a href="https://www.tigera.io/learn/guides/kubernetes-networking/kubernetes-multi-cluster/" rel="noopener noreferrer"&gt;multi-cluster architecture&lt;/a&gt;, which relies on external DNS and load balancers, exposes your internal services and presents a real risk. Beyond the security exposure, it introduces operational drag that compounds with every cluster you add. We are talking about frustrating DNS propagation delays, security policies that have to be manually synchronized across environments and, of course, the inevitable configuration drift.&lt;/p&gt;

&lt;p&gt;Cluster mesh architecture, with its unified observability, Kubernetes-native service discovery that does not rely on external DNS and consistent inter-cluster security policies, is what can keep a complex multi-cluster environment from becoming a liability. Multi-cluster done well is a reliable measure of operational maturity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Are you relying solely on hardware load balancers
&lt;/h3&gt;

&lt;p&gt;Hardware load balancers were built for a pre-Kubernetes world. They have no native concept of pods, services, or namespaces, and every configuration change typically requires a ticket, a separate team, and a procurement cycle. As Kubernetes becomes the default platform for production workloads, that operational friction compounds. The more clusters you run and the more latency-sensitive your workloads become, the more the limitations of hardware-centric load balancing show up in your incident logs and your budget.&lt;/p&gt;

&lt;p&gt;A Kubernetes-native load balancer replaces the appliance with software that runs inside the cluster and understands its abstractions. Capacity scales horizontally by adding nodes, not by upgrading hardware. Configuration uses standard Kubernetes resources, which means no separate management console and no version drift between your cluster and your load balancer. For teams managing payment processing, trading systems, or real-time data pipelines, the combination of eBPF-based forwarding, consistent hashing, and graceful node draining delivers the reliability of enterprise appliances without the operational overhead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is your team still stitching together clues from kubectl and scattered logs, or do you have a single, unified view across your entire environment?
&lt;/h3&gt;

&lt;p&gt;Kubernetes environments can fail quietly. Services degrade, traffic patterns shift, and workloads compete for resources in ways that are invisible without the right instrumentation in place. In a single cluster, experienced engineers can often piece together what is happening from logs and metrics. Across multiple clusters, namespaces, and workload types that approach becomes highly inefficient and costly. Managing cost and efficiently tracking down problems is even harder, and more imperative, now that AI workloads, with their training jobs, inference endpoints and non-deterministic agents, often share infrastructure and resources with business-critical services.&lt;/p&gt;

&lt;p&gt;Unified observability is essential to keeping all the moving parts manageable. Without Kubernetes-aware telemetry that is enriched with metadata about namespaces, services, and workload identity teams are operating blind. Mature observability means you can detect anomalous traffic patterns in real time, trace requests across cluster boundaries, and generate the audit evidence that compliance frameworks demand. It turns reactive firefighting into proactive operations. Organizations that strive for operational maturity cannot do without it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F683zan9brf76kap19tlu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F683zan9brf76kap19tlu.png" width="800" height="474"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Where are you on the journey to operational maturity?&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Where do you stand?
&lt;/h2&gt;

&lt;p&gt;No organization achieves Kubernetes operational maturity overnight, and not everything needs to be optimized immediately. What matters is knowing where you stand today so you can prioritize items that will have the greatest impact on your security posture, operational efficiency, and ability to support your current and future workloads. Whether you are still relying on default-allow networking, beginning to explore egress controls, or already running a multi-cluster mesh, there is always a next step on the maturity curve.&lt;/p&gt;

&lt;p&gt;Read our ebook, &lt;a href="https://www.tigera.io/lp/ebook-building-resilient-multi-cluster-kubernetes/" rel="noopener noreferrer"&gt;Building Resilient Multi-Cluster Kubernetes&lt;/a&gt; to get a practical framework for closing the gap between Kubernetes adoption and operational readiness.&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/deployed-is-not-the-same-as-ready-how-mature-is-your-kubernetes-environment/" rel="noopener noreferrer"&gt;Deployed Is Not the Same as Ready: How Mature Is Your Kubernetes Environment?&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>featuredblog</category>
      <category>technicalblog</category>
      <category>bestpractices</category>
      <category>products</category>
    </item>
    <item>
      <title>Beyond the Prompt: AI Agent Design Patterns and the New Governance Gap</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Wed, 15 Apr 2026 19:25:41 +0000</pubDate>
      <link>https://dev.to/tigeraio/beyond-the-prompt-ai-agent-design-patterns-and-the-new-governance-gap-4eki</link>
      <guid>https://dev.to/tigeraio/beyond-the-prompt-ai-agent-design-patterns-and-the-new-governance-gap-4eki</guid>
      <description>&lt;p&gt;If you are treating Large Language Models (LLMs) like simple question-and-answer machines, you are leaving their most transformative potential on the table. The industry has officially shifted from zero-shot prompting to structured &lt;a href="https://youtu.be/GDm_uH6VxPY?si=xsD64NCIrkhEU71d" rel="noopener noreferrer"&gt;AI agent design patterns&lt;/a&gt; and agentic workflows where AI iteratively reasons, uses external tools, and collaborates to solve complex engineering problems. These design patterns are the architectural blueprints that determine how autonomous Agentic AI systems work and interact with your infrastructure.&lt;/p&gt;

&lt;p&gt;But as these systems proliferate faster than organizations can govern them, they introduce a critical &lt;a href="https://www.tigera.io/blog/securing-ai-workloads-in-kubernetes-why-traditional-network-security-isnt-enough/" rel="noopener noreferrer"&gt;AI agent security&lt;/a&gt; risk: By the end of 2026, &lt;a href="https://www.gartner.com/en/newsroom/press-releases/2025-08-26-gartner-predicts-40-percent-of-enterprise-apps-will-feature-task-specific-ai-agents-by-2026-up-from-less-than-5-percent-in-2025" rel="noopener noreferrer"&gt;40% of enterprise applications will feature embedded AI agents&lt;/a&gt;, and those teams will urgently need purpose-built strategies to govern this new autonomous workforce before it becomes the next major shadow IT crisis.&lt;/p&gt;

&lt;p&gt;Before you can secure these autonomous systems, you have to understand how they are built. Here is a technical breakdown of the current AI Agent design patterns you need to know, and the specific security blind spots each design pattern creates.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. The Foundational Execution Patterns
&lt;/h2&gt;

&lt;p&gt;Building reliable AI systems comes down to how you route the cognitive load. Here are the three baseline structural patterns:&lt;/p&gt;

&lt;h3&gt;
  
  
  A. The Single Agent (Tool Use)
&lt;/h3&gt;

&lt;p&gt;In this pattern, a single LLM is equipped with access to external, deterministic tools (APIs, databases, bash environments, or the Model Context Protocol).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; The agent receives a prompt, realizes it lacks the necessary context, calls a tool, ingests the output, and formulates a final response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Governance Challenge:&lt;/strong&gt; When an agent is granted API keys to query your cluster, it operates with implicit trust to access that data. If compromised via prompt injection, that single agent becomes an unmonitored vector for data exfiltration.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  B. The Sequential Agent (The Assembly Line)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fneqjmgr81wiah50e528u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fneqjmgr81wiah50e528u.png" width="800" height="151"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When a single agent fails at a complex task, we break the task down into a pipeline. Sequential agents operate in a linear hand-off, where the output of &lt;em&gt;Agent A&lt;/em&gt; becomes the input of &lt;em&gt;Agent B&lt;/em&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; You deploy specialized micro-agents. Agent 1 extracts data, Agent 2 analyzes it, and Agent 3 formats the final report.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Governance Challenge:&lt;/strong&gt; As data flows between agents, maintaining an audit lineage becomes incredibly complex. You cannot easily trace which tools Agent 2 called based on Agent 1’s corrupted input.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  C. The Parallel Agent (Concurrency &amp;amp; Voting)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6ocfnmpe6u3faqpmx3k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6ocfnmpe6u3faqpmx3k.png" width="800" height="310"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To combat the latency of sequential pipelines, the Parallel pattern fans out tasks to multiple specialized agents simultaneously.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; A router agent delegates sub-tasks to multiple worker agents concurrently. Once they finish, a “Judge” or “Synthesizer” agent aggregates the parallel outputs into a cohesive result.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Governance Challenge:&lt;/strong&gt; You now have multiple autonomous agents acting concurrently. Traditional security tools built for deterministic services cannot provide the visibility or control required for these non-deterministic autonomous actions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. The Advanced Cognitive Patterns That Complicate AI Agent Security
&lt;/h2&gt;

&lt;p&gt;To make agents truly autonomous, developers are giving them the ability to “think” about their own work. These cognitive patterns drastically improve output quality, but introduce severe behavioral unpredictability.&lt;/p&gt;

&lt;h3&gt;
  
  
  A. The Reflection Pattern (Critic &amp;amp; Refiner)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27phjiq6j9pfqszg08tl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27phjiq6j9pfqszg08tl.png" width="800" height="163"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Reflection pattern pairs a Generator agent with a Critic agent.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; The Generator outputs a first draft. The Critic evaluates it against guardrails, and the Generator iteratively refines the output until it passes the Critic’s checks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters:&lt;/strong&gt; Wrapping an older model (like GPT-3.5) in a Reflection loop often produces higher-quality, more reliable code than a zero-shot prompt to a cutting-edge model (like GPT-5.4 Pro).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  B. The Planning Pattern
&lt;/h3&gt;

&lt;p&gt;For highly ambiguous goals, agents need the autonomy to devise their own roadmaps.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; Given a high-level goal, the Planning agent decomposes it into a Directed Acyclic Graph (DAG) of sub-tasks. It executes the plan step-by-step, adapting dynamically if a step fails (e.g., “Dependency missing, re-routing to fetch from alternate repo”).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Governance Challenge:&lt;/strong&gt; AI agents don’t follow scripts. They autonomously choose which tools to call, which data to access, and which agents to collaborate with, making static security models completely obsolete.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. The Cold Start Problem: Why AI Agent Governance Can’t Wait
&lt;/h2&gt;

&lt;p&gt;The ultimate evolution of these patterns is &lt;strong&gt;Multi-Agent Collaboration&lt;/strong&gt; , a “society of minds” system where diverse agents with distinct personas (The Architect, The Security Engineer, The QA Tester) debate, share data, and execute code collaboratively across boundaries. &lt;strong&gt;AI agent security&lt;/strong&gt; — &lt;em&gt;the discipline of discovering, controlling, and auditing what autonomous agents can access and do&lt;/em&gt; — requires a fundamentally different approach than traditional application security. Each pattern described above introduces distinct risks, and in combination, they create attack surfaces that traditional security models were never designed to handle.&lt;/p&gt;

&lt;p&gt;But as AI/ML engineering teams race to deploy and scale these &lt;a href="https://www.tigera.io/blog/how-ai-agents-communicate-understanding-the-a2a-protocol-for-kubernetes/" rel="noopener noreferrer"&gt;Agent-to-Agent (A2A) architectures&lt;/a&gt;, most enterprises realize they don’t have any inventory of the AI agents running in their environment, including shadow agents deployed by teams outside official channels. A massive infrastructure challenge arises: &lt;strong&gt;How do these agents communicate securely?&lt;/strong&gt; You cannot govern what you cannot see.&lt;/p&gt;

&lt;p&gt;Whether your AI agents run in Kubernetes, cloud environments, on-premises, at the edge, or on developer laptops, governance that only covers one environment is governance with holes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enter Tigera Agent Governance (TAG)
&lt;/h3&gt;

&lt;p&gt;We are moving past the era of human-in-the-loop chat interfaces into human-on-the-loop autonomous systems. To bridge this gap, Tigera is introducing &lt;a href="https://www.tigera.io/tigera-products/tigera-agent-governance/" rel="noopener noreferrer"&gt;TAG&lt;/a&gt;: the platform with the discipline to discover, authenticate, authorize, enforce, and audit every agent action, wherever agents run.&lt;/p&gt;

&lt;p&gt;TAG is the first platform to own the full five-pillar framework required for modern AI workloads:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Discovery:&lt;/strong&gt; Central registry and auto-discovery of shadow agents across your infrastructure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authentication:&lt;/strong&gt; Cryptographic trust giving every agent a verified identity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authorization:&lt;/strong&gt; Default-deny, fine-grained access control with tool-level binding.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enforcement:&lt;/strong&gt; Real-time enforcement that enables development velocity without bureaucratic blockers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Governance:&lt;/strong&gt; Full audit lineage, service graph visualization, and board-ready compliance reporting.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Your AI agents are making decisions. Do you know what they’re authorized to do?&lt;/strong&gt; Do not wait for an autonomous agent to go rogue. Secure your next-generation architecture with universal governance built for the Agentic AI era.&lt;br&gt;&lt;br&gt;
→ &lt;a href="https://www.tigera.io/contact-tigera-agent-governance/" rel="noopener noreferrer"&gt;Request Early Access to TAG&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/beyond-the-prompt-ai-agent-design-patterns-and-the-new-governance-gap/" rel="noopener noreferrer"&gt;Beyond the Prompt: AI Agent Design Patterns and the New Governance Gap&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>aiagentsecurity</category>
      <category>products</category>
    </item>
    <item>
      <title>How to Stub LLMs for AI Agent Security Testing and Governance</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Thu, 02 Apr 2026 14:15:28 +0000</pubDate>
      <link>https://dev.to/tigeraio/how-to-stub-llms-for-ai-agent-security-testing-and-governance-34n2</link>
      <guid>https://dev.to/tigeraio/how-to-stub-llms-for-ai-agent-security-testing-and-governance-34n2</guid>
      <description>&lt;p&gt;_ &lt;strong&gt;Note:&lt;/strong&gt; The core architecture for this pattern was introduced by &lt;a href="https://www.linkedin.com/in/isaac-hawley-9481743/" rel="noopener noreferrer"&gt;Isaac Hawley&lt;/a&gt; from Tigera._&lt;/p&gt;

&lt;p&gt;If you are building an AI agent that relies on tool calling, complex routing, or the &lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;Model Context Protocol (MCP)&lt;/a&gt;, you’re not just building a chatbot anymore. You are building an autonomous system with access to your internal APIs.&lt;/p&gt;

&lt;p&gt;With that power comes a massive security and governance headache, and AI agent security testing is where most teams hit a wall. &lt;strong&gt;How do you definitively prove that your agent’s identity and access management (IAM) actually works?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The scale of the problem is hard to overstate. Microsoft’s telemetry shows that &lt;a href="https://www.microsoft.com/en-us/security/blog/2026/02/10/80-of-fortune-500-use-active-ai-agents-observability-governance-and-security-shape-the-new-frontier/" rel="noopener noreferrer"&gt;80% of Fortune 500 companies now run active AI agents&lt;/a&gt;, yet only 47% have implemented specific AI security controls. Most teams are deploying agents faster than they can test them.&lt;/p&gt;

&lt;p&gt;If an agent is hijacked via prompt injection, or simply hallucinates a destructive action, does your governance layer stop it? Testing this usually forces engineers into a frustrating trade-off:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Use the real API (Gemini, OpenAI):&lt;/strong&gt; Real models are heavily RLHF’d to be safe and polite. It is incredibly difficult (and non-deterministic) to intentionally force a real model to “go rogue” and consistently output malicious tool calls so you can test your security boundaries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mock the internal tools only:&lt;/strong&gt; You test your Python or Go functions in isolation, but you never actually test the “Agent Loop”—meaning you aren’t testing if the harness correctly applies the user’s OAuth tokens or &lt;a href="https://docs.tigera.io/calico/latest/network-policy/get-started/kubernetes-policy/kubernetes-network-policy" rel="noopener noreferrer"&gt;Role-Based Access Control (RBAC)&lt;/a&gt; to the LLM’s requested tool call.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Recently, Isaac Hawley introduced a much better pattern: &lt;strong&gt;The Stub Model&lt;/strong&gt; —a way to stub your LLM for testing that makes your security assertions completely deterministic.&lt;/p&gt;

&lt;p&gt;A Stub Model (or mock LLM) is a deterministic, non-AI replacement for a real language model that you inject into your agent harness during testing. It returns hardcoded tool-call requests — including deliberately malicious ones — so you can prove that your security layer correctly intercepts and blocks unauthorized actions without relying on a live model API.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Concept: A “Malicious” Router for AI Agent Security Testing
&lt;/h2&gt;

&lt;p&gt;Instead of hitting a real model API during tests, we inject a &lt;code&gt;StubLLM that&lt;/code&gt; implements our system’s core LLM interface.&lt;/p&gt;

&lt;p&gt;The stub doesn’t use any AI. Instead, it parses incoming prompts for specific testing triggers and returns hardcoded, completely predictable tool calls. Crucially, this forces your agent harness to &lt;strong&gt;actually execute the real underlying tool pipeline&lt;/strong&gt;. You aren’t just faking a final text response; you are making the LLM trigger your application’s real execution loop.&lt;/p&gt;

&lt;p&gt;From a governance perspective, this is a superpower. You can program the stub to request highly privileged actions (like &lt;code&gt;drop_database&lt;/code&gt; or&lt;code&gt;read_all_users&lt;/code&gt;), and then write strict, lightning-fast assertions to prove that your Agent Harness intercepts the call, checks the executing user’s identity, and blocks the action.&lt;/p&gt;

&lt;p&gt;Here is how you can implement and test this security pattern in both Python and Go.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python: Proving RBAC &amp;amp; Tool Governance
&lt;/h3&gt;

&lt;p&gt;In Python, we use a &lt;code&gt;Protocol&lt;/code&gt; to define our LLM dependency, and then build a Stub that intentionally requests unauthorized actions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Protocol&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;
&lt;span class="c1"&gt;# Define standard tool call response formats
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ToolCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
   &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
   &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
   &lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ToolCall&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;span class="c1"&gt;# Define the LLM Interface
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LLMClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Protocol&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
       &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="c1"&gt;# Implement the Stub Model for Security Testing
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StubLLM&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
   &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
       &lt;span class="c1"&gt;# 1. Standard authorized action
&lt;/span&gt;       &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MOCK_WEATHER_TOOL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
           &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
               &lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;ToolCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;call_1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;get_weather&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;location&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;London&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})]&lt;/span&gt;
           &lt;span class="p"&gt;)&lt;/span&gt;

       &lt;span class="c1"&gt;# 2. Malicious / Unauthorized action for Governance testing
&lt;/span&gt;       &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MOCK_UNAUTHORIZED_DELETE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
               &lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
                   &lt;span class="nc"&gt;ToolCall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                       &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;call_malicious_999&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;delete_user_account&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                       &lt;span class="n"&gt;arguments&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;admin_01&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c1"&gt;# The LLM is trying something dangerous!
&lt;/span&gt;                   &lt;span class="p"&gt;)&lt;/span&gt;
               &lt;span class="p"&gt;]&lt;/span&gt;
           &lt;span class="p"&gt;)&lt;/span&gt;
       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;This is a stubbed standard response.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Security Unit Test (&lt;code&gt;pytest&lt;/code&gt;):&lt;/strong&gt; With the stub in place, we can test that our Agent correctly parses the dangerous tool call, evaluates the user’s identity, and &lt;strong&gt;blocks&lt;/strong&gt; the execution of the real local Python function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pytest&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_agent_rbac_blocks_unauthorized_tool_execution&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
&lt;span class="c1"&gt;# Arrange: Inject our deterministic stub into the Agent
&lt;/span&gt;&lt;span class="n"&gt;stubbed_llm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StubLLM&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# Initialize our agent harness with a heavily restricted "guest" identity
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm_client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;stubbed_llm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;guest_user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Act: Send the trigger that forces our stub to attempt a destructive tool call
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Please MOCK_UNAUTHORIZED_DELETE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Assert: Verify the Agent's governance harness intercepted the call,
# checked the "guest_user" identity, and blocked the REAL local tool.
&lt;/span&gt;&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;blocked_by_policy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_executed&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Insufficient permissions to execute delete_user_account&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;error_message&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Go: Validating OAuth &amp;amp; Identity Boundaries
&lt;/h3&gt;

&lt;p&gt;In Go, this pattern shines for validating complex OAuth scopes or identity propagation in multi-agent networks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
   &lt;span class="s"&gt;"encoding/json"&lt;/span&gt;
   &lt;span class="s"&gt;"strings"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;ToolCall&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;ID&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="s"&gt;`json:"id"`&lt;/span&gt;
   &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="s"&gt;`json:"name"`&lt;/span&gt;
   &lt;span class="n"&gt;Arguments&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt; &lt;span class="s"&gt;`json:"arguments"`&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;Content&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="s"&gt;`json:"content,omitempty"`&lt;/span&gt;
   &lt;span class="n"&gt;ToolCalls&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;ToolCall&lt;/span&gt; &lt;span class="s"&gt;`json:"tool_calls,omitempty"`&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="n"&gt;Generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;StubLLM&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewStubLLM&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StubLLM&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;StubLLM&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StubLLM&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="c"&gt;// Simulate an Agent trying to access a secure internal system via MCP&lt;/span&gt;
   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;strings&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"MOCK_ACCESS_SECURE_VAULT"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Marshal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"secret_id"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"prod_db_password"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

       &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
           &lt;span class="n"&gt;ToolCalls&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;ToolCall&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
               &lt;span class="p"&gt;{&lt;/span&gt;
                   &lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"call_vault_123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"read_secure_vault"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                   &lt;span class="n"&gt;Arguments&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
               &lt;span class="p"&gt;},&lt;/span&gt;
           &lt;span class="p"&gt;},&lt;/span&gt;
       &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
   &lt;span class="p"&gt;}&lt;/span&gt;
   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Content&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Standard response"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Security Unit Test (&lt;code&gt;testing&lt;/code&gt;):&lt;/strong&gt; We write a test to guarantee that if the LLM decides to hit the vault, the Agent harness forces the underlying tool to respect the provided OAuth context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;agent_test&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="s"&gt;"testing"&lt;/span&gt;
&lt;span class="s"&gt;"errors"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;TestAgentEnforcesOAuthScopes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;testing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="c"&gt;// Arrange: Initialize the agent with the Stub model&lt;/span&gt;
&lt;span class="n"&gt;stub&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewStubLLM&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c"&gt;// Create an agent context with a standard user OAuth token (No Vault Access)&lt;/span&gt;
&lt;span class="n"&gt;mockOAuthContext&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;identity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;identity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithScope&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"read:public"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;myAgent&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stub&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mockOAuthContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;// Act: Trigger the LLM to request a highly privileged tool call&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;myAgent&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"I need you to MOCK_ACCESS_SECURE_VAULT"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;// Assert: Verify the harness evaluated the tool against the OAuth scope and blocked it&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatalf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CRITICAL SECURITY FAILURE: Agent executed secure vault tool without proper OAuth scope"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Is&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ErrUnauthorizedToolExecution&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Expected authorization error, got: %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ExecutedTool&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;"read_secure_vault"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"The real tool was executed despite lack of permissions!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Security &amp;amp; Governance Teams Love This Architecture
&lt;/h2&gt;

&lt;p&gt;By treating the LLM like any other untrusted external dependency, we achieve total control over our agent’s testing environment.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auditable Proof of Governance:&lt;/strong&gt; You now have concrete CI/CD tests proving that your agent respects OAuth scopes, RBAC, and identity guardrails. You aren’t just hoping the model behaves; you are proving the harness defends against it when it doesn’t.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tests the Real Agent Harness:&lt;/strong&gt; Because the LLM returns a perfectly formatted tool call request, your application code actually executes its real security middleware. You validate the entire execution loop, not just a mocked final answer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lightning Fast &amp;amp; Free:&lt;/strong&gt; You can run thousands of these security edge-case tests in milliseconds without spending a dime on API tokens or exposing secrets in your CI pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Force Prompt Injection Scenarios:&lt;/strong&gt; You can easily stub the LLM to return tool arguments containing SQL injection or XSS payloads to ensure your local tools sanitize inputs provided by the AI.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Trade-Offs: What the Stub Model DOESN’T Test
&lt;/h2&gt;

&lt;p&gt;As powerful as this architecture is for testing your infrastructure, it’s important to acknowledge that it is not a silver bullet. There are two major things the Stub Model cannot test:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;It tests the pipes, not the brain:&lt;/strong&gt; The stub proves your system can correctly block a malicious tool call, but it does &lt;em&gt;not&lt;/em&gt; test whether your system prompt is resilient to &lt;a href="https://www.tigera.io/learn/guides/llm-security/prompt-injection/" rel="noopener noreferrer"&gt;prompt injection&lt;/a&gt; in the first place. You still need LLM-as-a-judge pipelines and continuous evaluation frameworks to test your model’s actual reasoning capabilities.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor Schema Drift:&lt;/strong&gt; If OpenAI, Anthropic, or Google update the shape of their underlying JSON tool-call schema, your hardcoded stub tests will still pass with flying colors while your production environment crashes. You still need a handful of real, end-to-end (E2E) smoke tests running against the live API on a nightly basis to catch vendor drift.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Beyond the Chatbot: Engineering for Agency
&lt;/h2&gt;

&lt;p&gt;If you are building complex systems, delegating between autonomous agents, or integrating internal APIs via MCP, you cannot afford to have untested authorization loops.&lt;/p&gt;

&lt;p&gt;By treating the LLM like any other untrusted external dependency, we achieve total control over our agent’s testing environment. We gain &lt;strong&gt;auditable proof of governance&lt;/strong&gt; , ensuring we can run thousands of CI/CD tests in milliseconds without exposing secrets or spending a dime on API tokens.&lt;/p&gt;

&lt;p&gt;If you are building complex systems, delegating between autonomous agents, or integrating internal APIs via MCP, you cannot afford to have untested authorization loops.&lt;/p&gt;

&lt;p&gt;Do yourself a favor: &lt;strong&gt;Stub your LLMs&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Stubbing your LLM proves the guardrails work in test. &lt;strong&gt;TAG&lt;/strong&gt; enforces them in production, giving you continuous visibility into every agent action, authorization decision, and policy enforcement event across your entire organization. &lt;a href="https://www.tigera.io/contact-tigera-agent-governance/" rel="noopener noreferrer"&gt;Talk to us about TAG&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/how-to-stub-llms-for-ai-agent-security-testing-and-governance/" rel="noopener noreferrer"&gt;How to Stub LLMs for AI Agent Security Testing and Governance&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera - Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>aiagentsecuritygover</category>
      <category>bestpractices</category>
      <category>howto</category>
    </item>
    <item>
      <title>Introducing AI Assistant for Calico, Calico Load Balancer, and Seamless VM-to-Kubernetes Migration</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Mon, 23 Mar 2026 07:01:36 +0000</pubDate>
      <link>https://dev.to/tigeraio/introducing-ai-assistant-for-calico-calico-load-balancer-and-seamless-vm-to-kubernetes-migration-4h80</link>
      <guid>https://dev.to/tigeraio/introducing-ai-assistant-for-calico-calico-load-balancer-and-seamless-vm-to-kubernetes-migration-4h80</guid>
      <description>&lt;p&gt;&lt;strong&gt;SAN JOSE, Calif., March 23, 2026&lt;/strong&gt; — &lt;a href="https://www.tigera.io/?utm_source=syndicate&amp;amp;utm_medium=press_release&amp;amp;utm_campaign=KubeCon2026" rel="noopener noreferrer"&gt;Tigera&lt;/a&gt;, the creator and maintainer of Project Calico, today announced a major expansion of its Unified Network Security Platform for Kubernetes, aimed at helping enterprises consolidate infrastructure and accelerate the migration of legacy workloads to cloud-native platforms.&lt;/p&gt;

&lt;p&gt;The new capabilities include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Al Assistant for Calico:&lt;/strong&gt; A proactive, conversational intelligence layer that replaces complex manual log analysis with natural-language troubleshooting and proactive security audits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calico Load Balancer:&lt;/strong&gt; A high-performance, eBPF-based, software-defined load balancer that replaces expensive, rigid hardware appliances with a Kubernetes-native solution.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seamless VM-to-Kubernetes Migration:&lt;/strong&gt; Advanced Layer 2 (L2) networking support eliminates migration friction by allowing virtual machines to move into Kubernetes clusters without changing their original IP addresses or existing VLAN dependencies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These innovations help organizations tackle the rising “complexity tax” in managing high-scale Kubernetes clusters and provide a high-velocity path to consolidate virtual machines and containers into a single, standardized platform.&lt;/p&gt;

&lt;p&gt;“The industry is at a breaking point where the operational overhead of managing legacy hardware and fragmented VM silos is no longer sustainable. By building a distributed load balancer into the fabric of Calico, launching an Al assistant that ‘troubleshoots at the speed of thought,’ and introducing live migration support to move VMs to Kubernetes, we are giving platform teams the power to innovate rather than spend hours managing and troubleshooting.”&lt;/p&gt;

&lt;p&gt;— Ratan Tipirneni, president and CEO, Tigera&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting at the Speed of Thought: Introducing an Al Assistant for Calico
&lt;/h2&gt;

&lt;p&gt;Despite the wealth of telemetry available in modern clusters, SREs often struggle to find the “connecting thread” across isolated events. Calico’s Al Assistant provides a context-aware intelligence layer to extract actionable insights from raw telemetry.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ask, Don’t Query:&lt;/strong&gt; Engineers can move away from rigid query languages and toward articulating intent in plain English. For example: “What are the unrestricted egress destinations currently receiving traffic from my pods?”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context-Aware Explanations:&lt;/strong&gt; The assistant provides summaries and recommendations generated from real telemetry and policy context, explaining exactly why traffic is being denied and offering remediation advice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proactive Security:&lt;/strong&gt; Beyond troubleshooting, the Al assistant maintains cluster stability by detecting unused network policies, identifying misconfigurations, and surfacing exposure risks before they cause an outage.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Explore the full capabilities: &lt;a href="https://www.tigera.io/blog/ai-assistant-for-calico-troubleshooting-at-the-speed-of-thought/" rel="noopener noreferrer"&gt;How the AI Assistant for Calico simplifies troubleshooting at the speed of thought.&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Eliminating Hardware Bottlenecks: The Calico Load Balancer
&lt;/h2&gt;

&lt;p&gt;On-premises Kubernetes teams have traditionally relied on legacy hardware appliances to expose services, creating significant operational overhead and rigid dependencies between networking and platform teams. These external solutions often lack visibility into Kubernetes service context, do not scale horizontally, and require manual coordination for even basic software upgrades.&lt;/p&gt;

&lt;p&gt;Tigera is disrupting this model with the Calico Load Balancer, a modern, software-defined solution built natively into the Calico platform. By transforming existing cluster nodes into a distributed, session-stable load-balancing tier, platform teams gain full control over service advertisement and configuration using the same Kubernetes workflows they already use.&lt;/p&gt;

&lt;p&gt;This Kubernetes-native innovation delivers several critical advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Session Persistence for Stateful Apps:&lt;/strong&gt; A high-performance, eBPF-based data plane ensures that latency-sensitive, stateful applications like Kafka or RabbitMQ maintain active connections even during node failures or changes in network paths.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful Node Restarts:&lt;/strong&gt; Platform teams can mark nodes for maintenance and take them offline without impacting user sessions, preventing lost transactions for critical business services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduced Latency:&lt;/strong&gt; By enabling return traffic to take a shorter path back to the client, the solution reduces latency compared to traditional appliances where traffic must pass through the same central hardware twice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplified Scaling:&lt;/strong&gt; The load balancer scales horizontally with the cluster; adding more nodes automatically adds more load-balancing capacity without vertical scaling limits or vendor upgrade cycles.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-Service and Declarative Control:&lt;/strong&gt; Configuration is handled through standard Kubernetes resources and GitOps workflows, removing cross-team bottlenecks and eliminating the need for tickets or separate management consoles.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Technical Deep Dive: &lt;a href="https://www.tigera.io/blog/calico-load-balancer-simplifying-network-traffic-management-with-ebpf/" rel="noopener noreferrer"&gt;Simplifying network traffic management with eBPF and the Calico Load Balancer.&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Great Migration: Seamlessly Moving VMs to Kubernetes
&lt;/h2&gt;

&lt;p&gt;Historically, migrating virtual machines to Kubernetes meant a forced network redesign because VMs rely on static IP addresses and legacy Layer 2 VLAN configurations. Tigera’s new L2 networking support removes this friction.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero-Change Migration:&lt;/strong&gt; VMs can be migrated from VMware to Kubernetes (KubeVirt) while keeping their original IP addresses, ensuring business continuity for applications with hardcoded dependencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instant Security Upgrade:&lt;/strong&gt; Once migrated, VMs are automatically protected by Calico’s microsegmentation, allowing organizations to retire costly third-party security tools.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once migrated, the VMs in Kubernetes benefit from Calico’s advanced network security and observability capabilities. For users familiar with technologies like VMware NSX, Calico provides NSX-like functionality, including software-defined networking, microsegmentation, a workload-based firewall, and egress gateways for VMs running in Kubernetes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Step-by-Step Guide: &lt;a href="https://www.tigera.io/blog/lift-and-shift-vms-to-kubernetes-with-calico-l2-bridge-networks/" rel="noopener noreferrer"&gt;Lift and shift VMs to Kubernetes with Calico L2 bridge networks.&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  One Platform for Networking, Security, and Observability
&lt;/h2&gt;

&lt;p&gt;The new Calico Unified Network Security Platform provides platform teams with a single, operator-managed solution. This allows teams to gain consistent network policy enforcement across L3-L7 layers with unified visibility, eliminating the overhead of managing multiple tools. Calico works consistently across any Kubernetes distribution, virtual machines, and bare-metal servers, ensuring enterprises can avoid vendor lock-in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;About Tigera&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/?utm_source=syndicate&amp;amp;utm_medium=press_release&amp;amp;utm_campaign=KubeCon2026" rel="noopener noreferrer"&gt;Tigera&lt;/a&gt; provides Calico, a unified network security and observability platform to prevent, detect, and mitigate security breaches in Kubernetes clusters. Tigera’s open-source offering, &lt;a href="https://www.tigera.io/tigera-products/calico?utm_source=syndicate&amp;amp;utm_medium=press_release&amp;amp;utm_campaign=KubeCon2026" rel="noopener noreferrer"&gt;Calico Open Source&lt;/a&gt;, is the most widely adopted container networking and security solution. Powering more than 100M containers across 8M+ nodes, Calico is supported across all major cloud providers and Kubernetes distributions.&lt;/p&gt;

&lt;p&gt;Media Contact&lt;br&gt;&lt;br&gt;
Media relations, Tigera&lt;br&gt;&lt;br&gt;
&lt;a href="mailto:contact@tigera.io"&gt;contact@tigera.io&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Next Steps: Get Hands-on with These Innovations
&lt;/h3&gt;

&lt;p&gt;Learn more about AI Assistant, Calico Load Balancer, and L2 networking support within the Calico ecosystem. Whether you are looking to optimize troubleshooting, reduce hardware dependency, or accelerate your VM migration, we provide the tools to get started today.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1dp36elgeuxvuiact13r.png" alt="🚀" width="72" height="72"&gt; &lt;strong&gt;Experience the Platform:&lt;/strong&gt; &lt;a href="https://www.calicocloud.io/" rel="noopener noreferrer"&gt;Start a free trial of Calico Cloud&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2m902pgqgnzrjghahs3o.png" alt="📅" width="72" height="72"&gt; &lt;strong&gt;Personalized Deep Dive:&lt;/strong&gt; &lt;a href="https://www.tigera.io/demo/" rel="noopener noreferrer"&gt;Request a technical demo with our engineering team&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Attending KubeCon Amsterdam? Stop by the Tigera booth #400 to learn more about these features.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/introducing-ai-assistant-for-calico-calico-load-balancer-and-seamless-vm-to-kubernetes-migration/" rel="noopener noreferrer"&gt;Introducing AI Assistant for Calico, Calico Load Balancer, and Seamless VM-to-Kubernetes Migration&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera - Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>companyblog</category>
    </item>
    <item>
      <title>Secure and Scale VMware VKS with Calico Kubernetes Networking</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Sun, 22 Mar 2026 18:50:39 +0000</pubDate>
      <link>https://dev.to/tigeraio/secure-and-scale-vmware-vks-with-calico-kubernetes-networking-4pl2</link>
      <guid>https://dev.to/tigeraio/secure-and-scale-vmware-vks-with-calico-kubernetes-networking-4pl2</guid>
      <description>&lt;p&gt;Co-authors&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Abhishek Rao&lt;/strong&gt; | Tigera&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Ka Kit Wong, Charles Lee, &amp;amp; Christian Rauber&lt;/strong&gt; | Broadcom&lt;/p&gt;

&lt;p&gt;VMware vSphere Kubernetes Service (VKS) is the CNCF-certified Kubernetes runtime built directly into VMware Cloud Foundation (VCF), which delivers a single platform for both virtual machines and containers. VKS enables platform engineers to deploy, manage, and scale Kubernetes clusters while leveraging a comprehensive set of cloud services. And with VKS v3.6, that foundation just got significantly more powerful: VKS now natively supports Calico Enterprise — part of the &lt;a href="https://www.tigera.io/tigera-products/calico-commercial-editions/" rel="noopener noreferrer"&gt;Calico Unified Platform&lt;/a&gt; — as a validated, lifecycle-managed networking add-on through the new VKS Addon Framework.&lt;/p&gt;

&lt;p&gt;Even better, VKS natively integrates &lt;a href="https://www.tigera.io/tigera-products/calico/" rel="noopener noreferrer"&gt;Calico Open Source&lt;/a&gt; by Tigera as a supported, out-of-the-box Container Network Interface (CNI). This gives organizations a powerful open source baseline right from day one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable Data Planes:&lt;/strong&gt; The flexibility to run high-performance eBPF, standard Linux iptables, modern nftables, or Windows data planes based on specific workload needs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wire-Speed Routing:&lt;/strong&gt; Direct BGP peering with the underlying VMware NSX infrastructure, eliminating the performance overhead of traditional overlay networks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Foundational Zero-Trust:&lt;/strong&gt; Global default-deny policies to instantly secure pod-to-pod traffic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability:&lt;/strong&gt; Includes Whisker, a visual UI tool that simplifies access to flow logs, making it easier to analyze network communication and debug policies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;VKS and Calico Open Source build the perfect house for your applications. However, as Kubernetes adoption explodes across the enterprise, platform engineering and security teams inevitably hit a new wall.&lt;/p&gt;

&lt;p&gt;What happens when your security team mandates strict compliance audits across 50 different clusters? What happens when you need to route ephemeral Kubernetes traffic through your legacy physical firewalls? Or when a critical microservice drops traffic at 2 AM and you need to know exactly why?&lt;/p&gt;

&lt;p&gt;To conquer the complex realities of production scale, organizations running VKS are supercharging their environments with the &lt;a href="https://www.tigera.io/tigera-products/calico-commercial-editions/" rel="noopener noreferrer"&gt;Calico Unified Platform&lt;/a&gt; (available via Calico Enterprise and Calico Cloud). Here is how Calico transforms your baseline VKS clusters into a fully observable, enterprise-grade networking and security platform.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Calico Unified Platform Reference Architecture
&lt;/h3&gt;

&lt;p&gt;As you scale your VKS environment, your architecture must evolve from providing basic pod connectivity to delivering a comprehensive security, routing, and observability mesh.&lt;/p&gt;

&lt;p&gt;The reference architecture below illustrates how Calico Unified Platform wraps your VKS worker nodes in advanced Layer 7 protections, granular egress controls, and deep forensic logging capabilities—all while maintaining the high-performance eBPF and BGP foundation of your clusters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Calico Unified Platform Architecture&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://wordpress-1075849-4005834.cloudwaysapps.com/app/uploads/2026/03/image1.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs1sqhhrz47wiuw3iovoo.png" width="800" height="501"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 1: Calico Unified Platform reference architecture for VKS – showing how Calico Enterprise wraps VKS worker nodes with Layer 7 security, egress controls, and deep observability while preserving the eBPF and BGP performance foundation.&lt;/em&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  1. Secure the Perimeter: Bridging Kubernetes with Legacy Firewalls
&lt;/h3&gt;

&lt;p&gt;Traditional network security teams often struggle with Kubernetes because Pod IP addresses are ephemeral—they spin up and die in seconds. This makes it virtually impossible to write static firewall rules on your external Palo Alto or Fortinet appliances.&lt;/p&gt;

&lt;p&gt;The Calico Unified Platform bridges this gap seamlessly for VKS environments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Egress Gateway &amp;amp; Source NAT:&lt;/strong&gt; Calico allows you to map dynamic Kubernetes namespaces to highly available, static IP Egress Gateways. When a pod talks to the outside world, your external firewall only sees the static IP. No more fighting with the NetSec team over IP tracking!&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native WAF and IDS/IPS:&lt;/strong&gt; Secure your inbound traffic right at the Calico Ingress Gateway. Calico integrates a powerful Web Application Firewall (WAF) using the ModSecurity Core Rule Set. Coupled with native Intrusion Detection/Prevention (IDS/IPS) and DDoS protection, Calico detects and blocks malicious payloads before they impact performance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DNS Policies &amp;amp; Threat Feeds:&lt;/strong&gt; Do not just block IPs; block malicious domains. Calico dynamically ingests global threat intelligence feeds to automatically halt traffic to known bad actors.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Enforce Zero-Trust at Scale: Unified Policy Across Kubernetes, VMs, and Bare Metal
&lt;/h3&gt;

&lt;p&gt;Open-source network policies are fantastic, but managing them across dozens of teams and clusters can quickly turn into the “Wild West” of YAML files. Calico brings true enterprise governance to your VKS environment—and extends it well beyond Kubernetes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Network Policy Tiers &amp;amp; Staged Policies:&lt;/strong&gt; A hierarchical, RBAC-driven approach to security. The Security team can create non-overrideable “Tier 1” guardrails, while Developers get full freedom to write microsegmentation rules for their specific namespaces. Even better, with Staged Policies, you can preview and test the impact of any rule on live traffic before fully enforcing it, ensuring zero downtime.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified Protection for Legacy VMs &amp;amp; Bare Metal:&lt;/strong&gt; Your VKS clusters do not exist in a vacuum. Calico extends its policy engine beyond Kubernetes, allowing you to secure traditional VMware VMs and bare-metal servers using the exact same single-pane-of-glass dashboard—a headline differentiator of the Calico Unified Platform.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sidecar-Less Service Mesh (Istio Ambient Mode):&lt;/strong&gt; Get the deep L7 visibility and mTLS encryption of a service mesh without the crippling performance overhead. Calico seamlessly integrates with Istio Ambient Mesh, managed through a single Calico operator—no standalone Istio expertise required.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Total Visibility: One Management Plane for Every Traffic Flow
&lt;/h3&gt;

&lt;p&gt;When a connection fails in a standard K8s cluster, troubleshooting usually involves blindly digging through kubectl logs. It is slow, frustrating, and drastically inflates your Mean Time to Resolution (MTTR).&lt;/p&gt;

&lt;p&gt;Calico acts as the ultimate CCTV system for your VKS clusters—with a single console covering every traffic type, from ingress to egress to pod-to-pod:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Service Graph &amp;amp; Alerts:&lt;/strong&gt; Get a real-time visual map of all microservice traffic across your clusters. Instantly see performance metrics, blocked traffic, and active connections. You can even configure automated alerts and incident response to deploy mitigating policies the second an anomaly is detected.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deep Forensic Logging:&lt;/strong&gt; Calico goes far beyond basic flow logs. It provides granular DNS Logs, L7 Logs, and Ingress Logs, allowing you to pinpoint exactly which layer of the stack is failing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On-Demand Packet Capture:&lt;/strong&gt; Did a specific pod trigger an anomaly? Trigger a targeted packet capture (pcap) directly from the Calico UI for deep forensic analysis, without ever having to SSH into the vSphere worker nodes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Scale Without Limits: Multi-Cluster Management and AI-Powered Operations
&lt;/h3&gt;

&lt;p&gt;As your VMware footprint grows, managing clusters individually becomes impossible. Calico’s Multi-Cluster Management provides a single pane of glass to view, secure, and troubleshoot all your VKS clusters—and even your public cloud EKS/AKS clusters. You can seamlessly federate identities and extend resilient multi-cluster networking with Cluster Mesh.&lt;/p&gt;

&lt;p&gt;And when things get truly complex? AI Assistant for Calico serves as your platform co-pilot. You can use natural language prompts to generate declarative Policy as Code, query flow logs, and diagnose active threats, drastically reducing the learning curve for new team members.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Ultimate VKS Experience
&lt;/h3&gt;

&lt;p&gt;VMware VKS gives you a world-class, CNCF-certified Kubernetes platform built directly into VCF. Calico Enterprise — part of the &lt;a href="https://www.tigera.io/tigera-products/calico-commercial-editions/" rel="noopener noreferrer"&gt;Calico Unified Platform&lt;/a&gt; — takes that foundation further, delivering a single management plane for networking, network security, and observability across every cluster, every workload type, and every environment. No stitching tools together. No integration tax. Just the enterprise-grade performance and security your most critical workloads demand.&lt;/p&gt;

&lt;h4&gt;
  
  
  Ready to see it in action?
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/demo/" rel="noopener noreferrer"&gt;Request a Demo of Calico Enterprise →&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.calicocloud.io/home" rel="noopener noreferrer"&gt;Start your free trial of Calico Cloud today →&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/vmware-vks-calico-secure-networking/" rel="noopener noreferrer"&gt;Secure and Scale VMware VKS with Calico Kubernetes Networking&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera - Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>companyblog</category>
      <category>technicalblog</category>
      <category>partnerintegration</category>
      <category>announcements</category>
    </item>
    <item>
      <title>Calico Load Balancer: Simplifying Network Traffic Management with eBPF</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Sat, 21 Mar 2026 20:00:55 +0000</pubDate>
      <link>https://dev.to/tigeraio/calico-load-balancer-simplifying-network-traffic-management-with-ebpf-3l21</link>
      <guid>https://dev.to/tigeraio/calico-load-balancer-simplifying-network-traffic-management-with-ebpf-3l21</guid>
      <description>&lt;p&gt;&lt;strong&gt;Authors:&lt;/strong&gt; Alex O’Regan, Aadhil Abdul Majeed&lt;/p&gt;

&lt;p&gt;Ever had a load balancer become the bottleneck in an on-prem Kubernetes cluster? You are not alone. Traditional hardware load balancers add cost, create coordination overhead, and can make scaling painful. A Kubernetes-native approach can overcome many of those challenges by pushing load balancing into the cluster data plane. Calico Load Balancer is an &lt;a href="https://www.tigera.io/learn/guides/ebpf/" rel="noopener noreferrer"&gt;&lt;strong&gt;eBPF&lt;/strong&gt;&lt;/a&gt; powered Kubernetes-native load balancer that uses consistent hashing (Maglev) and Direct Server Return (DSR) to keep sessions stable while allowing you to scale on-demand.&lt;/p&gt;

&lt;p&gt;Below is a developer-focused walkthrough: what problem Calico Load Balancer solves, how Maglev consistent hashing works, the life of a packet with DSR, and a clear configuration workflow you can follow to roll it out.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why a Kubernetes-native load balancer matters
&lt;/h2&gt;

&lt;p&gt;On-prem clusters often rely on dedicated hardware or proprietary appliances to expose services. That comes with a few persistent problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost and scaling friction&lt;/strong&gt; – You have to scale the network load balancer vertically as the size and throughput requirements of your Kubernetes cluster/s grows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational overhead&lt;/strong&gt; – Virtual IPs (VIPs) are often owned by another team, so simple service changes require coordination.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stateful failure modes&lt;/strong&gt; – Kube-proxy load balancing is stateful per node, so losing an ingress node can break active sessions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configuration drift&lt;/strong&gt; – Kubernetes is declarative, but the upstream load balancer is not, which causes divergence over time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Calico Load Balancer flips that model. Instead of dedicated hardware, it uses the &lt;strong&gt;Calico eBPF&lt;/strong&gt; data plane on ordinary Linux nodes in the cluster, advertises service IPs via &lt;a href="https://www.tigera.io/blog/when-to-use-bgp-vxlan-or-ip-in-ip-a-practical-guide-for-kubernetes-networking/" rel="noopener noreferrer"&gt;BGP&lt;/a&gt;, and makes the load balancing decision consistent across nodes. The result is a system that is cheaper to scale, easier to operate, and more resilient to node or path changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Calico Load Balancer works (and why Maglev matters)
&lt;/h2&gt;

&lt;p&gt;The core idea is consistent hashing. Instead of each node picking a backend at random and storing that decision in per-node state, Calico Load Balancer computes the same backend choice on any node for the same flow. This is implemented with Maglev, a consistent hashing algorithm that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Evenly distributes connections across backends.&lt;/li&gt;
&lt;li&gt;Minimizes disruption when load balancer nodes come and go.&lt;/li&gt;
&lt;li&gt;Allows any load balancer node to make the same backend selection, even mid-connection.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kube-proxy uses random selection plus per-node state, which is fine for many cases but can fail under node churn or route changes. Maglev avoids that by making the decision deterministic. Nodes may still cache the mapping for performance, but the flow-to-backend decision can be reproduced anywhere, which is what keeps sessions stable when traffic lands on a different node.&lt;/p&gt;

&lt;h3&gt;
  
  
  Strategic Assessment: Is This Right for Your Deployment?
&lt;/h3&gt;

&lt;p&gt;Questions you can ask your team to identify if Calico Load Balancer can help your environment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which services are most impacted by node churn today?&lt;/li&gt;
&lt;li&gt;Where do we see the most operational overhead in Virtual IP (VIP) provisioning?&lt;/li&gt;
&lt;li&gt;How do we secure access to service VIPs?&lt;/li&gt;
&lt;li&gt;Does the network have Equal Cost Multi-Path (ECMP) access to service VIPs?&lt;/li&gt;
&lt;li&gt;How do we handle VIP failover?&lt;/li&gt;
&lt;li&gt;Are there services with high-throughput requirements?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Life of a Packet
&lt;/h2&gt;

&lt;p&gt;A key design goal is to keep client sessions stable while enabling horizontal scale. Here is a simplified flow for a typical ECMP + BGP setup:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/app/uploads/2026/03/image2-1.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwafci6ul41u8twkj1qzm.png" alt="This diagram shows how Direct Server Return (DSR) allows the return path to bypass the load balancer node, reducing latency and hop count." width="800" height="580"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;This diagram shows how Direct Server Return (DSR) allows the return path to bypass the load balancer node, reducing latency and hop count.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A few important details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The top-of-rack router uses ECMP to pick a load balancer node to receive the packet.&lt;/li&gt;
&lt;li&gt;That node runs the Maglev algorithm to choose the backend pod. It DNATs the packet and tunnels it to the node that hosts the pod.&lt;/li&gt;
&lt;li&gt;The pod replies, and the node SNATs the packet back to the service VIP before it leaves.&lt;/li&gt;
&lt;li&gt;With &lt;strong&gt;DSR (Direct Server Return)&lt;/strong&gt;, the return path bypasses the load balancer node and goes straight back to the client. The client always sees responses from the advertised service VIP.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That &lt;strong&gt;DSR&lt;/strong&gt; path is important. It keeps the data path efficient and reduces load balancer hop count on the return path. It also prevents the client from seeing internal pod IPs.&lt;/p&gt;

&lt;h3&gt;
  
  
  DSR compared to a traditional return path
&lt;/h3&gt;

&lt;p&gt;If you have only worked with classic NAT-based load balancers, DSR can feel unusual. The key difference is that the response does not have to traverse the same load balancer node that handled the inbound packet. That has two practical benefits: less work for the load balancer nodes and lower return-path latency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Maglev and caching: deterministic and fast
&lt;/h3&gt;

&lt;p&gt;There are two pieces working together in Calico Load Balancer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Maglev lookup table:&lt;/strong&gt; Provides the deterministic backend choice. Any node can compute the same result for the same flow, which is why mid-connection packets can land on a different node without breaking the session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A per-flow cache:&lt;/strong&gt; (for example, via conntrack) can retain that decision for efficiency, and to preserve existing connections when the backend lookup table changes. It is not the source of truth for correctness.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a subtle but important difference from kube-proxy. In kube-proxy, the per-node conntrack decision is the only thing tying a flow to a backend. In Calico Load Balancer which uses &lt;a href="https://www.tigera.io/learn/guides/ebpf/" rel="noopener noreferrer"&gt;&lt;strong&gt;Calico’s eBPF dataplane&lt;/strong&gt;&lt;/a&gt;, the decision can be reproduced on any node, which is what makes failover or ECMP rehash events non-disruptive.&lt;/p&gt;

&lt;h3&gt;
  
  
  What happens during failures or path changes
&lt;/h3&gt;

&lt;p&gt;Consistent hashing is not just about distribution. It is about resilience. In practice, you can test this by intentionally re-routing traffic for an existing TCP connection to a different node. Even if the new node has no prior per-flow state, it can recompute the same backend decision using Maglev, so the connection can continue without disruption.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/app/uploads/2026/03/image1-1.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2z49ruqn8qpbk9kils6w.png" width="800" height="546"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Calico uses Maglev consistent hashing to ensure TCP sessions remain stable even if a load balancer node fails or is drained&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This matters when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A load balancer node fails or is drained.&lt;/li&gt;
&lt;li&gt;ECMP next hops reshuffle due to network outages.&lt;/li&gt;
&lt;li&gt;You scale the load balancer pool up or down.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because the decision is deterministic, the packet can land on any node and still find the correct backend. The whole cluster then seemingly acts as a single, distributed load balancer, with per-node caches for additional performance and resilience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuration workflow (high level)
&lt;/h2&gt;

&lt;p&gt;Calico Load Balancer is configured and managed declaratively just like any other Kubernetes resource. A typical configuration flow looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a dedicated IP pool for Calico LB IPAM, marked for LoadBalancer use.&lt;/li&gt;
&lt;li&gt;Create a Service of type LoadBalancer. Calico IPAM allocates a VIP from that pool.&lt;/li&gt;
&lt;li&gt;Advertise the VIP to the upstream network using Calico BGP (optional BFD for faster detection of outages).&lt;/li&gt;
&lt;li&gt;Ensure your upstream router uses ECMP to send traffic for the VIP to the Calico load balancer nodes.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Calico IP pool for load balancer VIPs&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;IPPool&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;loadbalancer-ip-pool&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;cidr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;192.210.0.0/20&lt;/span&gt;
  &lt;span class="na"&gt;blockSize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;24&lt;/span&gt;
  &lt;span class="na"&gt;assignmentMode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Automatic&lt;/span&gt;
  &lt;span class="na"&gt;allowedUses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;LoadBalancer&lt;/span&gt;


&lt;span class="c1"&gt;# Kubernetes Service using Calico LB&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;lb.projectcalico.org/external-traffic-strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;maglev&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LoadBalancer&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-app&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;443&lt;/span&gt;
      &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8443&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From there, the VIP is advertised and traffic can arrive through the ECMP paths to any load balancer node. Calico handles the rest.&lt;/p&gt;

&lt;h2&gt;
  
  
  Platform Benefits
&lt;/h2&gt;

&lt;p&gt;The benefits discussion above can translate into real operational advantages for platform teams:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Remove Hardware Dependency:&lt;/strong&gt; Scale load balancing capacity by adding standard Kubernetes nodes rather than purchasing expensive appliances or coordinating with vendors and avoid vendor lock-in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kubernetes-native approach:&lt;/strong&gt; Reduces complexity by keeping all service configuration within your existing GitOps workflows – no separate load balancer management interfaces or external ticketing systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session persistence:&lt;/strong&gt; Addresses one of the most common causes of user-facing outages in traditional setups, where losing an ingress node would drop all active connections.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-service capability:&lt;/strong&gt; Empowers development teams to provision and modify load balancer configurations without waiting for network team approvals, significantly reducing time-to-market for new services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable traffic distribution:&lt;/strong&gt; Maglev’s consistent hashing ensures that traffic distribution remains predictable and fair even as backend pods scale up and down, preventing the “hot spot” issues that can occur with simpler load balancing algorithms.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Calico Load Balancer gives you a Kubernetes-native way to scale your load balancer and protect critical services without the operational drag of traditional appliances.&lt;/p&gt;




&lt;h3&gt;
  
  
  Ready to scale your on-prem networking?
&lt;/h3&gt;

&lt;p&gt;If you want to try this in your environment, here is a safe, incremental path:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Identify&lt;/strong&gt; a non-critical service that is a good LoadBalancer candidate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create&lt;/strong&gt; a Calico IP pool for LoadBalancer VIPs and advertise it via BGP to your upstream network.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable&lt;/strong&gt; a LoadBalancer Service with Maglev for that service and confirm the VIP is reachable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate&lt;/strong&gt; failover: remove a load balancer node or change ECMP next hops and verify sessions continue.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document&lt;/strong&gt; the workflow and replicate to other services.&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/learn/guides/ebpf/" rel="noopener noreferrer"&gt;Learn more about Calico eBPF&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/calico-load-balancer-simplifying-network-traffic-management-with-ebpf/" rel="noopener noreferrer"&gt;Calico Load Balancer: Simplifying Network Traffic Management with eBPF&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera - Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>Lift-and-Shift VMs to Kubernetes with Calico L2 Bridge Networks</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Sat, 21 Mar 2026 06:12:02 +0000</pubDate>
      <link>https://dev.to/tigeraio/lift-and-shift-vms-to-kubernetes-with-calico-l2-bridge-networks-2d15</link>
      <guid>https://dev.to/tigeraio/lift-and-shift-vms-to-kubernetes-with-calico-l2-bridge-networks-2d15</guid>
      <description>&lt;p&gt;On paper, lift-and-shift VM migration to Kubernetes sounds simple. Compute can be moved. Storage can be remapped. But many migration projects stall at the network boundary. VM workloads are often tied to IP addresses, network segments, firewall rules, and routing models that already exist in the wider environment. That is where lift-and-shift becomes much harder than it first appears.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why lift-and-shift migration is challenging
&lt;/h2&gt;

&lt;p&gt;In a traditional hypervisor environment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A VM connects to a network the rest of the data center already understands.&lt;/li&gt;
&lt;li&gt;Its IP address is a first-class citizen of the network.&lt;/li&gt;
&lt;li&gt;Firewalls, routers, &lt;a href="https://www.tigera.io/learn/guides/kubernetes-monitoring/kubernetes-monitoring-tools/" rel="noopener noreferrer"&gt;monitoring tools&lt;/a&gt;, and peer applications know how to reach it.&lt;/li&gt;
&lt;li&gt;Existing application dependencies are often built around that network identity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Default &lt;a href="https://www.tigera.io/learn/guides/kubernetes-networking/" rel="noopener noreferrer"&gt;Kubernetes pod networking&lt;/a&gt; works very differently:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pod IPs usually come from a cluster-managed pod CIDR.&lt;/li&gt;
&lt;li&gt;Those IPs are mainly meaningful inside the Kubernetes cluster.&lt;/li&gt;
&lt;li&gt;The upstream network usually does not have direct visibility into pod networks.&lt;/li&gt;
&lt;li&gt;The original network segments from the VM world are not preserved by default.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This creates a major problem for VM migration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The workload can no longer keep the same network presence it had before.&lt;/li&gt;
&lt;li&gt;Teams often need to introduce VIPs or reconfigure the networking settings of the VM.&lt;/li&gt;
&lt;li&gt;That adds more complexity since changing the IP of the VM also requires changes to network firewall and load balancer configuration.&lt;/li&gt;
&lt;li&gt;At scale, it can make migration slower, more expensive, and harder to justify.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So while Kubernetes can be a strong platform for running VM workloads, default pod networking is often not a natural fit for lift-and-shift migration. The networking gap is one of the biggest reasons these projects become more complex than expected.&lt;/p&gt;

&lt;p&gt;The lack of network continuity is shown in the image below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/app/uploads/2026/03/image1-2.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmam2sb943kmnvry1wwri.png" alt="A diagram showing a VM moving from an existing hypervisor to a Kubernetes Pod Network, resulting in " width="800" height="591"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Default pod networking often creates a gap in network continuity, forcing complex reconfigurations and breaking existing dependencies like firewalls and load balancers.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introducing Calico L2 Bridge Networks
&lt;/h2&gt;

&lt;p&gt;Calico L2 Bridge Networks are designed to close that gap. Instead of forcing the VM to adapt to the Kubernetes pod network, Calico allows administrators to extend the existing layer 2 network all the way to the virtual machine running in Kubernetes.&lt;/p&gt;

&lt;p&gt;Administrators can define a &lt;strong&gt;network&lt;/strong&gt; resource in Kubernetes, and Calico creates a bridge on the cluster nodes to extend external networks. A trunk interface can be attached to the bridge, allowing VLANs to be carried all the way to the virtual machine. During migration, the migration tool can map the VM’s existing interface to interface definitions in the cluster and also inform Calico of the VM’s IP address, so Calico can keep track of that address throughout the VM’s lifecycle. Calico does all the underlying plumbing to ensure that the VM retains its network connectivity after migration.&lt;/p&gt;

&lt;p&gt;The key point is that the VM does not need a brand new networking model just because it moved to Kubernetes. The same layer 2 network structure can be preserved, which makes lift-and-shift migration much more practical.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why this matters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Existing VLAN-based connectivity can be extended directly to the VM.&lt;/li&gt;
&lt;li&gt;Administrators do not need to re-address the VM or place it behind VIPs just to make migration work.&lt;/li&gt;
&lt;li&gt;Multiple VLANs can be supported through the same trunk-backed bridge.&lt;/li&gt;
&lt;li&gt;The network can move with the VM, instead of becoming a separate redesign project.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The network continuity offered by Calico L2 Bridge Networks is shown in the image below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/app/uploads/2026/03/image3.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkdc1bak72t2fi379y72a.png" alt="A diagram showing a VM migrating to Kubernetes via a Calico L2 Bridge, which extends existing VLANs and maintains connection to original network firewalls and load balancers." width="800" height="592"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Calico L2 Bridge Networks allow you to extend existing Layer 2 infrastructure directly into Kubernetes, enabling “lift-and-shift” migrations that preserve original IP addresses and VLANs.&lt;/em&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Readiness Assessment: Is L2 Bridge Networking Right for Your Migration?
&lt;/h4&gt;

&lt;p&gt;Ask your infrastructure and networking teams:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do our existing VMs rely on specific VLAN tags for firewall policy enforcement?&lt;/li&gt;
&lt;li&gt;Will re-addressing our workloads require updating multiple external load balancers or hardcoded application dependencies?&lt;/li&gt;
&lt;li&gt;Do we need to maintain L2 adjacency between our legacy VM clusters and new Kubernetes nodes during a phased migration?&lt;/li&gt;
&lt;li&gt;Is network observability (via eBPF) a requirement for our compliance or troubleshooting workflows post-migration?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Benefits After Migration
&lt;/h2&gt;

&lt;p&gt;Calico L2 Bridge Networks do more than simplify the move into Kubernetes. Once the VM is running in Kubernetes, Calico can also bring the same operational advantages that teams already expect for cloud-native workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Network Observability
&lt;/h3&gt;

&lt;p&gt;One major benefit is &lt;a href="https://www.tigera.io/learn/guides/observability/" rel="noopener noreferrer"&gt;observability&lt;/a&gt;. Calico provides visibility into network traffic for these VM interfaces, giving administrators a much clearer view of how workloads are communicating after migration. Because Calico uses eBPF, it can capture deep insights into network behavior without relying on external tooling or guesswork. That makes it easier to understand traffic patterns, troubleshoot issues, and operate migrated VMs with more confidence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Calico Policy Enforcement
&lt;/h3&gt;

&lt;p&gt;Another major benefit is policy enforcement. Administrators can apply declarative &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;network policy&lt;/a&gt; directly to these VM interfaces using Kubernetes-native constructs. Policies can be based on labels, which fits naturally into Kubernetes operations, and selectors can be used to target specific VLANs or external networks when defining policy. Teams can also migrate networking policy from their previous hypervisor environment into Calico network policy, helping them maintain the same security posture as workloads move into Kubernetes. In practice, that means teams can preserve the connectivity model they need while still applying consistent, modern security controls to VM workloads inside Kubernetes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Live Migration
&lt;/h3&gt;

&lt;p&gt;Live migration is another important benefit. Once the VM is running in Kubernetes, it can be moved from one node to another while retaining the same network configuration. That is critical for day-2 operations, because it means teams can take advantage of Kubernetes-based VM mobility without having to rework network settings each time a workload moves. The network identity stays consistent even as the VM is migrated across the cluster.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/app/uploads/2026/03/image2-2.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmftw2561i69ipnocecq5.png" alt="A diagram illustrating a VM live migrating from Node 1 to Node 2 within a Kubernetes cluster while maintaining consistent compute and networking via KubeVirt and Calico." width="800" height="598"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;By decoupling compute and networking, Calico ensures that migrated VMs can move between cluster nodes while retaining their original network configuration and identity.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Lift-and-shift VM migration to Kubernetes often breaks down because the network model does not move with the workload. That forces teams to introduce workarounds such as VIPs, re-addressing, and additional operational complexity, which can quickly turn a simple migration plan into a much larger project.&lt;/p&gt;

&lt;p&gt;Calico L2 Bridge Networks help remove that barrier by extending existing layer 2 networks all the way to the VM inside Kubernetes. That means teams can preserve familiar network configurations during migration while also gaining the advantages of running VMs on Kubernetes, including observability, declarative policy, and live migration. Instead of treating networking as a migration blocker, organizations can use Calico to make it part of a cleaner and more practical path forward.&lt;/p&gt;

&lt;p&gt;Webinar Recording&lt;/p&gt;

&lt;p&gt;Available on demand&lt;/p&gt;

&lt;h2&gt;
  
  
  Calico L2 bridge networking for virtual machines
&lt;/h2&gt;

&lt;p&gt;Migrating VMs to Kubernetes? Learn how to preserve your existing IPs, VLANs, and security policies — no network rebuild required.&lt;/p&gt;

&lt;p&gt;“Lift and shift” VM migrations with zero IP changes&lt;/p&gt;

&lt;p&gt;Maintain existing VLANs and security dependencies&lt;/p&gt;

&lt;p&gt;Expert guidance from Tigera’s networking team&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=gxpm47mGKPc" rel="noopener noreferrer"&gt;Watch the recording&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/lift-and-shift-vms-to-kubernetes-with-calico-l2-bridge-networks/" rel="noopener noreferrer"&gt;Lift-and-Shift VMs to Kubernetes with Calico L2 Bridge Networks&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera - Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>AI Assistant for Calico: Troubleshooting at the Speed of Thought</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Thu, 19 Mar 2026 20:36:45 +0000</pubDate>
      <link>https://dev.to/tigeraio/ai-assistant-for-calico-troubleshooting-at-the-speed-of-thought-38jo</link>
      <guid>https://dev.to/tigeraio/ai-assistant-for-calico-troubleshooting-at-the-speed-of-thought-38jo</guid>
      <description>&lt;p&gt;Despite the wealth of data available, distilling a coherent narrative from a Kubernetes cluster remains a challenge for modern infrastructure teams. Even with powerful visualization tools like the Policy Board, Service Graph, and specialized dashboards, &lt;a href="https://www.splunk.com/en_us/blog/learn/kubernetes-troubleshoot-observability.html" rel="noopener noreferrer"&gt;users often find themselves spending significant time piecing together context across different screens&lt;/a&gt;. Making good use of this data to secure a cluster or troubleshoot an issue becomes nearly impossible when it requires manually searching across multiple sources to find a single “connecting thread.”&lt;/p&gt;

&lt;p&gt;Inevitably, security holes happen, configurations conflict causing outages, and teams scramble to find that needle-in-the-haystack cause of cluster instability. A new approach is needed to understand the complex layers of security and the interconnected relationships among numerous microservices. Observability tools need to not only organize and present data in a coherent manner but proactively help to filter and interpret it, cutting through the noise to get to the heart of an issue. As we discussed in our &lt;a href="https://www.tigera.io/blog/2026-the-rise-of-ai-agents/" rel="noopener noreferrer"&gt;2026 outlook on the rise of AI agents&lt;/a&gt;, this represents a fundamental shift in Kubernetes management.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Insight:&lt;/strong&gt; With AI Assistant for Calico, observability takes a leap forward, providing a proactive, conversational, and context-aware intelligence layer to extract actionable insights from a sea of raw telemetry. SREs can interrogate their data through a natural language interface instead of having to painstakingly construct complex queries, removing knowledge barriers and reducing MTTR (Mean Time to Repair).&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond Manual Log Analysis
&lt;/h2&gt;

&lt;p&gt;To understand the impact of the AI Assistant for Calico, it is helpful to look at the traditional workflow through the lens of the challenges platform teams face daily. Troubleshooting connectivity issues, for example, typically starts with a look at traffic flows, identifying ones that may be problematic, then drilling down into the details while looking up possibly relevant policies, network configuration, ingress rules, and hostname resolution in different dashboards and sets of logs. Often one or more multi-step queries have to be run and then the results have to be filtered to start getting an idea of what may be going wrong. This is particularly difficult when &lt;a href="https://www.tigera.io/blog/why-kubernetes-flat-networks-fail-at-scale/" rel="noopener noreferrer"&gt;Kubernetes flat networks fail at scale&lt;/a&gt;, increasing the complexity of every query.&lt;/p&gt;

&lt;p&gt;This sort of manual navigation slows down problem resolution and imposes a high cognitive cost on SREs. Even for seasoned engineers, debugging can take hours or even days when the answer must be excavated from multiple sources of information.&lt;/p&gt;

&lt;h2&gt;
  
  
  Natural Language Insights
&lt;/h2&gt;

&lt;p&gt;The AI Assistant for Calico resolves these bottlenecks by replacing cumbersome queries with a seamless, natural-language interface that interprets telemetry instead of just displaying it and synthesizes data from multiple sources so you don’t have to. By moving away from rigid query languages, the assistant changes how engineers interact with their cluster data in three primary ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ask, Don’t Query:&lt;/strong&gt; Troubleshooting now starts with an articulation of intent instead of a lengthy session wrestling with search fields and operators. Being able to simply ask “What are the unrestricted egress destinations currently receiving traffic from my pods?” without painstakingly cobbling together and testing a multi-layered query is a paradigm shift. It moves the engineer’s focus from the mechanics of the search to the logic of the solution.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context-Aware Explanations:&lt;/strong&gt; The assistant doesn’t just return raw data; it provides summaries and recommendations generated from real telemetry and policy context. It can explain, for instance, that “Traffic is denied because policy X in namespace Y blocks TCP 443.” It also suggests further troubleshooting steps and offers remediation advice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified Visibility Across the Cluster:&lt;/strong&gt; The assistant provides insights across clusters, namespaces, and workloads, extracting details that would previously require drilling down into, for example, a specific flow or policy configuration. All of a sudden, that “connecting thread” between seemingly isolated events becomes a lot clearer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI Assistant for Calico allows engineers to quickly zero in on relevant information using a conversational form of root-cause analysis that even junior members of the team can have success with.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/app/uploads/2026/03/AI-Assisstant-for-Calico-.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2glgplvwnukldlj56qh2.png" width="800" height="477"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;AI Assistant for Calico can quickly get you the information you need&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Proactive Security and Policy Optimization
&lt;/h2&gt;

&lt;p&gt;While reactive troubleshooting is critical, the AI Assistant for Calico also enables a proactive security posture by identifying misconfigurations and security gaps that might otherwise go unnoticed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Surfacing Exposure Risks:&lt;/strong&gt; The AI Assistant can identify workloads exposed to the internet or detect egress exposure risks, such as pods communicating with unrestricted external destinations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy Recommendations and Generation:&lt;/strong&gt; Instead of starting from scratch, users can ask the AI to recommend a base policy or generate a specific snippet, such as a policy to block all egress traffic from a specific training pod.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cleaning up the Mesh:&lt;/strong&gt; The assistant helps maintain cluster stability and security hygiene by detecting unused or missing network policies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identifying Gaps:&lt;/strong&gt; It proactively surfaces network flows that have no policies applied to them, ensuring that the principle of least privilege is maintained across the cluster—a key requirement highlighted in the &lt;a href="https://www.tigera.io/blog/key-insights-from-the-2025-gigaom-radar-for-container-networking/" rel="noopener noreferrer"&gt;2025 GigaOm Radar for Container Networking&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;These capabilities streamline the time-consuming and error-prone process of manually managing intricate policy syntax, making for more stable, performant, and secure clusters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Scenario: Rapidly Resolving a Blocked Service Connection
&lt;/h2&gt;

&lt;p&gt;To see the impact of these capabilities, consider a common high-pressure situation for a platform engineer. An engineer receives an urgent alert that a critical production service is unable to communicate with its database.&lt;/p&gt;

&lt;p&gt;In a traditional environment, the engineer would spend 30 to 60 minutes manually checking network policies, inspecting flow logs, and verifying namespace labels across multiple clusters to find the culprit. Every minute of manual investigation increases the risk of service downtime and customer frustration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The AI Solution:&lt;/strong&gt; Instead of manual log diving, the engineer asks the AI Assistant for Calico a direct question: “Why is the frontend-service in the production namespace unable to reach the db-service?”. The AI instantly analyzes the environment and identifies that a recent policy update is missing a necessary egress rule for the specific database port. Total resolution time is reduced from over an hour to just a few minutes.&lt;/p&gt;

&lt;p&gt;Thinking ahead, the engineer asks for an audit of all staged policies. AI Assistant for Calico finds another incorrect policy—this one with a misspelled label selector—averting a future outage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://app.arcade.software/share/vnmgt3EfCjxX76D26z48" rel="noopener noreferrer"&gt;&lt;br&gt;&lt;br&gt;
View Interactive Demo: Exploring Assistant for Calico →&lt;br&gt;&lt;br&gt;
&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A New Standard for Platform Operations
&lt;/h2&gt;

&lt;p&gt;The introduction of the AI Assistant for Calico in the &lt;a href="https://www.tigera.io/blog/whats-new-in-calico-winter-2026-release/" rel="noopener noreferrer"&gt;Winter 2026 release&lt;/a&gt; is the next step in observability and Kubernetes management. By adding the ability to interrogate a cluster in plain English, Calico’s unified platform bridges the gap between high-fidelity telemetry data and practical solutions&lt;/p&gt;

&lt;p&gt;Beyond the immediate operational gains, this AI-powered approach fits into a broader strategy of defense in depth and operational simplicity, specifically regarding &lt;a href="https://www.tigera.io/blog/ingress-security-for-ai-workloads/" rel="noopener noreferrer"&gt;ingress security for AI workloads&lt;/a&gt;. It removes the friction of complex debugging, accelerates onboarding for new team members, and ensures that your security posture remains consistent even as your architecture scales.&lt;/p&gt;




&lt;h3&gt;
  
  
  Experience the Power of AI Assistant for Calico
&lt;/h3&gt;

&lt;p&gt;Ready to see how AI can accelerate your Kubernetes troubleshooting and network policy management?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/event/calico-ai-accelerating-kubernetes-troubleshooting-and-network-policy-management/" rel="noopener noreferrer"&gt;Watch the On-Demand Demo&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;a href="https://www.calicocloud.io/home" rel="noopener noreferrer"&gt;Sign Up for Calico Cloud (Free Trial)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/ai-assistant-for-calico-troubleshooting-at-the-speed-of-thought/" rel="noopener noreferrer"&gt;AI Assistant for Calico: Troubleshooting at the Speed of Thought&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera - Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>bestpractices</category>
      <category>howto</category>
    </item>
    <item>
      <title>What Your EKS Flow Logs Aren’t Telling You</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Wed, 18 Mar 2026 21:06:48 +0000</pubDate>
      <link>https://dev.to/tigeraio/what-your-eks-flow-logs-arent-telling-you-50ca</link>
      <guid>https://dev.to/tigeraio/what-your-eks-flow-logs-arent-telling-you-50ca</guid>
      <description>&lt;p&gt;If you’re running workloads on Amazon EKS, there’s a good chance you already have some form of network observability in place. VPC Flow Logs have been a staple of AWS networking for years, and AWS has since introduced Container Network Observability, a newer set of capabilities built on Amazon CloudWatch Network Flow Monitor, that adds pod-level visibility and a service map directly in the EKS console.&lt;/p&gt;

&lt;p&gt;It’s a reasonable assumption that between these tools, you have solid visibility into what’s happening on your cluster’s network. But for teams focused on &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/" rel="noopener noreferrer"&gt;Kubernetes security&lt;/a&gt; and &lt;a href="https://www.tigera.io/blog/calico-whisker-staged-network-policies-secure-kubernetes-workloads-without-downtime/" rel="noopener noreferrer"&gt;policy enforcement&lt;/a&gt;, there’s a significant gap — and it’s not the one you might expect.&lt;/p&gt;

&lt;p&gt;In this post, we’ll break down exactly what EKS native observability gives you, where it falls short for security-focused use cases, and what Calico’s observability tools, Goldmane and Whisker, provide that you simply cannot get from AWS alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  What EKS Gives You Out of the Box
&lt;/h2&gt;

&lt;p&gt;AWS offers two main sources of network observability for EKS clusters:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;VPC Flow Logs&lt;/strong&gt; capture IP traffic at the network interface level across your VPC. For each flow, you get source and destination IP addresses, ports, protocol, and whether traffic was accepted or rejected at the VPC level, by security groups and network ACLs. Useful for infrastructure-level visibility, but with no awareness of the Kubernetes layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Container Network Observability,&lt;/strong&gt; introduced more recently and powered by Amazon CloudWatch Network Flow Monitor, goes meaningfully further. Once you’ve installed the NFM agent as a DaemonSet and configured the required IAM permissions, Scope, and Monitor resources, you get access to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Performance metrics&lt;/strong&gt; — pod and node-level metrics including ingress/egress flow counts, packet counts, bytes transferred, and bandwidth limit events, exposed in OpenMetrics format and scrapable by Prometheus&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A service map&lt;/strong&gt; — a visualization of traffic between pods and deployments in the EKS console, showing retransmissions, retransmission timeouts, and data transferred between communicating workloads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A flow table&lt;/strong&gt; — a breakdown of top-talking workloads across three views: within the cluster (east-west), to AWS services (S3, DynamoDB), and to external destinations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a genuinely capable performance observability tool. If your primary concern is understanding network throughput, identifying bandwidth hotspots, tracking cross-AZ traffic costs, or detecting retransmission anomalies, Container Network Observability gives you a solid foundation.&lt;/p&gt;

&lt;p&gt;But if your primary concern is &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-security/" rel="noopener noreferrer"&gt;Kubernetes network security&lt;/a&gt;, specifically understanding policy behavior, debugging denied connections, and moving toward a least-privilege posture, it leaves critical gaps.&lt;/p&gt;

&lt;h2&gt;
  
  
  What EKS Native Observability Doesn’t Tell You
&lt;/h2&gt;

&lt;p&gt;Understanding what EKS observability doesn’t show you is just as important as knowing what it does. Several gaps become significant once you’re actively managing network policies or investigating a security incident.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No policy verdict context.&lt;/strong&gt; This is the most important gap. Neither VPC Flow Logs nor Container Network Observability have any awareness of Kubernetes network policies. If a Calico policy is denying traffic between two pods, you will not see that denial in AWS observability tooling. You’ll see a connection failing with no indication of which policy rule fired, which tier it belonged to, or whether the traffic was intentionally blocked or the result of a misconfiguration. For teams actively managing network policies, this makes AWS observability tools nearly useless for the most common debugging scenario you’ll face.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance metrics, not security metrics.&lt;/strong&gt; The flow-level metrics in Container Network Observability (retransmissions, retransmission timeouts, and bytes transferred) are designed to answer performance questions. They are not designed to answer security questions like: which namespaces are communicating that shouldn’t be, which egress destinations are being reached, or which policy rules are being evaluated for a given flow. These are fundamentally different observability needs, and AWS’s tooling is built for the former.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Top 500 flows only, over a 1-hour window.&lt;/strong&gt; The NFM agent collects the top 500 network flows by volume every 30 seconds, and the console visualizations are scoped to a 1-hour time range. For security investigations, this matters: less frequent or lower-volume connections — exactly the kind that might indicate lateral movement or exfiltration — may not appear in the top 500 and will be invisible to the service map and flow table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No namespace-level policy context.&lt;/strong&gt; While the service map does show pod and deployment-level topology, it shows you traffic volume and performance — not whether that traffic is authorized by your network policies, which policies evaluated it, or whether any of it should be blocked. Understanding the security posture of your namespace boundaries requires a different layer of data entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setup complexity.&lt;/strong&gt; Enabling Container Network Observability requires installing the NFM agent add-on, configuring IAM permissions with Pod Identity or IRSA, and creating NFM Scope and Monitor resources either through the console, AWS CLI, or Terraform. For teams managing this with IaC, that means defining additional resource dependencies and managing the Terraform AWS Provider version requirements. It’s not prohibitively complex, but it’s meaningful infrastructure to own and maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Calico Adds: Goldmane and Whisker
&lt;/h2&gt;

&lt;p&gt;Calico’s observability capabilities are built on two components introduced in Calico 3.30: &lt;a href="https://www.tigera.io/blog/calico-open-source-3-30-exploring-the-goldmane-api-for-custom-kubernetes-network-observability/" rel="noopener noreferrer"&gt;Goldmane&lt;/a&gt;, a flow log API that generates enriched, Kubernetes-native flow data, and &lt;a href="https://www.tigera.io/blog/calico-whisker-your-new-ally-in-network-observability/" rel="noopener noreferrer"&gt;Whisker&lt;/a&gt;, a web-based UI for visualizing and filtering that data in real time. Together they give you a fundamentally different class of observability — one built specifically for the Kubernetes security layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Goldmane: Flow Logs That Speak Kubernetes Security
&lt;/h3&gt;

&lt;p&gt;Where AWS Container Network Observability speaks in performance metrics, Goldmane speaks in Kubernetes policy context. Every flow log entry generated by Goldmane includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Source and destination namespace, pod name, and deployment — Kubernetes identity is always present, regardless of IP churn&lt;/li&gt;
&lt;li&gt;Service names — traffic is attributed to the service it passed through, not just the backend pod IP&lt;/li&gt;
&lt;li&gt;Policy verdicts — each flow includes which Calico policy rule evaluated it, whether the action was Allow or Deny, and which tier the policy belonged to&lt;/li&gt;
&lt;li&gt;Port, protocol, and domain information — including DNS-based destinations for egress traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The policy verdict data is what changes the debugging experience most fundamentally. When a network policy misconfiguration breaks Prometheus scraping, blocks a health check probe, or silently drops traffic between namespaces — scenarios that are routine for any team actively managing network policies — Goldmane tells you exactly which rule fired and why. You’re not correlating IP addresses and timestamps across multiple tools; the answer is in the flow log.&lt;/p&gt;

&lt;p&gt;Goldmane exposes its data via a gRPC API, making it straightforward to consume from your existing observability stack, whether that’s Elasticsearch, Grafana, or a custom pipeline. It covers all flows in your cluster, not just the top 500 by volume.&lt;/p&gt;

&lt;h3&gt;
  
  
  Whisker: Real-Time Policy Visibility Without Additional Infrastructure
&lt;/h3&gt;



&lt;p&gt;Whisker is a lightweight web console that surfaces Goldmane’s flow data without requiring any additional tooling. You can filter flows by namespace, pod, policy verdict, or direction, and see in real time which traffic is being allowed and denied across your cluster.&lt;/p&gt;

&lt;p&gt;For teams moving from a default-allow posture toward namespace isolation or zero trust, Whisker is particularly valuable during the transition: you can watch policy verdicts update live as you apply and adjust rules, rather than inferring policy behavior from downstream signals like application errors and health check failures.&lt;/p&gt;

&lt;p&gt;Whisker is included in &lt;a href="https://www.tigera.io/blog/introducing-calico-3-30-a-new-era-of-open-source-network-security-and-observability-for-kubernetes/" rel="noopener noreferrer"&gt;Calico Open Source as of 3.30.&lt;/a&gt; Access it via a local port-forward — no agent &lt;code&gt;DaemonSet&lt;/code&gt; configuration, no IAM policies, no cloud service dependencies required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Going Further: Calico Cloud Free Tier
&lt;/h2&gt;

&lt;p&gt;Goldmane and Whisker give you a significantly richer observability foundation for security and troubleshooting than AWS native tooling. If you want to go further, &lt;a href="https://www.tigera.io/blog/a-detailed-look-at-calico-cloud-free-tier/" rel="noopener noreferrer"&gt;Calico Cloud’s free tier&lt;/a&gt; adds a hosted experience that requires no additional infrastructure to operate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/app/uploads/2026/03/image1.png" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fon1oeyyk072drgezq4wb.png" alt="Visualizing Security Posture with Calico Cloud Service Graph" width="800" height="461"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;The Calico Cloud Service Graph provides a live, visual map of communication between namespaces, services, and pods.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Connecting your EKS cluster to Calico Cloud gives you access to the Service Graph, which provides a live visual map of how your namespaces, services, and pods are communicating, overlaid with Calico policy evaluation data. Unlike the AWS console service map, which surfaces performance metrics for your top flows, the Calico Cloud Service Graph shows you the security posture of your traffic: which connections are authorized, which are being denied, and where your policy coverage has gaps. Teams that see it for the first time consistently describe it as the moment their cluster’s network finally became legible from a security perspective.&lt;/p&gt;

&lt;p&gt;The free tier also includes the policy recommendation engine, which analyzes your cluster’s actual traffic patterns and automatically generates staged network policies to implement namespace isolation. Staged policies let you audit the recommended rules and see exactly which traffic they would allow and deny before you enforce them. It’s the fastest path from a default-allow EKS cluster to one where every namespace is isolated and secured.&lt;/p&gt;

&lt;p&gt;Calico Cloud’s free tier is genuinely free, with no sales engagement required. It supports a single cluster with 24-hour data retention — enough to experience the Service Graph and understand what your cluster’s traffic actually looks like from a security perspective.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Quick Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;VPC Flow Logs&lt;/th&gt;
&lt;th&gt;EKS Container Network Observability&lt;/th&gt;
&lt;th&gt;Calico Open Source (Goldmane + Whisker)&lt;/th&gt;
&lt;th&gt;Calico Cloud Free Tier&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pod / namespace identity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(deployment/pod view)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Service-level visibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(service map)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Network performance metrics&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(RT, RTO, bytes)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Calico policy verdict (allow/deny + which rule)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;All flows (not just top N by volume)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(top 500)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security posture / policy gap visibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Real-time policy visualization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(Whisker)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(Service Graph)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Policy recommendations&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;td&gt;✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(NFM agent, IAM)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(port-forward)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;small&gt;(single manifest)&lt;/small&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  Sign up for the free tier
&lt;/h3&gt;

&lt;p&gt;Goldmane and Whisker, available today in Calico 3.30+, fill the gaps in EKS observability. They’re purpose-built for the Kubernetes security layer and give every EKS operator richer policy-level observability at no cost.&lt;/p&gt;

&lt;p&gt;If you want to go further and have a live service graph that surfaces policy context, hosted dashboards, and automated policy recommendations, Calico Cloud’s free tier is the next step.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://calicocloud.io" rel="noopener noreferrer"&gt;Sign up at Calico Cloud and connect your EKS cluster in under 20 minutes&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AWS Container Network Observability is a meaningful improvement over VPC Flow Logs and a genuinely useful tool for understanding network performance in your EKS environment. If you’re tracking retransmissions, monitoring cross-AZ traffic, or trying to identify bandwidth hotspots, it’s worth enabling.&lt;/p&gt;

&lt;p&gt;But it was designed for performance observability, not security observability. It has no awareness of &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;Kubernetes network policy&lt;/a&gt; behavior, no policy verdict data, and no visibility into whether your namespace boundaries are being respected. For teams actively managing network policies or trying to move toward a least-privilege security posture, these are not minor gaps.&lt;/p&gt;

&lt;p&gt;Goldmane and Whisker, available today in Calico 3.30+, fill exactly those gaps. They’re purpose-built for the Kubernetes security layer and give every EKS operator richer policy-level observability at no cost. If you want to go further and have a live service graph that surfaces policy context, hosted dashboards, and automated policy recommendations, Calico Cloud’s free tier is the next step.&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/what-your-eks-flow-logs-arent-telling-you/" rel="noopener noreferrer"&gt;What Your EKS Flow Logs Aren’t Telling You&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera - Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>howto</category>
    </item>
  </channel>
</rss>
