<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tigera Inc</title>
    <description>The latest articles on DEV Community by Tigera Inc (@tigeraio).</description>
    <link>https://dev.to/tigeraio</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F12572%2Fe692e88e-7a1e-49d5-870b-930d459570c0.png</url>
      <title>DEV Community: Tigera Inc</title>
      <link>https://dev.to/tigeraio</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tigeraio"/>
    <language>en</language>
    <item>
      <title>Five Principles of an Accountable AI Agent Network: How to Evaluate Any Governance Platform</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Wed, 10 Jun 2026 20:19:08 +0000</pubDate>
      <link>https://dev.to/tigeraio/five-principles-of-an-accountable-ai-agent-network-how-to-evaluate-any-governance-platform-2jcm</link>
      <guid>https://dev.to/tigeraio/five-principles-of-an-accountable-ai-agent-network-how-to-evaluate-any-governance-platform-2jcm</guid>
      <description>&lt;p&gt;The &lt;a href="https://www.tigera.io/blog/the-ai-agent-accountability-crisis-why-governance-isnt-keeping-up-with-deployment/" rel="noopener noreferrer"&gt;first post&lt;/a&gt; in this series argued that AI agent governance hasn’t kept pace with deployment. The &lt;a href="https://www.tigera.io/blog/the-five-pillars-of-ai-agent-accountability-a-diagnostic-framework-for-engineering-leaders/" rel="noopener noreferrer"&gt;second&lt;/a&gt; laid out the five pillars of accountability, and what is required. The &lt;a href="https://www.tigera.io/blog/the-ai-agent-accountability-gap-why-network-policies-api-gateways-and-rbac-are-not-enough/" rel="noopener noreferrer"&gt;third&lt;/a&gt; walked through why network policies, API gateways, MCP/A2A protocols, DIY security patterns, and Role-based Access Control (RBAC) each leave critical accountability gaps.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;So what does good look like?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The five pillars define &lt;strong&gt;what&lt;/strong&gt; &lt;a href="https://www.tigera.io/blog/your-ai-agents-are-autonomous-but-are-they-accountable/" rel="noopener noreferrer"&gt;AI agent accountability&lt;/a&gt; requires. The principles below define &lt;strong&gt;how&lt;/strong&gt; a governance platform should deliver it. These are the architectural principles your team should evaluate any AI agent governance solution against, whether you build it, buy it, or assemble it from open-source components.&lt;/p&gt;

&lt;p&gt;If a vendor pitches you a governance platform that fails any of these five, walk away.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are the five principles of an accountable AI agent network?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;Kubernetes Network Policies&lt;/a&gt; are essential for securing any cluster. They restrict which pods can communicate with which other pods at the network level, and they should absolutely be part of your security posture.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Default-deny:&lt;/strong&gt; No agent communicates unless a policy explicitly permits it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attribute-based policy:&lt;/strong&gt; Policies reference agent attributes, not agent names.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero-trust identity:&lt;/strong&gt; Every request authenticated, every identity verified.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit by design:&lt;/strong&gt; Every interaction produces a structured, correlated trace automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kubernetes-native:&lt;/strong&gt; The platform extends your existing infrastructure rather than replacing it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each principle below explains why it matters and what a passing solution looks like.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjx576wo13n6xvjgaq2b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyjx576wo13n6xvjgaq2b.png" width="800" height="130"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Use the five principles as a checklist when evaluating any governance platform. Fail any one, and the platform is one missing principle away from security theater.&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 1: Default-deny
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;No agent communicates with any other agent unless explicitly permitted by policy.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the only safe starting posture for accountability. If your governance layer defaults to &lt;em&gt;allowing&lt;/em&gt; communication and only blocks what’s explicitly forbidden, every interaction you didn’t anticipate is ungoverned, and you can’t be accountable for what you didn’t authorize.&lt;/p&gt;

&lt;p&gt;Default-deny flips the model: nothing is allowed until a policy explicitly permits it. Every allowed interaction is intentional, traceable, and auditable. New agents are isolated by default until policies are written to grant them access, which is exactly the behavior you want in a governed network.&lt;/p&gt;

&lt;p&gt;Default-deny seems restrictive, but in practice it’s liberating. Your security team doesn’t have to anticipate every possible _ &lt;strong&gt;bad&lt;/strong&gt; _ interaction. They only have to define the &lt;strong&gt;&lt;em&gt;good&lt;/em&gt;&lt;/strong&gt; ones.&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 2: Attribute-based policy
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Policies should reference agent attributes, not agent names.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hardcoding agent names in policies creates a governance system that breaks every time you add or rename an agent. It’s the equivalent of maintaining a firewall with hundreds of IP-based rules instead of using network segments.&lt;/p&gt;

&lt;p&gt;Attribute-based policies reference properties like capabilities, risk levels, team ownership, and environment tags. Instead of &lt;em&gt;“Agent-Finance-v2 can call Agent-Compliance-v3,&lt;/em&gt;” the policy says &lt;em&gt;“Agents with capability=financial-analysis can call agents with capability=compliance-query.”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This approach has a powerful scaling property: when a new agent registers with matching attributes, existing policies apply automatically. The governance grows with the agent network, not against it. A team deploying a new agent doesn’t need to file a ticket to update allow-lists, they describe the agent’s attributes at registration time, and the policy engine handles the rest.&lt;/p&gt;

&lt;p&gt;This is the principle that separates a security model that survives at 10 agents from one that survives at 1,000.&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 3: Zero-trust identity
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Every request authenticated. Every identity verified. Trust nothing by default.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agent networks are susceptible to the same identity threats as any distributed system: spoofing, replay attacks, credential theft. But agents add an unique challenge: they operate on behalf of the users. This means both the &lt;strong&gt;workload identity&lt;/strong&gt; (is this actually the agent it claims to be?) and the &lt;strong&gt;user identity&lt;/strong&gt; (on whose behalf is this agent acting?) must be verified.&lt;/p&gt;

&lt;p&gt;A governance platform should support &lt;strong&gt;dual authentication&lt;/strong&gt; : cryptographic workload identity (proving the agent is genuine) and token-based user identity (establishing who triggered the action). Both identities should be available for policy evaluation and audit logging.&lt;/p&gt;

&lt;p&gt;Short-lived credentials, automatic rotation, and cryptographic verification should be standard, not optional add-ons. Static API keys and long-lived tokens are liabilities in an agent network; compromised credentials can enable automated lateral movement at machine speed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 4: Audit by design, not by afterthought
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Every interaction produces a structured, correlated trace automatically.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If your team has to &lt;em&gt;add&lt;/em&gt; logging after the fact, you’ve already lost accountability. Audit records should be a &lt;strong&gt;byproduct of the governance layer’s enforcement&lt;/strong&gt; , not a separate system bolted on later.&lt;/p&gt;

&lt;p&gt;When the governance layer evaluates a policy and permits (or denies) an agent interaction, that evaluation &lt;em&gt;is&lt;/em&gt; the audit record. It captures: who called whom, what policy was evaluated, what the decision was, what attributes matched, and when it happened. These records should be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structured&lt;/strong&gt; (not free-text logs),&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Correlated&lt;/strong&gt; across multi-hop chains (using distributed trace IDs),&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queryable&lt;/strong&gt; by agent, by policy, by time range, by outcome.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The practical implication: the audit trail should be a &lt;strong&gt;first-class product&lt;/strong&gt; of the governance platform, not a configuration option. If you have to enable it, someone will forget. If it’s built in, it’s always there.&lt;/p&gt;

&lt;h3&gt;
  
  
  Principle 5: Kubernetes-native
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The governance layer should work with your existing infrastructure, not replace it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Enterprises have invested heavily in Kubernetes, Helm charts, GitOps pipelines, RBAC, namespaces, and observability stacks. An AI agent governance platform that requires a separate control plane, its own deployment model, or a proprietary orchestration layer will face adoption resistance and operational overhead.&lt;/p&gt;

&lt;p&gt;The governance platform should be deployable via Helm, manageable via CRDs, observable (e.g. via Prometheus or OpenTelemetry), and compatible with existing identity infrastructure (OIDC providers, SPIFFE/SPIRE). It should feel like a natural extension of the Kubernetes platform, not a foreign system that happens to run on it.&lt;/p&gt;

&lt;p&gt;This isn’t just about developer experience. It’s about &lt;strong&gt;operational sustainability&lt;/strong&gt;. If the governance platform requires specialized skills your platform team doesn’t have, it will become a bottleneck instead of an enabler.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the principles reinforce each other
&lt;/h2&gt;

&lt;p&gt;These five principles aren’t independent. They reinforce each other:&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Principle&lt;/strong&gt; | &lt;strong&gt;What it enables&lt;/strong&gt; |&lt;br&gt;
| Default-deny | Provenance; every allowed interaction was explicitly authorized |&lt;br&gt;
| Attribute-based policy | Governance at scale; authorization grows with the network |&lt;br&gt;
| Zero-trust identity | Trust in audit records; every participant is verified |&lt;br&gt;
| Audit by design | Traceability and compliance; every decision is recorded |&lt;br&gt;
| Kubernetes-native | Adoption; the platform integrates with existing infrastructure |&lt;/p&gt;

&lt;p&gt;When evaluating governance solutions, test each principle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If a solution requires you to default to allowing communication and only block specific interactions, &lt;strong&gt;it fails Principle 1.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;If it requires naming agents in policies, &lt;strong&gt;it fails Principle 2.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;If it relies on static API keys or long-lived tokens, &lt;strong&gt;it fails Principle 3.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;If it doesn’t produce correlated audit trails automatically, &lt;strong&gt;it fails Principle 4.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;If it needs its own control plane outside Kubernetes, &lt;strong&gt;it fails Principle 5.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The right solution delivers all five. Because accountability requires nothing less.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently asked questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What’s the difference between default-deny and zero-trust?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Default-deny is a policy posture — no communication unless explicitly permitted. Zero-trust is an identity posture — every identity must be verified, every time. They reinforce each other but aren’t interchangeable. A platform with zero-trust identity but default-allow policy is still ungoverned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why does Kubernetes-native matter for AI agent accountability?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Because adoption is the difference between a governance platform that works and one that gets shelved. If your platform team has to learn a new control plane, run a parallel deployment pipeline, or operate a proprietary policy engine, the governance layer becomes a bottleneck — and ungoverned agents start showing up because the official path is too slow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I build this myself with SPIFFE, OPA, and OpenTelemetry?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Technically yes. Practically, you’ll spend 6–12 months on the &lt;a href="https://www.tigera.io/blog/calculating-the-kubernetes-integration-tax-what-your-diy-networking-stack-actually-costs/" rel="noopener noreferrer"&gt;integration glue&lt;/a&gt;, audit correlation across multi-hop chains, dual identity verification, attribute-based policy modeling, and the human oversight surface. We covered the build-vs-buy tradeoff in &lt;a href="https://www.tigera.io/blog/the-ai-agent-accountability-gap-why-network-policies-api-gateways-and-rbac-are-not-enough/#diy-security-patterns-four-tools-no-unified-policy-layer" rel="noopener noreferrer"&gt;post 3 of this series&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are these principles specific to Tigera Lynx?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
No. These are architectural principles for any accountable agent governance platform — whether commercial, open source, or homegrown. We use them ourselves to evaluate Lynx, and we’d encourage you to use them to evaluate every option you consider.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Default-deny&lt;/strong&gt; is the only safe starting posture. Anything else leaves ungoverned interactions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attribute-based policy&lt;/strong&gt; is the principle that lets governance scale past 100 agents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero-trust identity&lt;/strong&gt; must verify both the workload (is this the right agent?) and the user (on whose behalf is it acting?).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit by design&lt;/strong&gt; means audit records are a byproduct of enforcement, not a separate system.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kubernetes-native&lt;/strong&gt; ensures the platform actually gets adopted instead of bypassed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get the strategic guide for accountable AI agents
&lt;/h2&gt;

&lt;p&gt;We wrote a strategic guide, &lt;a href="https://info.tigera.io/rs/805-GFH-732/images/Whitepaper_Accountability_for_AI_Agents.pdf" rel="noopener noreferrer"&gt;Accountable AI Agents: A Strategic Guide for AI &amp;amp; Security Leaders Governing Autonomous AI at Scale&lt;/a&gt;, that walks through these principles in depth, including a side-by-side comparison of common governance approaches and how they score against each principle.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://info.tigera.io/rs/805-GFH-732/images/Whitepaper_Accountability_for_AI_Agents.pdf" rel="noopener noreferrer"&gt;Get the strategic guide for accountable AI agents →&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/five-principles-of-an-accountable-ai-agent-network-how-to-evaluate-any-governance-platform/" rel="noopener noreferrer"&gt;Five Principles of an Accountable AI Agent Network: How to Evaluate Any Governance Platform&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>aiagentsecurity</category>
      <category>products</category>
    </item>
    <item>
      <title>Multi-Layer Policy for Securing AI Agents</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Wed, 03 Jun 2026 18:58:09 +0000</pubDate>
      <link>https://dev.to/tigeraio/multi-layer-policy-for-securing-ai-agents-4h95</link>
      <guid>https://dev.to/tigeraio/multi-layer-policy-for-securing-ai-agents-4h95</guid>
      <description>&lt;p&gt;As part of our work at Tigera building products that create secure runtime environments for enterprise agents at scale in the real world, one small part of this puzzle I think about a lot is policy, and runtime enforcement of policy, and how to create a comprehensive secure runtime, configured from one place. The more companies we talk to trying to lock down and secure these platforms at runtime, the more I believe &lt;a href="https://www.tigera.io/learn/guides/ai-agent-security/" rel="noopener noreferrer"&gt;AI Agent security&lt;/a&gt; needs policy in multiple places, not just one (e.g., not just at the gateway layer), and ideally expressed in the same policy language.&lt;/p&gt;

&lt;p&gt;At the L7 gateway layer, every agent call is observable: who is calling, what they are calling, what attributes both sides carry, what the requested action is. This is where you decide whether an agent should be permitted to talk to a particular MCP server, invoke a particular tool, delegate to another agent, or call a particular LLM. The atoms of policy here are identity, action, resource, and context.&lt;/p&gt;

&lt;p&gt;At the agent runtime layer, or kernel layer in a container, what the agent does inside its own runtime is observable: syscalls, file access, library loads, network connections that bypass the brokered channel. This is where you decide whether the agent can read a file, open a socket, spawn a subprocess, or load a library. The atoms of policy here are processes, paths, file descriptors, and system calls.&lt;/p&gt;

&lt;p&gt;Both layers are necessary. The gateway alone cannot constrain what an agent does inside its runtime once it holds a token. The kernel alone cannot reason about identity, delegation, or multi-hop intent. Building policy at one and not the other leaves a category gap.&lt;/p&gt;

&lt;p&gt;The architectural choice that makes this work in practice is using one policy language for both. We use Cedar at the gateway and interpret and translate Cedar to &lt;a href="https://docs.tigera.io/calico/latest/about/kubernetes-training/about-ebpf" rel="noopener noreferrer"&gt;eBPF&lt;/a&gt; policy for the kernel: same policies, two enforcement points, one place to author and review.&lt;/p&gt;

&lt;h2&gt;
  
  
  Policy at the gateway: enforcing agent intent
&lt;/h2&gt;

&lt;p&gt;The gateway sees intent. It is the right place to enforce &lt;em&gt;who can talk to whom, under what conditions, with what level of human oversight.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A Cedar policy that constrains which agents can invoke which tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rego"&gt;&lt;code&gt;&lt;span class="n"&gt;permit&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;principal&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt; &lt;span class="n"&gt;Group&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"finance-agents"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Action&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"invokeTool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ToolSet&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"finance-readonly"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;when&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;principal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;risk_level&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"low"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
  &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delegation_depth&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This policy expresses several things that are hard to model in RBAC or in a network policy. The principal is identified by group membership but constrained by attribute (&lt;code&gt;risk_level&lt;/code&gt;). The resource is a typed set of tools. The condition includes a check on delegation depth; agents three hops deep in a delegation chain are refused even if they pass every other check.&lt;/p&gt;

&lt;p&gt;The gateway layer naturally enforces delegation rules, per-hop token issuance with scope reduction, agent-to-MCP tool authorization, agent-to-LLM constraints, human-in-the-loop hooks for high-stakes actions, and attribute-based decisions across all of these.&lt;/p&gt;

&lt;p&gt;What the gateway cannot do is constrain what happens after it issues a token. Once the agent has the credential, the kernel is the only layer that sees what the process actually does with it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Policy at the kernel: constraining agent behaviour
&lt;/h2&gt;

&lt;p&gt;The kernel sees behaviour. It is the right place to enforce &lt;em&gt;what an agent process is allowed to do at the operating system level&lt;/em&gt;, regardless of what tokens it holds.&lt;/p&gt;

&lt;p&gt;A baseline sandbox for an agent workload, expressed conceptually in the same Cedar policy model and compiled to BPF programs at runtime:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rego"&gt;&lt;code&gt;&lt;span class="n"&gt;permit&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;principal&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt; &lt;span class="n"&gt;AgentClass&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"data-analyst"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Action&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"readFile"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Action&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"writeFile"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="n"&gt;is&lt;/span&gt; &lt;span class="n"&gt;FilePath&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;when&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="n"&gt;like&lt;/span&gt; &lt;span class="s2"&gt;"/workspace/analyst-*"&lt;/span&gt; &lt;span class="p"&gt;||&lt;/span&gt;
  &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"/var/run/secrets/analyst-key"&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="n"&gt;forbid&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;principal&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt; &lt;span class="n"&gt;AgentClass&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"data-analyst"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;Action&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"connectNetwork"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="n"&gt;is&lt;/span&gt; &lt;span class="n"&gt;NetworkDestination&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;unless&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt; &lt;span class="n"&gt;in&lt;/span&gt; &lt;span class="n"&gt;DestinationSet&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="s2"&gt;"approved-llm-endpoints"&lt;/span&gt; &lt;span class="p"&gt;||&lt;/span&gt;
  &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"lynx-gateway.internal"&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The compilation target is BPF LSM hooks, cgroup network hooks, and file access enforcement at the inode boundary. When the agent process steps outside what the policy permits, the kernel refuses the operation – &lt;code&gt;EPERM&lt;/code&gt; for the syscall, &lt;code&gt;ECONNREFUSED&lt;/code&gt; for the network connection, &lt;code&gt;ENOENT&lt;/code&gt; for the file access. The agent gets the same error it would get for any prohibited operation, regardless of what credentials it holds.&lt;/p&gt;

&lt;p&gt;The kernel layer naturally enforces file access boundaries, network egress restrictions, syscall whitelisting, library load constraints, and process-spawn limits. The same observation pipeline that feeds enforcement also feeds threat detection.&lt;/p&gt;

&lt;p&gt;What the kernel cannot do is reason about why an action is being attempted. It sees a &lt;code&gt;connect()&lt;/code&gt; system call. It does not know whether the call is part of a legitimate multi-hop delegation that the gateway already authorized. That context only exists at the L7 layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  The dual-layer architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs685pbznwc9qm61rs2vm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs685pbznwc9qm61rs2vm.png" width="800" height="492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The architectural integration matters as much as either layer in isolation. Cedar policies authored once, evaluated at the gateway, compiled to BPF for kernel enforcement. The compilation is not magical—only the substrate-relevant subset of Cedar policies compiles. The rest stay at the gateway. Either way, security teams write Cedar; the runtime decides which layer is the right one to enforce at.&lt;/p&gt;

&lt;p&gt;This integration is what makes the dual-layer approach operationally sustainable. Without one policy language, you end up with two policy stores, two review processes, two engineering teams, and inevitable divergence between what the gateway permits and what the kernel allows. With Cedar at both layers, the policy you wrote is the policy that gets enforced everywhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why single-layer policy isn’t enough for AI agent security
&lt;/h2&gt;

&lt;p&gt;Policy at the gateway alone defends against unauthorized callers and out-of-scope actions. It does not defend against a compromised agent that has a legitimate token and uses it to do things outside its intended behaviour, like read credential files, exfiltrate data through side channels, and escalate privilege inside its runtime.&lt;/p&gt;

&lt;p&gt;Policy at the kernel alone defends against process-level misbehaviour. It does not understand identity or delegation, cannot reason about whether a network connection is part of a legitimate multi-hop chain, and has no way to enforce human-in-the-loop approval flows.&lt;/p&gt;

&lt;p&gt;Combined, the two layers cover the threat model that either layer alone misses. A compromised agent with a legitimate token can still call out through the gateway, but its local actions are constrained by the kernel sandbox. A misconfigured Cedar policy at the gateway is mitigated by the substrate baseline. A shadow agent that never registered is observed and contained at the kernel.&lt;/p&gt;

&lt;p&gt;For Kubernetes-native enterprises building agent infrastructure into regulated workloads, this is the architecture worth building toward. Gateway policy for what agents are allowed to ask for. Kernel policy for what they are allowed to do. Same language for both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Going deeper
&lt;/h2&gt;

&lt;p&gt;Multi-layer policy is one piece of a larger problem: making AI agent infrastructure accountable end-to-end. Traceability, authorization provenance, identity and ownership, policy-based governance at scale, and human oversight and intervention—they all have to work together.&lt;/p&gt;

&lt;p&gt;Read: &lt;a href="https://www.tigera.io/blog/the-five-pillars-of-ai-agent-accountability-a-diagnostic-framework-for-engineering-leaders/" rel="noopener noreferrer"&gt;The Five Pillars of AI Agent Accountability →&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/multi-layer-policy-for-securing-ai-agents/" rel="noopener noreferrer"&gt;Multi-Layer Policy for Securing AI Agents&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>aiagentsecurity</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>What’s new in Calico: Spring 2026 Release</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Tue, 02 Jun 2026 16:10:41 +0000</pubDate>
      <link>https://dev.to/tigeraio/whats-new-in-calico-spring-2026-release-1lgg</link>
      <guid>https://dev.to/tigeraio/whats-new-in-calico-spring-2026-release-1lgg</guid>
      <description>&lt;p&gt;Kubernetes has come a long way since its debut in 2014. It’s gone from running a couple of containerized microservices to orchestrating fleets of production workloads spanning everything from AI agents to full scale VMs running in pods. As Kubernetes adoption grows, and its use cases stretch to cover more ground, managing its increasingly complex networking and security landscape demands operational maturity and a platform that supports it.&lt;/p&gt;

&lt;p&gt;The Spring 2026 release of Calico provides that support in two key areas:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unified operations across Kubernetes pods and VMs&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;KubeVirt Live Migration in Bridge Mode&lt;/strong&gt; allows you to migrate VM workloads with IPs preserved, minimal packet loss, and fast route convergence. VMs can move between nodes for planned maintenance, load balancing and to support high availability without interrupting network connectivity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Egress Gateway Layer 2 Advertisements&lt;/strong&gt; (Enterprise exclusive) lets pod traffic egress with IPs from the host’s own subnet so workloads get a stable identity the rest of your network already recognizes eliminating the need for BGP Peering to advertise Egress Gateway IPs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy recommendations for VMs and hosts&lt;/strong&gt; (Enterprise exclusive) automates and scales policy authoring for Calico-managed workloads running outside of your Kubernetes clusters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenStack Live Migration Improvements&lt;/strong&gt; lets you migrate VM workloads running in high availability OpenStack environments with minimal risk of service disruption during maintenance. Preloading policies on the target node keeps downtime inside the single-digit-second SLOs regulated workloads require.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Production-grade operations at scale&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Whisker Policy Verdict and UI Improvements&lt;/strong&gt; reveal connectivity blockers in minutes by letting you see the actual tier, policy, and rule that denied a flow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calico Load Balancer – Maintenance Mode&lt;/strong&gt; (Enterprise exclusive) supports graceful node maintenance by excluding backends on nodes marked for maintenance from new Maglev assignments, allowing existing connections to drain naturally. Operators can monitor active connections via Prometheus metrics to determine when it is safe to proceed with node maintenance&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What’s new in Calico Open Source v3.32
&lt;/h2&gt;

&lt;p&gt;Two new noteworthy features headline this release: Kubevirt Live Migration and Whisker UI improvements.&lt;/p&gt;

&lt;h3&gt;
  
  
  KubeVirt Live Migration in Bridge Mode
&lt;/h3&gt;

&lt;p&gt;Running VMs in Kubernetes comes with many challenges, among them the need to preserve a VMs IP during live migration so that network traffic can continue uninterrupted. One way to handle this is with Multus and a bridge CNI, statically configuring the VM’s IP and plumbing it directly into the underlay. That preserves the IP, but the VMs sit outside of Calico which means no microsegmentation, no observability and no shared tooling with pods running alongside these VMs. With Calico v3.32, Calico IPAM assigns persistent IPs to KubeVirt VMs. The IP survives live migration and pod restarts and can be advertised upstream over BGP. VMs share the same Kubernetes-native pod network as containers, with the same CNI, policies, observability, load balancing, QoS, and Layer 7 traffic management.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F78gpfj5q82jo7im75jj1.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F78gpfj5q82jo7im75jj1.jpg" alt="Live migration in bridge mode ships as a tech preview in OSS v3.32 and moves to production GA in the June release." width="800" height="438"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Live migration in bridge mode ships as a tech preview in Calico Open Source v3.32 and moves to production GA in the June release.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits of KubeVirt Live Migration in Bridge Mode:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Migrate VMs With Live Connections:&lt;/strong&gt; Ensure long-lived TCP sessions such as database queries stay connected across the migration so applications don’t have to reconnect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep VM Workloads Reachable During Maintenance:&lt;/strong&gt; Live migrate VMs to new nodes without blocking user access to applications.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor VM Migrations in a Shared Dashboard:&lt;/strong&gt; Track live-migration success rates, duration, and post-move network metrics in the same place you track pod activity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run One Network, Not Two:&lt;/strong&gt; Stop maintaining parallel networking layers with VMs sharing the CNI, policy framework, and observability stack with your pod workloads.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Scenario: Live Migration That Keeps VMs on the Pod Network
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Situation:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A financial services enterprise is consolidating its virtualization estate onto KubeVirt on Kubernetes. The VM count sits in the six figures across dozens of clusters. Live migration is part of routine operations: VMs move between nodes during patching, capacity rebalancing, and host failures. The current workaround is Multus and a bridge CNI plumbed into the underlay, which keeps the IP through the move but leaves the VMs outside Calico’s pod network. The platform team would like to implement microsegmentation and observability for VMs as they do for containerized applications.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;The Calico Solution:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Calico IPAM assigns each KubeVirt VM a persistent IP that survives live migration and pod restarts, advertised to the upstream network over BGP. Every VM runs on the same Kubernetes-native pod network as the containers next to it, with the same network policies, observability, load balancing, QoS, and Layer 7 traffic management. When nodes go down for maintenance, VMs move and connections survive. The microsegmentation and observability story stays intact.&lt;/p&gt;

&lt;h3&gt;
  
  
  Whisker Policy Verdict and UI Improvements
&lt;/h3&gt;

&lt;p&gt;Knowing a flow was blocked by policy is a good start to troubleshooting a connection problem. It does not, however, answer the more important question: what policy is responsible and why? Without knowing the reason a flow is denied, the problem cannot be fixed and tracing a flow’s journey across multiple policy tiers and rules can be unreliable and time consuming, potentially prolonging an outage.&lt;/p&gt;

&lt;p&gt;The Whisker updates in v3.32 put the verdict, the matching policy, and the full tier chain right in the flow log view. See all the policies that were invoked by drilling down into a flow. Filter by policy kind, tier, namespace and policy name to find out which flows selected policies take action on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits of Whisker Verdict Improvements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;See the Policy Kind, Tier, and Rule Behind Every Verdict:&lt;/strong&gt; Surface the full evaluation chain, not just the allow/deny decision.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter by Verdict or Policy:&lt;/strong&gt; Narrow the flow log view to just denies or filter by kind, tier, namespace and name, or any combination, to see which flows a set of policies affects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Close Policy-Denial Tickets in Minutes:&lt;/strong&gt; Reduce the troubleshooting path from a lengthy and painstaking analysis of policy layers to a thirty-second click into the matching rule.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Let Application Teams Self-Serve:&lt;/strong&gt; Trace your team’s own policy denies without waiting on the platform team.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Scenario: The Five-Minute Incident That Used to Take an Hour
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Situation:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A developer on the web-app team opens a ticket: their new service can’t reach the payment service. An on-call platform engineer pulls up Whisker, sees the flow was denied, and starts the usual investigation, checking tiers, scanning policies and cross-referencing rules, while walking the developer through each step. Forty minutes later, they find the issue: the payment tier has a default-deny policy that doesn’t include web-app in its allowed-set.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;The Calico Solution:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
With the Whisker verdict view, the platform engineer opens the flow log, filters by denied flows for the web-app service, and clicks the first matching row. The verdict panel immediately shows the tier, policy, and rule that produced the deny with enough context to describe the fix. The incident is resolved in five minutes, and the ticket closes with a clear remediation path. The platform engineer then stages the fixed policy and then in Whisker filters by kind, tier and policy name to see if any other flows will be affected, averting potential problems.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.tigera.io%2Fapp%2Fuploads%2F2026%2F06%2FWhats-new-in-Calico-Spring-2026-Release-2.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.tigera.io%2Fapp%2Fuploads%2F2026%2F06%2FWhats-new-in-Calico-Spring-2026-Release-2.gif" alt="Click a denied flow to see the tier, the policy, and the rule that produced the verdict." width="640" height="362"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Click a denied flow to see the tier, the policy, and the rule that produced the verdict.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  ClusterNetworkPolicy: Cluster-Wide Policy Goes Standard
&lt;/h3&gt;

&lt;p&gt;Calico has had GlobalNetworkPolicy for years, cluster-scoped policy that sits above namespace boundaries and gives platform teams a place to define org-wide guardrails, default-deny baselines, and cross-namespace controls. The Kubernetes SIG-Network ClusterNetworkPolicy spec is the upstream community’s version of the same idea, and Calico Open Source v3.32 implements it.&lt;/p&gt;

&lt;p&gt;While this is more housekeeping than a headline feature, it has two important implications. First, for the Kubernetes community, Calico’s conformant implementation keeps the spec moving and helps cement cluster-wide policy as a first-class part of the standard. Second, for platform teams already running Calico, ClusterNetworkPolicy provides the same cluster-level control surface as GlobalNetworkPolicy, but utilizes the standard upstream API. This means that tooling built around the spec remains reusable and consistent, regardless of the underlying network implementation.&lt;/p&gt;

&lt;p&gt;If you’ve been using GlobalNetworkPolicy in your policy pipelines, you don’t need to do anything; everything keeps working. If you’re starting fresh or building tooling that needs to work across multiple CNIs, ClusterNetworkPolicy is now an option to consider.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits of ClusterNetworkPolicy:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Define Policy Cluster-Wide With the Standard API:&lt;/strong&gt; Use the upstream SIG-Network ClusterNetworkPolicy spec at the cluster level, no vendor-specific CRD required.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adopt the Standard Without Re-Learning:&lt;/strong&gt; ClusterNetworkPolicy mirrors GlobalNetworkPolicy in shape and behavior, so platform teams already running Calico’s cluster-scoped policy keep the same mental model and tooling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stay Aligned With Where Kubernetes Is Heading:&lt;/strong&gt; Calico’s early implementation moves the SIG-Network ClusterNetworkPolicy spec toward general adoption, cementing cluster-wide policy as a first-class Kubernetes concept.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fthjmtlub8g9r5gru59ml.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fthjmtlub8g9r5gru59ml.jpg" alt="Cluster-wide network policy scope, now in the standard upstream API" width="800" height="395"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Cluster-wide network policy scope, now in the standard upstream API&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenStack Live Migration Improvements
&lt;/h3&gt;

&lt;p&gt;Calico’s route management work in v3.32 closes the gap that’s kept regulated workloads out of OpenStack live migration. By preloading network policies on the target node ahead of a migration, traffic resumes the moment the VM lands instead of waiting for the network to catch up. This solution, which leverages the same route management code that powers KubeVirt Bridge-Mode live migration, addresses the pain of migration for specific industries that measure downtime in single-digit seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits of OpenStack Live Migration Improvements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Migrate Within Your Downtime SLO: Complete OpenStack live migrations within the single-digit-second SLOs that regulated workloads require.&lt;/li&gt;
&lt;li&gt;Live Migration During Active Hours: Run live migration without having to wait for off-hours maintenance windows.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Scenario: Migrating a Trading Workload During Market Hours
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Situation:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A regulated financial-data provider runs a trading workload on OpenStack with a single-digit-second downtime SLO for live migrations. Their current KVM live migration routinely stalls long enough to violate it. The platform team has been limited to performing host maintenance during narrow after-hours windows, and some migrations have simply been deferred indefinitely.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;The Calico Solution:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
After upgrading to Calico v3.32, the team measures live-migration downtime against their reference workload and finds it consistently within SLO. Host maintenance is now possible during trading hours. Deferred migrations can be scheduled and completed without requiring an after-hours rotation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb0hdljemxwn0ij97gbcr.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb0hdljemxwn0ij97gbcr.jpg" alt="The node is ready when the VM arrives reducing downtime" width="800" height="426"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;The node is ready when the VM arrives reducing downtime&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Also in this release: Istio Ambient Mode comes to Calico Open Source
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Not new, but new here&lt;/strong&gt;. Calico Enterprise v3.22.1 bundled Istio Ambient Mesh in the Tigera Operator bringing the production hardened and one hundred percent upstream Istio images with sidecarless mTLS to the Calico stack.&lt;/p&gt;

&lt;p&gt;As of Calico Open Source v3.32, the same capability is available in the open-source edition. If your platform team is running Istio in sidecar mode, or has given up on service mesh because of its complexity and resource usage, Istio’s ambient mode is worth a second look. In ambient mode there are no sidecars to wrangle on every upgrade, no per-pod CPU and memory overhead, and a much smaller surface to patch when the next CVE lands.&lt;/p&gt;

&lt;p&gt;For the full story including architecture, migration path, and a sidecar-tax deep dive, read the &lt;a href="https://www.tigera.io/blog/whats-new-in-calico-winter-2026-release/" rel="noopener noreferrer"&gt;Winter 2026 launch blog post&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s new in Calico Enterprise and Calico Cloud
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;KubeVirt Live Migration in Bridge Mode&lt;/strong&gt; that is part of Calico Open Source v3.32 is also available in Calico Enterprise where it arrives as a tech preview in v3.23 EP2. For organizations evaluating KubeVirt as their landing spot for VMs, this is the release that makes Calico a supported production target.&lt;/p&gt;

&lt;p&gt;Beyond KubeVirt, three Platform-exclusive capabilities help you achieve operational maturity at scale, keeping your policy estate clean, unifying management across cluster and non-cluster workloads, and running load-balancer maintenance without customer impact.&lt;/p&gt;

&lt;h3&gt;
  
  
  Last Evaluated Metrics, Now via API (Cloud and Enterprise)
&lt;/h3&gt;

&lt;p&gt;As customers extend microsegmentation across Kubernetes, the policy set grows sometimes into the thousands for large enterprises. Workloads change, applications change, and the policies that were essential six months ago may not match traffic anymore. Unused policies don’t announce themselves, they lurk, no longer evaluating traffic, but still on the books, a security and compliance risk that violates the least-privileged posture you’ve spent years building towards.&lt;/p&gt;

&lt;p&gt;The Winter 2026 release introduced the “Last Evaluated” metric to surface policies and rules that haven’t matched traffic within a configurable window. Spring 2026 adds API access. Platform teams can now query the metric programmatically and feed it into automated cleanup workflows, compliance reports, scheduled alerts, or command line utilities. The same data that supports a PCI DSS v4.1 audit conversation can now flow into a Prometheus alerting rule or a nightly cleanup-candidate report.&lt;/p&gt;

&lt;p&gt;One thing worth being explicit about: the metric tells you whether a policy is evaluating traffic, not whether it should still exist. Customers still make the call about what’s genuinely unused, based on knowledge of the workloads. The API uncovers the candidates. The platform team makes the decision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits of Last Evaluated Metrics:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automate Policy Hygiene:&lt;/strong&gt; Pipe Last Evaluated data into Prometheus alerts, scheduled reports, or any other workflow you already run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate Compliance Evidence on Demand:&lt;/strong&gt; Show auditors that every active rule is in use, the proof PCI DSS v4.1 and similar standards require.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Troubleshoot From the CLI:&lt;/strong&gt; Query last-evaluated state directly via terminal during an incident, no browser required.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decommission Unused Policies Without Guesswork:&lt;/strong&gt; Confidently clean up unused policies, not only to maintain that least-privileged posture but to reduce etcd memory pressure and shorten policy-engine evaluation time.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Scenario: Pruning a Microsegmentation Estate at Scale
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Situation:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A large financial-services platform team has been running Calico for several years. Their policy set has grown to several thousand policies accumulated from successive microsegmentation projects, decommissioned services, and one-off tickets. PCI DSS v4.1 audit is approaching, and the auditor wants evidence that every active rule is actually serving a purpose. Manually reviewing several thousand rules isn’t feasible, and the team can’t safely delete what they don’t understand.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;The Calico Solution:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The platform team uses the Last Evaluated Metrics API to pull a list of policies and rules that haven’t matched traffic in the last 90 days. They route the output to a CSV, distribute it to the owning teams, and ask each team to confirm or contest each candidate. Within two weeks the policy set is several hundred rules smaller and the auditor gets the evidence trail directly from the metric output, not from a manual investigation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvsez97k4u4zgyanu8w8n.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvsez97k4u4zgyanu8w8n.jpg" alt="Automate your least-privileged posture" width="800" height="395"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Automate your least-privileged posture&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Egress Gateway Layer 2 Advertisements
&lt;/h3&gt;

&lt;p&gt;With Egress Gateway Layer 2 Advertisements to Calico 3.23 EP2 eliminates the need for cluster-specific egress IP pools and for BGP peering with ToR switches. You can now assign addresses from the hosts subnet to egress gateways, SNAT egress traffic to the gateway’s host node IP and forward packets using ARP. This means less reliance on coordinating with the network team, more efficient use of routable IP addresses and simplified firewall rules for reduced operational overhead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits of Egress Gateway Layer 2 Advertisements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reduce the Need for Coordination with the Network Team:&lt;/strong&gt; Allocate IPs to new egress gateways without extensive intervention by the networking team significantly increasing deployment velocity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Forward Packets Using ARP:&lt;/strong&gt; Decrease operational overhead doing away with BGP session on top-of-rack switches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid Depleting Routable IPs in Large Environments:&lt;/strong&gt; Configure a shared set of allow-listed IPs rather than a per-tenant pool preserving scarce routable IPs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maintain One Firewall Ruleset:&lt;/strong&gt; Pod egress IPs come from the host’s own subnet, so the firewall team works with the same address space it already maintains for hosts and VMs making firewall configuration and ongoing maintenance much simpler.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbxdm19ic1mocboi7x7eg.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbxdm19ic1mocboi7x7eg.jpg" alt="Pod egress lives in the same address space your network team already maintains for hosts and VMs" width="799" height="455"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Pod egress lives in the same address space your network team already maintains for hosts and VMs&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario: Cluster Scale-Up Without a Firewall Ticket
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Situation:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A financial services platform team exposes a set of cluster services to external partner systems through a corporate firewall. Pod egress traffic uses IPs from a cluster-managed pool that the network team registers in the firewall ruleset. Every time the platform team scales the cluster, the pool changes, the firewall ruleset needs updating, and a change-control ticket flows between the two teams. They meet monthly to reconcile drift.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;The Calico Solution:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Egress Gateway Layer 2 Advertisements moves pod egress identity into the host’s own subnet. Pod traffic now exits the cluster using a uniquely identifiable IP address from the host’s routable subnet, which can be allowed by the network firewall. Cluster scale-ups stop triggering firewall changes. The reconciliation meeting comes off the calendar.&lt;/p&gt;

&lt;h3&gt;
  
  
  Policy Recommendations for VMs and Hosts
&lt;/h3&gt;

&lt;p&gt;Calico’s policy recommendations engine has been a valuable tool in a platform engineers arsenal giving teams a head start authoring policies for Kubernetes pods. Until now, however, they could not take advantage of this productivity boost when it came to hosts running outside a cluster. A new VM or bare-metal workload meant manually combing through flow logs and hand-authoring policies which, at scale, often became a significant microsegmentation bottleneck. Policy Recommendations for VMs and Hosts extends the policy recommendation engine to non-cluster workloads. As of v3.23 EP2, Calico observes traffic to and from VMs and bare-metal hosts generating recommended starting policies just as it does for the workloads running in your cluster. The same review-and-apply process platform engineers use for pods now applies to every workload Calico manages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits of Policy Recommendations for VMs and Hosts:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dispense with Hand-Rolling Policies for VMs and Hosts:&lt;/strong&gt; Calico generates starting points for non-cluster workloads from observed traffic, the same way it does for pods.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scale Microsegmentation Across the Whole Estate:&lt;/strong&gt; Bring least-privilege policies to hundreds or thousands of non-cluster workloads without writing each one by hand.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use One Authoring Workflow for Every Workload:&lt;/strong&gt; Work with the same tooling and the same review pattern across pods, VMs, and bare-metal hosts.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Scenario: Microsegmenting a Thousand VMs Without a Thousand Authoring Tasks
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Situation:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A telco runs Kubernetes workloads for 5G edge services alongside thousands of VMs for legacy signaling systems. The platform team has automated policy recommendations for pods, but every new VM workload comes with a manual policy-authoring task. The team cannot keep pace with the VM side, so default policies on VMs trend toward permissive over time.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;The Calico Solution:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
With Policy Recommendations for VMs and Hosts, the team’s existing recommendation workflow now covers VMs and bare-metal workloads. Recommendations come in based on observed traffic. The team reviews and applies them at the same rate they already review and apply pod recommendations. Microsegmentation extends across the entire estate without doubling the authoring workload.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5esj7ph1sz4gwxjxz9oc.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5esj7ph1sz4gwxjxz9oc.jpg" alt="One review-any-apply workflow across pods, VMs and bare-metal hosts" width="799" height="381"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;One review-any-apply workflow across pods, VMs and bare-metal hosts&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Calico Load Balancer – Maintenance Mode (Enterprise Exclusive)
&lt;/h3&gt;

&lt;p&gt;Choosing a software load balancer was already the right call for platform teams who wanted declarative service exposure and consistent-hash session affinity, capabilities Calico Load Balancer has delivered since v3.23 EP1.&lt;/p&gt;

&lt;p&gt;With v3.23 EP2, the call gets easier. The fast, predictable failover that a pair of hardware load balancers in HA handles cleanly is now native to Calico’s software LB and ready to take over from that expensive 2018 LB you thought you had to replace. Calico Load Balancer now supports label-based node exclusion. Setting &lt;code&gt;maglev.tigera.io/exclude=true&lt;/code&gt; on a node tells Calico Load Balancer to stop forwarding new connections to the backends the node hosts while keeping existing sessions flowing until they complete naturally. Prometheus metrics expose per-node active session counts so operators can watch them decline to zero before proceeding with the drain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Benefits of Graceful Maglev Session Handling:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Patch Nodes During Business Hours:&lt;/strong&gt; Take nodes out of load-balancer rotation for kernel patches, kubelet upgrades, or hardware work without scheduling around customer traffic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Drain a Node With a Single Label:&lt;/strong&gt; Set &lt;code&gt;maglev.tigera.io/exclude=true&lt;/code&gt; on a node and Calico Load Balancer stops forwarding new connections to its backends, with no custom scripts or out-of-band coordination.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Drain Without Forcing Disconnects:&lt;/strong&gt; Active sessions on the excluded node keep flowing until they complete naturally so maintenance doesn’t cut off in-flight work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Know When It’s Safe to Drain:&lt;/strong&gt; Prometheus metrics expose per-node session counts so operators can watch them decline to zero before proceeding with maintenance.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Scenario: Maintenance That Customers Never Notice
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Situation:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Scheduled maintenance on a node serving live customer traffic has always been a balancing act. Take the node out of rotation too early and customers with in-flight transactions get cut off mid-session. Wait too long and the maintenance window slips. Most teams have either accepted some level of session disruption or built bespoke tooling to coordinate their load balancer’s health checks with the drain workflow.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;The Calico Solution:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The platform engineer labels the node with &lt;code&gt;maglev.tigera.io/exclude=true&lt;/code&gt;. From that moment, Calico routes new connections to backends elsewhere in the cluster. Existing sessions on the excluded node keep flowing until they complete, so customers with in-flight transactions finish them naturally. The engineer watches per-node session counts in Prometheus, and when the count reaches zero, drains the node. The maintenance happens. The customers don’t notice.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr1xw2mzxvjrvotal4k79.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr1xw2mzxvjrvotal4k79.jpg" alt="Same fast, predictable failover as hardware load balancers but Kubernetes native" width="799" height="509"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Same fast, predictable failover as hardware load balancers but Kubernetes native&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started with Calico Spring 2026
&lt;/h2&gt;

&lt;p&gt;The Spring 2026 release closes some critical Day 2 operations gaps unifying operations across Kubernetes pods and VMs, collapsing two operational worlds into one network, one policy model and one observability stack. It removes long-standing operational friction and clears the way for scaling infrastructure securely and efficiently helping teams take that next step towards Kubernetes operational maturity.&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Environment&lt;/strong&gt; | &lt;strong&gt;Action Required&lt;/strong&gt; | &lt;strong&gt;Documentation Link&lt;/strong&gt; |&lt;br&gt;
| Calico Open Source | Upgrade to Calico v3.32 | &lt;a href="https://docs.tigera.io/calico/latest/release-notes/" rel="noopener noreferrer"&gt;Calico Open Source release notes&lt;/a&gt; |&lt;br&gt;
| Calico Enterprise | Upgrade to Enterprise v3.23 EP2 | &lt;a href="https://docs.tigera.io/calico-enterprise/3.23/getting-started/upgrading/" rel="noopener noreferrer"&gt;Upgrade Calico Enterprise documentation&lt;/a&gt; |&lt;br&gt;
| Calico Cloud | Follow instructions to update connected clusters | &lt;a href="https://docs.tigera.io/calico-cloud/get-started/upgrade-cluster" rel="noopener noreferrer"&gt;Upgrade Calico Cloud instructions&lt;/a&gt; |&lt;/p&gt;

&lt;p&gt;To learn more about these new product capabilities and see them in action, &lt;a href="https://www.tigera.io/demo/" rel="noopener noreferrer"&gt;schedule a demo&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/whats-new-in-calico-spring-2026-release/" rel="noopener noreferrer"&gt;What’s new in Calico: Spring 2026 Release&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>companyblog</category>
      <category>technicalblog</category>
      <category>opensource</category>
      <category>products</category>
    </item>
    <item>
      <title>The AI Agent Accountability Gap: Why Network Policies, API Gateways, And RBAC Are Not Enough</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Wed, 27 May 2026 18:45:39 +0000</pubDate>
      <link>https://dev.to/tigeraio/the-ai-agent-accountability-gap-why-network-policies-api-gateways-and-rbac-are-not-enough-49b8</link>
      <guid>https://dev.to/tigeraio/the-ai-agent-accountability-gap-why-network-policies-api-gateways-and-rbac-are-not-enough-49b8</guid>
      <description>&lt;p&gt;In &lt;a href="https://www.tigera.io/blog/the-five-pillars-of-ai-agent-accountability-a-diagnostic-framework-for-engineering-leaders/" rel="noopener noreferrer"&gt;The Five Pillars of AI Agent Accountability: A Diagnostic Framework for Engineering Leaders&lt;/a&gt;, we walked through each pillar of AI agent accountability (traceability, authorization provenance, identity and ownership, policy at scale, and human oversight) and argued that most enterprises today sit at Level 0 or Level 1 of the Accountability Maturity Model.&lt;/p&gt;

&lt;p&gt;The most common reaction we get when we share that framework is some version of: &lt;strong&gt;“We’re already covered. We have network policies. We have an API gateway. We have RBAC.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This article is for that reaction.&lt;/p&gt;

&lt;p&gt;Enterprises aren’t starting from zero. Most have invested in security, networking, and identity infrastructure that works well for traditional workloads. The problem isn’t a lack of tools. It’s that existing tools were &lt;a href="https://www.paloaltonetworks.com/cyberpedia/what-is-agentic-ai-governance" rel="noopener noreferrer"&gt;designed for model outputs, not autonomous actions&lt;/a&gt;; a world where services are deterministic, communication patterns are predictable, and humans make all the decisions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/learn/guides/ai-agent-security/agentic-ai-security/" rel="noopener noreferrer"&gt;Agentic AI&lt;/a&gt; breaks every one of those assumptions. Here’s where the most common approaches each leave a critical accountability gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Network policies: the wrong abstraction level
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;Kubernetes Network Policies&lt;/a&gt; are essential for securing any cluster. They restrict which pods can communicate with which other pods at the network level, and they should absolutely be part of your security posture.&lt;/p&gt;

&lt;p&gt;But network policies operate at the wrong abstraction level for agent accountability. They can say &lt;em&gt;“pods in namespace A can reach pods in namespace B.”&lt;/em&gt; They cannot say &lt;em&gt;“Agent A with risk-level=low can only call agents with risk-level=low.”&lt;/em&gt; They have no concept of agent identity, capabilities, or policy attributes.&lt;/p&gt;

&lt;p&gt;More critically, network policies produce &lt;strong&gt;no audit trail&lt;/strong&gt;. When a connection is allowed, there’s no record of &lt;em&gt;why&lt;/em&gt; it was allowed; no policy name, no attribute match, no traceable decision. When your compliance team asks “&lt;em&gt;was this interaction authorized by policy?”&lt;/em&gt; a network policy gives you nothing to show them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accountability gap:&lt;/strong&gt; No agent-level authorization, no audit trail, no provenance.&lt;/p&gt;

&lt;h2&gt;
  
  
  API gateways: built for north-south, not agent-to-agent
&lt;/h2&gt;

&lt;p&gt;API gateways (e.g. NGINX, Kong, Envoy, cloud-native gateways) are designed for request routing, rate limiting, and basic authentication. They work well for north-south traffic: external clients accessing internal services.&lt;/p&gt;

&lt;p&gt;But agent-to-agent communication is east-west traffic between internal services, often with complex multi-hop chains. API gateways don’t understand agent identities, don’t evaluate agent-specific policies, and don’t produce agent-aware audit trails that correlate across multiple hops.&lt;/p&gt;

&lt;p&gt;An API gateway can tell you &lt;em&gt;“a request came from IP 10.0.3.47 and was routed to service X.”&lt;/em&gt; It can’t tell you &lt;em&gt;“Agent A (owned by the finance team, risk-level=medium) called Agent B (owned by the compliance team, capability=audit-query) and this was permitted by policy P-2847.”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That’s the level of detail your compliance team needs. An API gateway will never give it to them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accountability gap:&lt;/strong&gt; No agent identity awareness, no policy evaluation, no multi-hop trace correlation.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP and A2A protocols: communication without governance
&lt;/h2&gt;

&lt;p&gt;The Model Context Protocol (MCP) and &lt;a href="https://www.tigera.io/blog/how-ai-agents-communicate-understanding-the-a2a-protocol-for-kubernetes/" rel="noopener noreferrer"&gt;Agent-to-Agent (A2A) protocol&lt;/a&gt; represent major progress in standardizing agent communication. MCP standardizes how agents connect to tools. A2A standardizes how agents coordinate with each other.&lt;/p&gt;

&lt;p&gt;Both are important infrastructure. And both explicitly assume that someone else handles governance.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCP&lt;/strong&gt; solves the &lt;em&gt;how&lt;/em&gt; of tool access: A consistent protocol for discovering and calling tools. It does not solve the &lt;em&gt;who&lt;/em&gt;: which agents are allowed to access which tools, under what conditions, and with what audit trail.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A2A&lt;/strong&gt; solves the &lt;em&gt;how&lt;/em&gt; of agent coordination: Capability discovery, task delegation, lifecycle tracking. It does not solve the &lt;em&gt;who&lt;/em&gt;: which agents are allowed to delegate to which other agents, or who is accountable when a delegated task goes wrong.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These protocols are necessary but not sufficient. They are the plumbing, not the governance. Using MCP without agent governance is like having HTTP without authentication; the communication works, but anyone can call anything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accountability gap:&lt;/strong&gt; Protocols handle communication mechanics, not authorization, policy enforcement, or audit.&lt;/p&gt;

&lt;h2&gt;
  
  
  DIY security patterns: four tools, no unified policy layer
&lt;/h2&gt;

&lt;p&gt;The O’Reilly book, &lt;a href="https://www.oreilly.com/library/view/generative-ai-on/9781098171919/" rel="noopener noreferrer"&gt;Generative AI on Kubernetes (2026),&lt;/a&gt; documents four security patterns for securing MCP communication: token passthrough, service account delegation, OAuth2 token exchange, and mTLS with SPIFFE/SPIRE. Each pattern is sound on its own.&lt;/p&gt;

&lt;p&gt;The problem is that implementing all four creates four disconnected systems with no unified policy layer:&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Pattern&lt;/strong&gt; | &lt;strong&gt;What it does&lt;/strong&gt; | &lt;strong&gt;What it misses&lt;/strong&gt; |&lt;br&gt;
| Token passthrough | Propagates user identity through hops | No agent-level policy evaluation |&lt;br&gt;
| Service accounts | Authenticates workloads | Loses user attribution |&lt;br&gt;
| OAuth2 token exchange | Preserves both identities | Requires a separate token- exchange service to operate |&lt;br&gt;
| SPIFFE/SPIRE mTLS | Cryptographic workload identity | No knowledge of agent capabilities or team ownership |&lt;/p&gt;

&lt;p&gt;None of these patterns produce a correlated audit trail that spans the full agent interaction chain. None evaluate declarative policies based on agent attributes. None provide a dashboard for human oversight of agent communication patterns.&lt;/p&gt;

&lt;p&gt;Building accountability from these primitives is like building a car from raw steel, technically possible, but nobody should have to do it from scratch. We’ve seen platform teams sink six to twelve months of engineering into &lt;a href="https://www.tigera.io/blog/calculating-the-kubernetes-integration-tax-what-your-diy-networking-stack-actually-costs/" rel="noopener noreferrer"&gt;stitching this together&lt;/a&gt;, only to discover they still can’t answer the auditor’s question.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accountability gap:&lt;/strong&gt; Fragmented security, no unified policy layer, no correlated audit, significant engineering investment required.&lt;/p&gt;

&lt;h2&gt;
  
  
  RBAC alone: doesn’t survive agent #101
&lt;/h2&gt;

&lt;p&gt;Role-Based Access Control is the default model for most authorization systems. Assign agents to roles, grant roles permissions, done.&lt;/p&gt;

&lt;p&gt;RBAC works at a small scale. With 10 agents and 3 roles, the matrix is manageable. But RBAC requires explicit enumeration, where every agent must be assigned to a role, and every permission must be granted to a role. When you add agent #101, someone must decide which role it belongs to and update the bindings.&lt;/p&gt;

&lt;p&gt;More fundamentally, RBAC cannot express the nuanced policies that agent governance requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;“Agents with overlapping capabilities can communicate with each other.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;“Low-risk agents can call low-risk agents, but medium-risk agents can call both low and medium.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;“Agents on the same team can access that team’s MCP servers.”&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These attribute-based policies are natural to express in English but impossible to model cleanly in RBAC without an explosion of roles. By agent #200, the role matrix is unmaintainable and new agents start getting deployed without governance, exactly the shadow agent problem we covered in our previous blog post, ​​&lt;a href="https://www.tigera.io/blog/the-ai-agent-accountability-crisis-why-governance-isnt-keeping-up-with-deployment/#the-shadow-agent-problem" rel="noopener noreferrer"&gt;The AI Agent Accountability Crisis: Why Governance Isn’t Keeping Up With Deployment&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accountability gap:&lt;/strong&gt; Doesn’t scale, can’t express attribute-based policies, requires manual updates for every new agent.&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Approach&lt;/strong&gt; | &lt;strong&gt;What it does well&lt;/strong&gt; | &lt;strong&gt;Accountability gap&lt;/strong&gt; |&lt;br&gt;
| Kubernetes Network Policies | Pod-to-pod isolation | No agent identity, no audit trail |&lt;br&gt;
| API gateways | North-south request routing | No east-west, no policy correlation |&lt;br&gt;
| MCP / A2A protocols | Standardize agent communication | Communication, not governance |&lt;br&gt;
| DIY security patterns | Per-pattern soundness | Four disconnected systems, no unified policy |&lt;br&gt;
| RBAC | Simple at small scale | Doesn’t scale well with large amount of agents, no attribute policies |&lt;/p&gt;

&lt;h2&gt;
  
  
  The AI agent accountability layer is the missing piece
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Every existing approach covers part of the problem. None of them, alone or stacked together, deliver AI agent accountability. The missing piece is the unified layer above them.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The industry has solved agent &lt;strong&gt;communication&lt;/strong&gt; (MCP, A2A) and agent &lt;strong&gt;infrastructure&lt;/strong&gt; (Kubernetes, GPUs, model serving). What’s missing is the &lt;a href="https://www.tigera.io/blog/your-ai-agents-are-autonomous-but-are-they-accountable/" rel="noopener noreferrer"&gt;accountability layer&lt;/a&gt;, the control plane that answers three questions for every agent interaction:&lt;/p&gt;

&lt;p&gt;Effective human oversight means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Who authorized this?&lt;/strong&gt; Traceable to a specific, auditable policy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What policy permitted it?&lt;/strong&gt; With attribute-based evaluation, not hardcoded names.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What’s the full record?&lt;/strong&gt; End-to-end distributed trace with every hop, decision, and outcome.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The immaturity of the space is striking. A recent review of 43 AI risk frameworks found that &lt;a href="https://www.ibm.com/think/insights/ethics-governance-agentic-ai" rel="noopener noreferrer"&gt;only two even addressed agent-specific risks&lt;/a&gt;. This is the gap that will determine which enterprises can scale agentic AI responsibly, and which will be forced to cancel projects, face compliance failures, or deal with incidents they can’t investigate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common questions about AI agent accountability
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Aren’t network policies enough if I’m using a service mesh?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
No. A service mesh adds mTLS and routing, but its policy layer still operates on workload identities and namespaces, not agent capabilities, owners, or risk levels. You still can’t produce an audit trail that names which policy permitted a specific agent-to-agent call, or scale that policy without manual updates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I add an authorization layer on top of MCP myself?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
You can, and many teams are trying. The hard part isn’t the policy engine; it’s the audit correlation across multi-hop chains, the dual identity verification (workload + user), the visual oversight surface, and the attribute-based policy model that scales. Stitching those together is a 6–12 month engineering investment that delivers a worse outcome than purpose-built tooling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What about ABAC instead of RBAC?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Attribute-Based Access Control (ABAC) is on the right track, it’s exactly the model AI agent governance needs. But ABAC by itself is a policy language, not a complete platform. You still need agent identity, agent registration, attribute population, audit correlation, and a human oversight surface around it. ABAC is a piece of the answer, not the whole answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does Tigera’s solution replace these tools?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Tigera’s solution complements them. Network policies still secure your cluster. Service meshes still handle mTLS. MCP and A2A still standardize agent communication. Our platform adds the accountability layer above them, the layer that answers who, what, and why for every agent interaction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Network policies, API gateways, MCP/A2A protocols, DIY security patterns, and RBAC each solve a different problem;&lt;/strong&gt;  none of them solves AI agent accountability.&lt;/li&gt;
&lt;li&gt;The missing layer is the accountability layer: the one that ties identity, policy, and audit together across every agent interaction.&lt;/li&gt;
&lt;li&gt;Without that layer, your compliance team has no answer to &lt;em&gt;“which policy permitted this?”&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Building it from primitives is technically possible, but it’s a 6–12 month investment that still leaves gaps.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get the strategic guide for accountable AI agents
&lt;/h2&gt;

&lt;p&gt;If your team is currently trying to assemble accountability from network policies, OAuth2 exchange, SPIFFE, and a homegrown policy engine, then read our guide first.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://info.tigera.io/rs/805-GFH-732/images/Whitepaper_Accountability_for_AI_Agents.pdf" rel="noopener noreferrer"&gt;Accountable AI Agents: A Strategic Guide for AI &amp;amp; Security Leaders Governing Autonomous AI at Scale&lt;/a&gt; covers the full framework: the five pillars, the maturity model, the principles, and the three-step roadmap. No code, no product demos. Just what your leadership team needs to make the build-vs-buy call.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://info.tigera.io/rs/805-GFH-732/images/Whitepaper_Accountability_for_AI_Agents.pdf" rel="noopener noreferrer"&gt;Get the strategic guide for accountable AI agents →&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/the-ai-agent-accountability-gap-why-network-policies-api-gateways-and-rbac-are-not-enough/" rel="noopener noreferrer"&gt;The AI Agent Accountability Gap: Why Network Policies, API Gateways, And RBAC Are Not Enough&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>featuredblog</category>
      <category>technicalblog</category>
      <category>aiagentsecurity</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>The Case for VM and Container Consolidation in 2026</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Tue, 26 May 2026 18:50:17 +0000</pubDate>
      <link>https://dev.to/tigeraio/the-case-for-vm-and-container-consolidation-in-2026-1fo4</link>
      <guid>https://dev.to/tigeraio/the-case-for-vm-and-container-consolidation-in-2026-1fo4</guid>
      <description>&lt;p&gt;&lt;em&gt;Two platforms, two teams, two procurement relationships, all doing one job. There’s a reason it ended up this way. There isn’t a reason it has to stay this way.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Ask anyone at a typical enterprise why the VM platform and the container platform are separate, and they’ll give you a sensible answer. The VM estate has been there for fifteen years. It runs the workloads the business depends on. &lt;a href="https://www.tigera.io/learn/guides/kubernetes-101/" rel="noopener noreferrer"&gt;Kubernetes&lt;/a&gt; got stood up later, when application teams started building microservices, and giving them their own environment made more sense than retrofitting one onto VMware. Two platforms, two teams, two roadmaps.&lt;/p&gt;

&lt;p&gt;That’s how most enterprises got here.&lt;/p&gt;

&lt;p&gt;The reasoning was sound at the time. The question is whether it still is.&lt;/p&gt;

&lt;p&gt;This is the consolidation question most enterprises haven’t actually revisited, and it’s the one quietly absorbing more of your budget each year.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfbkmr6lkbkfcrn0a4q3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdfbkmr6lkbkfcrn0a4q3.png" alt="Figure 1. The current state most enterprises operate today." width="800" height="460"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure 1. The current state most enterprises operate today.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why VM and container platforms ended up separate
&lt;/h2&gt;

&lt;p&gt;If you operate both platforms, you know the shape of this already. There’s a VMware team: vSphere admins, network engineers who know NSX, storage specialists, plus a separate procurement relationship for the underlying virtualisation stack. Then there’s a Kubernetes team: platform engineers, CNI specialists, GitOps people, a different set of vendor relationships. Each team runs its own upgrade calendar, its own monitoring stack, its own security posture, its own incident process. They share office space at offsites. They don’t share much else.&lt;/p&gt;

&lt;p&gt;Both teams are doing the same job. They keep infrastructure available for the workloads above it. One set of those workloads happens to be virtual machines and the other happens to be containers, which is a real technical distinction, but it isn’t the distinction your operational model was built around. Your operational model was built around the platforms themselves, and the platforms are separate because of when they were stood up.&lt;/p&gt;

&lt;p&gt;Most enterprises don’t re-examine this. The platforms are separate because they always have been. The teams are separate because the platforms are. The procurement is separate because the teams are. Every layer of duplication has a reasonable justification, but the foundational decision underneath all of them, that VMs and containers belong on different infrastructure, is one nobody actually revisits.&lt;/p&gt;

&lt;h2&gt;
  
  
  What KubeVirt changed about running VMs and containers together
&lt;/h2&gt;

&lt;p&gt;The technical answer to this stopped being theoretical a few years ago. &lt;a href="https://www.tigera.io/learn/guides/kubevirt/" rel="noopener noreferrer"&gt;KubeVirt&lt;/a&gt; is a &lt;a href="https://www.cncf.io/projects/kubevirt/" rel="noopener noreferrer"&gt;CNCF project&lt;/a&gt; that lets virtual machines run as native objects on a Kubernetes cluster. It’s in production at NVIDIA, Cloudflare, and ByteDance. This is no longer “an interesting research direction.” It’s the platform pattern that some of the largest, least forgiving infrastructure operators in the world use to run their VMs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkscu192vapdrefxs00un.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkscu192vapdrefxs00un.png" alt="Figure 2. The unified state — same workloads, one operational model." width="800" height="460"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Figure 2. The unified state — same workloads, one operational model.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Which means the original reason your platforms are separate doesn’t really hold anymore. You don’t need a VMware-specific stack to host VMs and a Kubernetes-specific stack to host containers. Both can run on Kubernetes. The platform team you already have, the one that operates your container infrastructure, can take on the VMs too, with the same tooling, the same security model, the same upgrade pattern. The networking layer is the part most teams underestimate. VMs have to keep their existing IPs, VLANs, and firewall references so the rest of your infrastructure doesn’t break. This is the part Calico was built for. Whether your platform team wants to &lt;a href="https://www.tigera.io/blog/lift-and-shift-vms-to-kubernetes-with-calico-l2-bridge-networks/" rel="noopener noreferrer"&gt;lift and shift VMs onto Kubernetes&lt;/a&gt; with the network they already have, or modernise them onto a more dynamic networking model over time, Calico supports both paths on the same platform. Teams don’t have to commit to one approach up front, and they don’t have to migrate the network and the workload in the same step.&lt;/p&gt;

&lt;p&gt;This isn’t a pitch about throwing out what you have. The migration is real work, and the order in which you do it matters. But consolidating onto one platform is no longer experimental, and that changes the math on staying where you are.&lt;/p&gt;

&lt;h2&gt;
  
  
  What VM and container consolidation means for your roadmap
&lt;/h2&gt;

&lt;p&gt;If you’re a CTO or VP Engineering, the question to ask your platform leads isn’t “should we adopt KubeVirt?” That’s an implementation question. The strategic question is whether running two platforms is still the right operational model, or whether it’s something worth a look now that the alternative is real.&lt;/p&gt;

&lt;p&gt;Running two platforms compounds slowly. Two teams, two upgrade cadences, two vendor relationships, two of everything. Until the next renewal cycle, the next hiring round, or the next hardware refresh forces a decision you’d rather have made on your own terms.&lt;/p&gt;

&lt;p&gt;The first step isn’t the whole programme. Before you can consolidate at scale, you have to migrate one real VM end-to-end without breaking the network it lives on. That’s what your platform team will need to evaluate first, and it’s what our migration guide walks through in detail. It’s written for the engineers who’ll do the work, not for the executive sponsoring it. But if you’re at the point of asking whether the two-platform arrangement is still serving you, it’s the right thing to send their way.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/lp/ebook-the-complete-guide-to-vm-networking-for-kubernetes/" rel="noopener noreferrer"&gt;Read the VM migration guide&lt;/a&gt; →&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/the-case-for-vm-and-container-consolidation-in-2026/" rel="noopener noreferrer"&gt;The Case for VM and Container Consolidation in 2026&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>featuredblog</category>
      <category>technicalblog</category>
      <category>bestpractices</category>
      <category>vmmigration</category>
    </item>
    <item>
      <title>Kubernetes Operational Maturity: Secure and Resilient Cluster Federation with Cluster Mesh</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Mon, 25 May 2026 19:26:14 +0000</pubDate>
      <link>https://dev.to/tigeraio/kubernetes-operational-maturity-secure-and-resilient-cluster-federation-with-cluster-mesh-3apk</link>
      <guid>https://dev.to/tigeraio/kubernetes-operational-maturity-secure-and-resilient-cluster-federation-with-cluster-mesh-3apk</guid>
      <description>&lt;p&gt;Practically no one runs a single Kubernetes cluster in production these days. Maybe that’s how it started but data sovereignty requirements, acquisitions, AI initiatives and the need for edge servers, among other considerations, have pulled most enterprises into multi-cluster territory whether they planned for it or not. Reaching Kubernetes operational maturity—the point at which a fleet of clusters operates as one secure, observable, policy-consistent system—depends entirely on how those clusters are connected. Operating in a &lt;a href="https://www.tigera.io/learn/guides/kubernetes-networking/kubernetes-multi-cluster/" rel="noopener noreferrer"&gt;multi-cluster environment&lt;/a&gt; has evolved into the unspoken standard, one requiring a careful re-evaluation of the network architectures used to link clusters together.&lt;/p&gt;

&lt;p&gt;That re-evaluation rarely happens. Most enterprises connect their clusters with the same networking patterns they were using before Kubernetes existed: load balancers fronting internal services, DNS records published to external zones, and IP-based firewall rules. Those patterns were built for north-south traffic moving in and out of a traditional data center perimeter, not for east-west traffic moving between internal workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running east-west traffic on north-south plumbing
&lt;/h2&gt;

&lt;p&gt;The conventional way to make services in one cluster reachable from another is to expose them externally with a load balancer in front, a DNS name registered in a public zone, a firewall rule allowing traffic in. This works but it is not ideal as clusters are not separate entities making the odd API call to each other. They are part of a web of interconnected services that should be able to communicate securely, and with a minimum of friction.&lt;/p&gt;

&lt;p&gt;Having to expose these services through external DNS providers, adding additional hops to send traffic through load balancers and creating firewall rules to allow that traffic between internal workloads increases the potential attack surface, introduces latency and piles more responsibilities onto the network team. Securing traffic between workloads gets harder at every layer. Egress rules end up broad and permissive because there is no per-pod identity to write a tighter rule against. Source IPs are erased by SNAT before they reach the destination, so the audit trail compliance teams depend on is non-deterministic. Each cluster also runs its own set of &lt;a href="https://kubernetes.io/docs/concepts/services-networking/network-policies/" rel="noopener noreferrer"&gt;network policies&lt;/a&gt; with no awareness of the others, leaving gaps wherever those policy sets disagree.&lt;/p&gt;

&lt;p&gt;Visibility suffers in the same way. Each cluster’s observability stack only sees traffic that lives inside it, so the moment a flow crosses a cluster boundary it becomes someone else’s problem. The destination workload sees a connection arriving from a load balancer or a NAT gateway rather than the workload that actually made the call, which means the receiving team can’t tell who is calling their endpoints or whether those endpoints should answer. Tracing a request from a service in one cluster to an endpoint in another means correlating timestamps and partial signals across two or three tools that were never designed to talk to each other. During an incident that gap is the difference between a five-minute fix and a three-hour bridge call. Mean Time To Resolution (MTTR) stretches accordingly.&lt;/p&gt;

&lt;p&gt;It is common to see enterprises with eight to twelve clusters where most internal-trust traffic now traverses external load balancers, public DNS, and inspection points designed for traffic from the open internet. This was probably the only option when that first cluster with its half dozen trailblazing workloads was first spun up. Now there’s a better way to connect clusters at scale, and it was built for Kubernetes from the start.&lt;/p&gt;

&lt;h2&gt;
  
  
  How cluster mesh rewires multi-cluster networking
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6tlooymzxv7q11x433xh.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6tlooymzxv7q11x433xh.jpg" alt="Cluster Mesh changes the way workloads connect" width="800" height="374"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Cluster Mesh changes the way workloads connect&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A cluster mesh f&lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-federation/" rel="noopener noreferrer"&gt;ederates Kubernetes clusters&lt;/a&gt; into a single flat overlay network. Pods talk to pods directly across cluster boundaries, services resolve through native Kubernetes DNS rather than an external provider and traffic is encrypted end-to-end, typically with WireGuard. &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;Network policy&lt;/a&gt; is expressed against workload identity such as namespace, label or service account instead of IP addresses that change every time a pod is rescheduled.&lt;/p&gt;

&lt;p&gt;Four important things change at the architecture level. East-west traffic stops leaving the trust boundary, because the overlay terminates inside the cluster nodes. DNS resolution moves back inside Kubernetes, removing the external dependency. Identity replaces IP as the unit of policy enforcement, which means a policy written today is still valid after the workload has moved across nodes, regions, or clusters. And telemetry flows through one fabric across every cluster instead of being assembled after the fact from per-cluster silos.&lt;/p&gt;

&lt;p&gt;A cluster mesh stops treating each cluster as a sovereign country with its own borders, customs, and identity papers, and starts treating the fleet as a federation where workloads move freely under shared rules.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cluster mesh means a more secure and resilient architecture
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7co4twpd0gd5z1pu3ut2.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7co4twpd0gd5z1pu3ut2.jpg" alt="Workloads connect across clusters in a Kubernetes native way" width="800" height="387"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Workloads connect across clusters in a Kubernetes native way&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;By treating a group of connected clusters as members of one network, cluster mesh shrinks the attack surface by keeping internal services off public DNS where they did not belong in the first place. Policies stay valid as workloads move across nodes, regions, and clusters, because identity rather than IP is what they bind to. Inter-cluster traffic can be encrypted and policies applied uniformly across the entire fleet.&lt;/p&gt;

&lt;p&gt;Pods connect to each other directly and observability stops being a per-cluster silo. Flow logs can now follow a request from the client all the way to the service handling it, even when those two live in different clusters.&lt;/p&gt;

&lt;p&gt;Day-to-day operations become smoother too, since the platform team stops having to file tickets with the networking team every time a new service ships and connecting that service no longer requires a new VIP or a new DNS record.&lt;/p&gt;

&lt;p&gt;In other words, calls between clusters are treated like the east-west traffic they are.&lt;/p&gt;

&lt;p&gt;Even compliance work gets noticeably lighter because the default state of the network already satisfies most of what auditors ask about: encryption in transit, identity attribution, and workload-level audit trails.&lt;/p&gt;

&lt;h2&gt;
  
  
  How mature is your inter-cluster networking?
&lt;/h2&gt;

&lt;p&gt;Here is what each of the four stages looks like in practice, and what each one says about the work that still lies ahead.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Beginner.&lt;/strong&gt; A single cluster, or multiple clusters with no inter-cluster connectivity. Services exposed via external load balancers and manual DNS records.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intermediate.&lt;/strong&gt; VPC peering or transit gateways connect the clusters. External DNS handles service discovery. Some traffic is encrypted, much of it isn’t.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advanced.&lt;/strong&gt; A cluster mesh with overlay networking, native Kubernetes service discovery, WireGuard encryption, and identity-based policies enforced consistently across clusters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimized.&lt;/strong&gt; The cluster mesh is fully GitOps-managed, with unified observability and real-time anomaly detection across the fleet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;| &lt;strong&gt;Stage&lt;/strong&gt; | &lt;strong&gt;Connectivity&lt;/strong&gt; | &lt;strong&gt;Service discovery&lt;/strong&gt; | &lt;strong&gt;Encryption&lt;/strong&gt; | &lt;strong&gt;Policy &amp;amp; observability&lt;/strong&gt; |&lt;br&gt;
| &lt;strong&gt;Beginner&lt;/strong&gt; | Single cluster, or multi-cluster with no inter-cluster connectivity | Manual DNS records, external load balancers | None | Per-cluster, no fleet view |&lt;br&gt;
| &lt;strong&gt;Intermediate&lt;/strong&gt; | VPC peering or transit gateways | External DNS | Partial | Per-cluster, inconsistent |&lt;br&gt;
| &lt;strong&gt;Advanced&lt;/strong&gt; | Cluster mesh with overlay networking | Native Kubernetes service discovery | WireGuard, end-to-end | Identity-based, consistent across clusters |&lt;br&gt;
| &lt;strong&gt;Optimized&lt;/strong&gt; | GitOps-managed cluster mesh | Native, fully automated | End-to-end | Unified observability, real-time anomaly detection |&lt;/p&gt;

&lt;p&gt;In our experience, most enterprises are at Intermediate stage for connectivity and Beginner for the surrounding pillars (egress, &lt;a href="https://www.tigera.io/learn/guides/microsegmentation/microsegmentation-security/" rel="noopener noreferrer"&gt;microsegmentation&lt;/a&gt; and observability) that compound on top of it. This will likely change as organizations grow into their Kubernetes adoption progressing step by step towards operational excellence.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI raises the stakes
&lt;/h2&gt;

&lt;p&gt;AI has made proper multi cluster architecture even more urgent. GPU scarcity by region, data residency requirements for training data, blast-radius isolation between training and inference, and the operational pattern of separating data preparation, training, and inference into purpose-built clusters are pushing teams into multi-cluster topologies whether they planned for it or not. The architecture you bring to that moment determines whether multi-cluster becomes a strength or a liability.&lt;/p&gt;

&lt;p&gt;The full nine-pillar reference architecture, including the egress, microsegmentation, observability, and service mesh pillars that build directly on cluster mesh, is in our ebook, &lt;a href="https://www.tigera.io/lp/ebook-building-resilient-multi-cluster-kubernetes/" rel="noopener noreferrer"&gt;&lt;em&gt;Building Resilient Multi-Cluster Kubernetes&lt;/em&gt;&lt;/a&gt;. If you would rather work through it hands-on, our r&lt;a href="https://www.tigera.io/event/from-reference-architecture-to-production-a-hands-on-kubernetes-workshop/" rel="noopener noreferrer"&gt;eference architecture workshop&lt;/a&gt; walks through the first five pillars, the next steps on your operational maturity journey, in a working environment.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Read the ebook: &lt;a href="https://www.tigera.io/lp/ebook-building-resilient-multi-cluster-kubernetes/" rel="noopener noreferrer"&gt;Building Resilient Multi-Cluster Kubernetes →&lt;/a&gt;&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/kubernetes-operational-maturity-secure-and-resilient-cluster-federation-with-cluster-mesh/" rel="noopener noreferrer"&gt;Kubernetes Operational Maturity: Secure and Resilient Cluster Federation with Cluster Mesh&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>bestpractices</category>
      <category>unifiedplatform</category>
    </item>
    <item>
      <title>The Five Pillars of AI Agent Accountability: A Diagnostic Framework for Engineering Leaders</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Fri, 22 May 2026 17:51:17 +0000</pubDate>
      <link>https://dev.to/tigeraio/the-five-pillars-of-ai-agent-accountability-a-diagnostic-framework-for-engineering-leaders-34ip</link>
      <guid>https://dev.to/tigeraio/the-five-pillars-of-ai-agent-accountability-a-diagnostic-framework-for-engineering-leaders-34ip</guid>
      <description>&lt;p&gt;You’re in a board meeting. The CISO is presenting on AI risk. The CFO asks a simple question:&lt;/p&gt;

&lt;p&gt;_ &lt;strong&gt;“When that finance agent we deployed last quarter accessed a customer payment record, can we tell who authorized it, what policy permitted it, and produce the full audit trail?”&lt;/strong&gt; _&lt;/p&gt;

&lt;p&gt;The CISO looks at the head of the platform. The head of the platform looks at security. Nobody answers.&lt;/p&gt;

&lt;p&gt;If you can picture that meeting happening at your company, you’re not alone. &lt;a href="https://www.mckinsey.com/capabilities/risk-and-resilience/our-insights/trust-in-the-age-of-agents" rel="noopener noreferrer"&gt;McKinsey&lt;/a&gt; found that &lt;strong&gt;only one-third of organizations have AI agent governance maturity at level 3 or higher&lt;/strong&gt;. The other two-thirds are exactly the silence in that boardroom.&lt;/p&gt;

&lt;p&gt;This post is the diagnostic framework that closes that gap. It’s part 2 of a five-part series on AI agent accountability, and if you only have time to read one post in the series, read this one. By the end you’ll have a five-question assessment to run with your team this week, and a maturity model to score where you stand today.&lt;/p&gt;

&lt;p&gt;Not all governance equals &lt;a href="https://www.tigera.io/blog/your-ai-agents-are-autonomous-but-are-they-accountable/" rel="noopener noreferrer"&gt;AI agent accountability&lt;/a&gt;. Many enterprises believe they’re covered because they have network policies or an API gateway, but governance without accountability is a &lt;strong&gt;security theater&lt;/strong&gt; : it might prevent some bad outcomes, but it can’t prove why good outcomes were permitted, trace what happened when something goes wrong, or satisfy an auditor asking for evidence.&lt;/p&gt;

&lt;p&gt;True AI agent accountability requires five distinct capabilities working together. Miss any one and you have a gap that will surface during your next incident, audit, or regulatory review.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are the five pillars of AI agent accountability?
&lt;/h2&gt;

&lt;p&gt;The five pillars are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Traceability:&lt;/strong&gt; Every agent interaction produces an end-to-end record automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authorization provenance:&lt;/strong&gt; Every permitted action is traceable to a specific, auditable policy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identity and ownership:&lt;/strong&gt; Every agent has a verified identity and a clear human owner.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy-based governance at scale:&lt;/strong&gt; Declarative, attribute-based policies that don’t break at 100 agents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human oversight and intervention:&lt;/strong&gt; Humans can see, review, and override agent behavior in real time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffia6tmjhemuryp5syqo5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffia6tmjhemuryp5syqo5.png" width="799" height="173"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each pillar comes with a question you can ask your team. Below, we’ll work through each one, and at the end, a 5-level maturity model and a 5-question assessment to score where you stand today.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pillar 1: Traceability
&lt;/h3&gt;

&lt;p&gt;_ &lt;strong&gt;“Can you trace what happened, end to end?”&lt;/strong&gt; _&lt;/p&gt;

&lt;p&gt;When Agent A calls Agent B, which calls Tool C, which accesses Database D, can you reconstruct the entire chain? Not just that it happened, but when, how long each step took, and what the outcome was at each hop?&lt;/p&gt;

&lt;p&gt;Traceability means every agent interaction produces a structured, correlated record automatically. This is distributed tracing applied to agent communication. Each hop in the chain is a span; the full trace tells the complete story of an interaction from trigger to outcome.&lt;/p&gt;

&lt;p&gt;Without traceability, incident response is guesswork. You know something went wrong, but you can’t determine the chain of events that led there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The test:&lt;/strong&gt; Can your team pull up a single interaction and see the full path it took across every agent and tool in your network, with timestamps and outcomes at every hop?&lt;/p&gt;

&lt;h3&gt;
  
  
  Pillar 2: Authorization provenance
&lt;/h3&gt;

&lt;p&gt;_ &lt;strong&gt;“Can you prove why it was permitted?”&lt;/strong&gt; _&lt;/p&gt;

&lt;p&gt;Blocking unauthorized actions is table stakes. The harder (and more important) question is, can you prove why authorized actions were permitted?&lt;/p&gt;

&lt;p&gt;Authorization provenance means every allowed interaction is traceable to a specific, auditable policy. Not just “Agent A was allowed to call Agent B,” but “Agent A was allowed to call Agent B because Policy X grants agents with capability Y access to agents with risk-level Z.”&lt;/p&gt;

&lt;p&gt;This is the difference between a lock on the door and a sign-in sheet. The lock prevents unauthorized entry. The sign-in sheet proves who was authorized, when, and by what authority.&lt;/p&gt;

&lt;p&gt;Without authorization provenance, your compliance team cannot demonstrate that access was intentional and governed, only that it wasn’t blocked. That distinction is the difference between passing an audit and failing one..&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The test:&lt;/strong&gt; For any &lt;a href="https://www.tigera.io/blog/how-ai-agents-communicate-understanding-the-a2a-protocol-for-kubernetes/" rel="noopener noreferrer"&gt;agent-to-agent interaction&lt;/a&gt; in your network, can you identify the specific policy that permitted it and the attributes that triggered that policy?&lt;/p&gt;

&lt;h3&gt;
  
  
  Pillar 3: Identity and ownership
&lt;/h3&gt;

&lt;p&gt;_ &lt;strong&gt;“Who owns this agent, and who is responsible when it acts?”&lt;/strong&gt; _&lt;/p&gt;

&lt;p&gt;Every agent must have two things: a verified identity (it is who it claims to be) and a &lt;a href="https://thehackernews.com/2026/01/who-approved-this-agent-rethinking.html" rel="noopener noreferrer"&gt;clear owner&lt;/a&gt; (a person accountable for its behavior).&lt;/p&gt;

&lt;p&gt;Identity means the governance layer can verify that an agent is genuinely the agent it claims to be, and not a compromised workload masquerading as a legitimate one. This requires cryptographic identity verification, not just a name in a configuration file.&lt;/p&gt;

&lt;p&gt;Ownership means that when an incident occurs, there is a specific person (not a team alias, not a Slack channel, not “the AI team”) who is accountable. Without clear ownership definitions, &lt;a href="https://www.paloaltonetworks.com/cyberpedia/what-is-agentic-ai-governance" rel="noopener noreferrer"&gt;accountability diffuses across components&lt;/a&gt;, and diffused accountability is no accountability at all.&lt;/p&gt;

&lt;p&gt;Agent registration should capture: who registered it, what team owns it, what it’s designed to do, and what permissions it’s been granted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The test:&lt;/strong&gt; Pick any agent in your network. Can you immediately identify it’s a verified identity, who registered it, which team owns it, and what permissions it has… all without asking around?&lt;/p&gt;

&lt;h3&gt;
  
  
  Pillar 4: Policy-based governance at scale
&lt;/h3&gt;

&lt;p&gt;_ &lt;strong&gt;“Does your security model survive agent #101?”&lt;/strong&gt; _&lt;/p&gt;

&lt;p&gt;With 10 agents, you can manage permissions by hand. You write explicit rules: “Agent A can call Agent B. Agent C can call Agent D.” You maintain a spreadsheet. It works.&lt;/p&gt;

&lt;p&gt;With 100 agents, it doesn’t. With 1,000, it’s impossible. Every new agent requires updating every relevant policy. Permissions become a tangled web that nobody fully understands. New agents get deployed ungoverned because updating the allow-lists is too slow.&lt;/p&gt;

&lt;p&gt;Scalable governance requires &lt;strong&gt;declarative, attribute-based policies&lt;/strong&gt;. Instead of naming specific agents, policies reference agent attributes: capabilities, risk levels, teams, environments.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;“Low-risk agents can communicate with low-risk agents.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;“Agents on the finance team can access finance MCP servers.”&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;“Agents in production can only call production-grade tools.”&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When a new agent registers with matching attributes, it’s governed from day one — automatically. No policy updates required. No spreadsheet to maintain. The governance scales with the agent network, not against it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The test:&lt;/strong&gt; When your team deploys a new agent next week, will it be governed by existing policies automatically, or will someone need to manually update an allow-list?&lt;/p&gt;

&lt;h3&gt;
  
  
  Pillar 5: Human oversight and intervention
&lt;/h3&gt;

&lt;p&gt;_ &lt;strong&gt;“Can a human review, approve, or override?”&lt;/strong&gt; _&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://thefuturesociety.org/aiagentsintheeu/" rel="noopener noreferrer"&gt;EU AI Act&lt;/a&gt; (Article 14) requires effective human oversight of high-risk AI systems. But human oversight doesn’t mean a human approves every agent action, that would eliminate the value of agents entirely.&lt;/p&gt;

&lt;p&gt;Effective human oversight means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Visibility:&lt;/strong&gt; Humans can see what agents are doing, which agents are communicating, and what policies govern them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review:&lt;/strong&gt; Humans can examine agent interactions after the fact, with enough context to understand what happened and why.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intervention:&lt;/strong&gt; Humans can modify policies, revoke agent access, or halt agent communication in real time when necessary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard, not log file:&lt;/strong&gt; The oversight interface should be a visual dashboard with communication graphs and policy visualization, not a grep command on a log file.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The test:&lt;/strong&gt; Right now, can someone on your team open a dashboard, see which agents are communicating with which, and modify the policies governing that communication — all without touching a terminal?&lt;/p&gt;

&lt;h2&gt;
  
  
  How to assess your AI agent accountability maturity
&lt;/h2&gt;

&lt;p&gt;Run this five-question assessment with your platform lead, security lead, and one compliance representative in a 30-minute meeting. For each question, you have three possible answers: _ &lt;strong&gt;Yes&lt;/strong&gt; _ (you’ve got it), _ &lt;strong&gt;Partial&lt;/strong&gt; _ (you can answer for some agents but not all), or _ &lt;strong&gt;No&lt;/strong&gt; _ (gap).&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pick the most recent agent-to-agent interaction in your environment.&lt;/strong&gt; Can someone on the call pull up the full trace (every hop, timestamp, and outcome) in under five minutes? (Pillar 1)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For that same interaction, can you name the specific policy that permitted it&lt;/strong&gt; and the agent attributes that triggered the match? (Pillar 2)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick one production agent at random.&lt;/strong&gt; Can you produce (from a system, not a wiki) its verified identity, registered owner, team, and granted permissions? (Pillar 3)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Imagine your team deploys a brand-new agent tomorrow.&lt;/strong&gt; Will your existing policies govern it automatically, or will someone need to update an allow-list? (Pillar 4)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open whatever dashboard your team uses to view agent activity.&lt;/strong&gt; Does it show communication graphs and policy state visually, or are you grep-ing a log file? (Pillar 5)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Count your answers. &lt;strong&gt;Five Yes&lt;/strong&gt; = Level 4. &lt;strong&gt;Mostly Yes, occasional Partial&lt;/strong&gt; = Level 3. &lt;strong&gt;Yes on identity but No on policy enforcement&lt;/strong&gt; = Level 2. &lt;strong&gt;Inventory only, no identity verification&lt;/strong&gt; = Level 1. &lt;strong&gt;Couldn’t run the assessment because you don’t know what agents exist&lt;/strong&gt; = Level 0.&lt;/p&gt;

&lt;p&gt;If you scored below Level 3, you’re in the McKinsey two-thirds. The good news: you now know exactly which pillar to fix first.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Accountability Maturity Model
&lt;/h2&gt;

&lt;p&gt;The five pillars map to a five-level progression. Use it to track where you are today and where you’re heading.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi4cm6osgqs2frjf646kq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi4cm6osgqs2frjf646kq.png" width="800" height="193"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;| Level | State | What you can do |&lt;br&gt;
| Level 0:&lt;br&gt;&lt;br&gt;
Blind | No visibility | You don’t know what agents exist in your network, let alone what they’re doing |&lt;br&gt;
| Level 1:&lt;br&gt;&lt;br&gt;
Inventory | Awareness | You know what agents exist, but not what they do, who they talk to, or what policies govern them |&lt;br&gt;
| Level 2:&lt;br&gt;&lt;br&gt;
Authenticated | Identity verification | Your agents have cryptographic identities, but communication is not yet governed by policy |&lt;br&gt;
| Level 3:&lt;br&gt;&lt;br&gt;
Controlled | Policy enforcement | You have policies governing agent communication, and unauthorized interactions are blocked |&lt;br&gt;
| Level 4:&lt;br&gt;&lt;br&gt;
Accountable | Full accountability | You can trace, prove, and audit every agent action — with authorization provenance, identity verification, and human oversight |&lt;/p&gt;

&lt;p&gt;Most enterprises today are at &lt;strong&gt;Level 0&lt;/strong&gt; or &lt;strong&gt;Level 1&lt;/strong&gt;. They lack verified identities, policy enforcement, and end-to-end auditability. The goal is Level 4, and the gap between where most organizations are and where they need to be is the &lt;a href="https://www.tigera.io/blog/the-ai-agent-accountability-crisis-why-governance-isnt-keeping-up-with-deployment/" rel="noopener noreferrer"&gt;AI agent accountability crisis&lt;/a&gt; this framework addresses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is the most important pillar of AI agent accountability?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
All five are required, but authorization provenance is the one most enterprises miss. Plenty of teams can block unauthorized actions; very few can show why an authorized action was permitted, traceable to a specific policy. Without provenance, you have security but not accountability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How is AI agent accountability different from observability?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Observability tells you what happened. Accountability tells you what was permitted, by which policy, and on whose authority. Observability is a prerequisite, but it’s not enough on its own; your trace data needs to be tied to policy decisions and identity claims to count as accountability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does AI agent accountability relate to AI agent security?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
They’re complementary, not interchangeable. &lt;a href="https://www.tigera.io/learn/guides/ai-agent-security/" rel="noopener noreferrer"&gt;AI agent security&lt;/a&gt; focuses on preventing compromise—stopping prompt injection, blocking unauthorized API access, eliminating shadow agents. Accountability focuses on proving what authorized agents did and why. You need both: security keeps the bad agents out, accountability keeps the good agents honest. The five pillars in this framework assume strong AI agent security is already in place.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I assess my AI agent governance maturity using these pillars?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Yes — that’s exactly what the assessment and maturity model above are for. Walk through each pillar’s “test” with your team. If you can’t answer cleanly on all five, you’re at Level 3 or below, regardless of what tooling you’ve deployed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need all five pillars on day one?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
No, but you need a path to all five. A platform that delivers two pillars natively and forces you to bolt on the other three is an accountability gap waiting to surface. We cover what to look for in future articles of this series.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the difference between Level 3 and Level 4 in the maturity model?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Level 3 means unauthorized interactions are blocked, you have policy enforcement. Level 4 means you can also prove why every authorized interaction was permitted, with audit evidence tied to a specific policy and identity. Level 3 is security; Level 4 is accountability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;AI agent accountability rests on five pillars: traceability, authorization provenance, identity and ownership, policy at scale, and human oversight.&lt;/li&gt;
&lt;li&gt;Each pillar has a clear test you can run against your environment today.&lt;/li&gt;
&lt;li&gt;The five pillars map to a five-level Accountability Maturity Model — most enterprises are at Level 0 or 1.&lt;/li&gt;
&lt;li&gt;Run the 5-question assessment with your platform, security, and compliance leads to score where you stand.&lt;/li&gt;
&lt;li&gt;Missing any single pillar creates a gap that will surface during your next incident, audit, or regulatory review.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get the strategic guide for accountable AI agents
&lt;/h2&gt;

&lt;p&gt;We wrote a strategic guide for engineering and security leaders that goes deeper into each pillar, including detailed assessment questions, the full maturity model, and a practical roadmap to Level 4.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accountable AI Agents: A Strategic Guide for AI &amp;amp; Security Leaders Governing Autonomous AI at Scale&lt;/strong&gt; — no code, no product demos. Just the framework your leadership team needs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://info.tigera.io/rs/805-GFH-732/images/Whitepaper_Accountability_for_AI_Agents.pdf" rel="noopener noreferrer"&gt;_ &lt;strong&gt;Get the strategic guide for accountable AI agents →&lt;/strong&gt; _&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/the-five-pillars-of-ai-agent-accountability-a-diagnostic-framework-for-engineering-leaders/" rel="noopener noreferrer"&gt;The Five Pillars of AI Agent Accountability: A Diagnostic Framework for Engineering Leaders&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>featuredblog</category>
      <category>technicalblog</category>
      <category>aiagentsecurity</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>KubeVirt Live Migration Done Right: What it Takes to Run VMs on Kubernetes</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Thu, 14 May 2026 20:53:44 +0000</pubDate>
      <link>https://dev.to/tigeraio/kubevirt-live-migration-done-right-what-it-takes-to-run-vms-on-kubernetes-369i</link>
      <guid>https://dev.to/tigeraio/kubevirt-live-migration-done-right-what-it-takes-to-run-vms-on-kubernetes-369i</guid>
      <description>&lt;p&gt;Running VMs in Kubernetes sounds like a crazy workaround for avoiding vendor lock-in, and standardizing legacy applications and newer containerized workloads on one control plane with one set of security policies to govern them all. It is, however, a rapidly growing pattern, and &lt;a href="https://www.tigera.io/learn/guides/kubevirt/kubevirt-live-migration/" rel="noopener noreferrer"&gt;KubeVirt live migration&lt;/a&gt; — moving running VMs between nodes without downtime — is increasingly central to platform engineering use cases that require full VMs, like on-demand CI/CD pipelines.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.tigera.io/learn/guides/kubevirt/" rel="noopener noreferrer"&gt;KubeVirt&lt;/a&gt; is gaining traction as a way to bring VMs into Kubernetes as first-class workloads, managed with the same tools and primitives that platform teams already use for containers. It has, however, introduced some unique challenges.&lt;/p&gt;

&lt;p&gt;Here’s the uncomfortable truth about that migration: compute and storage are the easy parts. Networking is where migrations stall, roadblock multiple, and platform teams start questioning whether KubeVirt was the right call in the first place.&lt;/p&gt;

&lt;p&gt;If your VMs have no fixed IP dependencies, no VLAN memberships, and no upstream firewall rules scoped to specific subnets, you can migrate them into Kubernetes without losing sleep over the networking layer. If you’re running hundreds or thousands of VMs with IP addresses hardcoded into application configs, DNS entries, and firewall ACLs — and you need to move those VMs to Kubernetes without rewriting any of it — then your networking layer is about to become the most important decision in your migration.&lt;/p&gt;

&lt;p&gt;What follows is a technical walk-through of the L2 plumbing that keeps KubeVirt VMs connected when they move between nodes in a production cluster and how it eliminates the need to update your complicated network infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kubernetes Networking Wasn’t Built for VMs
&lt;/h2&gt;

&lt;p&gt;In a traditional hypervisor environment — vSphere, Hyper-V, Nutanix — VMs sit on VLANs and have fixed IPs. Upstream firewalls, load balancers, and DNS records all reference those IPs. A security team owns the VLAN segmentation while the network team owns the routing. This network infrastructure is the accumulated work of many years and forms a static, and somewhat brittle, system of securing hosts and getting traffic to its destination. The &lt;a href="https://www.tigera.io/learn/guides/kubernetes-networking/" rel="noopener noreferrer"&gt;Kubernetes networking&lt;/a&gt; model, with its dynamic allocation of IPs that are meaningful only inside a cluster, is at odds with this traditional approach. Therein lies the problem.&lt;/p&gt;

&lt;p&gt;The upstream network has no direct visibility into the pod network. When a VM is migrated from your existing hypervisor into Kubernetes, its original network segment is not preserved. The VM gets a new IP from the pod CIDR, and every firewall rule, DNS entry, and load balancer config that referenced the old IP is now broken. For a handful of VMs, you can reconfigure your firewall rules and routing manually. For hundreds or thousands reconfiguration becomes not only costly in terms of engineering effort but also injects the risk of breaking critical functionality and introducing security blind spots.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Networking Modes, Two Different Problems
&lt;/h2&gt;

&lt;p&gt;Before diving into solutions, it helps to understand how KubeVirt presents networking to VMs. There are two modes for the primary pod interface, and they solve different problems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Masquerade mode&lt;/strong&gt; decouples the pod IP from the VM IP. KubeVirt assigns a static IP to the VM internally and uses NAT rules to translate between the two. Live migration works out of the box because the pod IP can change without affecting the VM. The trade-off is that you need a service-level abstraction to reach the VM from outside the pod, which makes this mode impractical for production workloads that need stable, directly-addressable IPs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bridge mode&lt;/strong&gt; is the production-grade option. The pod IP and the VM IP are identical. The VM is directly reachable on the network. No NAT, no service abstraction. But bridge mode introduces a hard problem: when a VM live-migrates to a new node, KubeVirt creates a new pod on the destination. That new pod gets a fresh IP from the CNI. The VM still thinks it has its original IP. The result is a routing mismatch — the network doesn’t know where to send traffic, and the VM’s connections break.&lt;/p&gt;

&lt;p&gt;KubeVirt only handles memory and disk migration. This does not matter much in masquerade mode since the VM’s IP is decoupled from the pod’s IP via NAT but becomes a critical consideration in bridge mode. So the &lt;a href="https://www.tigera.io/learn/guides/kubernetes-networking/kubernetes-cni/" rel="noopener noreferrer"&gt;CNI&lt;/a&gt; has to do three things to ensure nothing breaks: preserve the IP across the pod transition, converge routes so the rest of the network knows the VM has moved, and ensure network policy is in place on the destination before the VM goes live.&lt;/p&gt;

&lt;h2&gt;
  
  
  Live Migration in Bridge Mode: What Happens Under the Hood
&lt;/h2&gt;

&lt;p&gt;VMs need to move between nodes for a variety of reasons, for example maintenance, load balancing, or high availability. What actually happens during a live migration in bridge mode and why is making it work right so hard?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpn602akqrbmext24o6av.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpn602akqrbmext24o6av.png" alt="The 5-step network handover during live migration in bridge mode" width="799" height="462"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;The 5-step network handover during live migration in bridge mode&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Core Challenge
&lt;/h3&gt;

&lt;p&gt;When a migration is triggered using the KubeVirt command line utility, &lt;a href="https://kubevirt.io/user-guide/user_workloads/virtctl_client_tool/" rel="noopener noreferrer"&gt;virtctl&lt;/a&gt;, KubeVirt creates a new pod on a destination node chosen by the Kubernetes scheduler in the usual way based on available resources, affinity rules, shared storage, etc. Next, KubeVirt copies the VM’s memory state using libvirt’s pre-copy and post-copy mechanisms.&lt;/p&gt;

&lt;p&gt;Then things get a bit interesting.&lt;/p&gt;

&lt;p&gt;The source pod continues running during the whole process. From a networking perspective, the same IP now needs to exist in two places temporarily — on the source node (where the VM is still running) and on the destination (where it’s about to go live).&lt;/p&gt;

&lt;p&gt;The CNI has to solve three problems simultaneously: IP persistence across pod lifecycles, route convergence during the handover window, and policy continuity so the VM isn’t exposed during migration.&lt;/p&gt;

&lt;p&gt;Let’s look at how Calico makes this happen.&lt;/p&gt;

&lt;h3&gt;
  
  
  IP Persistence: IPAM That Understands VMs
&lt;/h3&gt;

&lt;p&gt;Traditionally, Calico IPAM allocates IPs to pods. The IPAM handle (the ownership ticket for an IP reservation) is derived from the pod’s identity. This works for containers because pods are ephemeral. But a KubeVirt VM is more like a Kubernetes Deployment: you define a VirtualMachine resource, and KubeVirt creates a randomly-named pod to run it. Every time you restart or migrate the VM, the pod changes, but the VM stays the same with the same identity, memory state and the same IP.&lt;/p&gt;

&lt;p&gt;Since IPAM assigns the IP to the pod, every migration means a new IP, which defeats the purpose of preserving the VM’s IP and breaks any firewall rules, load balancer configurations or DNS records pointed at this IP.&lt;/p&gt;

&lt;p&gt;To fix this, Calico constructs the IPAM handle from the VM’s name instead of the pod’s name ensuring that the reservation persists across pod lifecycles. When a VM migrates and its old pod is destroyed, the IPAM handle survives because it’s tied to the VM identity. When the new pod starts, the IPAM finds the existing handle and reuses the same IP. During migration, the IPAM transiently tracks dual ownership — an active owner on the source node and an alternate owner on the destination — then converges to a single owner once the source pod is cleaned up.&lt;/p&gt;

&lt;h3&gt;
  
  
  Route Convergence: The GARP Handover
&lt;/h3&gt;

&lt;p&gt;IP persistence ensures the VM keeps its address. Route convergence ensures the rest of the network knows where to find it. Here’s the sequence:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Migration initiated.&lt;/strong&gt; The CNI watches for migration events in the Kubernetes API. As soon as one is created, it starts preparing the destination node’s networking — policies, routes, interface configuration — so that everything is in place before the VM actually moves.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory pre-copy.&lt;/strong&gt; KubeVirt and libvirt handle the iterative memory copy. The VM continues running on the source node. Traffic continues routing to the source at standard priority.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VM goes live on destination.&lt;/strong&gt; The VM broadcasts a &lt;a href="https://www.practicalnetworking.net/series/arp/gratuitous-arp/" rel="noopener noreferrer"&gt;Gratuitous ARP (GARP)&lt;/a&gt; packet announcing “I own this IP now, and I’m on this node.” Felix picks up this GARP and immediately advertises a high-priority route for the VM’s IP via the destination node. The networking layer picks this up and immediately starts steering traffic for the VM’s IP toward the new node, overriding the old route.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Route priority override.&lt;/strong&gt; This is a critical engineering detail. Normal routing uses a standard metric (1024). During migration, the destination node advertises the VM’s route at a higher priority metric (512). Because the source pod still exists briefly in a post-life state, both nodes momentarily have routes for the same IP. The higher-priority route ensures all traffic is forwarded to the destination, even before the source pod is fully cleaned up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cleanup and steady state.&lt;/strong&gt; Once the source pod terminates, the high-priority route is replaced with a standard-priority route. The source node’s route is removed. The network converges to its normal state with the VM on its new node at the same IP.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Policy continuity
&lt;/h3&gt;

&lt;p&gt;The CNI watches for migration events and uses the lead time to pre-program network policies on the destination node while the memory copy is still in progress. By the time the VM cuts over, its security posture is already in place leaving no gap for unsanctioned traffic to slip through.&lt;/p&gt;

&lt;p&gt;This works because &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;Kubernetes network policies&lt;/a&gt; use label selectors, not IP addresses. The policies follow the VM’s identity, its labels, namespace, and network membership, not its physical location. When the VM appears on the destination node with the same labels, the same policies apply automatically. One nuance worth noting: while the policy rules carry over, stateful connection tracking (conntrack) does not currently replicate between nodes. Established connections survive because the routes converge, but the destination node evaluates them as new flows. Full conntrack replication is a planned future enhancement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Portability and Standardization for VMs
&lt;/h2&gt;

&lt;p&gt;If you’re familiar with vSphere, you know vMotion, paired with the vSphere distributed switch, managed live migration networking seamlessly. However, this transparency relies on a vertically integrated stack that is not portable to other cloud environments.&lt;/p&gt;

&lt;p&gt;In Kubernetes, the stack is disaggregated. Components like KubeVirt (VM lifecycle), CNI (networking), policy engines (security), and storage operators (disks) each manage their own part. For live migration, the CNI must coordinate with KubeVirt’s migration state machine to manage the VM’s temporary dual-existence across two nodes and converge routing without a centralized controller.&lt;/p&gt;

&lt;p&gt;The Kubernetes approach is fundamentally different. It uses open standards: CRI, CNI, CSI, and NetworkPolicy. KubeVirt extends this; VMs are custom resources, managed by kubectl, and scheduled by the same control plane. This approach demands a CNI that understands the unique lifecycle, identity and networking requirements of a pod running a VM but it also makes VMs portable.&lt;/p&gt;

&lt;p&gt;It also means that now your containers and VMs can be managed and monitored using the same policies and tools and that means not only operational efficiency but better security and more reliable auditing.&lt;/p&gt;

&lt;p&gt;Live migration is one piece of a larger networking story. If your KubeVirt rollout involves bridge mode at scale, multi-cluster topologies, BGP peering, or policy parity across VMs and containers, those decisions compound quickly. We pulled the full picture into &lt;a href="https://www.tigera.io/lp/ebook-the-complete-guide-to-vm-networking-for-kubernetes/" rel="noopener noreferrer"&gt;The Complete Guide to VM Networking for Kubernetes&lt;/a&gt;, a practitioner’s reference covering the architectural choices, networking modes, and operational patterns that determine whether a migration ships or stalls.&lt;/p&gt;

&lt;p&gt;Get &lt;a href="https://www.tigera.io/lp/ebook-the-complete-guide-to-vm-networking-for-kubernetes/" rel="noopener noreferrer"&gt;The Complete Guide to VM Networking for Kubernetes&lt;/a&gt; →&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/kubevirt-live-migration-done-right-what-it-takes-to-run-vms-on-kubernetes/" rel="noopener noreferrer"&gt;KubeVirt Live Migration Done Right: What it Takes to Run VMs on Kubernetes&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>bestpractices</category>
      <category>vmmigration</category>
    </item>
    <item>
      <title>The AI Agent Accountability Crisis: Why Governance Isn’t Keeping Up With Deployment</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Thu, 14 May 2026 18:08:21 +0000</pubDate>
      <link>https://dev.to/tigeraio/the-ai-agent-accountability-crisis-why-governance-isnt-keeping-up-with-deployment-5cl0</link>
      <guid>https://dev.to/tigeraio/the-ai-agent-accountability-crisis-why-governance-isnt-keeping-up-with-deployment-5cl0</guid>
      <description>&lt;p&gt;Every enterprise is building AI agents. Marketing has one summarizing campaign performance. Engineering has one triaging incidents. Customer support has one resolving tickets. Finance has one processing invoices. Each was built by a different team, using a different framework, with different assumptions about security.&lt;/p&gt;

&lt;p&gt;Now those agents are talking to each other &lt;a href="https://www.tigera.io/blog/how-ai-agents-communicate-understanding-the-a2a-protocol-for-kubernetes/" rel="noopener noreferrer"&gt;through agent-to-agent (A2A) communication&lt;/a&gt;. The incident-triage agent calls the customer-support agent to check affected accounts. The invoice agent calls an external payment API. The marketing agent queries a data warehouse with customer records.&lt;/p&gt;

&lt;p&gt;When something goes wrong (and at this scale of deployment, it will), can you answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Who authorized the action?&lt;/li&gt;
&lt;li&gt;What policy permitted it?&lt;/li&gt;
&lt;li&gt;What was the full chain of events?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you can’t, you have an accountability gap.&lt;/p&gt;

&lt;p&gt;This is part one of a five-part series on AI agent accountability for engineering and security leaders. We’ll work through the gap between agent deployment and governance, the diagnostic framework that exposes it, why your existing tools won’t close it, and the principles you’ll need to evaluate any solution that claims it can.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is AI agent accountability?
&lt;/h2&gt;

&lt;p&gt;AI agent accountability is the ability to trace, prove, and audit every action an AI agent takes. This includes which policy permitted the agent, which identity initiated it, and what the downstream effects were. It’s the layer above agent communication (MCP, A2A) and agent infrastructure (Kubernetes, GPUs, model serving) that answers the question: &lt;strong&gt;&lt;em&gt;who’s responsible when the agent acts?&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff56qeqh952pqywk8hden.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff56qeqh952pqywk8hden.png" width="800" height="398"&gt;&lt;/a&gt;&lt;br&gt;&lt;br&gt;
A landmark &lt;a href="https://fortune.com/2026/03/26/ai-agents-accountability-accenture-wharton-report/" rel="noopener noreferrer"&gt;2026 report from Accenture and the Wharton School of Business&lt;/a&gt; put the gap bluntly: “ &lt;strong&gt;Intelligence may be scalable, but accountability is not.&lt;/strong&gt; ” As enterprises race to deploy agents across every function, the governance architecture has not kept pace.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agents are scaling faster than governance
&lt;/h2&gt;

&lt;p&gt;The scale of the problem is not theoretical anymore. Major analyst firms have quantified it:&lt;/p&gt;

&lt;p&gt;| Source | Finding |&lt;br&gt;
| McKinsey, 2026 | 80% of organizations have encountered risky behavior from AI agents, actions that were unintended, unauthorized, or outside acceptable guardrails. |&lt;br&gt;
| McKinsey, 2026 | Only one-third (~33%) of organizations report governance maturity. |&lt;br&gt;
| Gartner, 2025 | Over 40% of agentic AI projects will be canceled by the end of 2027 due to escalating costs, unclear value, or inadequate risk controls. |&lt;br&gt;
| ISACA, 2025 | 66% of industry leaders believe formal agent accountability frameworks will become mandatory within the next two years. |&lt;br&gt;
| Dataiku, 2026 | 87% of CIOs report AI agents are already embedded in their enterprises, yet 75% lack real-time visibility into agent operations in production. |&lt;/p&gt;

&lt;p&gt;These are not edge cases. This is the mainstream enterprise experience with agentic AI in 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shadow agents: the new AI agent security gap
&lt;/h2&gt;

&lt;p&gt;A decade ago, enterprises faced “ &lt;strong&gt;Shadow IT&lt;/strong&gt; “. Employees adopting cloud services without IT approval, creating ungoverned sprawl that took years to bring under control. The same pattern is repeating with AI agents, but faster and with higher stakes.&lt;/p&gt;

&lt;p&gt;Low-code platforms have made it easy for almost anyone to create an AI agent. Building agents are now table stakes. Scaling them with governance is the real differentiator.&lt;/p&gt;

&lt;p&gt;Unlike cloud services, agents don’t just store data. They act. They make decisions, call APIs or MCP servers, access databases, and communicate with other agents. An ungoverned cloud service might leak data. &lt;strong&gt;But an ungoverned agent will leak data, take actions on that data, and propagate those actions across other agents in a chain that nobody can trace&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When an AI agent operates without clear ownership or accountability, productivity gains become systemic &lt;a href="https://www.tigera.io/learn/guides/ai-agent-security/" rel="noopener noreferrer"&gt;AI agent security&lt;/a&gt; risk. When something goes wrong, there is no clear owner to take responsibility, remediate, or even understand the full blast radius.&lt;/p&gt;

&lt;h2&gt;
  
  
  The regulatory deadlines
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://thefuturesociety.org/how-ai-agents-are-governed-under-the-eu-ai-act/" rel="noopener noreferrer"&gt;EU AI Act&lt;/a&gt;‘s main body takes effect in August 2026. For enterprises deploying agentic AI, three articles are particularly relevant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Article 12&lt;/strong&gt; requires high-risk AI systems to log their actions to ensure accountability and traceability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Article 13&lt;/strong&gt; requires clear and comprehensible information about how AI systems function and make decisions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Article 14&lt;/strong&gt; requires that high-risk systems are subject to effective human oversight, which is especially important for agentic AI, given the challenges of supervising autonomous agents.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The European Commission may also assess degree of autonomy as a relevant factor when determining whether a system poses unacceptable risks. The more independent your agents are, the higher the regulatory bar.&lt;/p&gt;

&lt;p&gt;The US is not far behind. The &lt;a href="https://leg.colorado.gov/bills/sb24-205" rel="noopener noreferrer"&gt;Colorado AI Act (SB 24-205)&lt;/a&gt;, delayed to &lt;a href="https://www.clarkhill.com/news-events/news/colorados-ai-law-delayed-until-june-2026-what-the-latest-setback-means-for-businesses/" rel="noopener noreferrer"&gt;June 30, 2026&lt;/a&gt;, requires deployers of high-risk AI systems to implement risk management programs, complete impact assessments, disclose to consumers when AI makes consequential decisions, and report algorithmic discrimination to the state attorney general. It applies to any company doing business in Colorado.&lt;br&gt;&lt;br&gt;
And Colorado is not an unique outlier, it’s just the leading edge. &lt;a href="https://iapp.org/resources/article/us-state-ai-governance-legislation-tracker" rel="noopener noreferrer"&gt;California, New York, Utah, and Texas&lt;/a&gt; have also already enacted AI governance laws. At the federal level, &lt;a href="https://www.americanactionforum.org/list-of-proposed-ai-bills-table/" rel="noopener noreferrer"&gt;80+ AI governance bills&lt;/a&gt; are under consideration in the current Congress. The &lt;a href="https://www.nist.gov/itl/ai-risk-management-framework" rel="noopener noreferrer"&gt;NIST AI Risk Management Framework&lt;/a&gt; is already the de facto US enterprise standard, even where it isn’t legally required.&lt;/p&gt;

&lt;p&gt;Compliance deadlines on both sides of the Atlantic are weeks away, not months or years.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core tension, and why it’s solvable
&lt;/h2&gt;

&lt;p&gt;Enterprises want agent autonomy. That’s the entire point: agents acting independently to drive efficiency and scale. But they also need accountability; knowing what happened, why it was permitted, and who is responsible.&lt;/p&gt;

&lt;p&gt;These seem to conflict. More autonomy means less control. More control means less autonomy.&lt;/p&gt;

&lt;p&gt;But this is a false dichotomy. As &lt;a href="https://www.paloaltonetworks.com/cyberpedia/what-is-agentic-ai-governance" rel="noopener noreferrer"&gt;Palo Alto Networks&lt;/a&gt; puts it: _ &lt;strong&gt;autonomy changes how systems operate, it doesn’t change who’s responsible&lt;/strong&gt; _.&lt;/p&gt;

&lt;p&gt;The same tension existed in microservices a decade ago. Teams wanted independent deployments (autonomy) with reliable service communication (control). The answer wasn’t to choose one over the other. It was to build a governance layer: service meshes, mTLS, observability; that delivered both.&lt;/p&gt;

&lt;p&gt;AI agents need the same evolution. The question isn’t whether to give agents autonomy or accountability. It’s whether you have the governance infrastructure to deliver both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently asked questions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What is the difference between AI agent accountability and AI agent security?&lt;/strong&gt; Security is about preventing unauthorized actions (blocking the bad). Accountability is about proving why authorized actions were permitted (auditing the good). You need both. A locked door (security) without a sign-in sheet (accountability) leaves your compliance team with nothing to show an auditor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why is AI agent accountability a 2026 priority?&lt;/strong&gt;  Three forces are converging this year: rapid agent deployment (87% of CIOs report agents already in production), maturing regulatory regimes (EU AI Act in August, Colorado AI Act in June), and the first wave of public agent-related incidents driving boardroom attention.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Does the EU/US AI Acts apply to my AI agents?&lt;/strong&gt;  If your agent is classified as a high-risk AI system under the Acts, then yes; and Articles 12 (logging), 13 (transparency), and 14 (human oversight), from the EU AI Act, all apply directly. Degree of autonomy is one of the factors regulators consider when assessing risk classification.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Are network policies and RBAC enough for AI agent governance?&lt;/strong&gt;  No. &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;Network policies&lt;/a&gt; operate at the wrong abstraction level (pod-to-pod, not agent-to-agent) and produce no audit trail. RBAC requires explicit enumeration that breaks down past about 100 agents, and can’t express attribute-based policies. We’ll cover this in detail in a later post of the series.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;80% of organizations have already encountered risky AI agent behavior, but only one-third have governance maturity to match.&lt;/li&gt;
&lt;li&gt;The EU AI Act and Colorado AI Act both take effect in 2026, so accountability requirements are no longer just optional, they are mandatory.&lt;/li&gt;
&lt;li&gt;AI agent accountability is the missing layer above agent communication (MCP, A2A) and agent infrastructure (Kubernetes).&lt;/li&gt;
&lt;li&gt;Autonomy and accountability are not in conflict, but you need a governance layer to deliver both.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Get the strategic guide for accountable AI agents&lt;/p&gt;

&lt;p&gt;We wrote our guide, &lt;em&gt;Accountable AI Agents: A Strategic Guide for AI &amp;amp; Security Leaders Governing Autonomous AI at Scale&lt;/em&gt;, to help engineering and security leaders close this gap. No code, no product demos, no fluff. Just the framework your leadership team needs to govern AI agents before the next incident (or the next regulation) forces your hand.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://info.tigera.io/rs/805-GFH-732/images/Whitepaper_Accountability_for_AI_Agents.pdf" rel="noopener noreferrer"&gt;Get the strategic guide for accountable AI agents&lt;/a&gt; →&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/the-ai-agent-accountability-crisis-why-governance-isnt-keeping-up-with-deployment/" rel="noopener noreferrer"&gt;The AI Agent Accountability Crisis: Why Governance Isn’t Keeping Up With Deployment&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>featuredblog</category>
      <category>technicalblog</category>
      <category>aiagentsecurity</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>What’s New in Calico v3.32</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Wed, 13 May 2026 22:23:05 +0000</pubDate>
      <link>https://dev.to/tigeraio/whats-new-in-calico-v332-3n6o</link>
      <guid>https://dev.to/tigeraio/whats-new-in-calico-v332-3n6o</guid>
      <description>&lt;p&gt;We’re excited to announce the release of Calico Open Source v3.32! &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff0aoxrsdsuj2942fdc40.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff0aoxrsdsuj2942fdc40.png" alt="🎉" width="72" height="72"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This release corresponds with Kubernetes v1.36 (Codename Haru) and it goes beyond just sharing a cat as the mascot of the release, it actually extends capabilities and features of Kubernetes to keep you up to date with the latest innovations of the cloud.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5rjk5f14gxokvaidcoey.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5rjk5f14gxokvaidcoey.png" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This release brings some of the most significant architectural changes in Calico, from live-migrating KubeVirt VMs to eBPF based Maglev load balancer.&lt;br&gt;&lt;br&gt;
Here’s a quick look at everything that’s new:&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08ewajgsxcq652w4iq8l.png" alt="🚨" width="72" height="72"&gt; Breaking Changes &amp;amp; Deprecations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ClusterNetworkPolicy (Alpha2) replaces Admin and Baseline Admin Network Policies:&lt;/strong&gt; &lt;code&gt;AdminNetworkPolicy&lt;/code&gt; and &lt;code&gt;BaselineAdminNetworkPolicy&lt;/code&gt; have been &lt;strong&gt;removed&lt;/strong&gt;. You must migrate to &lt;code&gt;ClusterNetworkPolicy&lt;/code&gt; before upgrading to v3.32, as Calico will no longer enforce the old resources.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;calico-apiserver&lt;/code&gt; Deprecated:&lt;/strong&gt; The aggregated API server is deprecated and will be removed in a future release. It is being replaced by &lt;strong&gt;Native v3 CRDs&lt;/strong&gt;. &lt;em&gt;(Requires MutatingAdmissionPolicy feature gate, Kubernetes 1.30+).&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1dp36elgeuxvuiact13r.png" alt="🚀" width="72" height="72"&gt; Key Feature Updates
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. KubeVirt VM Live Migration Support
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What it does:&lt;/strong&gt; Allows live-migrating KubeVirt VMs between nodes without dropping TCP connections.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it works:&lt;/strong&gt; Achieves IP persistence by binding the IP to the VM name rather than the ephemeral pod.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Activation:&lt;/strong&gt; Set &lt;code&gt;kubeVirtVMAddressPersistence: Enabled&lt;/code&gt; in the &lt;code&gt;IPAMConfiguration&lt;/code&gt; resource.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  2. Sidecarless mTLS (Istio Ambient Mode)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What it does:&lt;/strong&gt; High-performance, sidecarless mTLS using Istio ambient mode and Ztunnel. Removes the need to restart workloads or manage third-party components.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Activation:&lt;/strong&gt; Create a brand new Tigera-operator resource and set its kind to Istio then the Tigera Operator will automatically pick it up and automate the Istio integration! .&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  3. Maglev Consistent-Hash Load Balancing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What it does:&lt;/strong&gt; Minimizes flow remapping during backend changes, ensuring long-lived connections survive backend churn and allowing you to bypass external load balancers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Requirements:&lt;/strong&gt; Must use the Calico eBPF data plane in Direct Server Return (DSR) mode.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Activation:&lt;/strong&gt; Add the annotation &lt;code&gt;lb.projectcalico.org/external-traffic-strategy: "maglev"&lt;/code&gt; to your Service.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  4. Whisker Policy Filtering &lt;em&gt;(Tech Preview)&lt;/em&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What it does:&lt;/strong&gt; The Whisker web console flow-log stream now allows advanced UI filtering by Policy, Namespace/Pod, Verdict (allow/deny), Reporter, and Pending/Staged actions.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Kubernetes ClusterNetworkPolicy (Alpha2)
&lt;/h2&gt;
&lt;h3&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08ewajgsxcq652w4iq8l.png" alt="🚨" width="72" height="72"&gt;Breaking change &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08ewajgsxcq652w4iq8l.png" alt="🚨" width="72" height="72"&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;AdminNetworkPolicy and BaselineAdminNetworkPolicy resources were removed in v3.32 and must be replaced with ClusterNetworkPolicy before upgrading, Calico v3.32 and newer will not enforce them.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Kubernetes NetworkPolicy resource has long been limited by its namespace scoped perspective. This often created challenges for practitioners attempting to secure clusters forcing them to a flat design that required individual policies for every namespace and a heavy lift for the security team to govern every aspect of the environment’s security. Calico users, however, have avoided these pitfalls through the use of Global NetworkPolicy, policy tiers and ordering. These features enable a “shift-left” approach for Calico users, allowing application teams to manage their own security while administrators and security teams maintain the cluster’s overarching security posture by adjusting policy evaluation precedence.&lt;/p&gt;

&lt;p&gt;So we are glad to announce that the upstream Kubernetes SIG-Network introduces a new security model called ClusterNetworkPolicy. This is how cluster admins enforce cluster-scoped Accept, Deny, and Pass rules that namespace owners cannot override, filling the gap that namespace-scoped NetworkPolicy has never been able to address.&lt;/p&gt;

&lt;p&gt;Two tiers are auto-created at startup:&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;ClusterNetworkPolicy Tier&lt;/strong&gt; | &lt;strong&gt;Calico Tier&lt;/strong&gt; | &lt;strong&gt;Order&lt;/strong&gt; | &lt;strong&gt;Purpose&lt;/strong&gt; |&lt;br&gt;
| Admin | kube-admin | 1,000 | Hard guardrails set by the security/platform team |&lt;br&gt;
| Baseline | kube-baseline | 10,000,000 | Safety net defaults below namespace NetworkPolicy |&lt;/p&gt;

&lt;p&gt;Policies use an action field, Allow, Deny, or Pass, and a priority field (lower wins within the tier):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;policy.networking.k8s.io/v1alpha2&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterNetworkPolicy&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin-isolate-prod&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;tier&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Admin&lt;/span&gt;
  &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt;
  &lt;span class="na"&gt;subject&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;namespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prod&lt;/span&gt;
  &lt;span class="na"&gt;ingress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deny&lt;/span&gt;
      &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;namespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dev&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Native v3 CRDs (Tech Preview)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftco2892mv8tzss0j95hn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftco2892mv8tzss0j95hn.png" alt="⚠" width="72" height="72"&gt;&lt;/a&gt; &lt;strong&gt;Deprecation notice&lt;/strong&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftco2892mv8tzss0j95hn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftco2892mv8tzss0j95hn.png" alt="⚠" width="72" height="72"&gt;&lt;/a&gt;The aggregated API server (calico-apiserver) is deprecated in 3.32 and will be removed in a future release. &lt;strong&gt;Since this feature is currently in tech preview, migrating is optional.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of the longest-standing sources of installation friction in Calico has been the aggregated API server (calico-apiserver), a pod deployment that proxied requests to Calico’s v3 resources and generated its own OpenAPI schema independent of Kubernetes. This created ordering dependencies at install time, validation failures without an error if users used older APIs and also caused GitOps tools to fail schema validation after Kubernetes upgrades, and complicated the overall install experience. In Calico 3.32, we’re changing this permanently. Native projectcalico.org/v3 CRDs register Calico’s v3 resources directly as standard Kubernetes CRDs, the same mechanism as any other custom resource.&lt;/p&gt;

&lt;p&gt;What changes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5bgncwl6uzvaze8ysjr.png" alt="🚫" width="72" height="72"&gt; No calico-apiserver host-network pod&lt;/li&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7m28f8nilvowmbafix4b.png" alt="⚡" width="72" height="72"&gt; No ordering race between CRDs and the API server at startup&lt;/li&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fctjcqwvdvg2jrobnoq13.png" alt="🖥" width="72" height="72"&gt; kubectl get globalnetworkpolicies works natively, no calicoctl required&lt;/li&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8dwxu3ucolcmolsqelqb.png" alt="✅" width="72" height="72"&gt; OpenAPI schema generated by Kubernetes, so ArgoCD and Flux validate correctly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Beta or GA releases of &lt;a href="https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/" rel="noopener noreferrer"&gt;MutatingAdmissionPolicy feature gate&lt;/a&gt; must be enabled in your cluster (Kubernetes 1.32+, not enabled by default in all distributions).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;To help you prepare for upcoming Calico changes, we have provided a &lt;a href="https://docs.tigera.io/calico/latest/operations/crd-migration?utm_source=blog&amp;amp;utm_medium=whats_new&amp;amp;utm_id=advocacy" rel="noopener noreferrer"&gt;step-by-step migration guide&lt;/a&gt; you can use now.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" alt="📖" width="72" height="72"&gt;&lt;/a&gt; &lt;a href="https://docs.tigera.io/calico/latest/operations/native-v3-crds?utm_source=blog&amp;amp;utm_medium=whats_new&amp;amp;utm_id=advocacy" rel="noopener noreferrer"&gt;Enable native v3 CRDs&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  KubeVirt Virtual Machine (VM) Live Migration
&lt;/h2&gt;

&lt;p&gt;Kubernetes flexibility and ephemeral IP allocation is its strength but when it comes to VMs hosted on Kubernetes it becomes a pain point. Most VMs are transferred from a legacy network and applications that are running on it require static IP or a certain MAC address which is not something Kubernetes offers. Calico v3.32 release brings first-class support for live-migrating KubeVirt VMs between nodes in a cluster, without even dropping a single TCP connection. This means that you can move a VM from any node into any other nodes within a cluster without impacting the network operations.&lt;/p&gt;

&lt;p&gt;VM-based IP persistence is controlled by a single field in the IPAM configuration resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;projectcalico.org/v3&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;IPAMConfiguration&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kubeVirtVMAddressPersistence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Enabled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When enabled, Calico ties the IP handle to the VM name (k8s-pod-network.vmi..) rather than the visual representation of the VM as a pod. The same IP is reallocated to the destination pod during migration, and persists across reboots and pod evictions too.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" alt="📖" width="72" height="72"&gt;&lt;/a&gt; &lt;a href="https://docs.tigera.io/calico/latest/networking/kubevirt/kubevirt-networking?utm_source=blog&amp;amp;utm_medium=whats_new&amp;amp;utm_id=advocacy" rel="noopener noreferrer"&gt;KubeVirt networking&lt;/a&gt; | &lt;a href="https://docs.tigera.io/calico/latest/networking/kubevirt/live-migration-bgp?utm_source=blog&amp;amp;utm_medium=whats_new&amp;amp;utm_id=advocacy" rel="noopener noreferrer"&gt;BGP routing for KubeVirt live migration&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  mTLS encryption without compromise – Istio Ambient Mode (Tech Preview)
&lt;/h2&gt;

&lt;p&gt;Calico sidecarless mTLS is based on Istio ambient mode and Ztunnel which establishes a secure high performance link between your pods without a need for any sidecars. This is a significant performance boost over the previous Calico integration with Istio given that it eliminates the need to restart your workloads to join the mesh and the resource overhead that used to stack up as your workloads used to grow. On top of that for the new mTLS features you don’t need to install or manage any third-party components, since Tigera operator takes care of all the necessary parts of this integration and provides a smooth transition from an unencrypted environment to a high performance secure mesh.&lt;/p&gt;

&lt;p&gt;To enable ambient mode with Calico create the following Tigera Operator resource:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;operator.tigera.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Istio&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Calico publishes its own customized Istio, and Ztunnel images, with Calico-specific patches and CVE-fix dependency bumps applied. These images were previously available only in Calico Enterprise and are now part of Open Source.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" alt="📖" width="72" height="72"&gt;&lt;/a&gt; &lt;a href="https://docs.tigera.io/calico/latest/operations/istio/about-istio-ambient?utm_source=blog&amp;amp;utm_medium=whats_new&amp;amp;utm_id=advocacy" rel="noopener noreferrer"&gt;Istio Ambient Mode&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Maglev Consistent-Hash Load Balancing
&lt;/h2&gt;

&lt;p&gt;Calico v3.32 provides support for Maglev consistent-hash load balancing for external traffic to a Service. This means if enabled, Calico nodes act as stable Equal-Cost Multi-Path (ECMP) nexthops for advertised service IPs, serving as a distributed load balancer. When a Calico node is churned or loses connectivity, service connections will stay healthy. It also means if a backend is added or removed, Maglev remaps only a small fraction of flows, so that more long-lived connections survive churn.&lt;/p&gt;

&lt;p&gt;Now you might be wondering why you need such a feature? Using Maglev allows you to ditch your external legacy load balancers and move everything within your cluster.&lt;/p&gt;

&lt;p&gt;Opt in per-Service with a single annotation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-service&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;lb.projectcalico.org/external-traffic-strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maglev"&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LoadBalancer&lt;/span&gt;
  &lt;span class="s"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Calico Maglev also provides monitoring capabilities and builds on top of the prometheus integrations and you can monitor Maglev connection counts via the new Prometheus metric:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;felix_bpf_conntrack_maglev_entries_total
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Requirements:&lt;/strong&gt; Calico eBPF data plane in direct server return (DSR) mode.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" alt="📖" width="72" height="72"&gt;&lt;/a&gt; &lt;a href="https://docs.tigera.io/calico/latest/networking/configuring/add-maglev-load-balancing?utm_source=blog&amp;amp;utm_medium=whats_new&amp;amp;utm_id=advocacy" rel="noopener noreferrer"&gt;Add Maglev load balancing to a service&lt;/a&gt;&lt;br&gt;&lt;br&gt;
 &lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" alt="📖" width="72" height="72"&gt;&lt;/a&gt; &lt;a href="https://docs.tigera.io/calico/latest/operations/monitor/monitor-component-metrics?utm_source=blog&amp;amp;utm_medium=whats_new&amp;amp;utm_id=advocacy" rel="noopener noreferrer"&gt;Learn how to enable Calico Prometheus integrations&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Whisker Policy Filtering (Tech Preview)
&lt;/h2&gt;

&lt;p&gt;The Whisker web console gains expanded filtering on the live flow-log stream in Calico 3.32.&lt;/p&gt;

&lt;p&gt;This video depicts how you can now filter by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdht6gu9a9lhi9eeh1ia2.png" alt="🔍" width="72" height="72"&gt; Policy, show only flows that hit a specific policy&lt;/li&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcpc3lwr2dgyj25mysreu.png" alt="📁" width="72" height="72"&gt; Namespace / Pod, narrow to a specific workload&lt;/li&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8dwxu3ucolcmolsqelqb.png" alt="✅" width="72" height="72"&gt; Verdict, filter to allowed or denied flows only&lt;/li&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvb45azgwbxqle55xzdia.png" alt="👁" width="72" height="72"&gt; Reporter, filter by source or destination reporter&lt;/li&gt;
&lt;li&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F52h6v88ws6ehdyuuyqmz.png" alt="🕐" width="72" height="72"&gt; Pending/Staged actions, see what staged policies would do before enforcing them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This builds on the flow-logs API (Goldmane) and Whisker components shipped in earlier 3.x releases.&lt;/p&gt;

&lt;p&gt;New to Calico Whisker? Watch this CalicoCon session to learn more about Calico observability features:&lt;/p&gt;

&lt;p&gt;Like to build your own integration with Calico Goldmane?&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fic7hk10k9dk8fpvzxh5k.png" alt="📖" width="72" height="72"&gt;&lt;/a&gt; &lt;a href="https://docs.tigera.io/calico/latest/observability/view-flow-logs?utm_source=blog&amp;amp;utm_medium=whats_new&amp;amp;utm_id=advocacy" rel="noopener noreferrer"&gt;View flow logs in the Calico Whisker web console&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As always a full list of all the changes can be found in the &lt;a href="https://docs.tigera.io/calico/latest/release-notes/" rel="noopener noreferrer"&gt;release notes&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/whats-new-in-calico-v3-32/" rel="noopener noreferrer"&gt;What’s New in Calico v3.32&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>companyblog</category>
      <category>technicalblog</category>
      <category>opensource</category>
      <category>release</category>
    </item>
    <item>
      <title>Calculating The Kubernetes Integration Tax: What Your DIY Networking Stack Actually Costs</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Tue, 05 May 2026 20:07:46 +0000</pubDate>
      <link>https://dev.to/tigeraio/calculating-the-kubernetes-integration-tax-what-your-diy-networking-stack-actually-costs-2p1l</link>
      <guid>https://dev.to/tigeraio/calculating-the-kubernetes-integration-tax-what-your-diy-networking-stack-actually-costs-2p1l</guid>
      <description>&lt;p&gt;It was 11:47pm on a Thursday night, and a senior platform engineer at a large North American bank was rolling back a ‘simple’ configuration change. The change itself was small, a routine update approved through the usual review process, but when it was applied, pods began cycling and connections started dropping. For the next three seconds, mobile banking sessions already mid-transaction dropped. Customer support lit up. The incident review the next morning spent most of its time arguing about how the change had been approved. Almost no one asked the harder question: why a configuration change in one place broke something seemingly unrelated.&lt;/p&gt;

&lt;p&gt;That question rarely gets a clean answer. What looks like a single layer is usually one knot in a stack of five to seven products including a CNI, &lt;a href="https://www.tigera.io/learn/guides/kubernetes-security/kubernetes-network-policy/" rel="noopener noreferrer"&gt;network policy&lt;/a&gt;, service mesh, observability, threat detection and compliance tooling that come from different vendors and were never designed to operate as one system. Each one works. The gaps between them are where the risk, and the cost, lives.&lt;/p&gt;

&lt;p&gt;This is just one example of the Kubernetes integration tax.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the Kubernetes Integration Tax?
&lt;/h2&gt;

&lt;p&gt;The Kubernetes integration tax is the cumulative cost in engineer time, security exposure, compliance overhead, and redundant licensing, of running a multi-vendor &lt;a href="https://www.tigera.io/learn/guides/kubernetes-networking/" rel="noopener noreferrer"&gt;Kubernetes networking&lt;/a&gt; stack that was never designed to operate as one system. It’s a tax in the most literal sense: a recurring charge most enterprises pay every year without ever budgeting for it. It doesn’t appear on a single invoice. There’s no row in your procurement system, no line on your cloud bill, no entry in your SOC 2 evidence package that says “integration tax.” Instead, it accumulates quietly across budgets that report to different leaders, making it nearly impossible to see in any one quarterly review.&lt;/p&gt;

&lt;p&gt;What makes the integration tax different from ordinary tooling cost is that it scales with the gaps between products, not the products themselves. The more vendors in your stack, the more surface area between them — and the surface area is where the cost lives. Every new tool you add doesn’t just add a license; it adds a new set of integrations, a new compatibility matrix, a new dashboard for on-call to learn, and a new policy model someone has to reconcile against the four already in place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the Integration Tax Hides
&lt;/h2&gt;

&lt;p&gt;The Kubernetes integration tax lurks between the rows of a typical budget spreadsheet; it may not be immediately obvious, but once you see it, the accumulating costs become hard to ignore.&lt;/p&gt;

&lt;h3&gt;
  
  
  Glue Work
&lt;/h3&gt;

&lt;p&gt;An organization running five networking tools will typically have two to three engineers dedicated to keeping the integrations intact. Custom webhooks, YAML adapters between tools whose CRDs nearly overlap, Terraform modules that paper over inconsistent authentication models and dashboards that pull from four sources is the work that never appears on a roadmap. Think about how many hours your platform engineers spend on this work and multiply by their fully loaded hourly rate to see how much these disjointed toolsets are costing your organization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Extended Mean Time to Repair (MTTR)
&lt;/h3&gt;

&lt;p&gt;When L3 policy lives in one product, L7 policy lives in another, flow logs live in a third, and the service mesh enforces its own identity layer, the question “why did this request fail?” becomes a research project. &lt;a href="https://www.ibm.com/reports/data-breach" rel="noopener noreferrer"&gt;The industry average Mean Time to Identify (MTTI) for container-related incidents is 194 days, with another 64 days to contain&lt;/a&gt;. Like glue work, outages cost your organization in expensive engineering hours. They can also mean lost revenue if your applications provide services to paying customers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Licensing Overlap
&lt;/h3&gt;

&lt;p&gt;A platform team carrying a CNI enterprise license, a &lt;a href="https://www.tigera.io/learn/guides/service-mesh/" rel="noopener noreferrer"&gt;service mesh&lt;/a&gt;, a network observability product, a threat detection product, and a policy management tool will routinely find that two or three of them have overlapping capabilities. Multiply the cost of each redundant license by the number of clusters you run and the overlap can add up to as much $50K to $75K per year for a typical organization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Onboarding Drag
&lt;/h3&gt;

&lt;p&gt;A new platform engineer at an organization running a five-tool stack is typically six to nine months from being able to handle on-call rotations independently and resolve complex incidents. They might have to learn four dashboards, three query languages, two policy models, and the undocumented wiring between them. While they ramp up, the on-call load stays on the senior engineers who would otherwise be designing the platform’s next year of work. The cost of the engineering team goes up but capacity stays the same.&lt;/p&gt;

&lt;h3&gt;
  
  
  Upgrade Cost
&lt;/h3&gt;

&lt;p&gt;Each tool has its own release cadence, its own CVE posture, and its own compatibility matrix with the Kubernetes version underneath. In a five-tool stack, the compatibility math is combinatorial: every upgrade in one product has to be validated against the other four before it ships. There is rarely a clean moment when all five are simultaneously stable, supported, and upgradeable. Factor in the re-integration work needed when compatibility inevitably breaks and the costs can grow exponentially.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Math Looks Like
&lt;/h2&gt;

&lt;p&gt;Take a concrete enterprise profile: 150 clusters, 5,250 nodes, 25 platform engineers at a $220K loaded cost, five networking products totaling roughly $400K a year in license. Run the integration tax numbers against it and the picture becomes clear.&lt;/p&gt;

&lt;p&gt;| &lt;strong&gt;Component&lt;/strong&gt; | &lt;strong&gt;Annual cost&lt;/strong&gt; |&lt;br&gt;
| Engineer time on tooling such as glue work, MTTR, onboarding and upgrades (25% × 40% of team) | $550,000 |&lt;br&gt;
| Security risk exposure from gaps between tools | $640,000 |&lt;br&gt;
| Compliance evidence collection overhead | $72,000 |&lt;br&gt;
| Licensing overlap | $50,000+ |&lt;br&gt;
| &lt;strong&gt;Total Kubernetes integration tax&lt;/strong&gt; | &lt;strong&gt;~$1.3M/year&lt;/strong&gt; |&lt;/p&gt;

&lt;p&gt;The largest line on the table isn’t licensing, it’s engineering. At $550K, platform-engineer time on tooling already exceeds the entire $400K license bill across all five products. Going fully open source and zeroing out that $400K wouldn’t close the gap. $150K of integration overhead would still be on the books, generated by the same five products with the same surface area between them. Add the $640K of security risk that lives in those gaps and you’re carrying nearly $1.2M of cost that licensing has nothing to do with.&lt;/p&gt;

&lt;p&gt;The integration tax doesn’t shrink when you stop paying vendors. It shrinks when the surface area between products does.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Did We Get Here?
&lt;/h2&gt;

&lt;p&gt;No platform team sets out to run seven different networking products. They start with a &lt;a href="https://www.tigera.io/learn/guides/kubernetes-networking/kubernetes-cni/" rel="noopener noreferrer"&gt;CNI&lt;/a&gt;, because every cluster needs one, and something to log errors. Then they realize they need to ship those error logs to some common storage location so a dashboard can easily access them. They add collectors for metrics. They add tracing tools. The security team starts talking about mTLS so now a service mesh needs to be bolted on. And, by the way, a WAF is a requirement as well according to the compliance auditor.&lt;/p&gt;

&lt;p&gt;Every procurement decision makes sense in isolation but the result is not a cohesive and harmonious tooling stack. It’s a sedimentary layer of one-at-a-time decisions, each one rational, and none of them integrated.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Calculate Your Kubernetes Networking Cost
&lt;/h2&gt;

&lt;p&gt;If your team hasn’t calculated the Kubernetes integration tax for its own stack asking these three questions is a good first step:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How many engineer-hours per week does your platform team spend on work that only exists because of gaps between networking tools? Multiply by loaded cost to get the glue-work line.&lt;/li&gt;
&lt;li&gt;How long does an average network-related incident take to diagnose, and how many products does an on-call engineer typically touch during that investigation? That’s the MTTR line.&lt;/li&gt;
&lt;li&gt;How many months does it take for a new platform engineer to be comfortable alone on call, and how much of that ramp is tool-specific rather than Kubernetes-specific? That’s the onboarding line.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For an enterprise managing five separate networking products, the annual cost typically ranges from $800,000 to $1.5 million.&lt;/p&gt;

&lt;p&gt;While the exact figure may vary, it highlights a critical reality: the Kubernetes integration tax is a substantial, tangible expense for your organization, and it needs to be quantified. It has been on every platform team’s books for years; the first time it gets a number is the first time it can be planned against.&lt;/p&gt;

&lt;p&gt;Interested in calculating the Kubernetes integration tax for your own environment? &lt;a href="http://tigera.io/contact/" rel="noopener noreferrer"&gt;Contact us&lt;/a&gt; to learn more.&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/calculating-the-kubernetes-integration-tax-what-your-diy-networking-stack-actually-costs/" rel="noopener noreferrer"&gt;Calculating The Kubernetes Integration Tax: What Your DIY Networking Stack Actually Costs&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>bestpractices</category>
      <category>unifiedplatform</category>
    </item>
    <item>
      <title>VM Migration to Kubernetes: What Breaks and How to Prevent It</title>
      <dc:creator>Alister Baroi</dc:creator>
      <pubDate>Wed, 29 Apr 2026 14:20:49 +0000</pubDate>
      <link>https://dev.to/tigeraio/vm-migration-to-kubernetes-what-breaks-and-how-to-prevent-it-2k4e</link>
      <guid>https://dev.to/tigeraio/vm-migration-to-kubernetes-what-breaks-and-how-to-prevent-it-2k4e</guid>
      <description>&lt;p&gt;Here is what nobody putting together the business case for a VM migration to Kubernetes will tell you upfront: the compute is the easy part.&lt;/p&gt;

&lt;p&gt;Moving workloads off vSphere and onto Kubernetes is conceptually straightforward. The tooling has matured. The architecture is proven. Compute moves, storage remaps, and the platform team has a plan.&lt;/p&gt;

&lt;p&gt;The network is where projects quietly stall.&lt;/p&gt;

&lt;p&gt;Not because the technology does not work. Because nobody scoped the network properly before the project started. A platform migration turned into a multi-team coordination exercise. The firewall team needed a change window. The security team needed to review a network placement that changed when it should not have needed to. The application team discovered hardcoded IPs that nobody documented.&lt;/p&gt;

&lt;p&gt;Six months later, half the VMs are still on vSphere and the project is technically “in progress.”&lt;/p&gt;

&lt;p&gt;This is not a skills gap. It happens at the most mature organisations with capable teams. It is a scoping problem, and it has a specific cause: the gap between how VM networking works and how Kubernetes networking works is wider than it looks on a migration plan.&lt;/p&gt;

&lt;p&gt;This post is for the people who approve these projects. Here is what actually breaks, and what to decide before it does.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why VM Migrations to Kubernetes Stall on the Network
&lt;/h2&gt;

&lt;p&gt;In a traditional hypervisor environment, a VM’s IP address is its identity. Not just technically, as a routing destination, but operationally. It is registered in DNS. Referenced in firewall rules. Watched by monitoring agents. Connected to by peer applications. In regulated environments, it may be in compliance documentation.&lt;/p&gt;

&lt;p&gt;Kubernetes was built on different assumptions. Workloads are ephemeral. Addresses come from a range managed by the cluster and mean nothing outside it. Identity is based on labels, not addresses.&lt;/p&gt;

&lt;p&gt;When a VM moves into Kubernetes using the default networking model, it gets a new IP. That new IP ripples through everything that referenced the old one. Firewall rules, DNS, security reviews, monitoring, peer applications. None of it is technically hard. The problem is that the platform team owns none of those systems and controls none of those timelines. A migration scoped for one team becomes a coordination exercise across four of them.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is the networking tax: the hidden cost of a networking model that does not account for what your VMs are already attached to. Your platform team pays it. Your project timeline absorbs it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This is not an edge case. According to &lt;a href="https://portworx.com/resources/voice-of-kubernetes-expert-report/" rel="noopener noreferrer"&gt;Portworx’s 2024 Voice of Kubernetes Experts report&lt;/a&gt;, 58% of organisations plan to migrate some of their VMs to Kubernetes using technologies like &lt;a href="https://kubevirt.io/" rel="noopener noreferrer"&gt;KubeVirt&lt;/a&gt;. Of those organisations, 65% plan to do so within the next two years.&lt;/p&gt;

&lt;p&gt;The migrations are already happening. The scoping decisions are being made now.&lt;/p&gt;

&lt;p&gt;Picture a single VM that has been running for several years. Its IP address is in two firewall rules, a monitoring dashboard, a load balancer backend, and a compliance document that was last audited in 2021. The application team has it hardcoded in a config file nobody has opened in three years. That VM is not unusual. It is representative. Now multiply it by two hundred.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Ways VM Migration to Kubernetes Breaks in Practice
&lt;/h2&gt;

&lt;p&gt;Two failure modes appear repeatedly in VM-to-Kubernetes migrations. Neither is a surprise once you know to look for them.&lt;/p&gt;

&lt;h3&gt;
  
  
  The security bottleneck
&lt;/h3&gt;

&lt;p&gt;VLANs in enterprise environments are not just a routing tool. They are a compliance construct. Organisations have spent years segmenting networks to meet PCI-DSS, SOC 2, or internal security policies. Those segments are owned and documented by the security team. Changes to them require sign-off.&lt;/p&gt;

&lt;p&gt;When a VM’s network placement changes during migration, even if the VM itself is unchanged, the security team has a legitimate reason to review it. That review takes as long as it takes. The platform team cannot accelerate it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Platform fragmentation
&lt;/h3&gt;

&lt;p&gt;The compound result of scope expansion and security bottlenecks is partial migration. The VMs with few dependencies move. The ones with static IPs embedded in firewall rules, or in security-reviewed VLAN segments, stay on vSphere.&lt;/p&gt;

&lt;p&gt;The organisation ends up running two platforms in parallel with no agreed path to consolidation. The cost reduction and operational simplification that justified the migration are deferred. The project is technically not cancelled, just permanently not finished.&lt;/p&gt;

&lt;p&gt;For many organisations, this is not a planning exercise. Active licence renegotiations and uncertainty about long-term hypervisor roadmaps have moved these conversations from the backlog to the boardroom. The migrations are happening now, and the scoping decisions made in the next quarter will shape whether they succeed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The One Question Every VM Migration to Kubernetes Needs Answered
&lt;/h2&gt;

&lt;p&gt;Before any VM migration project is scoped or budgeted, one question is worth an explicit answer: &lt;strong&gt;Is this migration also a modernisation?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If yes, the network redesign is expected work. The additional team coordination is part of the scope. Budget and timeline accordingly, and plan for the security and network teams to be involved from the start.&lt;/p&gt;

&lt;p&gt;If no, the networking model chosen for the migration determines whether lift-and-shift is actually achievable or just aspirational.&lt;/p&gt;

&lt;p&gt;This sounds like a technology question. It is not. It is a project scoping question that happens to have a technology answer.&lt;/p&gt;

&lt;p&gt;The default Kubernetes networking model was designed for cloud-native workloads. Containers with ephemeral addresses and no upstream dependencies. It was not designed for VMs that have ten years of firewall rules, DNS entries, and compliance documentation attached to a fixed IP address.&lt;/p&gt;

&lt;p&gt;Using a model designed for the former to move the latter is where projects run into trouble. &lt;em&gt;You are not choosing a networking model for your Kubernetes cluster. You are choosing whether your migration is also a modernisation, and whether your budget accounts for that.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In some organisations, the decision about which networking model to use for VM migration does not belong to the platform team at all. Where VLANs are compliance constructs rather than just routing tools, it is the security team that owns the answer. That is not unusual. It is a reason to get them into the scoping conversation before the architecture is chosen, not after.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Networking Models, Two Different Projects
&lt;/h2&gt;

&lt;p&gt;There are two networking models for running VMs in Kubernetes, and the right one depends on what the migration is actually for.&lt;/p&gt;

&lt;h3&gt;
  
  
  The modernisation model
&lt;/h3&gt;

&lt;p&gt;In a Layer 3 (L3) model, the VM gets a new IP address from the cluster’s address range. Traffic is routed between the cluster and the rest of the network. Once the VM is on the cluster network, it operates the same way containers do. Kubernetes-native tooling applies without modification. The long-term operational model is clean.&lt;/p&gt;

&lt;p&gt;The trade-off is explicit: everything that referenced the old address needs to be updated. Firewall, DNS, monitoring, peer applications. This is the work. It is expected and budgeted when modernisation is the goal, and for organisations running a small number of VMs, or VMs with few upstream dependencies, it is often the right choice. The issue is not L3 routing. It is using L3 routing on a VM estate that was never scoped for modernisation.&lt;/p&gt;

&lt;h3&gt;
  
  
  The lift-and-shift model
&lt;/h3&gt;

&lt;p&gt;In a Layer 2 (L2) model, the existing network segment is extended directly into the Kubernetes cluster. Using KubeVirt to run the VM as a native Kubernetes workload alongside containers, &lt;a href="https://www.tigera.io/blog/kubevirt-networking-how-to-preserve-vm-ip-addresses-during-migration/" rel="noopener noreferrer"&gt;the VM keeps its original IP address&lt;/a&gt;. The VLAN it lived in is preserved inside the cluster. From the upstream network’s perspective, the workload did not move. The firewall rule still applies. DNS still resolves. The security team does not need to be pulled into a review they did not schedule for.&lt;/p&gt;

&lt;p&gt;Calico L2 Bridge Networks provide this capability. The upstream network continues to see the same workload it always did. No change requests. No reconfiguration. No other teams in the room.&lt;/p&gt;

&lt;p&gt;The practical consequence: the platform team owns the migration end to end. No firewall change requests sitting in a queue. No security review on a workload that did not change. No application team dependency. The project delivers on its original scope and its original timeline. That is what lift-and-shift is supposed to mean.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;You can migrate a VM from VMware to Kubernetes and it keeps its original IP, stays on its original VLAN. Nothing needs to change. And now it can be protected by Calico network policy and observed through Calico flow logs.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For teams migrating VMs with years of accumulated network dependencies, that continuity is the difference between a migration that completes and one that gets cancelled.&lt;/p&gt;

&lt;p&gt;For a technical breakdown of the L2 Bridge mode, see our blog post, &lt;a href="https://www.tigera.io/blog/lift-and-shift-vms-to-kubernetes-with-calico-l2-bridge-networks/" rel="noopener noreferrer"&gt;Lift-and-Shift VMs to Kubernetes with Calico L2 Bridge Networks&lt;/a&gt;, which walks through how the network continuity actually works and includes a recorded webinar.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Your VM Estate Gains After Migrating to Kubernetes
&lt;/h2&gt;

&lt;p&gt;Choosing the L2 model for migration does not mean your VM estate stays in legacy mode permanently. The migration is the beginning. Day 2 networking is what comes next.&lt;/p&gt;

&lt;p&gt;Once the VM is running in Kubernetes, the platform team gains operational capabilities that were not available on the old hypervisor. Traffic visibility, including east-west flows between the migrated VM and other workloads, is available without additional tooling. Security policy can be applied directly to the VM interface using the same constructs the team uses for containers, replacing legacy firewall rules incrementally on whatever timeline the security team sets. This is what Security in Depth looks like in practice. Layered controls applied workload by workload, not a single perimeter replaced in one event.&lt;/p&gt;

&lt;p&gt;The VM can also be moved between cluster nodes without network reconfiguration. Same IP, same VLAN, no change to the upstream network. KubeVirt live migration between nodes, without a separate network coordination step.&lt;/p&gt;

&lt;p&gt;This is what a policy-first migration enables: the networking and security layer is unified before the workloads move, so day 2 does not require a second migration to get there. Migration and modernisation stay on separate timelines, with separate budgets, managed by separate teams. Neither blocks the other.&lt;/p&gt;

&lt;p&gt;A VM that migrated to Kubernetes last quarter can have a new security policy applied today, written by the same security team using the same review process they already have. The migration did not force their hand on timing or tooling. The security team gains a policy model that is version-controlled and auditable. The platform team gains a migration that delivered on its original timeline. Those two outcomes are not in conflict.&lt;/p&gt;

&lt;h2&gt;
  
  
  Four Questions to Ask Before Approving a VM Migration to Kubernetes
&lt;/h2&gt;

&lt;p&gt;Four questions are worth an explicit answer before any VM migration project is approved:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Is this migration also a modernisation? If yes, budget for multi-team coordination. If no, confirm the networking model supports genuine lift-and-shift.&lt;/li&gt;
&lt;li&gt;Which VMs have static IPs embedded in firewall rules or compliance documentation? These are the workloads most likely to stall. Identify them before work begins, not during it.&lt;/li&gt;
&lt;li&gt;Who owns the VLAN segments the migrated VMs currently live in? If it is the security team, they belong in the scoping conversation, not the execution phase.&lt;/li&gt;
&lt;li&gt;What is the plan for workloads that cannot be modernised? If there is no answer, plan for two platforms running in parallel indefinitely.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Before the project kicks off, run a blast radius analysis on a single VM. Pick one. Map every service connecting to its IP, every firewall rule referencing it, every DNS entry, and who owns each one. That single exercise will tell you more about your true migration scope than any architecture diagram. If the answer fills a spreadsheet, your migration is not a weekend project. If the answer is three lines, start there.&lt;/p&gt;

&lt;p&gt;These are not technical questions. They are scope questions. The answers determine whether the migration delivers on its business case or quietly becomes a programme nobody approved.&lt;/p&gt;

&lt;p&gt;Interested in learning more about VM Migrations? &lt;a href="https://www.tigera.io/contact/" rel="noopener noreferrer"&gt;Talk to an expert about how you can migrate your VM estate&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The post &lt;a href="https://www.tigera.io/blog/vm-migration-to-kubernetes-what-breaks-and-how-to-prevent-it/" rel="noopener noreferrer"&gt;VM Migration to Kubernetes: What Breaks and How to Prevent It&lt;/a&gt; appeared first on &lt;a href="https://www.tigera.io" rel="noopener noreferrer"&gt;Tigera – Creator of Calico&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>technicalblog</category>
      <category>bestpractices</category>
      <category>vmmigration</category>
      <category>products</category>
    </item>
  </channel>
</rss>
