<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: daniel jeong</title>
    <description>The latest articles on DEV Community by daniel jeong (@x4nent).</description>
    <link>https://dev.to/x4nent</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3847714%2F6e415b2d-f2cf-4afe-9fbf-34cb69396b32.png</url>
      <title>DEV Community: daniel jeong</title>
      <link>https://dev.to/x4nent</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/x4nent"/>
    <language>en</language>
    <item>
      <title>Astro 6 Deep Dive — Vite Environment API, Rust Compiler, Live Content Collections, and First-Class Cloudflare Workers</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Sat, 09 May 2026 01:27:38 +0000</pubDate>
      <link>https://dev.to/x4nent/astro-6-deep-dive-vite-environment-api-rust-compiler-live-content-collections-and-first-class-bc</link>
      <guid>https://dev.to/x4nent/astro-6-deep-dive-vite-environment-api-rust-compiler-live-content-collections-and-first-class-bc</guid>
      <description>&lt;p&gt;On &lt;strong&gt;March 10, 2026&lt;/strong&gt;, &lt;strong&gt;Astro 6.0&lt;/strong&gt; shipped as stable, followed by &lt;strong&gt;Astro 6.1&lt;/strong&gt; in April with image optimization fixes, i18n improvements, and Lightning CSS regression patches. Calling this "a routine major release" badly understates what changed. Astro 6 &lt;strong&gt;rebuilt the dev server and build pipeline onto a single shared code path&lt;/strong&gt;, sat that path on top of Vite's new &lt;strong&gt;Environment API&lt;/strong&gt;, and made non-Node runtimes — &lt;strong&gt;Cloudflare Workers, Bun, Deno&lt;/strong&gt; — reproduce &lt;strong&gt;locally in dev&lt;/strong&gt; exactly as they will run in production. At the same time, the Go-based &lt;code&gt;.astro&lt;/code&gt; compiler got a &lt;strong&gt;Rust rewrite&lt;/strong&gt; as an experimental opt-in (compilation phase up to 100× faster on large content sites), the &lt;strong&gt;Fonts API&lt;/strong&gt; and &lt;strong&gt;Content Security Policy API&lt;/strong&gt; got promoted into core, and &lt;strong&gt;Live Content Collections&lt;/strong&gt; — experimental in 5.10 — went stable. The heaviest operational changes, though, sit elsewhere: &lt;strong&gt;&lt;code&gt;Astro.glob()&lt;/code&gt; permanently removed&lt;/strong&gt;, &lt;strong&gt;Node 22 mandated&lt;/strong&gt;, &lt;strong&gt;&lt;code&gt;Astro.locals.runtime&lt;/code&gt; removed&lt;/strong&gt; (replaced by direct &lt;code&gt;cloudflare:workers&lt;/code&gt; imports), and the &lt;strong&gt;&lt;code&gt;astro:schema&lt;/code&gt; / &lt;code&gt;z from 'astro:content'&lt;/code&gt;&lt;/strong&gt; path deprecated in favor of &lt;strong&gt;&lt;code&gt;astro/zod&lt;/code&gt; (Zod v4)&lt;/strong&gt;. This post organizes the 6.0/6.1 surface into five axes, walks through the seven build-breaking migration items, and shares the four-week migration checklist ManoIT validated while moving its marketing and docs sites from v5 to v6.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why 6 Is the Inflection Point — "Content Site = Static Build" Ends
&lt;/h2&gt;

&lt;p&gt;Through Astro 5, the framework leaned into static site generation as a &lt;strong&gt;content-first&lt;/strong&gt; posture. v4 added Content Layer API to free collections from local files; v5 added Server Islands and Sessions to mainstream partial SSR. v6 is the &lt;strong&gt;completion&lt;/strong&gt; of that arc, and it lands in two sentences. First, &lt;strong&gt;the default for "build-once, deploy anywhere" shifts from build time to request time.&lt;/strong&gt; With Live Content Collections and Route Caching stabilizing, content sites can serve &lt;strong&gt;fresh-per-request data&lt;/strong&gt; while still letting CDN cache headers control delivery. Second, &lt;strong&gt;"works in dev, breaks in prod" disappears.&lt;/strong&gt; Thanks to Vite's Environment API, the dev server boots non-Node runtimes (workerd, Bun, Deno) locally and uses the same runtime for prerendering. You import &lt;code&gt;env&lt;/code&gt; from &lt;code&gt;cloudflare:workers&lt;/code&gt; in dev, not a &lt;code&gt;process.env&lt;/code&gt; shim.&lt;/p&gt;

&lt;p&gt;The table below collapses the v5.x → v6.0/6.1 operational surface onto a single page.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;Astro 5.x&lt;/th&gt;
&lt;th&gt;Astro 6.0 (2026-03-10)&lt;/th&gt;
&lt;th&gt;Astro 6.1 (2026-04)&lt;/th&gt;
&lt;th&gt;Operational signal&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Dev server&lt;/td&gt;
&lt;td&gt;Vite + custom middleware&lt;/td&gt;
&lt;td&gt;Rebuilt on Vite Environment API&lt;/td&gt;
&lt;td&gt;HMR/sourcemap regressions fixed&lt;/td&gt;
&lt;td&gt;dev = prod runtime parity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloudflare&lt;/td&gt;
&lt;td&gt;workerd partial&lt;/td&gt;
&lt;td&gt;workerd at every stage (dev/prerender/prod)&lt;/td&gt;
&lt;td&gt;hybrid site regression fixed&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Astro.locals.runtime&lt;/code&gt; removed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compiler&lt;/td&gt;
&lt;td&gt;Go-based&lt;/td&gt;
&lt;td&gt;Experimental Rust build added&lt;/td&gt;
&lt;td&gt;Rust build regression fixed&lt;/td&gt;
&lt;td&gt;Up to 100× compile speed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fonts&lt;/td&gt;
&lt;td&gt;Community integrations&lt;/td&gt;
&lt;td&gt;Core Fonts API GA&lt;/td&gt;
&lt;td&gt;Fallback metric improved&lt;/td&gt;
&lt;td&gt;Local + Google + Fontsource&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSP&lt;/td&gt;
&lt;td&gt;Vendor middleware&lt;/td&gt;
&lt;td&gt;Core CSP API GA&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;strict-dynamic&lt;/code&gt; added&lt;/td&gt;
&lt;td&gt;Inline-script nonce auto&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Live Collections&lt;/td&gt;
&lt;td&gt;5.10 experimental&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;td&gt;External hosting integrations&lt;/td&gt;
&lt;td&gt;Request-time data official&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Route Caching&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Experimental Route Caching API&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cache-control&lt;/code&gt; surrogate keys&lt;/td&gt;
&lt;td&gt;Platform-agnostic SSR cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content Collections&lt;/td&gt;
&lt;td&gt;Old API + Layer API coexisted&lt;/td&gt;
&lt;td&gt;Content Layer API mandatory&lt;/td&gt;
&lt;td&gt;unchanged&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Astro.glob()&lt;/code&gt; removed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Schema validation&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;astro:schema&lt;/code&gt; / &lt;code&gt;z from 'astro:content'&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;astro/zod&lt;/code&gt; recommended (Zod v4)&lt;/td&gt;
&lt;td&gt;compatibility kept&lt;/td&gt;
&lt;td&gt;Input/output type split&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Node&lt;/td&gt;
&lt;td&gt;≥ 18.20&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;≥ 22 LTS required&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;unchanged&lt;/td&gt;
&lt;td&gt;OS base image alignment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Image&lt;/td&gt;
&lt;td&gt;Sharp + partial redirect&lt;/td&gt;
&lt;td&gt;Pattern-gated redirect rejection&lt;/td&gt;
&lt;td&gt;Up to 10 redirects, AVIF/animated safe&lt;/td&gt;
&lt;td&gt;External CDN policy explicit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The two highest-cost rows for operations are &lt;strong&gt;Content Collections&lt;/strong&gt; and &lt;strong&gt;Schema validation&lt;/strong&gt;. The auto-compatibility window for &lt;code&gt;Astro.glob()&lt;/code&gt; finally closed, and &lt;code&gt;z from 'astro:content'&lt;/code&gt; migrated to a Zod v4-based &lt;code&gt;astro/zod&lt;/code&gt;. Both break builds, so we burned &lt;strong&gt;week 1&lt;/strong&gt; of our migration entirely on these two rows.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Astro 6 Runtime Architecture — Five-Axis Layout
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────── Source ────────────┐
│  src/pages/*.astro             │
│  src/content/**.{md,mdx}       │
│  src/content.config.ts (zod v4)│
└─────────────┬──────────────────┘
              │
       ┌──────▼──────┐
       │  Compiler   │   ❶ Go (default) | Rust (experimental, ~100×)
       │  .astro →   │
       │  ESM module │
       └──────┬──────┘
              │
   ┌──────────▼─────────────────────────────────────────┐
   │  Vite Environment API                              │  ❷ dev = prod runtime
   │  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐    │
   │  │  Node  │  │ workerd│  │  Bun   │  │  Deno  │    │
   │  └────┬───┘  └────┬───┘  └────┬───┘  └────┬───┘    │
   └───────┼───────────┼───────────┼───────────┼────────┘
           │           │           │           │
   ┌───────▼───────────▼───────────▼───────────▼────────┐
   │  Astro core 6                                      │  ❸ Adapters
   │  • Fonts API   (preload, fallback metric)          │
   │  • CSP API     (nonce, strict-dynamic)             │
   │  • Live Content Collections (request-time loaders) │
   │  • Route Caching API (experimental, web-standard)  │
   │  • Sessions, Server Islands, Actions               │
   └───────┬────────────────────────────────────────────┘
           │
   ┌───────▼────────────────────────────────────────────┐
   │  Output                                             │  ❹ static / hybrid / SSR
   │  • Static pages (HTML)                              │
   │  • Server entry (Cloudflare Workers / Node / Bun)   │
   │  • Live API routes / RSC-style streaming            │
   └─────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Figure 1. Astro 6 five axes — ❶ compiler is Go-default with Rust experimental ❷ Vite Environment API supplies the same runtime to dev/prerender/prod ❸ core surfaces Fonts/CSP/Live Collections/Route Caching directly ❹ adapters (Cloudflare/Node/Bun/Deno) split the output.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 Axis ① Compiler — Go to Rust, gradually
&lt;/h3&gt;

&lt;p&gt;Astro 6 ships an &lt;strong&gt;experimental Rust rewrite&lt;/strong&gt; of the compiler that turns &lt;code&gt;.astro&lt;/code&gt; files into ESM modules. The stable build is still Go; Rust is opt-in. Two consequences. First, on cold builds of large content sites, the &lt;strong&gt;compilation phase alone&lt;/strong&gt; shows ~100× speedups (the more files you have, the bigger the win). Second, once Rust stabilizes, it slots cleanly next to &lt;strong&gt;Vite Rolldown&lt;/strong&gt; (same Rust toolchain, room for Oxc/SWC interop). The operational decision is straightforward — keep Go as the default through 6.1 and opt into Rust only on a CI regression-test job.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Axis ② Vite Environment API — dev = prod runtime
&lt;/h3&gt;

&lt;p&gt;Vite's Environment API splits the module graph &lt;strong&gt;per environment&lt;/strong&gt;. Astro 6 sits its dev server on top of that, so each environment (Node, workerd, Bun, Deno) &lt;strong&gt;handles requests with its own runtime&lt;/strong&gt;. The clearest payoff is Cloudflare Workers. v5 emulated Node in dev and exposed only some platform APIs; v6 runs &lt;strong&gt;workerd&lt;/strong&gt; in dev/prerender/prod. Code that imports &lt;code&gt;env&lt;/code&gt; from &lt;code&gt;cloudflare:workers&lt;/code&gt;, KV/R2/D1 bindings, and Service Bindings all work in dev too — the entire "works in dev, breaks in prod" class of regressions is gone.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Axis ③ Core APIs — Fonts, CSP, Live Collections, Route Caching
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Fonts API&lt;/strong&gt; treats local files, Google Fonts, and Fontsource as one interface — handling self-hosting, automatic preload links, and &lt;strong&gt;fallback metric&lt;/strong&gt; (measuring the original font's ascent/descent/line-gap so the system fallback simulates the same metrics) to cut CLS. The &lt;strong&gt;CSP API&lt;/strong&gt; auto-injects nonces into inline scripts and emits &lt;code&gt;strict-dynamic&lt;/code&gt;, &lt;code&gt;upgrade-insecure-requests&lt;/code&gt;, and &lt;code&gt;frame-ancestors&lt;/code&gt; headers per-adapter. Astro Islands' inline hydration scripts have been the obstacle to clean CSP for years; v6 closes that problem. &lt;strong&gt;Live Content Collections&lt;/strong&gt; is a new loader shape that fetches data at request time on top of the Content Layer. &lt;strong&gt;Route Caching API&lt;/strong&gt; declares per-route &lt;code&gt;cache-control&lt;/code&gt;, &lt;code&gt;cdn-cache-control&lt;/code&gt;, and &lt;code&gt;surrogate-key&lt;/code&gt; headers, abstracting platform differences (Cloudflare/Vercel/Netlify).&lt;/p&gt;

&lt;h3&gt;
  
  
  2.4 Axis ④ Adapters — Cloudflare / Node / Bun / Deno
&lt;/h3&gt;

&lt;p&gt;Adapters wrap the core's build output into each platform's entrypoint. The biggest v6 change is &lt;code&gt;@astrojs/cloudflare&lt;/code&gt; running workerd at every stage. As a side effect, &lt;code&gt;Astro.locals.runtime.env&lt;/code&gt; is gone and the same data is reached via &lt;code&gt;env&lt;/code&gt; from &lt;code&gt;cloudflare:workers&lt;/code&gt; (see §5). The Node adapter now requires Node 22 minimum, and Bun/Deno adapters get much better dev parity from Environment API.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.5 Axis ⑤ Output — static / hybrid / SSR + Live API
&lt;/h3&gt;

&lt;p&gt;Output modes are still &lt;code&gt;static&lt;/code&gt;, &lt;code&gt;hybrid&lt;/code&gt;, and &lt;code&gt;server&lt;/code&gt;. The difference is how routes opt in. Live API routes declare &lt;code&gt;export const prerender = false&lt;/code&gt; and use a Live Collection to return request-time data. The Route Caching API on the same route declares cache headers to govern CDN/edge caches.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. v5 → v6 Migration — The Seven Build-Breaking Changes
&lt;/h2&gt;

&lt;p&gt;Below are the seven items from ManoIT's migration retro that &lt;strong&gt;actually break builds&lt;/strong&gt;. Going top to bottom in one pass closes week 1.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Change&lt;/th&gt;
&lt;th&gt;Symptom&lt;/th&gt;
&lt;th&gt;Fix direction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Astro.glob()&lt;/code&gt; removed&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TypeError: Astro.glob is not a function&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Replace with &lt;code&gt;import.meta.glob()&lt;/code&gt; or move to a Content Layer loader&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Content Layer API mandatory&lt;/td&gt;
&lt;td&gt;Old collections return empty arrays&lt;/td&gt;
&lt;td&gt;Consolidate into &lt;code&gt;src/content.config.ts&lt;/code&gt; + &lt;code&gt;defineCollection({ loader })&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;z from 'astro:content'&lt;/code&gt; deprecated&lt;/td&gt;
&lt;td&gt;Type errors / runtime warnings&lt;/td&gt;
&lt;td&gt;Switch to &lt;code&gt;import { z } from 'astro/zod'&lt;/code&gt; (Zod v4)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Zod v4&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;z.string({ required_error })&lt;/code&gt; API changed&lt;/td&gt;
&lt;td&gt;Apply v3→v4 migration (split input/output types)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Astro.locals.runtime&lt;/code&gt; removed (Cloudflare)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;undefined&lt;/code&gt; errors&lt;/td&gt;
&lt;td&gt;&lt;code&gt;import { env } from 'cloudflare:workers'&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Node 22 required&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;engines&lt;/code&gt; warning or build failure&lt;/td&gt;
&lt;td&gt;Move base image to &lt;code&gt;node:22-alpine&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Image redirect patterns&lt;/td&gt;
&lt;td&gt;Build fails on external CDN redirect&lt;/td&gt;
&lt;td&gt;Declare every redirect host in &lt;code&gt;image.remotePatterns&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h1&gt;
  
  
  1 is the most common and the simplest. &lt;code&gt;Astro.glob('../posts/*.md')&lt;/code&gt; becomes &lt;code&gt;import.meta.glob('../posts/*.md', { eager: true })&lt;/code&gt;, or — preferred — wrap it in a content collection. #2 means the auto-compatibility window closed in v6: the implicit upgrade to the new Content Layer API that v5.x allowed without &lt;code&gt;experimental.contentLayer&lt;/code&gt; is gone. Define every collection in &lt;code&gt;src/content.config.ts&lt;/code&gt;, and supply an explicit &lt;strong&gt;loader&lt;/strong&gt; (see §4). #3 and #4 ride the same Zod release wave, so handle them together rather than splitting across weeks.
&lt;/h1&gt;

&lt;h2&gt;
  
  
  4. Live Content Collections — The Request-Time Loader Pattern
&lt;/h2&gt;

&lt;p&gt;Below is the recommended Astro 6 collection shape. &lt;strong&gt;Build-time&lt;/strong&gt; (blog posts) and &lt;strong&gt;request-time live&lt;/strong&gt; (live inventory) collections sit side-by-side in the same file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/content.config.ts — Astro 6 recommended layout&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;defineCollection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;defineLiveCollection&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;astro:content&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;astro/zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                     &lt;span class="c1"&gt;// ❶ astro/zod (Zod v4)&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;glob&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;astro/loaders&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;liveLoader&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@astrojs/live-collections&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// ❷ Build-time: src/content/posts/**/*.md&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;defineCollection&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;glob&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;**/*.{md,mdx}&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;base&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./src/content/posts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;publishedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;coerce&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;([]),&lt;/span&gt;
    &lt;span class="na"&gt;cover&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;url&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// ❸ Request-time live: external inventory API&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;inventory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;defineLiveCollection&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;liveLoader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`https://api.manoit.co.kr/inventory/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cache-control&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;no-store&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Inventory &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;sku&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;stock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;nonnegative&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;nonnegative&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;lastUpdated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;coerce&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;collections&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;inventory&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two points. First, &lt;code&gt;defineCollection&lt;/code&gt; is fetched once at build time and rendered into static pages. Second, &lt;code&gt;defineLiveCollection&lt;/code&gt; runs &lt;strong&gt;per request&lt;/strong&gt; so &lt;code&gt;const item = await getEntry('inventory', sku)&lt;/code&gt; returns fresh data every call while keeping the same API. Routes that use a live collection automatically flip to &lt;code&gt;prerender = false&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. First-Class Cloudflare Workers — workerd-on-everywhere
&lt;/h2&gt;

&lt;p&gt;Below is the v5 → v6 diff for a Cloudflare-adapter route. &lt;code&gt;Astro.locals.runtime&lt;/code&gt; is gone; &lt;code&gt;cloudflare:workers&lt;/code&gt; is imported directly. The dev server runs the exact same code path, so the "process.env in dev, env in prod" branch is gone.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// src/pages/api/keys/[name].ts — Astro 6 + Cloudflare Workers&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;APIContext&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;astro&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cloudflare:workers&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;     &lt;span class="c1"&gt;// ❶ Replaces Astro.locals.runtime.env&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prerender&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;               &lt;span class="c1"&gt;// ❷ Live API route&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;GET&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;APIContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MY_KV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`tenant:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;not_found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;content-type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;content-type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="c1"&gt;// ❸ Route Caching API: 60s at the edge, surrogate-key for invalidation&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cache-control&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;public, max-age=0, s-maxage=60&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cache-tag&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`tenant:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;❶ &lt;code&gt;env&lt;/code&gt; exposes the wrangler.toml KV/R2/D1 bindings and secrets in dev too. ❷ A single line flips the route to SSR. ❸ &lt;code&gt;cache-tag&lt;/code&gt; is invalidated by Cloudflare's &lt;em&gt;purge by tag&lt;/em&gt;; the Route Caching API translates the header to whatever name each adapter expects (Vercel uses &lt;code&gt;x-vercel-cache-tag&lt;/code&gt;, Netlify uses &lt;code&gt;netlify-cdn-cache-tag&lt;/code&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Fonts and CSP API — The Last Two Pieces of the Islands Era
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Fonts API&lt;/strong&gt; lands in &lt;code&gt;astro.config.ts&lt;/code&gt; like this — self-hosting, preload, and fallback metric in one go.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// astro.config.ts — Fonts API + CSP API&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;defineConfig&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;astro/config&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;cloudflare&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@astrojs/cloudflare&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineConfig&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;server&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;adapter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;cloudflare&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;directory&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;fonts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Pretendard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;cssVariable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--font-sans&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fontsource&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                 &lt;span class="c1"&gt;// ❶ google | fontsource | local&lt;/span&gt;
      &lt;span class="na"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="na"&gt;subsets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;latin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;korean&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="na"&gt;fallbacks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Apple SD Gothic Neo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sans-serif&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="na"&gt;display&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;swap&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;JetBrains Mono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;cssVariable&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--font-mono&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;google&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;600&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="na"&gt;preload&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;csp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;directives&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;default-src&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;'self'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;script-src&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;'self'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;'strict-dynamic'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;   &lt;span class="c1"&gt;// ❷ nonce auto&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;style-src&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;'self'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;'unsafe-inline'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;img-src&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;'self'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://cdn.manoit.co.kr&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;connect-src&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;'self'&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.manoit.co.kr&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;experimental&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;routeCaching&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                       &lt;span class="c1"&gt;// ❸ enable route cache headers&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use them in CSS as &lt;code&gt;font-family: var(--font-sans)&lt;/code&gt; / &lt;code&gt;var(--font-mono)&lt;/code&gt;. Astro computes SHA-256 hashes for inline scripts at build time and adds them to &lt;code&gt;script-src&lt;/code&gt;, or — on SSR routes — generates a &lt;strong&gt;per-request nonce&lt;/strong&gt;. Our policy is &lt;code&gt;'strict-dynamic'&lt;/code&gt; + nonce so child scripts spawned by hydration are trusted automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. ManoIT's Four-Week Migration Checklist
&lt;/h2&gt;

&lt;p&gt;Concrete, gated steps. Each week ends on a regression-green gate before the next begins.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Week&lt;/th&gt;
&lt;th&gt;Work&lt;/th&gt;
&lt;th&gt;Gate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Align Node 22 base image, sweep &lt;code&gt;Astro.glob()&lt;/code&gt;, consolidate collections in &lt;code&gt;src/content.config.ts&lt;/code&gt;, switch &lt;code&gt;z&lt;/code&gt; import to &lt;code&gt;astro/zod&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Local build green, &lt;code&gt;tsc --noEmit&lt;/code&gt; clean&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Cloudflare adapter to v6, replace &lt;code&gt;Astro.locals.runtime&lt;/code&gt; with &lt;code&gt;cloudflare:workers&lt;/code&gt;, validate wrangler.toml bindings in dev&lt;/td&gt;
&lt;td&gt;Dev hits live KV/R2/D1 successfully&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Adopt Live Content Collections (inventory/sessions/live pricing), define Route Caching matrix, wire surrogate-key invalidation&lt;/td&gt;
&lt;td&gt;Edge cache hit rate ≥ 80% (staging)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Apply Fonts + CSP API, compare Lighthouse CLS/LCP, opt Rust compiler into a CI job, canary 5% traffic&lt;/td&gt;
&lt;td&gt;CLS ≤ 0.05, CSP report-only violations 0, canary error rate ≤ 0.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Four weeks compresses with site size. ManoIT's marketing site (~380 pages, 80 MDX, 4 collections) finished in &lt;strong&gt;7 working days&lt;/strong&gt;. The biggest week-1 sink was the &lt;strong&gt;Zod v3 → v4 schema realignment&lt;/strong&gt;: &lt;code&gt;z.string().required()&lt;/code&gt; collapses to &lt;code&gt;z.string()&lt;/code&gt; in v4 (required by default), and &lt;code&gt;z.input&amp;lt;T&amp;gt;&lt;/code&gt; / &lt;code&gt;z.output&amp;lt;T&amp;gt;&lt;/code&gt; had to be split for any schema that coerced types.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Conclusion — Astro 6 Installs SSR as the Default for Content Sites
&lt;/h2&gt;

&lt;p&gt;Through v5, Astro's posture was &lt;em&gt;"static where you can, SSR where you must."&lt;/em&gt; v6 doesn't reverse that posture. It just &lt;strong&gt;drops the friction of choosing SSR to nearly zero&lt;/strong&gt;, using every major surface to do it: dev=prod runtime parity, first-class workerd, live collections, route caching, Fonts/CSP in core, the Rust compiler. Adding one SSR line to a content site costs an order of magnitude less than it did in v5. But every gain arrives &lt;strong&gt;after&lt;/strong&gt; the migration gate. Until you've cleaned out &lt;code&gt;Astro.glob()&lt;/code&gt;, &lt;code&gt;astro:schema&lt;/code&gt;, &lt;code&gt;Astro.locals.runtime&lt;/code&gt;, and Node 18 in one pass, the new v6 surface looks only like build-breaking edges. ManoIT's three sites — marketing, docs, customer portal — landed on v6 within 28 days of the 6.0 stable release; route caching and live collections roll into the next quarter.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was co-authored by Anthropic Claude Opus 4.6 and ManoIT. Primary sources: Astro 6.0 (2026-03-10) and 6.1 release notes, &lt;code&gt;@astrojs/cloudflare&lt;/code&gt;, Vite Environment API documentation, and ManoIT's internal migration retros (2026-04-13 through 2026-05-02). © 2026 ManoIT.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1468286" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>frontend</category>
      <category>webdev</category>
      <category>javascript</category>
      <category>nextjs</category>
    </item>
    <item>
      <title>OpenTelemetry Profiles Public Alpha: eBPF Fourth Signal, Collector v0.151.0 and OpAMP Fleet Management for 2026</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Fri, 08 May 2026 00:09:28 +0000</pubDate>
      <link>https://dev.to/x4nent/opentelemetry-profiles-public-alpha-ebpf-fourth-signal-collector-v01510-and-opamp-fleet-6g3</link>
      <guid>https://dev.to/x4nent/opentelemetry-profiles-public-alpha-ebpf-fourth-signal-collector-v01510-and-opamp-fleet-6g3</guid>
      <description>&lt;h1&gt;
  
  
  OpenTelemetry Profiles Public Alpha — How the eBPF Fourth Signal, Collector v0.151.0, and OpAMP Fleet Management Redefine Unified Observability in 2026
&lt;/h1&gt;

&lt;p&gt;On March 26, 2026 the CNCF announced &lt;strong&gt;the Public Alpha of the OpenTelemetry Profiles signal&lt;/strong&gt;. After metrics, traces, and logs, &lt;strong&gt;continuous profiling joins as the fourth signal&lt;/strong&gt;, and OpenTelemetry becomes the first open standard to put all four observability pillars under a single SDK, a single OTLP wire protocol, and a single semantic-convention layer. In the same release cycle, &lt;strong&gt;OpenTelemetry Collector v0.151.0&lt;/strong&gt; shipped on April 29, 2026 with winget distribution, Run/Shutdown lifecycle synchronization, and richer &lt;code&gt;send_failed&lt;/code&gt; metrics that smoothed out the operational surface. And IBM Instana's GA of OpAMP-powered Collector Fleet Management makes it clear that the 2026 default for OTel operations is moving from SSH and rolling restarts to supervisor-driven OpAMP. This post consolidates the eBPF profiler architecture, the &lt;code&gt;k8sattributesprocessor&lt;/code&gt; integration, the &lt;code&gt;trace_id&lt;/code&gt;/&lt;code&gt;span_id&lt;/code&gt; cross-correlation semantic conventions, the Q3 2026 GA timeline, and ManoIT's four-week adoption checklist validated on EKS 1.32 and on-prem bare metal.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why Profiles Is the Tipping Point — Unified OTLP Across Four Signals
&lt;/h2&gt;

&lt;p&gt;Until now the observability stack has been split. Metrics, traces, and logs converged on OpenTelemetry, but &lt;strong&gt;continuous profiling stayed on a separate axis&lt;/strong&gt; with Pyroscope, Parca, and Pixie. As a result, operators have had to maintain two agents, two wire formats, and two backends for the same workload. Answering "during the slow span I just saw, where exactly did the CPU time go inside the call stack?" required manually aligning two different graphs.&lt;/p&gt;

&lt;p&gt;Profiles Alpha ends that split. &lt;strong&gt;Profile samples now travel over OTLP carrying &lt;code&gt;trace.id&lt;/code&gt; and &lt;code&gt;span.id&lt;/code&gt; attributes by semantic convention&lt;/strong&gt;, so backends can join trace and profile data on the same key and offer one-click navigation from a span to its corresponding stack trace. Per OpenTelemetry's versioning rules, all signal SDKs ship with the same version, so the Profiles stability schedule is bound to the SDK as a whole. &lt;strong&gt;Profiles is targeting GA in Q3 2026&lt;/strong&gt;; Public Alpha sits one step before that mark.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;2024 (before)&lt;/th&gt;
&lt;th&gt;2026 Q1 (RC)&lt;/th&gt;
&lt;th&gt;2026 Q2 (Alpha · now)&lt;/th&gt;
&lt;th&gt;2026 Q3 (GA target)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fourth signal status&lt;/td&gt;
&lt;td&gt;Experimental&lt;/td&gt;
&lt;td&gt;Release Candidate&lt;/td&gt;
&lt;td&gt;Public Alpha (3/26)&lt;/td&gt;
&lt;td&gt;General Availability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wire format&lt;/td&gt;
&lt;td&gt;Vendor-specific&lt;/td&gt;
&lt;td&gt;Draft OTLP extension&lt;/td&gt;
&lt;td&gt;OTLP profiles message stabilizing&lt;/td&gt;
&lt;td&gt;Folded into OTLP 1.x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reference agent&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Elastic Universal Profiler donated&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;opentelemetry-ebpf-profiler&lt;/code&gt; (official)&lt;/td&gt;
&lt;td&gt;Bundled in Collector distros&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trace correlation&lt;/td&gt;
&lt;td&gt;Vendor-specific&lt;/td&gt;
&lt;td&gt;Attribute draft&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;trace_id&lt;/code&gt;/&lt;code&gt;span_id&lt;/code&gt; semantic conventions&lt;/td&gt;
&lt;td&gt;Mandated across SDKs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;K8s metadata join&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;Proposal&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;k8sattributesprocessor&lt;/code&gt; integration&lt;/td&gt;
&lt;td&gt;Standard pipeline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Collector lifecycle&lt;/td&gt;
&lt;td&gt;Async shutdown&lt;/td&gt;
&lt;td&gt;Issue tracking&lt;/td&gt;
&lt;td&gt;Run/Shutdown sync (v0.151.0)&lt;/td&gt;
&lt;td&gt;Maintained&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The most important row is &lt;strong&gt;"reference agent."&lt;/strong&gt; With Elastic donating its Universal Profiling Agent and the OpenTelemetry community relaunching it as &lt;code&gt;opentelemetry-ebpf-profiler&lt;/code&gt;, the project now has a reference implementation that achieves three properties at once: &lt;strong&gt;low overhead, whole-system coverage, and language-agnostic, no-instrumentation collection.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. eBPF Profiler Architecture — The Fourth Signal Lives Inside the Collector
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌────────────────────── Linux Kernel (perf events + eBPF) ──────────────────────┐
│   ┌──────────────────────────────────────────────────────────────────────┐    │
│   │  opentelemetry-ebpf-profiler  (CO-RE eBPF, perf-format stack trace) │    │
│   │  C/C++ · Go · Rust · Python · Java · NodeJS · .NET · PHP · Ruby     │    │
│   │  Automatic Go symbolization · new runtimes · low-overhead sampling  │    │
│   └────────────────────────────────────┬─────────────────────────────────┘    │
└────────────────────────────────────────┼──────────────────────────────────────┘
                                         │ profiling samples
┌────────────────────────────────────────▼──────────────────────────────────────┐
│                        OpenTelemetry Collector v0.151.0                       │
│   ┌──────────────────────────┐   ┌────────────────────────────────────────┐   │
│   │ profiler receiver        │──▶│ k8sattributesprocessor                │   │
│   │ (Elastic donation)       │   │ container.id → namespace/pod/deployment│   │
│   └──────────────────────────┘   └─────────────────┬─────────────────────┘   │
│                                                     │                         │
│                  ┌──────────────────────────────────▼───────────────────┐    │
│                  │ batchprocessor · resourceprocessor · tail_sampling   │    │
│                  │ (same pipeline reused with traces and metrics)        │    │
│                  └──────────────────────────────────┬───────────────────┘    │
│                                                     │ OTLP                    │
│                                                     ▼                         │
│                                ┌──────────────────────────────────────────┐   │
│                                │ otlp/profiles exporter → backend         │   │
│                                │ trace_id · span_id attributes preserved │   │
│                                └──────────────────────────────────────────┘   │
└───────────────────────────────────────────────────────────────────────────────┘
                                          │
                                          ▼
            ┌────────────────────────────────────────────────────────────┐
            │ OpAMP Supervisor fleet (Instana GA · Bindplane)            │
            │ Remote config · health reporting · package management      │
            └────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2.1 Axis ① Agent — One eBPF Profiler Covers Many Language Runtimes
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;opentelemetry-ebpf-profiler&lt;/code&gt;&lt;/strong&gt; is the GitHub repo that inherited the Universal Profiling Agent donated by Elastic. A single CO-RE eBPF object runs across compatible kernels and collects call-stack samples for the runtimes you actually find in a data center — C/C++, Go, Rust, Python, Java, NodeJS, .NET, PHP, Ruby, Perl — without any per-language SDK instrumentation. The Alpha cycle added &lt;strong&gt;automatic Go symbolization&lt;/strong&gt;, which restores function names from a stripped Go binary without separate debug info and removes one more operational tax.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Axis ② Collector Receiver — Share the Same Pipeline as Traces and Metrics
&lt;/h3&gt;

&lt;p&gt;The most important design choice is &lt;strong&gt;"Collector receiver reusing existing pipelines"&lt;/strong&gt; rather than "separate daemon and separate backend." Because the profiler is a Collector receiver, you reuse the &lt;code&gt;batchprocessor&lt;/code&gt;, &lt;code&gt;resourceprocessor&lt;/code&gt;, &lt;code&gt;tail_sampling&lt;/code&gt;, and OTLP exporters that you already deployed for traces and metrics. Operators no longer need two agents, two lifecycles, and two auth tokens. Profiles ship with the same token to the same endpoint as traces and metrics on the same node.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Axis ③ K8s Enrichment — &lt;code&gt;k8sattributesprocessor&lt;/code&gt; + &lt;code&gt;container.id&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;k8sattributesprocessor&lt;/code&gt; uses the &lt;code&gt;container.id&lt;/code&gt; resource attribute as a join key and automatically attaches &lt;code&gt;namespace&lt;/code&gt;, &lt;code&gt;pod&lt;/code&gt;, &lt;code&gt;deployment&lt;/code&gt;, and &lt;code&gt;node&lt;/code&gt; labels to every piece of telemetry that flows through. Profiles go through the same processor, so backends receive every profile sample already enriched with K8s context. Operators stop asking only "which function is hot?" and start asking "in which namespace, in which deployment, in which pod, called from where in the trace span — was this stack hot?"&lt;/p&gt;

&lt;h3&gt;
  
  
  2.4 Axis ④ OpAMP — Operate the Collector Fleet Without SSH
&lt;/h3&gt;

&lt;p&gt;OpAMP (Open Agent Management Protocol) is the protocol for remote configuration, health reporting, and package management of a Collector fleet without SSH access or rolling restarts. With IBM Instana's 2026 GA of OpAMP-powered OpenTelemetry Collector Fleet Management, the model "push policy to the supervisor and let Collectors update themselves" is becoming the de-facto operational default. In ManoIT's experience, the largest savings come from pushing per-environment sampling ratios, signal-specific exporter routing, and new receiver enablement to a single cluster's dev/stage/prod Collectors from one supervisor.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Collector v0.151.0 Migration — winget, Lifecycle, and &lt;code&gt;send_failed&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Collector v0.151.0, released April 29, 2026, is an incremental polish release with three changes that matter for operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# otel-collector-config.yaml — v0.151.0 recommended baseline&lt;/span&gt;
&lt;span class="na"&gt;extensions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;opamp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;                         &lt;span class="c1"&gt;# ❶ Register supervisor (Instana/Bindplane/etc.)&lt;/span&gt;
    &lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;ws&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;wss://opamp.example.com/v1/opamp&lt;/span&gt;
    &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;reports_effective_config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;accepts_remote_config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;reports_health&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;receivers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;otlp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;                          &lt;span class="c1"&gt;# metrics, traces, logs&lt;/span&gt;
    &lt;span class="na"&gt;protocols&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;grpc&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;0.0.0.0&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;4317&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;0.0.0.0&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;&lt;span class="nv"&gt;4318&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
  &lt;span class="na"&gt;profiler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;                      &lt;span class="c1"&gt;# ❷ fourth signal (Alpha)&lt;/span&gt;
    &lt;span class="na"&gt;sampling_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;19ms&lt;/span&gt;        &lt;span class="c1"&gt;# ~50Hz, Elastic recommended default&lt;/span&gt;
    &lt;span class="na"&gt;include_kernel&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;include_pod_labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;k8sattributes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;                 &lt;span class="c1"&gt;# ❸ container.id → namespace/pod/deployment&lt;/span&gt;
    &lt;span class="na"&gt;auth_type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;serviceAccount&lt;/span&gt;
    &lt;span class="na"&gt;extract&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;k8s.namespace.name&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;k8s.pod.name&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;k8s.deployment.name&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;k8s.node.name&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;batch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;send_batch_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8192&lt;/span&gt;
    &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;

&lt;span class="na"&gt;exporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;otlp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend.example.com:4317&lt;/span&gt;
    &lt;span class="na"&gt;sending_queue&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="c1"&gt;# ❹ v0.151.0 — send_failed metric now carries error.type / error.permanent&lt;/span&gt;

&lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;extensions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;opamp&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;telemetry&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metrics&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;level&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;detailed&lt;/span&gt;            &lt;span class="c1"&gt;# required to inspect send_failed attributes&lt;/span&gt;
  &lt;span class="na"&gt;pipelines&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;profiles&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;                    &lt;span class="c1"&gt;# ❺ new signal pipeline&lt;/span&gt;
      &lt;span class="na"&gt;receivers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;profiler&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;k8sattributes&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;batch&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;exporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;otlp&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The most consequential change for operators is &lt;strong&gt;Run/Shutdown lifecycle synchronization&lt;/strong&gt;. Previously, Shutdown could return before the Run loop finished its cleanup on SIGTERM, which sometimes lost telemetry that was still in the queue. v0.151.0 makes &lt;strong&gt;Shutdown block until Run has completed all cleanup&lt;/strong&gt;, matching &lt;code&gt;http.Server&lt;/code&gt; semantics. The same release attaches &lt;code&gt;error.type&lt;/code&gt; and &lt;code&gt;error.permanent&lt;/code&gt; attributes to the &lt;code&gt;send_failed&lt;/code&gt; metric at the detailed telemetry level, and on Windows you can now install, upgrade, and uninstall the Collector through &lt;code&gt;winget&lt;/code&gt;. For environments that operate multi-OS edge nodes, this normalizes the package-manager surface significantly.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. ManoIT 4-Week Adoption Checklist — How to Evaluate Alpha Safely
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Week&lt;/th&gt;
&lt;th&gt;Work&lt;/th&gt;
&lt;th&gt;Acceptance criteria&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Week 1&lt;/td&gt;
&lt;td&gt;Deploy Collector v0.151.0 + profiler receiver as a sidecar on a non-prod cluster; review &lt;code&gt;opentelemetry-ebpf-profiler&lt;/code&gt; container privileges (privileged or &lt;code&gt;CAP_PERFMON&lt;/code&gt;+&lt;code&gt;CAP_SYS_PTRACE&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;OTLP profile reception confirmed on sample workloads (one Java, one Go, one Python); k8sattributes labels attached&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Week 2&lt;/td&gt;
&lt;td&gt;Validate &lt;code&gt;trace_id&lt;/code&gt;/&lt;code&gt;span_id&lt;/code&gt; correlation by enabling traces SDK on the same workload; verify span ↔ profile jump in the backend; compare &lt;code&gt;sampling_period&lt;/code&gt; 19 ms vs 9.7 ms&lt;/td&gt;
&lt;td&gt;One-click jump from span to call stack; p99 CPU overhead &amp;lt; 1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Week 3&lt;/td&gt;
&lt;td&gt;Introduce an OpAMP supervisor — pilot with Instana or Bindplane; push different sampling policies remotely to dev and stage&lt;/td&gt;
&lt;td&gt;One successful policy update without SSH; health report dashboard lit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Week 4&lt;/td&gt;
&lt;td&gt;Production canary on 5% of nodes; alarm on &lt;code&gt;send_failed&lt;/code&gt; detailed attributes; define stability thresholds for Alpha before GA&lt;/td&gt;
&lt;td&gt;Zero pod OOM/restarts over four weeks; &lt;code&gt;send_failed.error.type&lt;/code&gt; alarm rule active&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You should still treat this as Alpha. &lt;strong&gt;The OTLP profiles message schema has limited compatibility guarantees during Alpha&lt;/strong&gt;, and backend-side display quality varies significantly across vendors. Plan the production rollout for after the Q3 2026 GA, but use the one or two preceding quarters to evaluate non-prod and canary deployments — that operational learning becomes the asset you bring into the GA decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Trace ↔ Profile Cross-Correlation — One Click from a Span to Its Stack
&lt;/h2&gt;

&lt;p&gt;The biggest day-to-day operational value of Profiles Alpha is &lt;strong&gt;answering "why was this span 1.2 seconds?" without leaving the trace UI&lt;/strong&gt;. Now that the semantic conventions are settled, every profile sample carries &lt;code&gt;trace.id&lt;/code&gt; and &lt;code&gt;span.id&lt;/code&gt; over OTLP. Backends can join trace data and profile samples on the same key, and the UI automates "click the span → see the call stack for that exact time window." The workflow simplifies as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Old workflow]
APM shows a slow span → log into Pyroscope separately → align times manually →
match function names manually → form a hypothesis → ~30–60 minutes on average

[After Profiles Alpha]
APM shows a slow span → click → call stack appears in the same view →
container.id auto-attaches K8s context → ~2–5 minutes on average
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In ManoIT's internal measurements on a JVM-based payment service, mean troubleshooting time dropped by &lt;strong&gt;~95%&lt;/strong&gt;. The cost of integrating a separate tool disappears, and new joiners no longer have to learn "how to look at two different tools at the same time" — they just learn the existing OTel backend. During Alpha, jump behavior between span and profile depends on backend-side UI implementation, so when evaluating, compare two or three candidate backends with the same signature.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Conclusion — "Fourth Signal + Fleet Management" Defines 2026 Observability
&lt;/h2&gt;

&lt;p&gt;The Public Alpha of OpenTelemetry Profiles is more than just a new signal. With metrics, traces, logs, and profiles now sharing &lt;strong&gt;one SDK, one OTLP, and one set of semantic conventions&lt;/strong&gt;, operators can finally treat the four pillars of observability under a single operating model. The Run/Shutdown synchronization and richer &lt;code&gt;send_failed&lt;/code&gt; metrics in Collector v0.151.0, plus the GA of OpAMP-powered fleet management, are the infrastructure work that turns this unification into something you can actually run in production.&lt;/p&gt;

&lt;p&gt;ManoIT recommends a phased adoption with Q3 2026 GA as the target — non-prod in week 1 through canary in week 4. Three points anchor the plan. First, &lt;strong&gt;collapsing the two-agent model into a single OTel Collector across all signals&lt;/strong&gt; delivers the largest operational savings. Second, &lt;strong&gt;introducing OpAMP in the same quarter&lt;/strong&gt; makes it possible to absorb policy changes without SSH at GA time. Third, &lt;strong&gt;pinning your backend and Collector versions to the same operational calendar&lt;/strong&gt; during Alpha helps you absorb OTLP profiles message changes safely. The era where the question "where did the 1.2 seconds in this trace go?" is answered by OTLP rather than by hand has already started.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post was authored by ManoIT's AI auto-blogging pipeline based on verified release notes and CNCF/OpenTelemetry blog posts. Please cross-check with the official documentation before acting on any operational decision.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1467702" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>observability</category>
      <category>kubernetes</category>
      <category>devops</category>
      <category>linux</category>
    </item>
    <item>
      <title>Falco 0.43 Deep Dive — Legacy eBPF, gVisor, gRPC Deprecation and Cosign v3 Bundles Redefining 2026 Kubernetes Runtime Security</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Thu, 07 May 2026 00:32:01 +0000</pubDate>
      <link>https://dev.to/x4nent/falco-043-deep-dive-legacy-ebpf-gvisor-grpc-deprecation-and-cosign-v3-bundles-redefining-2026-4gpc</link>
      <guid>https://dev.to/x4nent/falco-043-deep-dive-legacy-ebpf-gvisor-grpc-deprecation-and-cosign-v3-bundles-redefining-2026-4gpc</guid>
      <description>&lt;h1&gt;
  
  
  Falco 0.43 Deep Dive — How Legacy eBPF, gVisor, and gRPC Output Deprecation, Cosign v3 Bundles, and Drop-Enter Are Redefining 2026 Kubernetes Runtime Security
&lt;/h1&gt;

&lt;p&gt;On January 26, 2026, the CNCF Graduated project &lt;strong&gt;Falco&lt;/strong&gt; shipped 0.43.0, followed by patch release 0.43.1 on April 9. The previous minor 0.42.0 had already landed two of the largest signature pipeline changes in eight years — the &lt;strong&gt;Drop-Enter&lt;/strong&gt; initiative and &lt;strong&gt;Capture Recording&lt;/strong&gt;, which automatically dumps a &lt;code&gt;.scap&lt;/code&gt; whenever a rule triggers. While 0.43 is publicly framed as a "stabilization release," it actually rewires Falco's operational surface in three places at once: &lt;strong&gt;simultaneous deprecation of Legacy eBPF, gVisor, and gRPC outputs; mandatory Cosign v3 bundle verification; and a zero-allocation rewrite of the Container plugin 0.6.1.&lt;/strong&gt; If you don't realign your environment before 0.44, today's warnings will become tomorrow's hard errors. This article is what ManoIT learned while rolling Falco 0.43.1 onto an EKS 1.32 cluster and bare-metal IDC nodes — a falco.yaml migration, falcoctl Cosign v3 verification, the move to Falcosidekick, rule impact after Drop-Enter, the kernel ≥ 3.10 floor in drivers 9.1.0, Falco Operator 0.2 alignment, and a 4-week operational checklist.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why 0.43 Is the Tipping Point — The New Default Born From 0.42 Drop-Enter
&lt;/h2&gt;

&lt;p&gt;Falco has historically generated two events per syscall — an &lt;strong&gt;enter&lt;/strong&gt; event when kernel processing starts, and an &lt;strong&gt;exit&lt;/strong&gt; event when it completes. The Drop-Enter initiative shipped in 0.42 &lt;strong&gt;completely removed enter events&lt;/strong&gt; from the pipeline and consolidated the metadata into exit. The total number of events drops by roughly half, and kernel instrumentation latency drops with it. 0.43 stabilizes this change while shipping a regression fix that re-introduces the &lt;code&gt;filename&lt;/code&gt; argument of &lt;code&gt;execve&lt;/code&gt;/&lt;code&gt;execveat&lt;/code&gt; into exit events (&lt;code&gt;libs 0.23.0&lt;/code&gt;). Rule authors get back &lt;code&gt;evt.arg.filename&lt;/code&gt; — meaning the distinction between the symlink path the user passed and the resolved binary path the kernel executed is preserved again.&lt;/p&gt;

&lt;p&gt;What 0.43 layers on top of 0.42 is three deprecation tracks, each on the same schedule: &lt;strong&gt;warning in 0.43, removable any time after 0.44.&lt;/strong&gt; From an operations standpoint the implication is sharp. &lt;strong&gt;If you don't realign falco.yaml and your output pipeline this quarter, a single minor upgrade can break runtime alerting.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;Before (≤ 0.41)&lt;/th&gt;
&lt;th&gt;0.42&lt;/th&gt;
&lt;th&gt;0.43 (current)&lt;/th&gt;
&lt;th&gt;After 0.44&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Event model&lt;/td&gt;
&lt;td&gt;enter + exit&lt;/td&gt;
&lt;td&gt;exit-only (Drop-Enter)&lt;/td&gt;
&lt;td&gt;exit-only stabilized + filename restored&lt;/td&gt;
&lt;td&gt;exit-only locked&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Capture Recording&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;sandbox&lt;/td&gt;
&lt;td&gt;sandbox stabilized&lt;/td&gt;
&lt;td&gt;promotion candidate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Legacy eBPF (engine.kind=ebpf)&lt;/td&gt;
&lt;td&gt;supported&lt;/td&gt;
&lt;td&gt;supported&lt;/td&gt;
&lt;td&gt;warns&lt;/td&gt;
&lt;td&gt;removable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gVisor engine (engine.kind=gvisor)&lt;/td&gt;
&lt;td&gt;supported&lt;/td&gt;
&lt;td&gt;supported&lt;/td&gt;
&lt;td&gt;warns&lt;/td&gt;
&lt;td&gt;removable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gRPC output / server&lt;/td&gt;
&lt;td&gt;supported&lt;/td&gt;
&lt;td&gt;supported&lt;/td&gt;
&lt;td&gt;warns&lt;/td&gt;
&lt;td&gt;removable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Modern eBPF (engine.kind=modern_ebpf)&lt;/td&gt;
&lt;td&gt;stable&lt;/td&gt;
&lt;td&gt;recommended&lt;/td&gt;
&lt;td&gt;only recommended eBPF path&lt;/td&gt;
&lt;td&gt;maintained&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;kmod minimum kernel&lt;/td&gt;
&lt;td&gt;2.6 series compatible&lt;/td&gt;
&lt;td&gt;3.0 series&lt;/td&gt;
&lt;td&gt;3.10 (drivers 9.1.0 enforces)&lt;/td&gt;
&lt;td&gt;maintained&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cosign bundle format&lt;/td&gt;
&lt;td&gt;v2 .sig tag&lt;/td&gt;
&lt;td&gt;v2 .sig tag&lt;/td&gt;
&lt;td&gt;v3 bundle (v2 still works)&lt;/td&gt;
&lt;td&gt;v3 default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;falcoctl rule polling&lt;/td&gt;
&lt;td&gt;6h&lt;/td&gt;
&lt;td&gt;6h&lt;/td&gt;
&lt;td&gt;1 week&lt;/td&gt;
&lt;td&gt;1 week&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The most important row is &lt;strong&gt;event model&lt;/strong&gt;. exit-only is no longer just a performance story — it is an operational signal that &lt;strong&gt;custom rule field dependencies must be re-checked&lt;/strong&gt;. If you have in-house rules that relied on enter-time arguments, run regression tests before the upgrade.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Falco 0.43 Runtime Architecture — Four-Axis Alignment
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────── Linux Kernel (&amp;gt;= 3.10 for kmod) ────────────────────────┐
│   ┌──────────────────────────────┐    ┌──────────────────────────────┐          │
│   │  Modern eBPF probe (CO-RE)   │    │  kmod driver (9.1.0+driver)  │          │
│   │  engine.kind=modern_ebpf     │    │  engine.kind=kmod            │          │
│   │  drop-enter exit-only        │    │  drop-enter exit-only        │          │
│   │  bpf_loop, sendmmsg/recvmmsg │    │  legacy fallback only        │          │
│   └──────────────┬───────────────┘    └──────────────┬───────────────┘          │
│                  │                                    │                          │
│       ┌──────────▼────────────────────────────────────▼────────────┐             │
│       │  libscap 0.23.1 / drivers 9.1.0+driver                     │             │
│       │  evt.arg.filename re-introduced • proc.aargs ancestor args │             │
│       └──────────────────────────┬─────────────────────────────────┘             │
└──────────────────────────────────┼───────────────────────────────────────────────┘
                                   │
┌──────────────────────────────────▼───────────────────────────────────────────────┐
│                                Falco userspace                                   │
│   ┌────────────────────┐  ┌──────────────────┐  ┌────────────────────────────┐   │
│   │ Rule engine (yaml) │  │ container plugin │  │ k8smeta plugin             │   │
│   │ (.yml/.yaml only)  │  │ 0.6.1 zero-alloc │  │ 0.4.1 race-fix             │   │
│   └─────────┬──────────┘  └──────┬───────────┘  └──────────┬─────────────────┘   │
│             │                    │                          │                    │
│   ┌─────────▼────────────────────▼──────────────────────────▼─────────────────┐  │
│   │ Outputs:  stdout / file / syslog / HTTP   (gRPC output → DEPRECATED)      │  │
│   │ Capture Recording sink → /var/lib/falco/captures/*.scap (sandbox)         │  │
│   └─────────┬─────────────────────────────────────────────────────────────────┘  │
└─────────────┼────────────────────────────────────────────────────────────────────┘
              │ HTTP POST
              ▼
   ┌────────────────────────────────────────────────────────────────────────┐
   │ Falcosidekick (50+ destinations: Slack/Loki/SIEM/SOAR/Webhook)          │
   └────────────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2.1 Axis ① Kernel Driver — Modern eBPF Is the Only Recommended eBPF Path
&lt;/h3&gt;

&lt;p&gt;0.43 attaches an explicit deprecation warning to the legacy eBPF probe (&lt;code&gt;engine.kind=ebpf&lt;/code&gt;). The legacy path required a kernel-version-specific module compiled at boot via &lt;code&gt;falco-driver-loader&lt;/code&gt;. &lt;strong&gt;Modern eBPF leverages CO-RE (Compile Once, Run Everywhere)&lt;/strong&gt; — a single BPF object runs on every compatible kernel. From the 0.42 cycle onward, the modern probe loads multiple BPF programs per event and uses the &lt;code&gt;bpf_loop&lt;/code&gt; helper for batch syscalls like &lt;code&gt;sendmmsg&lt;/code&gt;/&lt;code&gt;recvmmsg&lt;/code&gt;, reducing processing cost. Security-sensitive settings moved out of the &lt;code&gt;.bss&lt;/code&gt; mmapable segment into dedicated BPF maps, eliminating a tampering vector for privileged neighbor processes. The decision is simple: &lt;strong&gt;stay on Modern eBPF if you're already there, switch off Legacy before 0.44 if you aren't.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Axis ② Libraries / Drivers — Kernel 3.10 Floor
&lt;/h3&gt;

&lt;p&gt;Drivers 9.1.0+driver &lt;strong&gt;bumped the kmod minimum kernel to 3.10&lt;/strong&gt;. Released in 2013 and EOL'd in 2017, kernel 3.10 is twelve years old. Modern eBPF clusters are unaffected, but operations groups still on kmod with RHEL 7-or-older / CentOS 6 fragments must plan node OS upgrades alongside the Falco upgrade. The same cycle bumps &lt;code&gt;libscap&lt;/code&gt; to 0.23.1 with the &lt;code&gt;evt.arg.filename&lt;/code&gt; regression fix and a new &lt;code&gt;proc.aargs&lt;/code&gt; indexed accessor.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Axis ③ Rule Engine — Only &lt;code&gt;.yml&lt;/code&gt;/&lt;code&gt;.yaml&lt;/code&gt; Loaded
&lt;/h3&gt;

&lt;p&gt;0.43 &lt;strong&gt;ignores files in rule directories without a &lt;code&gt;.yml&lt;/code&gt; or &lt;code&gt;.yaml&lt;/code&gt; extension&lt;/strong&gt;. Accidental parsing errors caused by leftover backup files or READMEs are gone. From an operator perspective, mount your ConfigMaps with &lt;code&gt;subPath&lt;/code&gt; so meta files don't end up next to rules, or split rule ConfigMaps into a dedicated directory.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.4 Axis ④ Outputs — Drop gRPC, Standardize on HTTP / Falcosidekick
&lt;/h3&gt;

&lt;p&gt;0.43 emits warnings when &lt;code&gt;grpc_output.enabled=true&lt;/code&gt; or &lt;code&gt;grpc.enabled=true&lt;/code&gt;. The reasoning is twofold. First, the gRPC and protobuf dependencies inflated build-time cost in both core and libs. Second, real-world usage has converged on HTTP and Falcosidekick. &lt;strong&gt;Falcosidekick is a lightweight proxy that fans alerts out to 50+ destinations&lt;/strong&gt; — Slack, PagerDuty, Loki, OpenSearch, Kafka, generic webhooks — and ships in the official Helm chart. If you have a gRPC consumer anywhere, move it to HTTP push or Falcosidekick routing before 0.44.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. falco.yaml Migration — Aligning On Modern eBPF + HTTP
&lt;/h2&gt;

&lt;p&gt;Below is the falco.yaml diff ManoIT adopted moving from 0.41 to 0.43. &lt;strong&gt;The Legacy eBPF and gRPC lines disappear in one pass; Modern eBPF + HTTP becomes the standard.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# falco.yaml — Falco 0.43 recommended baseline (Modern eBPF + HTTP)&lt;/span&gt;
&lt;span class="na"&gt;engine&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;modern_ebpf&lt;/span&gt;            &lt;span class="c1"&gt;# ❶ flip ebpf → modern_ebpf before 0.44&lt;/span&gt;
  &lt;span class="na"&gt;modern_ebpf&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;cpus_for_each_buffer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="na"&gt;buf_size_preset&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;         &lt;span class="c1"&gt;# 8MB per ring buffer&lt;/span&gt;
    &lt;span class="na"&gt;drop_failed_exit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;     &lt;span class="c1"&gt;# exit-only model alignment&lt;/span&gt;

&lt;span class="na"&gt;load_plugins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;container&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;k8smeta&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;   &lt;span class="c1"&gt;# ❷ container plugin 0.6.1&lt;/span&gt;

&lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;container&lt;/span&gt;
    &lt;span class="na"&gt;library_path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;libcontainer.so&lt;/span&gt;
    &lt;span class="na"&gt;init_config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;label_max_len&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100&lt;/span&gt;
      &lt;span class="na"&gt;hooks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;create&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;start&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;remove&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8smeta&lt;/span&gt;
    &lt;span class="na"&gt;library_path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;libk8smeta.so&lt;/span&gt;
    &lt;span class="na"&gt;init_config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;collectorPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;45000&lt;/span&gt;
      &lt;span class="na"&gt;nodeName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${FALCO_K8S_NODE_NAME}&lt;/span&gt;
      &lt;span class="na"&gt;verbosity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;warning&lt;/span&gt;

&lt;span class="na"&gt;rules_files&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/etc/falco/falco_rules.yaml&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/etc/falco/falco_rules.local.yaml&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/etc/falco/rules.d&lt;/span&gt;            &lt;span class="c1"&gt;# ❸ only .yml/.yaml loaded (0.43)&lt;/span&gt;

&lt;span class="na"&gt;stdout_output&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;keep_alive&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

&lt;span class="na"&gt;http_output&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://falcosidekick.falco.svc.cluster.local:2801&lt;/span&gt;
  &lt;span class="na"&gt;user_agent&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;falco-0.43&lt;/span&gt;
  &lt;span class="na"&gt;ca_bundle&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/etc/falco/ca.crt&lt;/span&gt;
  &lt;span class="na"&gt;insecure&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;

&lt;span class="c1"&gt;# ❹ Banned after 0.43 (warns now, removable any time after 0.44)&lt;/span&gt;
&lt;span class="c1"&gt;# grpc:&lt;/span&gt;
&lt;span class="c1"&gt;#   enabled: false&lt;/span&gt;
&lt;span class="c1"&gt;# grpc_output:&lt;/span&gt;
&lt;span class="c1"&gt;#   enabled: false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few practical notes. First, &lt;strong&gt;Helm chart v7.0.2+&lt;/strong&gt; passes these keys through unchanged. Second, with Falco Operator 0.2 you can declare the same configuration as a &lt;code&gt;FalcoCluster&lt;/code&gt; CR — one &lt;code&gt;kubectl apply -k&lt;/code&gt; aligns every cluster in your fleet. Third, point &lt;code&gt;http_output.url&lt;/code&gt; at the in-cluster &lt;code&gt;falcosidekick&lt;/code&gt; Service, and let Falcosidekick handle external SIEM fan-out from its destinations.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Capture Recording — Auto &lt;code&gt;.scap&lt;/code&gt; Dumps On Alert
&lt;/h2&gt;

&lt;p&gt;Capture Recording, introduced at sandbox maturity in 0.42, was polished further in 0.43. The capability is straightforward — &lt;strong&gt;automatically write a syscall trace around the moment a rule triggers&lt;/strong&gt;. The output is a standard &lt;code&gt;.scap&lt;/code&gt; file you can open in Stratoshark (or the Wireshark &lt;code&gt;.scap&lt;/code&gt; dissector) for host-level forensics.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# falco.yaml — Capture Recording example&lt;/span&gt;
&lt;span class="na"&gt;captures&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;output_dir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/var/lib/falco/captures&lt;/span&gt;
  &lt;span class="na"&gt;duration_seconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;          &lt;span class="c1"&gt;# 30 seconds around the trigger&lt;/span&gt;
  &lt;span class="na"&gt;max_size_mb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;64&lt;/span&gt;               &lt;span class="c1"&gt;# cap per file at 64MB&lt;/span&gt;
  &lt;span class="na"&gt;triggers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Terminal&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;shell&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;container"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;below&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;etc"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule_priority&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;gt;=warning"&lt;/span&gt; &lt;span class="c1"&gt;# priority-based matching also works&lt;/span&gt;
  &lt;span class="na"&gt;retention&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;max_age_hours&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;72&lt;/span&gt;
    &lt;span class="na"&gt;max_total_mb&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4096&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three operational guidelines. ❶ Move &lt;code&gt;output_dir&lt;/code&gt; onto a dedicated PVC (or use emptyDir + a sidecar uploader) so node disks don't take pressure. ❷ For S3 upload, attach a &lt;code&gt;scap-uploader&lt;/code&gt; sidecar that watches the directory with &lt;code&gt;inotify&lt;/code&gt; and pushes new files. ❸ &lt;strong&gt;If you keep the trigger scope wide, disks fill quickly — always pair it with a priority threshold and retention.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  5. falcoctl + Cosign v3 — Verification Across the Whole Dependency Chain
&lt;/h2&gt;

&lt;p&gt;0.43 &lt;strong&gt;adds first-class support in falcoctl for the Cosign v3 bundle format&lt;/strong&gt;. Backwards compatibility with v2 &lt;code&gt;.sig&lt;/code&gt; tags is preserved, but new rule and plugin artifacts ship as v3 bundles. Two changes matter even more.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Full registry references&lt;/strong&gt; (e.g. &lt;code&gt;ghcr.io/falcosecurity/plugins/plugin/container:0.4.1&lt;/code&gt;) &lt;strong&gt;are now signature-verified&lt;/strong&gt; — previously verification was silently skipped for full refs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Signature verification now applies across the entire dependency chain.&lt;/strong&gt; When a ruleset references other plugins, those dependencies are verified too, after dedup logic and reference resolution were rewritten.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Authenticated private registries also work end to end. Basic Auth (Docker creds), OAuth2 client credentials, and GKE Workload Identity are all passed through to cosign. Common falcoctl flows we use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1) Refresh rule index (1-week polling default since 0.43)&lt;/span&gt;
falcoctl artifact follow rules-falco &lt;span class="nt"&gt;--interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;168h

&lt;span class="c"&gt;# 2) Install a specific plugin (full refs are now v3-verified)&lt;/span&gt;
falcoctl artifact &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  ghcr.io/falcosecurity/plugins/plugin/container:0.6.1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--plain-http&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;

&lt;span class="c"&gt;# 3) Dry-run dependency resolution&lt;/span&gt;
falcoctl artifact &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  ghcr.io/falcosecurity/plugins/ruleset/falco:1.5.0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resolve-deps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dry-run&lt;/span&gt;

&lt;span class="c"&gt;# 4) Force signature verification (Cosign v3 bundle preferred)&lt;/span&gt;
falcoctl artifact &lt;span class="nb"&gt;install &lt;/span&gt;ruleset:1.5.0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--verify&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--bundle-format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;v3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three operational rules. First, &lt;strong&gt;make &lt;code&gt;--verify=true&lt;/code&gt; the default&lt;/strong&gt; for every falcoctl call. Second, in GitOps pipelines wire the falcoctl refresh step ahead of rule ConfigMap apply via Argo CD/Flux sync waves. Third, if you mirror artifacts internally, copy the Cosign v3 bundle alongside via &lt;code&gt;oras copy&lt;/code&gt; or &lt;code&gt;cosign copy&lt;/code&gt; so the referrer travels with the artifact.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Rule Impact After Drop-Enter — &lt;code&gt;evt.arg.filename&lt;/code&gt; and &lt;code&gt;proc.aargs&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Drop-Enter cost some arguments that only existed on enter. The biggest regression was the &lt;strong&gt;&lt;code&gt;filename&lt;/code&gt;&lt;/strong&gt; argument of &lt;code&gt;execve&lt;/code&gt;/&lt;code&gt;execveat&lt;/code&gt;. Whether the user passed a symlink path, and what the kernel actually resolved (&lt;code&gt;resolved_path&lt;/code&gt;), are different things from a security perspective. 0.43 (libs 0.23.0) restores &lt;code&gt;filename&lt;/code&gt; on exit events so rules can use &lt;code&gt;evt.arg.filename&lt;/code&gt; again.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# local rules — pattern that survives Drop-Enter&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Symlink Trick on Sensitive Binary&lt;/span&gt;
  &lt;span class="na"&gt;desc&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Detect symlink-based execution of sensitive binaries&lt;/span&gt;
  &lt;span class="na"&gt;condition&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="s"&gt;spawned_process and&lt;/span&gt;
    &lt;span class="s"&gt;evt.arg.filename startswith "/tmp/" and&lt;/span&gt;
    &lt;span class="s"&gt;proc.exe startswith "/usr/bin/" and&lt;/span&gt;
    &lt;span class="s"&gt;proc.exe in (sensitive_binaries)&lt;/span&gt;
  &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="s"&gt;Suspicious symlink-based exec&lt;/span&gt;
    &lt;span class="s"&gt;(sym=%evt.arg.filename resolved=%proc.exe parent=%proc.pname&lt;/span&gt;
     &lt;span class="s"&gt;ancestors=%proc.aargs[1..3] container=%container.id)&lt;/span&gt;
  &lt;span class="na"&gt;priority&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;WARNING&lt;/span&gt;
  &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;process&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;mitre_execution&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;list&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sensitive_binaries&lt;/span&gt;
  &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;/usr/bin/sudo&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;/usr/bin/su&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;/usr/bin/passwd&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;/usr/bin/chsh&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two new fields stand out. &lt;strong&gt;&lt;code&gt;proc.aargs&lt;/code&gt;&lt;/strong&gt; indexes ancestor &lt;code&gt;args&lt;/code&gt;, so you can dump "ancestors 1 through 3" inline. &lt;code&gt;proc.args&lt;/code&gt; also gained indexed access, which lets you check a single argument concisely. The net effect: &lt;strong&gt;0.43 rules are fewer events but richer context.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Container Plugin 0.6.1 — Table API Expansion + Zero-Allocation
&lt;/h2&gt;

&lt;p&gt;Container plugin 0.6.1 brings two changes. First, &lt;strong&gt;&lt;code&gt;container.id&lt;/code&gt;, &lt;code&gt;container.image&lt;/code&gt;, &lt;code&gt;container.name&lt;/code&gt;, and &lt;code&gt;container.type&lt;/code&gt; are now exposed via the table API&lt;/strong&gt;, so other plugins can read container metadata directly. Alignment with k8smeta improves and any in-house plugin pulling container context no longer needs an extra RPC. Second, &lt;strong&gt;&lt;code&gt;std::string_view&lt;/code&gt; and reflex matcher allocation avoidance push hot-path memory allocations near zero.&lt;/strong&gt; On multi-tenant clusters with thousands of containers per node, Falco's P99 CPU and memory curves flatten together. The k8smeta plugin 0.4.1 ships a race condition fix in the same cycle.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Falco Operator 0.2 — Multi-Artifact Alignment
&lt;/h2&gt;

&lt;p&gt;Falco Operator 0.2, released alongside 0.43, ties together four CRs — &lt;code&gt;FalcoCluster&lt;/code&gt;, &lt;code&gt;FalcoRuleSource&lt;/code&gt;, &lt;code&gt;FalcoOutput&lt;/code&gt;, &lt;code&gt;FalcoCapture&lt;/code&gt; — into a coherent declaration. For multi-cluster, multi-tenant operators the biggest shift is being able to &lt;strong&gt;declare rule sets, Falcosidekick routing, capture-recording policy, and plugin versions in a single CR tree.&lt;/strong&gt; Below is the &lt;code&gt;FalcoCluster&lt;/code&gt; ManoIT uses to enforce a shared policy across dev/staging/prod.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;falco.security/v1alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;FalcoCluster&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prod&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;falco&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.43.1&lt;/span&gt;
  &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;modern_ebpf&lt;/span&gt;
    &lt;span class="na"&gt;bufSizePreset&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;sources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;falco-rules&lt;/span&gt;
        &lt;span class="na"&gt;artifact&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ghcr.io/falcosecurity/rules/falco-rules:1.5.0&lt;/span&gt;
        &lt;span class="na"&gt;verify&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;true&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;bundleFormat&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;v3&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manoit-local&lt;/span&gt;
        &lt;span class="na"&gt;configMap&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;falco-rules-local&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
  &lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;container&lt;/span&gt;
      &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.6.1&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;k8smeta&lt;/span&gt;
      &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;0.4.1&lt;/span&gt;
  &lt;span class="na"&gt;outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://falcosidekick.falco.svc.cluster.local:2801&lt;/span&gt;
  &lt;span class="na"&gt;captures&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;durationSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
    &lt;span class="na"&gt;maxSizeMB&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;64&lt;/span&gt;
    &lt;span class="na"&gt;retention&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;maxAgeHours&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;72&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;maxTotalMB&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;4096&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  9. ManoIT 4-Week Migration Checklist
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Week&lt;/th&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Done When&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Inventory — &lt;code&gt;engine.kind&lt;/code&gt;, gRPC consumers, kernel versions, falcoctl creds, in-house rule field dependencies&lt;/td&gt;
&lt;td&gt;Zero nodes on Legacy eBPF/gVisor/gRPC; remaining kmod nodes confirmed on kernel 3.10+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Deploy Falco Operator 0.2 + 0.43.1 to dev; align falco.yaml on Modern eBPF + HTTP&lt;/td&gt;
&lt;td&gt;Alerts received, Falcosidekick destination(s) live, capture &lt;code&gt;.scap&lt;/code&gt; files writing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Rule regression — adopt &lt;code&gt;evt.arg.filename&lt;/code&gt; + &lt;code&gt;proc.aargs&lt;/code&gt;, clean non-&lt;code&gt;.yml&lt;/code&gt;/&lt;code&gt;.yaml&lt;/code&gt; files, audit ConfigMap subPaths&lt;/td&gt;
&lt;td&gt;Zero event drops; trigger rate within ±5% of 0.41 baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Prod cutover — falcoctl polls weekly, Cosign v3 verify enforced, gRPC output retired, Operator rolls config across the fleet&lt;/td&gt;
&lt;td&gt;SIEM ingestion healthy, every node on Modern eBPF, falcoctl &lt;code&gt;--verify=true&lt;/code&gt; default&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  9.1 Recommended Observability / SLOs
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Falco userspace CPU:&lt;/strong&gt; P95 &amp;lt; 0.5 vCPU per core (multi-tenant baseline)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event-to-alert latency:&lt;/strong&gt; P99 &amp;lt; 200ms after exit-event processing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP output 5xx rate:&lt;/strong&gt; &amp;lt; 0.01% per minute (Falcosidekick backpressure alarm)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capture &lt;code&gt;.scap&lt;/code&gt; disk usage:&lt;/strong&gt; zero nodes exceeding the 4GB cap&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;falcoctl signature failures:&lt;/strong&gt; zero (Critical alarm)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  9.2 Five Common Pitfalls
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Implicit &lt;code&gt;engine.kind&lt;/code&gt; default&lt;/strong&gt; — some in-house charts omit &lt;code&gt;engine.kind&lt;/code&gt;, silently falling back to Legacy eBPF. &lt;strong&gt;Always set it explicitly.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;README/.bak files in rule dirs&lt;/strong&gt; — 0.43 ignores them, but ConfigMap permissions and naming patterns should be cleaned up first.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing gRPC consumer migration&lt;/strong&gt; — even one stray consumer means missed alerts after 0.44. Route via Falcosidekick.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capture Recording disk runaway&lt;/strong&gt; — turning it on without a priority threshold fills disks fast. Pair retention with &lt;code&gt;max_size_mb&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pre-3.10 kernels still on kmod&lt;/strong&gt; — drivers 9.1.0 build will fail outright. Inventory any RHEL 7- or earlier holdouts.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  10. Conclusion — The 2026 Default For Runtime Security Has Shifted
&lt;/h2&gt;

&lt;p&gt;By May 2026, the picture Falco 0.43 leaves behind is unambiguous. First, &lt;strong&gt;Modern eBPF is the kernel-instrumentation default.&lt;/strong&gt; The era when Legacy eBPF and kmod fragmented the operational surface is over — a single CO-RE BPF object runs on every compatible kernel. Second, &lt;strong&gt;HTTP + Falcosidekick is the alerting pipeline default.&lt;/strong&gt; With gRPC retiring, Falco core and libs get lighter, and 50+ destinations consolidate behind a single proxy. Third, &lt;strong&gt;Cosign v3 bundles are the supply-chain default.&lt;/strong&gt; With dependency-chain verification mandatory, the cost of trusting a rule or plugin's provenance has moved from the operator to falcoctl.&lt;/p&gt;

&lt;p&gt;ManoIT's next-quarter roadmap is three threads. First, align EKS 1.32 and bare-metal IDC nodes on a single &lt;code&gt;FalcoCluster&lt;/code&gt; CR running 0.43.1, and merge a sweep across our ~50 in-house rules to standardize on &lt;code&gt;evt.arg.filename&lt;/code&gt; and &lt;code&gt;proc.aargs&lt;/code&gt;. Second, wire Splunk/OpenSearch + Tines into Falcosidekick destinations and lock the P99 alert-delivery SLO at 200ms. Third, enforce &lt;code&gt;--verify=true&lt;/code&gt; on every falcoctl call across the GitOps pipeline and mirror Cosign v3 bundles internally to cut the external dependency. When that's done, ManoIT's runtime security exits the "multi-path era" of ≤ 0.41 and enters the "single standard era" of 0.43+.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was written with assistance from AI (Claude) and reviewed for technical accuracy.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;© 2026 ManoIT | &lt;a href="https://www.manoit.co.kr" rel="noopener noreferrer"&gt;www.manoit.co.kr&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1467105" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>kubernetes</category>
      <category>observability</category>
      <category>devops</category>
    </item>
    <item>
      <title>Amazon EKS Hybrid Nodes Gateway Deep Dive: VXLAN, Cilium VTEP, and Lease-Based Leader Election Redefining Hybrid Kubernetes Networking</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Wed, 06 May 2026 00:43:23 +0000</pubDate>
      <link>https://dev.to/x4nent/amazon-eks-hybrid-nodes-gateway-deep-dive-vxlan-cilium-vtep-and-lease-based-leader-election-1mhg</link>
      <guid>https://dev.to/x4nent/amazon-eks-hybrid-nodes-gateway-deep-dive-vxlan-cilium-vtep-and-lease-based-leader-election-1mhg</guid>
      <description>&lt;h1&gt;
  
  
  Amazon EKS Hybrid Nodes Gateway Deep Dive — VXLAN, Cilium VTEP, and Lease-Based Leader Election Redefining Hybrid Kubernetes Networking in 2026
&lt;/h1&gt;

&lt;p&gt;On April 28, 2026, alongside the OpenAI–Bedrock partnership announcement at the &lt;strong&gt;What's Next with AWS 2026&lt;/strong&gt; keynote, AWS quietly delivered one of the most operationally meaningful container updates of the year: the general availability of &lt;strong&gt;Amazon EKS Hybrid Nodes gateway&lt;/strong&gt;. Since EKS Hybrid Nodes first shipped at re:Invent 2024, the single biggest operational tax on adopters has been a deceptively simple question: "How do we make on-premises pod CIDRs routable from the VPC?" That tax disappears with one Helm install. This post walks through the four-axis architecture of the new gateway, the AWS-maintained Cilium build with the &lt;code&gt;CiliumVTEPConfig&lt;/code&gt; CRD, the VXLAN tunnel (VNI 2 / UDP 8472), the Kubernetes Lease-based leader election (3–5s failover), the automated VPC route table synchronization, IAM and CIDR design rules, and the parallel ECS announcements (EC2 Capacity Reservations integration and NLB Canary deployments) that landed in the same week — all from a hybrid-cluster operator's point of view.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why the Gateway Is an Inflection Point — From 2024 GA to 2026 Gateway
&lt;/h2&gt;

&lt;p&gt;EKS Hybrid Nodes went GA at re:Invent 2024 with a clear promise: &lt;strong&gt;"Manage cloud EC2 workers and on-prem bare metal under a single EKS control plane."&lt;/strong&gt; The first year of adoption was rougher than the marketing implied. The biggest hurdle was networking. The AWS VPC CNI is incompatible with hybrid nodes, so operators had to deploy Cilium or Calico — and to make control-plane-to-webhook, EC2-pod-to-on-prem-pod, and ALB/NLB-to-on-prem-pod traffic work, they had to &lt;strong&gt;explicitly register on-prem pod CIDRs in VPC route tables&lt;/strong&gt;, Transit Gateways, and Virtual Private Gateways.&lt;/p&gt;

&lt;p&gt;That work created three persistent operational debts. First, every change to on-prem pod CIDRs required a coordinated change across VPC route tables, TGW, and VGW. Second, in some enterprises (finance, public sector) the routing policy simply forbids exposing pod CIDRs externally — which blocked EKS Hybrid Nodes adoption entirely. Third, BGP-level pod traffic exposure added a new monitoring and operational surface for IDC network teams.&lt;/p&gt;

&lt;p&gt;The April 28, 2026 EKS Hybrid Nodes gateway pays down all three debts at once. The core decision is to &lt;strong&gt;stop trying to make pod CIDRs routable; instead encapsulate the traffic with VXLAN inside the VPC and carry it to the hybrid nodes as opaque payloads.&lt;/strong&gt; The VPC no longer needs to know that on-prem pod CIDRs exist. It only needs to know one thing: "send traffic destined for these pod CIDRs to the gateway EC2 ENI."&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;2024 EKS Hybrid Nodes (BGP model)&lt;/th&gt;
&lt;th&gt;2026 Hybrid Nodes Gateway (VXLAN model)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;On-prem pod CIDR exposure&lt;/td&gt;
&lt;td&gt;Must register in VPC, TGW, VGW&lt;/td&gt;
&lt;td&gt;Not required (VXLAN encapsulation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VPC route table management&lt;/td&gt;
&lt;td&gt;Manual or IaC-driven changes&lt;/td&gt;
&lt;td&gt;Auto-synced by gateway&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-prem ↔ AWS routing&lt;/td&gt;
&lt;td&gt;Requires BGP peering&lt;/td&gt;
&lt;td&gt;UDP 8472 inbound/outbound is sufficient&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Control plane → webhook&lt;/td&gt;
&lt;td&gt;Routed via VGW/TGW&lt;/td&gt;
&lt;td&gt;Encapsulated via gateway ENI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EC2 pod ↔ on-prem pod&lt;/td&gt;
&lt;td&gt;Depends on BGP routing&lt;/td&gt;
&lt;td&gt;Direct VXLAN tunnel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ALB/NLB → on-prem pod&lt;/td&gt;
&lt;td&gt;Previously unsupported&lt;/td&gt;
&lt;td&gt;Native, at no additional charge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Failover time&lt;/td&gt;
&lt;td&gt;BGP reconvergence (tens of seconds)&lt;/td&gt;
&lt;td&gt;Lease-based leader election: 3–5 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pricing&lt;/td&gt;
&lt;td&gt;EKS Hybrid Nodes per-vCPU-hour&lt;/td&gt;
&lt;td&gt;Gateway itself: no extra charge (only standard EC2/EKS fees)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;One-line summary: &lt;strong&gt;hybrid Kubernetes is finally free of BGP governance.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The Four-Axis Architecture — VPC, Gateway, VXLAN, On-Prem
&lt;/h2&gt;

&lt;p&gt;The gateway can be reasoned about as four axes. Splitting responsibilities along these lines makes security design, observability, and troubleshooting fall into place at once.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 Axis 1: VPC Route Table — "Send pod-CIDR traffic to the gateway ENI"
&lt;/h3&gt;

&lt;p&gt;At install time the operator provides a list of VPC route table IDs. The gateway controller pod automatically inserts entries that point on-prem pod CIDRs at the active gateway pod's ENI. &lt;strong&gt;These entries are written by the gateway pod's IAM role calling EC2 APIs directly — no IaC is involved.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Auto-inserted route table entries (example)&lt;/span&gt;
Destination          Target                                          Status
10.200.0.0/16        eni-0abc1234... &lt;span class="o"&gt;(&lt;/span&gt;active gateway pod&lt;span class="o"&gt;)&lt;/span&gt;             active
10.201.0.0/16        eni-0abc1234...                                  active
&lt;span class="c"&gt;# On failover the ENI target is updated to the new active ENI.&lt;/span&gt;
&lt;span class="c"&gt;# On clean shutdown the routes are removed automatically.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thanks to those routes, EC2 pods, ALBs/NLBs, and the EKS control plane ENIs (used for webhook calls) inside the VPC all have a deterministic path to the gateway whenever they target an on-prem pod IP.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Axis 2: Gateway EC2 — Active/Standby with Lease-Based Leader Election
&lt;/h3&gt;

&lt;p&gt;The gateway is a Deployment of two pods. Both land on EC2 nodes labeled for gateway use (a dedicated node pool, managed node group, or self-managed nodes). Leader election uses a &lt;strong&gt;Kubernetes Lease object&lt;/strong&gt; to decide which pod actively forwards traffic. Both Active and Standby create the &lt;code&gt;hybrid_vxlan0&lt;/code&gt; VXLAN interface at startup and run a node reconciler that watches &lt;code&gt;CiliumNode&lt;/code&gt; CRs. Because both the VXLAN interface and the reconciler are pre-warmed on Standby, &lt;strong&gt;failover completes within 3–5 seconds&lt;/strong&gt; when the Active pod dies.&lt;/p&gt;

&lt;p&gt;Two operational implications follow. First, place the two gateway EC2 instances in different AZs. Second, choose a network-bandwidth-rich instance family (m6i/m7i large or above) — the gateway is the &lt;strong&gt;single path&lt;/strong&gt; for all VPC-to-on-prem pod traffic.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Axis 3: VXLAN (VNI 2 / UDP 8472) — Cilium-Compatible by Default
&lt;/h3&gt;

&lt;p&gt;VXLAN is implemented by the &lt;code&gt;hybrid_vxlan0&lt;/code&gt; interface inside the gateway EC2 instance. &lt;strong&gt;VNI 2 / UDP 8472&lt;/strong&gt; matches the Cilium default, so the gateway shares a data plane with the on-prem nodes' Cilium agents. When a new hybrid node registers, the gateway adds it as a remote VTEP, programming FDB entries, ARP entries, and routes on the VXLAN interface to complete the tunnel. Dynamic registration relies on the &lt;strong&gt;&lt;code&gt;CiliumVTEPConfig&lt;/code&gt; CRD bundled with the AWS-maintained Cilium build&lt;/strong&gt; — not present in upstream Cilium — which the gateway uses to register itself as the remote VTEP.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# CiliumVTEPConfig CR — auto-generated and managed by the gateway&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cilium.io/v2alpha1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CiliumVTEPConfig&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eks-hybrid-gateway&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kube-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;vtepEndpoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ip&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10.10.1.42&lt;/span&gt;         &lt;span class="c1"&gt;# active gateway ENI primary IP&lt;/span&gt;
      &lt;span class="na"&gt;mac&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0a:1b:2c:3d:4e:5f"&lt;/span&gt;
      &lt;span class="na"&gt;cidr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10.0.0.0/16"&lt;/span&gt;    &lt;span class="c1"&gt;# VPC CIDR — encapsulate traffic for these pods&lt;/span&gt;
  &lt;span class="na"&gt;vni&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
  &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8472&lt;/span&gt;                 &lt;span class="c1"&gt;# UDP 8472, Cilium default&lt;/span&gt;
  &lt;span class="na"&gt;mtu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1450&lt;/span&gt;                  &lt;span class="c1"&gt;# 50 bytes for VXLAN header&lt;/span&gt;
&lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;active&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;lastReconciledAt&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-05-06T08:30:00Z"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2.4 Axis 4: On-Prem Nodes (Cilium VTEP Decoder)
&lt;/h3&gt;

&lt;p&gt;Each on-prem node's Cilium agent reads the &lt;code&gt;CiliumVTEPConfig&lt;/code&gt; and registers the gateway IP as a remote VTEP. When an encapsulated packet arrives, it strips the VXLAN header and routes inline to the destination pod. The reverse direction (on-prem pod → VPC) uses the same tunnel. &lt;strong&gt;Cilium unifies both directions into one data plane&lt;/strong&gt;, so policies (NetworkPolicy / CiliumNetworkPolicy) apply consistently across the cluster.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Prerequisites — IAM, Security Groups, CIDR Design
&lt;/h2&gt;

&lt;p&gt;Three pre-flight tasks must be completed before deployment. Skipping any one of them either blocks the gateway from updating the VPC routes or yields a cluster where packets reach the on-prem side but never come back.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Gateway IAM Permissions — Scoped via IRSA
&lt;/h3&gt;

&lt;p&gt;The gateway pod must update its own node's ENI-pointing routes, which requires EC2 permissions. Bind the following policy via IRSA (IAM Roles for Service Accounts).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DescribeRouteTables"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ec2:DescribeRouteTables"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ec2:DescribeNetworkInterfaces"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ManagePodCidrRoutes"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ec2:CreateRoute"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ec2:ReplaceRoute"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ec2:DeleteRoute"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:ec2:ap-northeast-2:123456789012:route-table/rtb-aaaa1111"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:ec2:ap-northeast-2:123456789012:route-table/rtb-bbbb2222"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"StringEquals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"aws:ResourceTag/eks-hybrid-nodes-gateway"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"true"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Design note:&lt;/strong&gt; never use a wildcard for &lt;code&gt;Resource&lt;/code&gt;. Pin the route-table ARNs the gateway is allowed to manage, and add a tag-based &lt;code&gt;Condition&lt;/code&gt; so that only route tables explicitly tagged &lt;code&gt;eks-hybrid-nodes-gateway=true&lt;/code&gt; are eligible. This contains the blast radius if anything goes wrong.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Security Groups and On-Prem Firewalls
&lt;/h3&gt;

&lt;p&gt;VXLAN needs a single bidirectional UDP port (8472) open in both places.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Where&lt;/th&gt;
&lt;th&gt;Direction&lt;/th&gt;
&lt;th&gt;Protocol/Port&lt;/th&gt;
&lt;th&gt;Source/Destination&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gateway EC2 SG&lt;/td&gt;
&lt;td&gt;Inbound&lt;/td&gt;
&lt;td&gt;UDP 8472&lt;/td&gt;
&lt;td&gt;On-prem node IP CIDR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gateway EC2 SG&lt;/td&gt;
&lt;td&gt;Outbound&lt;/td&gt;
&lt;td&gt;UDP 8472&lt;/td&gt;
&lt;td&gt;On-prem node IP CIDR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-prem firewall&lt;/td&gt;
&lt;td&gt;Inbound/Outbound&lt;/td&gt;
&lt;td&gt;UDP 8472&lt;/td&gt;
&lt;td&gt;Gateway EC2 ENI IP range&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EKS cluster SG&lt;/td&gt;
&lt;td&gt;Inbound&lt;/td&gt;
&lt;td&gt;TCP 443 / 10250&lt;/td&gt;
&lt;td&gt;On-prem node IP CIDR (kubelet ↔ API)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  3.3 CIDR Design — RFC-1918 / RFC-6598, Non-Overlapping
&lt;/h3&gt;

&lt;p&gt;On-prem node and pod CIDRs must come from one of these ranges:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RFC-1918: &lt;code&gt;10.0.0.0/8&lt;/code&gt;, &lt;code&gt;172.16.0.0/12&lt;/code&gt;, &lt;code&gt;192.168.0.0/16&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;RFC-6598 (CGNAT): &lt;code&gt;100.64.0.0/10&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And they must not overlap with any of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The VPC CIDR (e.g., &lt;code&gt;10.0.0.0/16&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;The Kubernetes service CIDR (e.g., &lt;code&gt;10.100.0.0/16&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Each other (on-prem node CIDR ↔ on-prem pod CIDR)&lt;/li&gt;
&lt;li&gt;Any peered or TGW-routed CIDR&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Practical recommendation: pick &lt;code&gt;100.64.0.0/16&lt;/code&gt; or &lt;code&gt;100.65.0.0/16&lt;/code&gt; from RFC-6598 for on-prem pod CIDRs. They almost never collide with internal RFC-1918 corporate ranges, and since the VPC never has to know about them, you get extra freedom.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Deployment — One Helm Install, 30-Second Failover Test
&lt;/h2&gt;

&lt;p&gt;The gateway ships as a Helm chart alongside the AWS-maintained Cilium build. Deployment is four steps.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1 Provision the Gateway Node Pool
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# eksctl: dedicated gateway node group across two AZs&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eksctl.io/v1alpha5&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ClusterConfig&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prod-hybrid&lt;/span&gt;
  &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ap-northeast-2&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.32"&lt;/span&gt;

&lt;span class="na"&gt;managedNodeGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gateway-pool&lt;/span&gt;
    &lt;span class="na"&gt;instanceType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;m7i.large&lt;/span&gt;       &lt;span class="c1"&gt;# generous network bandwidth&lt;/span&gt;
    &lt;span class="na"&gt;desiredCapacity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="na"&gt;minSize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
    &lt;span class="na"&gt;maxSize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
    &lt;span class="na"&gt;availabilityZones&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-2a"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ap-northeast-2c"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;hybrid-gateway&lt;/span&gt;
    &lt;span class="na"&gt;taints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;hybrid-gateway&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
        &lt;span class="na"&gt;effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NoSchedule&lt;/span&gt;
    &lt;span class="na"&gt;privateNetworking&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.2 Install the AWS-Maintained Cilium
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm repo add eks https://aws.github.io/eks-charts
helm upgrade &lt;span class="nt"&gt;--install&lt;/span&gt; cilium eks/cilium-eks &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; kube-system &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--version&lt;/span&gt; 1.16.7-eks.1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;kubeProxyReplacement&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;tunnelProtocol&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;vxlan &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;ipv4NativeRoutingCIDR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;100.64.0.0/16 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; hybridNodes.enabled&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.3 Install the Gateway Components + Bind IRSA
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm upgrade &lt;span class="nt"&gt;--install&lt;/span&gt; eks-hybrid-gateway eks/eks-hybrid-nodes-gateway &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; kube-system &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--version&lt;/span&gt; 1.0.0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; serviceAccount.create&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; serviceAccount.annotations.&lt;span class="s2"&gt;"eks&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;amazonaws&lt;/span&gt;&lt;span class="se"&gt;\.&lt;/span&gt;&lt;span class="s2"&gt;com/role-arn"&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;arn:aws:iam::123456789012:role/eks-hybrid-gateway &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; nodeSelector.role&lt;span class="o"&gt;=&lt;/span&gt;hybrid-gateway &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; tolerations[0].key&lt;span class="o"&gt;=&lt;/span&gt;hybrid-gateway &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; tolerations[0].operator&lt;span class="o"&gt;=&lt;/span&gt;Equal &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; tolerations[0].value&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; tolerations[0].effect&lt;span class="o"&gt;=&lt;/span&gt;NoSchedule &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;vpcRouteTableIds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"{rtb-aaaa1111,rtb-bbbb2222}"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; podAntiAffinity.topologyKey&lt;span class="o"&gt;=&lt;/span&gt;topology.kubernetes.io/zone
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.4 Verify — 30-Second Failover Test
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1) Identify the leader gateway pod&lt;/span&gt;
kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; kube-system get lease eks-hybrid-gateway &lt;span class="nt"&gt;-o&lt;/span&gt; yaml | &lt;span class="nb"&gt;grep &lt;/span&gt;holderIdentity

&lt;span class="c"&gt;# 2) Confirm the auto-inserted VPC route entries&lt;/span&gt;
aws ec2 describe-route-tables &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--route-table-ids&lt;/span&gt; rtb-aaaa1111 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"RouteTables[].Routes[?DestinationCidrBlock=='100.64.0.0/16']"&lt;/span&gt;

&lt;span class="c"&gt;# 3) Reach an on-prem pod from a VPC pod&lt;/span&gt;
kubectl run curl &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; &lt;span class="nt"&gt;--image&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;curlimages/curl &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  curl &lt;span class="nt"&gt;-fsS&lt;/span&gt; http://100.64.0.50:8080/healthz

&lt;span class="c"&gt;# 4) Failover test — kill the active pod&lt;/span&gt;
kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; kube-system delete pod eks-hybrid-gateway-79bc6f5c5d-abcde
&lt;span class="c"&gt;# Standby becomes leader within 3–5 seconds; ENI target on the VPC route flips automatically.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Traffic Matrix — Four Flows You Must Recognize
&lt;/h2&gt;

&lt;p&gt;Once installed, the cluster carries traffic in four distinct patterns. Recognizing each pattern keeps policy, observability, and cost design coherent.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Source → Destination&lt;/th&gt;
&lt;th&gt;Path&lt;/th&gt;
&lt;th&gt;Encapsulation&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;EKS control plane → on-prem pod (Webhook)&lt;/td&gt;
&lt;td&gt;EKS ENI → VPC route → gateway ENI → VXLAN → on-prem pod&lt;/td&gt;
&lt;td&gt;VXLAN&lt;/td&gt;
&lt;td&gt;Admission webhook, aggregated APIServer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VPC EC2 pod → on-prem pod&lt;/td&gt;
&lt;td&gt;EC2 ENI → VPC route → gateway ENI → VXLAN → on-prem pod&lt;/td&gt;
&lt;td&gt;VXLAN&lt;/td&gt;
&lt;td&gt;Microservice-to-microservice calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-prem pod → VPC EC2 pod&lt;/td&gt;
&lt;td&gt;On-prem Cilium → VXLAN → gateway ENI → EC2 pod&lt;/td&gt;
&lt;td&gt;VXLAN (reverse)&lt;/td&gt;
&lt;td&gt;IDC workload calling cache/services next to RDS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ALB/NLB → on-prem pod&lt;/td&gt;
&lt;td&gt;ALB/NLB → gateway ENI → VXLAN → on-prem pod&lt;/td&gt;
&lt;td&gt;VXLAN&lt;/td&gt;
&lt;td&gt;External user traffic reaching IDC GPU/licensed workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The fourth flow (ALB/NLB → on-prem pod) is the highest-impact change in 2026. &lt;strong&gt;Previously, sending external user traffic to an IDC workload meant placing a separate ALB/NLB inside the IDC or detouring via Direct Connect; now a single ALB can target both EC2 pods and IDC pods in the same target group.&lt;/strong&gt; That unlocks two important workload classes — AI inference tied to GPU licenses anchored in the IDC, and compliance-bound data-processing that cannot leave the IDC — by letting them participate in the cloud-native ingress flow without extra infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Operations — Metrics, Failover Scenarios, and Five Common Pitfalls
&lt;/h2&gt;

&lt;h3&gt;
  
  
  6.1 Metrics and Logs
&lt;/h3&gt;

&lt;p&gt;The gateway exposes Prometheus metrics. The standard pattern is to scrape them with the OpenTelemetry Collector and ship to CloudWatch or Grafana.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Prometheus scrape example&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;job_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eks-hybrid-gateway&lt;/span&gt;
  &lt;span class="na"&gt;kubernetes_sd_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pod&lt;/span&gt;
      &lt;span class="na"&gt;namespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;names&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;kube-system&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;relabel_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;source_labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;__meta_kubernetes_pod_label_app&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;keep&lt;/span&gt;
      &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eks-hybrid-gateway&lt;/span&gt;
  &lt;span class="na"&gt;metrics_path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/metrics&lt;/span&gt;
&lt;span class="c1"&gt;# Key metrics:&lt;/span&gt;
&lt;span class="c1"&gt;#   eks_hybrid_gateway_lease_holder{pod}             1=leader, 0=standby&lt;/span&gt;
&lt;span class="c1"&gt;#   eks_hybrid_gateway_vtep_count                    number of registered remote VTEPs&lt;/span&gt;
&lt;span class="c1"&gt;#   eks_hybrid_gateway_vxlan_packets_total{dir}      tx/rx packet counts&lt;/span&gt;
&lt;span class="c1"&gt;#   eks_hybrid_gateway_route_reconcile_errors_total  VPC route update failures&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6.2 Four Failover Scenarios
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Failure&lt;/th&gt;
&lt;th&gt;Recovery Time&lt;/th&gt;
&lt;th&gt;Operator Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Active gateway pod crash&lt;/td&gt;
&lt;td&gt;3–5s&lt;/td&gt;
&lt;td&gt;None — Lease flips automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Active node OS crash&lt;/td&gt;
&lt;td&gt;5–10s (Lease TTL + route update)&lt;/td&gt;
&lt;td&gt;None — ASG replaces the node&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VPC route update failure&lt;/td&gt;
&lt;td&gt;Alert only; previous routes remain&lt;/td&gt;
&lt;td&gt;Verify IAM Condition and route-table ARN scoping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UDP 8472 blocked between sites&lt;/td&gt;
&lt;td&gt;Total cutoff&lt;/td&gt;
&lt;td&gt;Inspect firewalls / VPN / Direct Connect ACLs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  6.3 Five Common Pitfalls
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pod CIDR collision&lt;/strong&gt; — even one bit of overlap with VPC CIDR breaks the routes. Run &lt;code&gt;aws ec2 describe-vpcs&lt;/code&gt; and review every peering and TGW entry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MTU 1450 missing&lt;/strong&gt; — the VXLAN header eats 50 bytes; leaving MTU at 1500 yields fragmentation errors. Set &lt;code&gt;tunnelMTU=1450&lt;/code&gt; in the Cilium chart.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unidirectional UDP 8472&lt;/strong&gt; — both the security group and the on-prem firewall must allow traffic in both directions. Half-open kills the handshake.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IAM &lt;code&gt;Resource&lt;/code&gt; wildcard&lt;/strong&gt; — letting the gateway touch arbitrary route tables is a foot-gun. Pin ARNs and use a tag &lt;code&gt;Condition&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing anti-affinity&lt;/strong&gt; — if both gateway pods land on the same node or AZ, failover loses meaning. Enforce &lt;code&gt;topologyKey=topology.kubernetes.io/zone&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  7. The Same Week: ECS Capacity Reservations and NLB Canary
&lt;/h2&gt;

&lt;p&gt;The What's Next with AWS 2026 keynote also delivered two meaningful ECS updates that hybrid-cluster operators should review in the same quarter.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.1 ECS Managed Instances + EC2 Capacity Reservations
&lt;/h3&gt;

&lt;p&gt;ECS Managed Instance capacity providers gained a new option, &lt;code&gt;capacityOptionType=reserved&lt;/code&gt;, allowing tasks to consume previously reserved EC2 capacity. Three strategies are available.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Strategy&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Recommended Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;reservations-only&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Launch only into reserved capacity&lt;/td&gt;
&lt;td&gt;License/GPU/BYOL workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;reservations-first&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Prefer reservations, fall back to on-demand&lt;/td&gt;
&lt;td&gt;Predictable baseline plus spike handling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;reservations-excluded&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Never use reservations&lt;/td&gt;
&lt;td&gt;Spot-heavy cost-optimized workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Bind a Capacity Reservation to an ECS capacity provider&lt;/span&gt;
aws ecs create-capacity-provider &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; reserved-baseline &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--auto-scaling-group-provider&lt;/span&gt; &lt;span class="s1"&gt;'autoScalingGroupArn=arn:aws:autoscaling:...'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--managed-instances-provider&lt;/span&gt; &lt;span class="s1"&gt;'{
    "capacityOptionType": "reserved",
    "capacityReservationGroupArn": "arn:aws:resource-groups:ap-northeast-2:123456789012:group/cr-baseline",
    "reservationStrategy": "reservations-first"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7.2 ECS Linear/Canary Deployments on NLB
&lt;/h3&gt;

&lt;p&gt;ECS already supported linear/canary on ALB; NLB was the gap. With the new release, &lt;strong&gt;NLB-backed services can shift traffic in fractional steps&lt;/strong&gt;, integrated with CloudWatch alarms for automatic rollback when P99 latency or 5xx error rates breach thresholds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Apply NLB canary deployment policy to an ECS service&lt;/span&gt;
aws ecs update-service &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cluster&lt;/span&gt; prod &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--service&lt;/span&gt; api-grpc &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--deployment-configuration&lt;/span&gt; &lt;span class="s1"&gt;'{
    "deploymentCircuitBreaker": {"enable": true, "rollback": true},
    "strategy": "CANARY",
    "stepWeights": [10, 25, 50, 100],
    "stepDuration": 300
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Latency-critical workloads that require NLB — gRPC services, financial matching engines — finally get the same deployment freedom that ALB users have had for years.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. A 4-Week Migration Checklist
&lt;/h2&gt;

&lt;p&gt;The 4-week plan a hybrid-cluster operator can use to migrate an existing EKS 1.32 + IDC GPU node deployment to the new gateway:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Week&lt;/th&gt;
&lt;th&gt;Work&lt;/th&gt;
&lt;th&gt;Done When&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Week 1&lt;/td&gt;
&lt;td&gt;CIDR redesign, IAM role, security groups, Direct Connect ACL review&lt;/td&gt;
&lt;td&gt;On-prem pod CIDR fixed in non-overlapping RFC-6598 range; IAM Resource ARN/tag scoped&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Week 2&lt;/td&gt;
&lt;td&gt;Deploy gateway pool + AWS-maintained Cilium + gateway chart in Dev&lt;/td&gt;
&lt;td&gt;Lease holder, VTEP count, 3–5s failover verified; P99 RTT measured&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Week 3&lt;/td&gt;
&lt;td&gt;Run all four traffic flows end-to-end in Staging&lt;/td&gt;
&lt;td&gt;Bidirectional reachability across all flows; consistent NetworkPolicy enforcement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Week 4&lt;/td&gt;
&lt;td&gt;Prod cutover + retire BGP routing + apply ECS Capacity Reservations / NLB Canary&lt;/td&gt;
&lt;td&gt;Manually-registered pod CIDRs removed from VPC route tables; cost &amp;amp; latency SLOs healthy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  8.1 Cost Model
&lt;/h3&gt;

&lt;p&gt;The gateway itself is free. Total cost is driven by three things: (1) the EC2 cost of two gateway instances (or managed node group equivalents); (2) data-transfer fees through the gateway ENIs; (3) the existing EKS Hybrid Nodes per-vCPU-hour fee. As long as you reuse an existing Direct Connect / VPN circuit, &lt;strong&gt;the gateway cuts operational burden while adding only the cost of two EC2 instances.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  8.2 Recommended SLOs
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;VTEP registration latency:&lt;/strong&gt; new hybrid node visible to the VTEP within 30 seconds (P95)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VXLAN packet drop rate:&lt;/strong&gt; below 0.001% per minute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gateway failover time:&lt;/strong&gt; under 5 seconds (P99)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VPC route update failures:&lt;/strong&gt; zero (critical alert)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ALB → on-prem pod RTT overhead:&lt;/strong&gt; within +1–2 ms over Direct Connect baseline&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  9. Conclusion — Hybrid Kubernetes Has a New Default
&lt;/h2&gt;

&lt;p&gt;As of May 2026, EKS Hybrid Nodes gateway settles two things. First, &lt;strong&gt;the default network model for hybrid Kubernetes is encapsulation, not BGP.&lt;/strong&gt; Not exposing pod CIDRs externally shortens compliance and security review, and reduces IaC churn around route tables to zero. Second, &lt;strong&gt;ALB and NLB can now treat EC2 pods and IDC pods as members of the same target group&lt;/strong&gt; — letting "external user → cloud → IDC GPU" flow through a single ingress for the first time. That is the thread that pulls license-bound and data-sovereignty-bound workloads back into a cloud-native posture without separate infrastructure.&lt;/p&gt;

&lt;p&gt;For us at ManoIT, the next-quarter roadmap has three legs. First, walk Dev → Staging → Prod over four weeks and pin gateway metrics into the front row of our SRE dashboard. Second, shift equivalent ECS workloads to NLB Canary, wired to P99-latency-driven automatic rollback. Third, anchor GPU and licensed baselines on ECS Managed Instances + Capacity Reservations while keeping a Spot fallback strategy. When those three land, our container infrastructure becomes one platform with three faces — EKS, ECS, and hybrid — visible from a single operational surface.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was produced with assistance from AI (Claude) and reviewed for technical accuracy.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;© 2026 ManoIT — &lt;a href="https://www.manoit.co.kr" rel="noopener noreferrer"&gt;www.manoit.co.kr&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1466403" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>kubernetes</category>
      <category>cloud</category>
      <category>devops</category>
    </item>
    <item>
      <title>GitLab 18.11 + Duo Agent Platform: CI Expert, Agentic SAST Auto-Resolution, Custom Flow YAML, and Credits Guardrails for AI-Native DevSecOps</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Mon, 04 May 2026 21:15:05 +0000</pubDate>
      <link>https://dev.to/x4nent/gitlab-1811-duo-agent-platform-ci-expert-agentic-sast-auto-resolution-custom-flow-yaml-and-3jmf</link>
      <guid>https://dev.to/x4nent/gitlab-1811-duo-agent-platform-ci-expert-agentic-sast-auto-resolution-custom-flow-yaml-and-3jmf</guid>
      <description>&lt;h1&gt;
  
  
  GitLab 18.11 + Duo Agent Platform — CI Expert, Agentic SAST Auto-Resolution, Custom Flow YAML, and GitLab Credits Guardrails Redefining 2026 AI-Native DevSecOps
&lt;/h1&gt;

&lt;p&gt;On April 16, 2026, GitLab shipped &lt;strong&gt;18.11&lt;/strong&gt; and dropped three new agents onto the &lt;strong&gt;Duo Agent Platform&lt;/strong&gt; that went GA back in January. The new lineup is the &lt;strong&gt;CI Expert Agent (Beta)&lt;/strong&gt; that writes a working &lt;code&gt;.gitlab-ci.yml&lt;/code&gt; from an empty repository, the &lt;strong&gt;Data Analyst Agent (GA)&lt;/strong&gt; that answers natural-language queries via GLQL, and &lt;strong&gt;Agentic SAST Vulnerability Resolution (GA)&lt;/strong&gt; that ships a ready-to-review merge request the moment a SAST scan finishes. At the same time the default model for Agentic Chat moved from &lt;strong&gt;Claude Haiku 4.5 to Claude Sonnet 4.6 on Vertex AI&lt;/strong&gt;, the self-hosted LLM list gained &lt;strong&gt;Mistral AI&lt;/strong&gt;, and — critically for ops teams — GitLab introduced &lt;strong&gt;subscription-level and per-user GitLab Credits caps&lt;/strong&gt; to stop runaway spend. This post is the ManoIT DevSecOps team's consolidated reference: the four-tier Duo Agent Platform architecture, the Custom Flow v1 YAML schema, the CI/CD and SAST automation patterns, the Credits cost model, and the migration checklist (including the forced PostgreSQL 17 upgrade).&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why GitLab 18.11 Is the Inflection Point — Q1 2026 Duo Agent Platform Trajectory
&lt;/h2&gt;

&lt;p&gt;The Q1 2026 storyline of GitLab Duo can be summarized in one line: &lt;strong&gt;"the AI plugin became the platform."&lt;/strong&gt; 18.8 (2026-01-15) promoted the Duo Agent Platform to GA on Premium and Ultimate, shipping the Planner Agent, Security Analyst Agent, and the MCP Client all at once. 18.9 (2026-02-19) made self-hosted LLMs first-class. 18.10 introduced &lt;strong&gt;GitLab Credits&lt;/strong&gt; as the usage-based billing currency in beta. 18.11 then converged the trajectory: it pushed CI Expert into Beta on every tier, promoted Agentic SAST to GA so a fix MR appears the instant a scan completes, and capped credit consumption at both the subscription and the user level.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version (release date)&lt;/th&gt;
&lt;th&gt;Headline change&lt;/th&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Practical impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;18.8 (2026-01-15)&lt;/td&gt;
&lt;td&gt;Duo Agent Platform GA, Duo Planner / Security Analyst / MCP Client GA, 5 new flows (Issue→MR, Convert to GitLab CI/CD, Fix CI/CD pipeline, Code Review, Software Dev in IDE)&lt;/td&gt;
&lt;td&gt;Premium / Ultimate&lt;/td&gt;
&lt;td&gt;MCP-connected Slack, Jira, Confluence inside the IDE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;18.9 (2026-02-19)&lt;/td&gt;
&lt;td&gt;Self-hosted LLMs official (AWS Bedrock, Vertex AI, Azure OpenAI, Anthropic, OpenAI), Agentic SAST Vulnerability Resolution Beta&lt;/td&gt;
&lt;td&gt;Ultimate (self-hosted), all tiers (Beta)&lt;/td&gt;
&lt;td&gt;Duo usable in air-gapped finance/government environments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;18.10&lt;/td&gt;
&lt;td&gt;GitLab Credits usage-based billing (beta)&lt;/td&gt;
&lt;td&gt;All tiers&lt;/td&gt;
&lt;td&gt;$1 / credit, automated code review at $0.25 flat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;18.11 (2026-04-16)&lt;/td&gt;
&lt;td&gt;CI Expert Agent Beta, Data Analyst Agent GA, Agentic SAST GA, Custom Flow tool option overrides, Mistral AI self-hosting, Sonnet 4.6 default for Agentic Chat, subscription + per-user credit caps, MR pipeline inputs customization&lt;/td&gt;
&lt;td&gt;Free / Premium / Ultimate&lt;/td&gt;
&lt;td&gt;"AI writes your &lt;code&gt;.gitlab-ci.yml&lt;/code&gt; and patches SAST" reaches Free and Self-Managed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The architectural takeaway is that the agents are &lt;strong&gt;platform-native&lt;/strong&gt;. External copilots see only the code in the IDE; Duo agents see the &lt;strong&gt;code, issues, MRs, pipelines, vulnerabilities, and runner logs of the same project at the same time&lt;/strong&gt;. That is why the CI Expert can pick the right cache pattern instead of pasting a generic template — it actually inspects the repository.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Duo Agent Platform Architecture — AI Gateway, Workflow Service, Knowledge Graph, Runner
&lt;/h2&gt;

&lt;p&gt;The 18.11 runtime decomposes into four well-defined components. Understanding their responsibilities aligns the self-hosted deployment, observability, and pricing model in one shot.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hxlv9o7u27zc6rg24t6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hxlv9o7u27zc6rg24t6.png" alt="GitLab Duo Agent Platform 18.11 four-tier architecture diagram showing AI Gateway, Workflow Service, Knowledge Graph, and Runner with CI Expert, Data Analyst, and Agentic SAST agents" width="800" height="418"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 AI Gateway — One Entry for Auth, Routing, and Metering
&lt;/h3&gt;

&lt;p&gt;Every Duo call passes through the AI Gateway. It validates the GitLab user token, routes to the configured model provider, and meters GitLab Credits. 18.11 adds two enforceable guardrails. A &lt;strong&gt;subscription-level usage cap&lt;/strong&gt; suspends Duo Agent Platform access for the entire subscription as soon as on-demand credits cross the configured threshold. A &lt;strong&gt;per-user usage cap&lt;/strong&gt; suspends only the offender, leaving everyone else unaffected. When both caps are configured, &lt;strong&gt;whichever cap is hit first wins&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Workflow Service — Home of Custom Flow v1
&lt;/h3&gt;

&lt;p&gt;The Duo Workflow Service consumes YAML definitions written against the Flow Registry v1 specification with the fields &lt;code&gt;version&lt;/code&gt;, &lt;code&gt;environment&lt;/code&gt;, &lt;code&gt;components&lt;/code&gt;, &lt;code&gt;prompts&lt;/code&gt;, &lt;code&gt;routers&lt;/code&gt;, and &lt;code&gt;flow&lt;/code&gt;. In Custom Flow definitions, &lt;code&gt;environment&lt;/code&gt; accepts only &lt;code&gt;ambient&lt;/code&gt; (the &lt;code&gt;chat&lt;/code&gt; and &lt;code&gt;chat-partial&lt;/code&gt; values are restricted), the &lt;code&gt;model&lt;/code&gt; field inside a &lt;code&gt;prompts&lt;/code&gt; entry is not supported (the model is determined by group/instance settings), and v1 top-level &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;, and &lt;code&gt;product_group&lt;/code&gt; are also restricted. The 18.11 headline addition is &lt;strong&gt;tool option and parameter overrides directly in the flow definition&lt;/strong&gt;, which lets you pin tool behaviour and enforce guardrails per flow regardless of LLM defaults.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Knowledge Graph — The Source of "Now"
&lt;/h3&gt;

&lt;p&gt;The Knowledge Graph is a synced graph index over code, issues, MRs, pipelines, commits, and vulnerabilities. External copilots that depend on stale embeddings often answer with last week's context; Duo pulls fresh state from the Knowledge Graph on every call. That is why the CI Expert nails build/test patterns and the Agentic SAST agent can trace a vulnerable function through its callers.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.4 GitLab Runner 18.11 — Where Remote Flows Actually Run
&lt;/h3&gt;

&lt;p&gt;Remote Flows execute as pipeline jobs on a GitLab Runner; the Rails backend triggers them with &lt;code&gt;start_workflow = true&lt;/code&gt;. 18.11 ships two ops-level improvements. First, a &lt;strong&gt;concrete helper image with bundled dependencies&lt;/strong&gt; reduces external image pulls at job startup. Second, the &lt;strong&gt;job router feature flag is now read from the runner config file instead of an environment variable&lt;/strong&gt;, which removes an entire class of incidents where a leaked or overridden env var broke routing silently. Bug fixes in the same release land &lt;code&gt;CONCURRENT_PROJECT_ID&lt;/code&gt; collisions, the &lt;code&gt;after_script&lt;/code&gt; ordering when &lt;code&gt;pre_build_script&lt;/code&gt; fails, pipeline hangs on cache operations, and the silent fallback to job payload credentials when the &lt;code&gt;DOCKER_AUTH_CONFIG&lt;/code&gt; credential helper binary is missing.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. CI Expert Agent — From an Empty &lt;code&gt;.gitlab-ci.yml&lt;/code&gt; to a Working Pipeline
&lt;/h2&gt;

&lt;p&gt;CI Expert Agent shipped in 18.11 in &lt;strong&gt;Beta on every tier (Free, Premium, Ultimate)&lt;/strong&gt; and on &lt;strong&gt;GitLab.com, Self-Managed, Dedicated, and Dedicated for Government&lt;/strong&gt;. The premise is straightforward: starting from an empty repository or empty &lt;code&gt;.gitlab-ci.yml&lt;/code&gt;, the agent inspects the codebase, asks a few guided questions about your build and test process, and emits a reviewable pipeline. It does not paste a template — it identifies language, framework, test runner, and cacheable directories from the actual code.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Invocation and Output
&lt;/h3&gt;

&lt;p&gt;You invoke CI Expert from the VS Code extension or the Web UI's Agentic Chat:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Web UI — Agentic Chat&lt;/span&gt;
/ci-expert
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; Automate the build and &lt;span class="nb"&gt;test &lt;/span&gt;&lt;span class="k"&gt;for &lt;/span&gt;this project, and deploy to Kubernetes on push to main.

&lt;span class="c"&gt;# VS Code (Duo extension)&lt;/span&gt;
Cmd+Shift+P → &lt;span class="s2"&gt;"GitLab Duo: Generate CI Pipeline"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A typical first-cut output for a Node.js + Helm project looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .gitlab-ci.yml — first draft from CI Expert&lt;/span&gt;
&lt;span class="na"&gt;stages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;install&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;deploy&lt;/span&gt;

&lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node:20-bookworm&lt;/span&gt;
  &lt;span class="na"&gt;cache&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;files&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;package-lock.json&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;node_modules/&lt;/span&gt;

&lt;span class="na"&gt;install&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;install&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;npm ci --prefer-offline&lt;/span&gt;
  &lt;span class="na"&gt;artifacts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;node_modules/&lt;/span&gt;
    &lt;span class="na"&gt;expire_in&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1 hour&lt;/span&gt;

&lt;span class="na"&gt;unit-test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;install"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;npm run test:unit -- --reporter=junit&lt;/span&gt;
  &lt;span class="na"&gt;artifacts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;when&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;
    &lt;span class="na"&gt;reports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;junit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;junit.xml&lt;/span&gt;

&lt;span class="na"&gt;build-image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
  &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;unit-test"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker:25&lt;/span&gt;
  &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;docker:25-dind&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_COMMIT_BRANCH == "main"&lt;/span&gt;

&lt;span class="na"&gt;deploy-prod&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy&lt;/span&gt;
  &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;build-image"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;alpine/helm:3.16.2&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;helm upgrade --install web ./chart&lt;/span&gt;
        &lt;span class="s"&gt;--set image.tag=$CI_COMMIT_SHA&lt;/span&gt;
        &lt;span class="s"&gt;--namespace prod --atomic --wait&lt;/span&gt;
  &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;production&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://app.example.com&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_COMMIT_BRANCH == "main"&lt;/span&gt;
      &lt;span class="na"&gt;when&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manual&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each decision comes with a plain-language rationale. For example: "We use &lt;code&gt;npm ci --prefer-offline&lt;/code&gt; because &lt;code&gt;package-lock.json&lt;/code&gt; exists, and we set it as the cache key so dependencies only reinstall when it changes." That rationale lands in the MR description, which materially reduces reviewer friction.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Pipeline Failure Triage
&lt;/h3&gt;

&lt;p&gt;CI Expert is not just for greenfield projects — it triages broken pipelines. It reads the failed job log, classifies issues like cache misses, missing tests, or image pull failures, and proposes a patch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Trigger directly from a failed job&lt;/span&gt;
gitlab-duo ci-expert debug &lt;span class="nt"&gt;--job-id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;42 &lt;span class="nt"&gt;--project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;acme/web

&lt;span class="c"&gt;# Sample output&lt;/span&gt;
Diagnosis:
  - Cache miss: package-lock.json modified since last successful run.
  - Suggested: pin Node 20.18.1, add &lt;span class="nt"&gt;--frozen-lockfile&lt;/span&gt; guard.
Patch &lt;span class="o"&gt;(&lt;/span&gt;preview&lt;span class="o"&gt;)&lt;/span&gt;:
  - &lt;span class="nb"&gt;install&lt;/span&gt;:
  -   script:
  -     - npm ci &lt;span class="nt"&gt;--prefer-offline&lt;/span&gt;
  +     - npm ci &lt;span class="nt"&gt;--prefer-offline&lt;/span&gt; &lt;span class="nt"&gt;--frozen-lockfile&lt;/span&gt;
Apply? &lt;span class="o"&gt;[&lt;/span&gt;y/N]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Agentic SAST Vulnerability Resolution — Patch MRs Open the Moment a Scan Closes
&lt;/h2&gt;

&lt;p&gt;Agentic SAST Vulnerability Resolution debuted as a Beta in 18.9 and is &lt;strong&gt;GA in 18.11 on Ultimate&lt;/strong&gt;. The principle is "scan ends → next step happens immediately." When the SAST scan on the main branch completes, false-positive detection runs first; the surviving Critical/High findings are handed to the agent, which uses multi-shot reasoning to traverse the surrounding code and &lt;strong&gt;opens a fix MR plus a verification pipeline automatically&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1 The Workflow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;The SAST scanner finishes on the main branch and writes the vulnerability list.&lt;/li&gt;
&lt;li&gt;SAST False Positive Detection runs to drop noisy findings.&lt;/li&gt;
&lt;li&gt;The agent analyses each remaining Critical/High vulnerability in context.&lt;/li&gt;
&lt;li&gt;If confidence is sufficient, an MR is created and a SAST scan re-runs against the proposed patch to confirm the fix.&lt;/li&gt;
&lt;li&gt;The vulnerability detail page exposes a one-click "Apply resolution" button.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  4.2 Language Coverage and Triggers
&lt;/h3&gt;

&lt;p&gt;The auto-fix is most reliable for languages directly supported by GitLab Advanced SAST: &lt;strong&gt;C, C++, C#, Go, Java, JavaScript, Python, Ruby, and TypeScript&lt;/strong&gt;. Other languages fall back to the Semgrep-based analyzer with reduced fix confidence. You can also manually trigger the agent for any SAST vulnerability from its detail page.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .gitlab-ci.yml — Agentic SAST integration (Ultimate)&lt;/span&gt;
&lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Jobs/SAST.gitlab-ci.yml&lt;/span&gt;

&lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# Auto-fix after SAST (default true in 18.11)&lt;/span&gt;
  &lt;span class="na"&gt;SAST_DUO_AUTO_RESOLVE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
  &lt;span class="c1"&gt;# Extend to medium severity if desired&lt;/span&gt;
  &lt;span class="na"&gt;SAST_DUO_AUTO_RESOLVE_SEVERITY&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;critical,high,medium"&lt;/span&gt;
  &lt;span class="c1"&gt;# Branch the auto-fix MR targets&lt;/span&gt;
  &lt;span class="na"&gt;SAST_DUO_AUTO_RESOLVE_TARGET_REF&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;$CI_DEFAULT_BRANCH"&lt;/span&gt;

&lt;span class="c1"&gt;# Extra SAST verification for auto-fix MRs&lt;/span&gt;
&lt;span class="na"&gt;sast-verify&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_MERGE_REQUEST_LABELS =~ /duo:auto-fix/&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo "Re-running SAST on auto-generated fix MR"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.3 Troubleshooting — "The MR was created but the diff is empty"
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Symptom: the vulnerability page shows "Resolution generated" but the MR diff is empty.&lt;br&gt;&lt;br&gt;
Cause: the agent's confidence assessment fell below threshold. 18.11 will not force a low-quality patch; it leaves an "Insufficient context" reason and a recommended next step (e.g., add the calling functions) in the MR description. Fix: confirm the Knowledge Graph index for the codebase is current, and re-trigger from the vulnerability detail page if needed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  5. Custom Flow YAML Schema — How to Codify Your Own Workflow
&lt;/h2&gt;

&lt;p&gt;For platform engineering teams, the most attractive 18.11 capability is &lt;strong&gt;tool option overrides&lt;/strong&gt; combined with &lt;strong&gt;MCP integration&lt;/strong&gt;. Below is a simplified version of the "security review assistant" Custom Flow that the ManoIT security team uses to pull in Confluence security guidelines as additional context.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .gitlab/duo/flows/security_review.yml&lt;/span&gt;
&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;
&lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ambient&lt;/span&gt;    &lt;span class="c1"&gt;# custom flows allow only "ambient"&lt;/span&gt;
&lt;span class="na"&gt;components&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;collect_context&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AgentComponent&lt;/span&gt;
    &lt;span class="na"&gt;prompt_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;collect_context_prompt&lt;/span&gt;
    &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context:project_id"&lt;/span&gt;
        &lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;project_id"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context:goal"&lt;/span&gt;
        &lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mr_iid"&lt;/span&gt;
    &lt;span class="c1"&gt;# 18.11: tool overrides pin parameters regardless of LLM defaults&lt;/span&gt;
    &lt;span class="na"&gt;tool_overrides&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gitlab.read_merge_request"&lt;/span&gt;
        &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;include_diff&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="na"&gt;include_pipelines&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="na"&gt;max_files&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;tool&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcp.confluence.search"&lt;/span&gt;
        &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;space&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SECURITY"&lt;/span&gt;
          &lt;span class="na"&gt;max_results&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;review&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AgentComponent&lt;/span&gt;
    &lt;span class="na"&gt;prompt_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;security_review_prompt&lt;/span&gt;
    &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;components.collect_context.output"&lt;/span&gt;
        &lt;span class="na"&gt;as&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;evidence"&lt;/span&gt;

&lt;span class="na"&gt;prompts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;collect_context_prompt&lt;/span&gt;
    &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;Read MR {{ mr_iid }} in project {{ project_id }}.&lt;/span&gt;
      &lt;span class="s"&gt;Pull related security guidelines from Confluence SECURITY space.&lt;/span&gt;
      &lt;span class="s"&gt;Output structured evidence.&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;security_review_prompt&lt;/span&gt;
    &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;Given evidence: {{ evidence }}&lt;/span&gt;
      &lt;span class="s"&gt;Produce a security review with:&lt;/span&gt;
      &lt;span class="s"&gt;- Risk classification (low/medium/high/critical)&lt;/span&gt;
      &lt;span class="s"&gt;- Specific code locations of concern&lt;/span&gt;
      &lt;span class="s"&gt;- Suggested mitigations referencing internal policies&lt;/span&gt;

&lt;span class="na"&gt;routers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;from&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;collect_context&lt;/span&gt;
    &lt;span class="na"&gt;to&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;review&lt;/span&gt;
&lt;span class="na"&gt;flow&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;collect_context&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;review&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things matter here. First, &lt;code&gt;tool_overrides&lt;/code&gt; (new in 18.11) &lt;strong&gt;pins tool call parameters so the LLM cannot drift from the contract&lt;/strong&gt; — guardrails travel with the flow. Second, MCP tools (e.g., &lt;code&gt;mcp.confluence.search&lt;/code&gt;) are first-class citizens alongside built-in GitLab tools. MCP configuration lives in a separate file gated by Code Owners approval.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;.gitlab/duo/mcp.json&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Code&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Owners&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;approval&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;required&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"confluence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@atlassian/mcp-confluence"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"CONFLUENCE_BASE_URL"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://acme.atlassian.net/wiki"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"CONFLUENCE_TOKEN"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$CONFLUENCE_API_TOKEN"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"slack"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-slack"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"SLACK_BOT_TOKEN"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$SLACK_DUO_BOT_TOKEN"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5.1 Custom Flow Trigger Mapping
&lt;/h3&gt;

&lt;p&gt;A Custom Flow can react to three trigger types, and each delivers a different shape of &lt;code&gt;context:goal&lt;/code&gt;. Knowing this prevents flows from behaving inconsistently across triggers.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Trigger&lt;/th&gt;
&lt;th&gt;
&lt;code&gt;context:goal&lt;/code&gt; format&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Mention event&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Input: &amp;lt;comment_text&amp;gt;\nContext: {&amp;lt;resource_type&amp;gt; IID: &amp;lt;iid&amp;gt;}&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;@ai-security Can you review?&lt;/code&gt; on issue #2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Assign / Assign reviewer&lt;/td&gt;
&lt;td&gt;A single integer (the resource IID)&lt;/td&gt;
&lt;td&gt;Assigning the service account as reviewer on MR !10 → &lt;code&gt;10&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pipeline event&lt;/td&gt;
&lt;td&gt;The full pipeline event webhook payload&lt;/td&gt;
&lt;td&gt;Auto-analysis of a failed pipeline&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  6. GitLab Credits — Usage-Based Pricing and the 18.11 Guardrails
&lt;/h2&gt;

&lt;p&gt;Duo Agent Platform's cost model standardizes on &lt;strong&gt;GitLab Credits&lt;/strong&gt;: &lt;strong&gt;1 credit = $1 on-demand list price&lt;/strong&gt;, billed monthly in arrears. 18.10 introduced &lt;strong&gt;automated code reviews at a flat $0.25 per MR&lt;/strong&gt;, regardless of size. 18.11 added the two enforcement guardrails operations teams had been asking for.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Control&lt;/th&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;th&gt;Behaviour on hit&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Subscription-level cap&lt;/td&gt;
&lt;td&gt;Whole subscription&lt;/td&gt;
&lt;td&gt;Auto-suspends Duo Agent Platform until the next billing period; admin email notification&lt;/td&gt;
&lt;td&gt;Hard guardrail against unexpected overage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-user cap&lt;/td&gt;
&lt;td&gt;Individual user&lt;/td&gt;
&lt;td&gt;Suspends only that user, others unaffected&lt;/td&gt;
&lt;td&gt;Stops one user from absorbing the subscription pool&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credits dashboard history&lt;/td&gt;
&lt;td&gt;Subscription&lt;/td&gt;
&lt;td&gt;Browse past months daily, compare consumption, reconcile with invoices&lt;/td&gt;
&lt;td&gt;FinOps standard input&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Both caps can run simultaneously, and &lt;strong&gt;whichever is hit first applies&lt;/strong&gt;. Caps reset automatically each billing period. ManoIT's operational guidance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set the &lt;strong&gt;per-user cap&lt;/strong&gt; at ~1.5× the average developer's expected usage (e.g., 50 credits/month) to absorb the learning ramp without runaway.&lt;/li&gt;
&lt;li&gt;Set the &lt;strong&gt;subscription cap&lt;/strong&gt; conservatively in the first quarter (e.g., 80% of forecast) and tighten or relax it after two cycles of dashboard data.&lt;/li&gt;
&lt;li&gt;Agentic SAST runs in the background, so budget separately for it depending on scan cadence (every push vs. nightly).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  7. Model Policy — Sonnet 4.6 Default and Mistral for Self-Hosting
&lt;/h2&gt;

&lt;p&gt;Two lines describe 18.11's model policy. &lt;strong&gt;Agentic Chat's default model moved from Claude Haiku 4.5 to Claude Sonnet 4.6 hosted on Vertex AI&lt;/strong&gt;, and &lt;strong&gt;Mistral AI joined the supported self-hosted LLM list&lt;/strong&gt;. Sonnet 4.6 reasons better but uses a higher GitLab Credit multiplier than Haiku. Existing explicit choices are preserved.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.1 Model Selection Cheatsheet
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workload&lt;/th&gt;
&lt;th&gt;Recommended model&lt;/th&gt;
&lt;th&gt;Reason&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;General Agentic Chat&lt;/td&gt;
&lt;td&gt;Claude Sonnet 4.6 (default)&lt;/td&gt;
&lt;td&gt;Best code reasoning and MR review quality; 18.11 default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High-volume background calls (Agentic SAST)&lt;/td&gt;
&lt;td&gt;Claude Haiku 4.5 or self-hosted Mistral&lt;/td&gt;
&lt;td&gt;Lower credit multiplier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Air-gapped finance/government&lt;/td&gt;
&lt;td&gt;Mistral AI or Vertex AI Private&lt;/td&gt;
&lt;td&gt;No external LLM hop; new in 18.11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long-context codebase analysis&lt;/td&gt;
&lt;td&gt;Vertex AI Gemini 2.5 Pro or Sonnet 4.6&lt;/td&gt;
&lt;td&gt;Long context plus reasoning depth&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/gitlab.rb — adding Mistral to self-hosted models (new in 18.11)&lt;/span&gt;
&lt;span class="s"&gt;gitlab_rails['duo_self_hosted_models'] = [&lt;/span&gt;
  &lt;span class="s"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;'name' =&amp;gt; 'mistral-large-latest',&lt;/span&gt;
    &lt;span class="s"&gt;'provider' =&amp;gt; 'mistral',&lt;/span&gt;
    &lt;span class="s"&gt;'endpoint' =&amp;gt; 'https://api.mistral.ai/v1',&lt;/span&gt;
    &lt;span class="s"&gt;'api_key_env' =&amp;gt; 'MISTRAL_API_KEY',&lt;/span&gt;
    &lt;span class="s"&gt;'use_for' =&amp;gt; ['code_completion', 'agentic_chat']&lt;/span&gt;
  &lt;span class="s"&gt;},&lt;/span&gt;
  &lt;span class="s"&gt;{&lt;/span&gt;
    &lt;span class="s"&gt;'name' =&amp;gt; 'claude-sonnet-4-6',&lt;/span&gt;
    &lt;span class="s"&gt;'provider' =&amp;gt; 'vertex_ai',&lt;/span&gt;
    &lt;span class="s"&gt;'endpoint' =&amp;gt; 'https://us-central1-aiplatform.googleapis.com/v1',&lt;/span&gt;
    &lt;span class="s"&gt;'project_id' =&amp;gt; 'acme-prod-ai',&lt;/span&gt;
    &lt;span class="s"&gt;'use_for' =&amp;gt; ['agentic_chat', 'agentic_sast']&lt;/span&gt;
  &lt;span class="s"&gt;}&lt;/span&gt;
&lt;span class="err"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  8. 18.11 Migration Checklist — Forced PostgreSQL 17 and MR Pipeline Inputs
&lt;/h2&gt;

&lt;p&gt;Last but not least, the migration checklist the ManoIT operations team wrote up while moving to 18.11. &lt;strong&gt;The most consequential change is the forced PostgreSQL 17 path&lt;/strong&gt;: 19.0's minimum supported version becomes PostgreSQL 17. Instances not running PostgreSQL Cluster will be auto-upgraded to PostgreSQL 17 during the 18.11 upgrade. PostgreSQL Cluster users (and anyone who opts out of the auto-upgrade) must move to PostgreSQL 17 manually before 19.0.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.1 MR Pipeline Inputs Customization
&lt;/h3&gt;

&lt;p&gt;18.11 lets you &lt;strong&gt;override &lt;code&gt;spec:inputs&lt;/code&gt; values per merge request pipeline run&lt;/strong&gt;, so you can rerun the same pipeline against a different environment without code changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .gitlab-ci.yml — 18.11 MR pipeline inputs&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;target_env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;staging"&lt;/span&gt;
      &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;staging"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;uat"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prod"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;canary_traffic&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;number&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;helm upgrade web ./chart&lt;/span&gt;
        &lt;span class="s"&gt;--set env=$[[ inputs.target_env ]]&lt;/span&gt;
        &lt;span class="s"&gt;--set canaryTraffic=$[[ inputs.canary_traffic ]]&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_PIPELINE_SOURCE == "merge_request_event"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  8.2 Operational Checklist
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Backup and rollback plan&lt;/strong&gt; — full PostgreSQL backup and WAL archive verified before the auto-upgrade.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runner upgrade&lt;/strong&gt; — move runners to 18.11 to pick up the concrete helper image and the runner-config-driven job router. Remove any scripts that forced job router via env vars.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agentic SAST policy&lt;/strong&gt; — decide whether to enable the GA auto-fix and define the MR label workflow (e.g., &lt;code&gt;duo:auto-fix&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Credits caps&lt;/strong&gt; — configure both subscription and per-user caps. Compare actual consumption on the Credits dashboard for the first 7 days.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model policy&lt;/strong&gt; — assess Sonnet 4.6 default cost impact; route background calls (Agentic SAST etc.) to Haiku or Mistral.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom Flow audit&lt;/strong&gt; — change &lt;code&gt;environment&lt;/code&gt; from &lt;code&gt;chat&lt;/code&gt; / &lt;code&gt;chat-partial&lt;/code&gt; to &lt;code&gt;ambient&lt;/code&gt;. Remove fields rejected by 18.11: &lt;code&gt;model&lt;/code&gt; in prompts, &lt;code&gt;response_schema_id&lt;/code&gt;, &lt;code&gt;stop&lt;/code&gt;, top-level &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;, &lt;code&gt;product_group&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP governance&lt;/strong&gt; — &lt;code&gt;.gitlab/duo/mcp.json&lt;/code&gt; requires Code Owners approval. Inject tokens via GitLab CI/CD variables or Vault.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  9. Conclusion — "AI-Native DevSecOps" Is Now the GitLab Default Surface
&lt;/h2&gt;

&lt;p&gt;By May 2026, GitLab 18.11 makes two things explicit. First, &lt;strong&gt;Duo Agent Platform is no longer an option — it is the default surface&lt;/strong&gt;. The moment CI Expert reaches every tier, "AI writes your pipeline" stops being marketing and becomes the new-project baseline experience. Second, &lt;strong&gt;operations finally has hard guardrails&lt;/strong&gt;. Subscription and per-user caps are not advisory — they cut access the instant the threshold is hit.&lt;/p&gt;

&lt;p&gt;ManoIT's roadmap for the next quarter splits into three workstreams. First, enable Agentic SAST GA to compress patch lead time for SOC 2 / ISO reporting into single-digit days. Second, standardize a code-review and release-notes automation pipeline using Custom Flows + MCP to bind Confluence, Jira, and Slack as live context. Third, layer OpenTelemetry-derived metrics on top of the GitLab Credits FinOps dashboard so cost, quality, and lead time live on the same screen. 18.11 is the first GitLab release that makes all three of those workstreams feasible in production.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1466071" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>gitlab</category>
      <category>devops</category>
      <category>cicd</category>
      <category>ai</category>
    </item>
    <item>
      <title>Google ADK 1.0 + A2A Protocol — Python/Go/Java/TypeScript GA, AgentCard, Task, and SSE Redefining the 2026 Multi-Agent Standard</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Sun, 03 May 2026 23:44:47 +0000</pubDate>
      <link>https://dev.to/x4nent/google-adk-10-a2a-protocol-pythongojavatypescript-ga-agentcard-task-and-sse-redefining-58e5</link>
      <guid>https://dev.to/x4nent/google-adk-10-a2a-protocol-pythongojavatypescript-ga-agentcard-task-and-sse-redefining-58e5</guid>
      <description>&lt;h1&gt;
  
  
  Google ADK 1.0 + A2A Protocol — Python/Go/Java/TypeScript GA, AgentCard, Task, and SSE Redefining the 2026 Multi-Agent Standard
&lt;/h1&gt;

&lt;p&gt;At Google Cloud Next 2026 in April, the &lt;strong&gt;Agent Development Kit (ADK)&lt;/strong&gt; graduated to &lt;strong&gt;1.0 GA across four languages — Python, Go, Java, and TypeScript — simultaneously&lt;/strong&gt;. In the same window, the &lt;strong&gt;Agent2Agent (A2A) protocol&lt;/strong&gt; crossed &lt;strong&gt;150+ organizations in production&lt;/strong&gt; under the Linux Foundation, and Anthropic's MCP (donated to the Linux Foundation in December 2025) settled in as the de-facto standard for tool calls. The 2026 multi-agent stack now sits on a clean separation: &lt;strong&gt;MCP for tools, A2A for agents, ADK (or equivalent) for orchestration&lt;/strong&gt;. This post documents the architectural decisions, code patterns, and operational checklist that the ManoIT AI platform team produced while integrating ADK 1.0 + A2A into our internal RAG and orchestration backend.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why May 2026 Is the Inflection Point
&lt;/h2&gt;

&lt;p&gt;The agent landscape in 2025 was a framework war — LangGraph for graph control, CrewAI for ergonomic multi-agent, AutoGen for research, and vendor-specific SDKs from OpenAI and Anthropic. By April–May 2026 that fragmentation collapsed onto two protocols and one cross-language SDK.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;H1 2025&lt;/th&gt;
&lt;th&gt;May 2026&lt;/th&gt;
&lt;th&gt;Practical impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tool calls&lt;/td&gt;
&lt;td&gt;Per-framework SDKs&lt;/td&gt;
&lt;td&gt;MCP (Linux Foundation)&lt;/td&gt;
&lt;td&gt;A tool built once works across Claude, Gemini, GPT, and self-hosted LLMs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent-to-agent&lt;/td&gt;
&lt;td&gt;Vendor silos&lt;/td&gt;
&lt;td&gt;A2A v1.0, 150+ orgs in prod&lt;/td&gt;
&lt;td&gt;Salesforce ↔ Vertex ↔ ServiceNow speak one wire format&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code-first SDK&lt;/td&gt;
&lt;td&gt;Fragmented&lt;/td&gt;
&lt;td&gt;ADK 1.0 GA in Py/Go/Java/TS&lt;/td&gt;
&lt;td&gt;One semantic, four languages, model-agnostic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Managed runtime&lt;/td&gt;
&lt;td&gt;DIY&lt;/td&gt;
&lt;td&gt;Vertex AI Agent Engine + Bedrock AgentCore + Azure AI Foundry&lt;/td&gt;
&lt;td&gt;Sessions, memory, observability as a service&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vertex AI brand&lt;/td&gt;
&lt;td&gt;Vertex AI Agent Builder&lt;/td&gt;
&lt;td&gt;Gemini Enterprise Agent Platform&lt;/td&gt;
&lt;td&gt;Catalog of 200+ models including Claude&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The bottom line: &lt;strong&gt;picking a multi-agent system is no longer a framework decision but a protocol-fit decision&lt;/strong&gt;. Whichever framework you choose has to consume MCP tools and accept A2A delegations. ADK 1.0 is the first cross-language SDK that treats both as first-class citizens.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. ADK 1.0 GA — Feature Parity Across Four Languages
&lt;/h2&gt;

&lt;p&gt;The headline of ADK 1.0 is &lt;strong&gt;feature parity&lt;/strong&gt;. Where the LangChain ecosystem in 2025 was Python-first, JS-second, and other languages a distant third, ADK 1.0 explicitly aligned all four runtimes so that an agent prototyped in Python can be re-implemented in Java for production without semantic drift.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;Release&lt;/th&gt;
&lt;th&gt;Signature 1.0 features&lt;/th&gt;
&lt;th&gt;Typical adoption&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;Continuous (bi-weekly)&lt;/td&gt;
&lt;td&gt;Plugin system, Event Compaction, FileArtifactService, Service Registry (&lt;code&gt;services.yaml&lt;/code&gt;), Vertex AI Agent Engine Sandbox code execution, Anthropic thinking blocks, OTel agentic metrics&lt;/td&gt;
&lt;td&gt;RAG, LangGraph replacement, research&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Java&lt;/td&gt;
&lt;td&gt;1.0.0 (Apr 2026)&lt;/td&gt;
&lt;td&gt;GoogleMapsTool, UrlContextTool, ContainerCodeExecutor, VertexAICodeExecutor, ComputerUseTool, HITL ToolConfirmation, Event Compaction, native A2A&lt;/td&gt;
&lt;td&gt;Finance, SI, enterprise backends&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;1.0 (Apr 2026)&lt;/td&gt;
&lt;td&gt;YAML agent config, native OpenTelemetry, cross-language parity&lt;/td&gt;
&lt;td&gt;High-concurrency microservices, edge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;td&gt;Aligned to 1.0&lt;/td&gt;
&lt;td&gt;Same patterns from front-end to BFF&lt;/td&gt;
&lt;td&gt;Web chat, BFF, Slack/Teams bots&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  2.1 ADK Python — Plugins and the Service Registry
&lt;/h3&gt;

&lt;p&gt;The Python 1.0 line collapses into two abstractions worth understanding. First, &lt;strong&gt;Plugin&lt;/strong&gt; is a cross-cutting hook registered once on the Runner and invoked globally before/after every Agent, Model, and Tool call. Second, the &lt;strong&gt;Service Registry&lt;/strong&gt; lets you swap session/artifact/memory backings declaratively in &lt;code&gt;services.py&lt;/code&gt; or &lt;code&gt;services.yaml&lt;/code&gt; so the same agent can run on in-memory backends locally and on Vertex AI Memory Bank, Firestore sessions, and GCS artifacts in production without code changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# agents/customer_support/agent.py
# Note: ADK 1.0 requires Python 3.10+
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;LlmAgent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;google_search&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.tools.openapi_tool&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAPIToolset&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.plugins&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Plugin&lt;/span&gt;

&lt;span class="c1"&gt;# 1) Cross-cutting concerns — PII masking and prompt injection defense as a Plugin
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GuardrailPlugin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Plugin&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;before_model_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_mask_pii&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;after_tool_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; status=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;

&lt;span class="c1"&gt;# 2) Agent — model-agnostic (Gemini, Claude, GPT, Llama all swap freely)
&lt;/span&gt;&lt;span class="n"&gt;support_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LlmAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;customer_support&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-pro&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c1"&gt;# or "claude-opus-4-6", "gpt-5", "vllm://..."
&lt;/span&gt;    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ManoIT first-line support agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reply in the user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s language. &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Delegate billing/refund/account issues to &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;billing_agent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;google_search&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nc"&gt;OpenAPIToolset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;spec_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.manoit.co.kr/openapi.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;sub_agents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;billing_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;     &lt;span class="c1"&gt;# delegated via A2A or in-process
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# services.yaml — swap backings per environment&lt;/span&gt;
&lt;span class="na"&gt;sessions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vertex_ai&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;project&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manoit-prod&lt;/span&gt;
    &lt;span class="na"&gt;location&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;asia-northeast3&lt;/span&gt;

&lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vertex_ai_memory_bank&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;corpus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;projects/manoit-prod/locations/asia-northeast3/ragCorpora/12345&lt;/span&gt;

&lt;span class="na"&gt;artifacts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gcs&lt;/span&gt;
  &lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;bucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manoit-agent-artifacts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2.2 ADK Java — Human-in-the-Loop and Computer Use
&lt;/h3&gt;

&lt;p&gt;ADK Java 1.0's signature feature is &lt;strong&gt;HITL ToolConfirmation&lt;/strong&gt;. Risky tools (refunds, permission grants, expensive operations) pause execution before the call, wait for human approval, and resume cleanly. Add &lt;code&gt;ComputerUseTool&lt;/code&gt;, &lt;code&gt;ContainerCodeExecutor&lt;/code&gt;, and &lt;code&gt;VertexAICodeExecutor&lt;/code&gt; and the agent can drive a real browser or run generated code in a sandbox. Together they make "cautious automation" — the most common enterprise requirement — a standard pattern rather than a custom build.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Event Compaction — Beating the Context Window Honestly
&lt;/h3&gt;

&lt;p&gt;Every long-running LLM agent eventually hits the same wall: context explosion. ADK 1.0's &lt;strong&gt;Event Compaction&lt;/strong&gt; keeps a sliding window of recent events and summarizes the rest, capping the token budget. In ManoIT's internal tests on 12-turn dialogues, compaction cut average token usage by &lt;strong&gt;38%&lt;/strong&gt; and latency by &lt;strong&gt;18%&lt;/strong&gt;. The catch: compaction can also drop "decisions you must remember." Use &lt;code&gt;memory_keys&lt;/code&gt; to pin user IDs, contract terms, and in-flight payments outside the compactor.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. A2A Protocol — AgentCard, SkillCard, and the Task Lifecycle
&lt;/h2&gt;

&lt;p&gt;A2A is &lt;strong&gt;a wire format for delegating work between opaque agents&lt;/strong&gt;. You don't need to know the other agent's model, prompt, or memory layout — only that it exposes an AgentCard, that the AgentCard advertises SkillCards, and that calls flow as JSON-RPC 2.0 over HTTP(S).&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 AgentCard — The Digital Business Card
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"manoit-billing-agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ManoIT billing, refund, and subscription agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://agents.manoit.co.kr/billing/a2a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"supported_interfaces"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"protocol"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"a2a/1.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"transport"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https+jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"capabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"streaming"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"push_notifications"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"long_running_tasks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"default_input_modes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"text/plain"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"application/json"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"default_output_modes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"text/plain"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"application/json"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"skills"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"refund.process"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Process a refund given a payment ID and reason"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"billing"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"refund"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"input_modes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"application/json"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"output_modes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"application/json"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"examples"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"input"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"payment_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PAY-..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"double-charge"&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"authentication"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"schemes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"oauth2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"api_key"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"oauth2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"authorization_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://auth.manoit.co.kr/authorize"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"token_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://auth.manoit.co.kr/token"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"scopes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"billing:refund"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The card lives at &lt;code&gt;/.well-known/agent.json&lt;/code&gt; so any other agent can discover capabilities, auth, and transport without prior agreement. Treat it as the OpenAPI of agents — registries and crawlers can build domain-wide capability catalogs from it.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Task Lifecycle — A First-Class Stateful Unit of Work
&lt;/h3&gt;

&lt;p&gt;A2A is not single-shot RPC; it is &lt;strong&gt;stateful Tasks&lt;/strong&gt; as a first-class abstraction. The client sends &lt;code&gt;tasks/send&lt;/code&gt;; the server returns a task ID; the same ID is then driven via &lt;code&gt;tasks/get&lt;/code&gt;, &lt;code&gt;tasks/cancel&lt;/code&gt;, &lt;code&gt;tasks/sendSubscribe&lt;/code&gt; (SSE stream), and &lt;code&gt;tasks/pushNotification/set&lt;/code&gt; (webhook). Tasks survive disconnects and are designed to live for seconds or hours.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sequenceDiagram
  autonumber
  participant Caller as Caller Agent (Vertex)
  participant Card as /.well-known/agent.json
  participant Billing as ManoIT Billing (A2A)
  participant Bank as PSP

  Caller-&amp;gt;&amp;gt;Card: GET AgentCard
  Card--&amp;gt;&amp;gt;Caller: skills, auth, capabilities
  Caller-&amp;gt;&amp;gt;Billing: POST tasks/send (skill=refund.process)
  Billing--&amp;gt;&amp;gt;Caller: task_id=tk_42, state=submitted
  Caller-&amp;gt;&amp;gt;Billing: tasks/sendSubscribe (SSE)
  Billing-&amp;gt;&amp;gt;Bank: refund call
  Bank--&amp;gt;&amp;gt;Billing: processing
  Billing--&amp;gt;&amp;gt;Caller: artifact: {state: working, progress: 0.4}
  Bank--&amp;gt;&amp;gt;Billing: completed
  Billing--&amp;gt;&amp;gt;Caller: artifact: {state: completed, refund_id: rf_77}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 Two Transports — SSE Streaming vs. Webhook Async
&lt;/h3&gt;

&lt;p&gt;Real-time chat and copilots use &lt;strong&gt;SSE&lt;/strong&gt; for token-level progress. Long-running batches and analyses register a &lt;strong&gt;webhook&lt;/strong&gt; so the client can disconnect and still receive results. A2A allows mixing both transports on the same task ID, which maps cleanly onto graph runtimes like LangGraph that pick a transport per node.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. ADK × A2A × MCP — The Three-Layer Stack
&lt;/h2&gt;

&lt;p&gt;A well-designed multi-agent system in 2026 separates three concerns: tools (MCP), agents (A2A), and orchestration (ADK or equivalent).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;flowchart LR
  subgraph User["User / external system"]
    U[Slack / Teams / Web Chat]
  end

  subgraph Orchestrator["ADK Runner (Python/Java)"]
    Root[Root Agent&amp;lt;br/&amp;gt;customer_support]
  end

  subgraph A2A_Agents["Remote agents over A2A"]
    B[Billing Agent&amp;lt;br/&amp;gt;Java + Spring]
    L[Logistics Agent&amp;lt;br/&amp;gt;SAP Joule]
    H[HR Agent&amp;lt;br/&amp;gt;Workday]
  end

  subgraph MCP_Tools["MCP servers (tools / data)"]
    DB[(Postgres MCP)]
    KB[(Notion KB MCP)]
    GH[GitHub MCP]
  end

  U --&amp;gt;|chat| Root
  Root --&amp;gt;|A2A tasks/send| B
  Root --&amp;gt;|A2A tasks/send| L
  Root --&amp;gt;|A2A tasks/send| H
  Root --&amp;gt;|MCP tools/call| DB
  Root --&amp;gt;|MCP tools/call| KB
  Root --&amp;gt;|MCP tools/call| GH
  B --&amp;gt;|MCP| DB
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Standard&lt;/th&gt;
&lt;th&gt;Owns&lt;/th&gt;
&lt;th&gt;Forbidden&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tool&lt;/td&gt;
&lt;td&gt;MCP&lt;/td&gt;
&lt;td&gt;Actions/queries against DBs, SaaS, internal APIs&lt;/td&gt;
&lt;td&gt;Business decisions, multi-step reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent&lt;/td&gt;
&lt;td&gt;A2A&lt;/td&gt;
&lt;td&gt;Domain expertise (billing, logistics, HR) + reasoning&lt;/td&gt;
&lt;td&gt;Exposing tools directly, leaking schemas&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Orchestrator&lt;/td&gt;
&lt;td&gt;ADK Runner&lt;/td&gt;
&lt;td&gt;Intent, planning, sessions, memory, HITL, observability&lt;/td&gt;
&lt;td&gt;Implementing domain logic&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two anti-patterns appear when this separation breaks. Push business decisions into MCP and tools become "hidden agents" — reuse and auditability collapse. Let an A2A agent behave like a tool and you get glorified RPC, losing task lifecycle, retries, and approval flows. ADK 1.0's &lt;code&gt;AgentTool&lt;/code&gt;, &lt;code&gt;RemoteA2aAgent&lt;/code&gt;, and &lt;code&gt;BaseTool&lt;/code&gt; types exist precisely to enforce this boundary at the type level.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Vertex AI Agent Engine — Single Process to Managed Multi-Agent
&lt;/h2&gt;

&lt;p&gt;ADK is model- and runtime-agnostic, but Google's preferred operating path is &lt;strong&gt;Vertex AI Agent Engine&lt;/strong&gt;. Where Cloud Run is stateless, Agent Engine absorbs sessions, memory, retries, observability, and scaling as a managed service. As of April 2026, &lt;strong&gt;Agent Engine Sandbox code execution&lt;/strong&gt; ships as an ADK Python integration, letting LLM-generated code run safely in an isolated environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 0) Auth&lt;/span&gt;
gcloud auth application-default login

&lt;span class="c"&gt;# 1) Scaffold&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--upgrade&lt;/span&gt; google-adk
adk create my_agent &lt;span class="nt"&gt;--template&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;root_agent

&lt;span class="c"&gt;# 2) Local eval&lt;/span&gt;
adk run my_agent
adk &lt;span class="nb"&gt;eval &lt;/span&gt;my_agent ./evals/golden.json

&lt;span class="c"&gt;# 3) Deploy to Vertex AI Agent Engine (Express Mode onboarding — new in 1.0)&lt;/span&gt;
adk deploy agent_engine &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--project&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;manoit-prod &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;asia-northeast3 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--agent_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;./my_agent &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--runtime&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;python3.12 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--requirements&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;requirements.txt &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--service_account&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;adk-runtime@manoit-prod.iam.gserviceaccount.com

&lt;span class="c"&gt;# 4) Or self-host on GKE&lt;/span&gt;
adk deploy gke &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cluster&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;adk-prod &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;agents &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image-tag&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;v1.4.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Operationally, Agent Engine wins where it matters: OTel traces flow into Cloud Trace automatically, the Memory Bank wires straight into RAG corpora, and session compaction/retries/multi-step delegation are bundled behind SDK abstractions. Multi-cloud-mandated organizations can run the same ADK code on EKS/AKS and pipe everything through OTel collectors with LiteLLM in front of Anthropic and OpenAI.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. ManoIT Adoption Roadmap — ADK 1.0 + A2A in 90 Days
&lt;/h2&gt;

&lt;p&gt;The 90-day roadmap the ManoIT AI platform team validated internally. The guiding rule: &lt;strong&gt;agree on protocol, permissions, and observability before writing code&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Day 1–14 — Standards alignment.&lt;/strong&gt; Security, platform, and domain teams agree on the externally exposable tool whitelist (MCP) and the delegatable agent list (A2A). Register the AgentCard schema and a SkillCard naming convention (&lt;code&gt;domain.verb.noun&lt;/code&gt;) as an internal standard.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 15–30 — Hello A2A.&lt;/strong&gt; Spin up one ADK Python root agent and one Java A2A remote agent, complete a single end-to-end delegation. Acceptance gate: the OTel trace must be continuous from edge to remote agent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 31–50 — MCP tool alignment.&lt;/strong&gt; Wire MCP servers to internal catalogs, DBs, Slack, and Notion. Implement PII masking, request-ID propagation, and audit logging cross-cutting via ADK Plugins.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 51–70 — HITL and governance.&lt;/strong&gt; Attach ToolConfirmation to refunds, permission grants, and high-cost operations; integrate the approval UI behind a Slack slash command. &lt;strong&gt;Note:&lt;/strong&gt; any path that reaches an external SaaS LLM must run PII detection, output filtering, and hallucination checks together.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 71–90 — Operational stabilization.&lt;/strong&gt; Memory-key conventions for Event Compaction, per-token cost alerts, Vertex AI Agent Engine autoscale validation, and a retry/recovery game day.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  6.1 Operations checklist
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pin versions.&lt;/strong&gt; Lock &lt;code&gt;google-adk==1.x&lt;/code&gt; and &lt;code&gt;a2a-sdk==1.0.x&lt;/code&gt;. Auto-merge bi-weekly patches only after staging regression.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protocol compliance tests.&lt;/strong&gt; Add the A2A SDK compliance suite plus AgentCard JSON Schema validation to CI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability.&lt;/strong&gt; OTel &lt;code&gt;service.name&lt;/code&gt; is per agent; &lt;code&gt;session.id&lt;/code&gt; is the ADK Runner-issued session ID.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost.&lt;/strong&gt; Expose per-call input/output tokens as Prometheus counters in a Plugin. Track unit pricing across Vertex, Bedrock, and Anthropic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure modes.&lt;/strong&gt; Prefer SSE or webhook over polling &lt;code&gt;tasks/get&lt;/code&gt;. Polling loses on cost, latency, and server load.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rollback.&lt;/strong&gt; Keep version-tagged Agent Engine deployments and previous ADK lock files. 24-hour rollback must be feasible.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  7. Competitive Landscape
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Framework&lt;/th&gt;
&lt;th&gt;Strength&lt;/th&gt;
&lt;th&gt;May 2026 position&lt;/th&gt;
&lt;th&gt;Relationship to ADK&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LangGraph&lt;/td&gt;
&lt;td&gt;Graph control, production reliability&lt;/td&gt;
&lt;td&gt;Native A2A&lt;/td&gt;
&lt;td&gt;Interop via ADK Plugin/Tool; LangGraph wins on graph runtime&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CrewAI&lt;/td&gt;
&lt;td&gt;Role-based ergonomics, gentle ramp&lt;/td&gt;
&lt;td&gt;Native A2A&lt;/td&gt;
&lt;td&gt;Good for lightweight PoCs; production goes to ADK&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AutoGen&lt;/td&gt;
&lt;td&gt;Research, multi-LLM debate&lt;/td&gt;
&lt;td&gt;Community-led&lt;/td&gt;
&lt;td&gt;Research → migrate to ADK&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI Agents SDK&lt;/td&gt;
&lt;td&gt;OpenAI-model optimized&lt;/td&gt;
&lt;td&gt;AGENTS.md convention&lt;/td&gt;
&lt;td&gt;Wins inside an OpenAI single-vendor stack&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic Managed Agents&lt;/td&gt;
&lt;td&gt;Brain/Hands/Session split, MCP-first&lt;/td&gt;
&lt;td&gt;Claude 4.6/4.7 optimized&lt;/td&gt;
&lt;td&gt;Wins inside an Anthropic stack; multi-LLM goes to ADK&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Google ADK 1.0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Py/Go/Java/TS GA, model-agnostic, A2A + MCP first-class&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Reference for protocol-aligned stacks&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;The reference&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Pragmatic guidance: if you're deeply locked into one LLM vendor, that vendor's SDK is the fastest path. But for enterprises facing &lt;strong&gt;multi-language, multi-cloud, and multi-LLM&lt;/strong&gt; demands at once, ADK 1.0 minimizes the cost of standards alignment. In Java- and Go-heavy markets such as Korean SI and finance, ADK lowers the door height that LangChain/LangGraph's Python-first stance raised.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. Conclusion — From "Picking a Framework" to "Fitting the Protocols"
&lt;/h2&gt;

&lt;p&gt;The 2026 multi-agent stack is no longer about which framework is most elegant. With &lt;strong&gt;MCP for tools and A2A for agents&lt;/strong&gt; locked in, the real evaluation axis is whether your SDK treats both protocols as first-class citizens and offers cross-language parity. Google ADK 1.0 GA is the best-balanced answer on that axis today, and A2A v1.0 has earned the external validation that 150+ production deployments only buy once. ManoIT's recommendation is concrete. First, &lt;strong&gt;default new agent builds in Q3 2026 to the three-layer split: MCP tools + A2A agents + ADK Runner&lt;/strong&gt;. Second, &lt;strong&gt;stand up an internal protocol catalog — AgentCards, SkillCards, MCP tool whitelists — co-owned by security, platform, and domain teams in week one&lt;/strong&gt;. Third, &lt;strong&gt;agree on OTel, HITL, and Event Compaction as operational defaults before code is written&lt;/strong&gt;. Operating model precedes tooling, and whoever standardizes the operating model first goes the farthest.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was written by the ManoIT engineering team with assistance from Anthropic Claude AI. Facts and code samples are based on the official Google ADK documentation and 1.0 release notes, the A2A Protocol Specification (a2a-protocol.org), the MCP Specification (modelcontextprotocol.io), and Google Cloud Next 2026 announcements. Some performance and operational figures reflect ManoIT's internal measurements and may vary by environment. Validate in your own environment before adopting. — ManoIT (manoit.co.kr)&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1465613" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>llm</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Backstage 1.49 Complete Guide — New Frontend System 1.0 RC by Default, Actions Registry, and mcp-actions-backend Make Your IDP AI-Native</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Sun, 03 May 2026 09:57:48 +0000</pubDate>
      <link>https://dev.to/x4nent/backstage-149-complete-guide-new-frontend-system-10-rc-by-default-actions-registry-and-1o5a</link>
      <guid>https://dev.to/x4nent/backstage-149-complete-guide-new-frontend-system-10-rc-by-default-actions-registry-and-1o5a</guid>
      <description>&lt;h1&gt;
  
  
  Backstage 1.49 Complete Guide — New Frontend System 1.0 RC by Default, Actions Registry, and mcp-actions-backend Make Your IDP AI-Native
&lt;/h1&gt;

&lt;p&gt;As of January 2026, Backstage powers internal developer platforms (IDPs) at &lt;strong&gt;3,400+ organizations and 2 million+ developers&lt;/strong&gt; outside Spotify, commanding &lt;strong&gt;89% market share&lt;/strong&gt; among IDP frameworks. From that summit, the Spotify and CNCF community has shipped its largest architectural shift in five years. &lt;strong&gt;Backstage 1.49&lt;/strong&gt; promotes the &lt;strong&gt;New Frontend System (NFS) to 1.0 Release Candidate&lt;/strong&gt; and makes it the default for new apps. It ships the &lt;strong&gt;Actions Registry&lt;/strong&gt; as a first-class governance layer and adds &lt;strong&gt;&lt;code&gt;@backstage/plugin-mcp-actions-backend&lt;/code&gt;&lt;/strong&gt;, which exposes every registered action as a tool through the Model Context Protocol. In other words: Claude Code, Cursor, and ChatGPT can now query your catalog, run scaffolder tasks, and grant GitLab group permissions in one natural-language step. This guide walks through the NFS migration, Actions Registry governance, MCP server splitting, and the BUI breaking changes from a production-platform perspective.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why Backstage 1.49 is the Inflection Point
&lt;/h2&gt;

&lt;p&gt;The IDP debate "should we even build a portal?" effectively ended in 2026 — Backstage's 67% overall adoption and 89% framework share answered it. The remaining question is &lt;strong&gt;how to operate one&lt;/strong&gt;, and 1.49 rewrites the operating model on three axes simultaneously.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;≤ 1.48&lt;/th&gt;
&lt;th&gt;1.49&lt;/th&gt;
&lt;th&gt;Operational Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Frontend system&lt;/td&gt;
&lt;td&gt;Legacy + &lt;code&gt;--next&lt;/code&gt; opt-in&lt;/td&gt;
&lt;td&gt;NFS 1.0 RC default, &lt;code&gt;--legacy&lt;/code&gt; opt-out&lt;/td&gt;
&lt;td&gt;New apps forced onto NFS, existing apps need 6–12 month migration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Action governance&lt;/td&gt;
&lt;td&gt;Scaffolder actions + custom hooks scattered&lt;/td&gt;
&lt;td&gt;Single Actions Registry&lt;/td&gt;
&lt;td&gt;Unified action catalog, permissions, audit logging&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI integration&lt;/td&gt;
&lt;td&gt;External chatbots, separate search index&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;mcp-actions-backend&lt;/code&gt; built in&lt;/td&gt;
&lt;td&gt;Claude / Cursor drive the IDP directly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BUI transition&lt;/td&gt;
&lt;td&gt;MUI-based EntityCards&lt;/td&gt;
&lt;td&gt;BUI (Backstage UI) migration in flight&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;variant&lt;/code&gt; prop removed, silent layout regressions possible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Catalog query&lt;/td&gt;
&lt;td&gt;Equality filters&lt;/td&gt;
&lt;td&gt;Predicate filters (&lt;code&gt;$all&lt;/code&gt;, &lt;code&gt;$any&lt;/code&gt;, &lt;code&gt;$not&lt;/code&gt;, &lt;code&gt;$exists&lt;/code&gt;, &lt;code&gt;$in&lt;/code&gt;, &lt;code&gt;$hasPrefix&lt;/code&gt;, &lt;code&gt;$contains&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Massive expressiveness for large catalogs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The takeaway is that 1.49 is not a feature drop but the &lt;strong&gt;first industry-standard release where platform engineering formally adopts AI&lt;/strong&gt;. BackstageCon Europe 2026 (Amsterdam, March 26) made the same point: N26, Spotify, and Roadie all argued that Actions Registry plus MCP will define IDP design for the next 18 months.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. New Frontend System 1.0 RC — &lt;code&gt;--next&lt;/code&gt; is Gone, &lt;code&gt;--legacy&lt;/code&gt; is Born
&lt;/h2&gt;

&lt;p&gt;NFS reached &lt;strong&gt;1.0 Release Candidate&lt;/strong&gt; in 1.49 after alpha (2024) and beta (2025). The most visible change is that the CLI defaults flipped: &lt;code&gt;npx @backstage/create-app&lt;/code&gt; now scaffolds an NFS-based app &lt;strong&gt;without&lt;/strong&gt; the &lt;code&gt;--next&lt;/code&gt; flag, and you must pass &lt;code&gt;--legacy&lt;/code&gt; explicitly to opt back into the old system.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1.49 new app — defaults to NFS 1.0 RC&lt;/span&gt;
npx @backstage/create-app@latest &lt;span class="nt"&gt;--path&lt;/span&gt; my-portal

&lt;span class="c"&gt;# Stay on the legacy MUI system&lt;/span&gt;
npx @backstage/create-app@latest &lt;span class="nt"&gt;--path&lt;/span&gt; my-portal &lt;span class="nt"&gt;--legacy&lt;/span&gt;

&lt;span class="c"&gt;# Bump existing apps&lt;/span&gt;
yarn backstage-cli versions:bump &lt;span class="nt"&gt;--release&lt;/span&gt; latest

&lt;span class="c"&gt;# New: Actions Registry CLI&lt;/span&gt;
yarn backstage-cli actions list
yarn backstage-cli actions execute scaffolder:catalog:register &lt;span class="nt"&gt;--input&lt;/span&gt; &lt;span class="s1"&gt;'{"url":"..."}'&lt;/span&gt;
yarn backstage-cli actions sources
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The essence of NFS is that &lt;strong&gt;plugin extension points and routing are decided at app-config time, not at code-build time&lt;/strong&gt;. Sidebar entries, pages, and catalog widgets toggle on or off purely from &lt;code&gt;app-config.yaml&lt;/code&gt;'s &lt;code&gt;app.extensions&lt;/code&gt; block, which means you can ship the same binary to multiple environments and let configuration decide the plugin surface. In 1.49, &lt;code&gt;PluginWrapperApi&lt;/code&gt; graduates from alpha to &lt;strong&gt;stable&lt;/strong&gt;, and a new &lt;code&gt;@backstage/frontend-dev-utils&lt;/code&gt; package gives you &lt;code&gt;createDevApp()&lt;/code&gt; for spinning up plugin-only dev environments in a single line.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 BUI (Backstage UI) — The Quiet Retirement of MUI Cards
&lt;/h3&gt;

&lt;p&gt;In parallel with NFS 1.0 RC, Spotify is migrating the design system to &lt;strong&gt;BUI&lt;/strong&gt;, a React Aria–based component layer. The 1.49 major bump of &lt;code&gt;@backstage/plugin-catalog&lt;/code&gt; rebuilt &lt;code&gt;EntityAboutCard&lt;/code&gt;, &lt;code&gt;EntityLinksCard&lt;/code&gt;, &lt;code&gt;EntityLabelsCard&lt;/code&gt;, &lt;code&gt;GroupProfileCard&lt;/code&gt;, and &lt;code&gt;UserProfileCard&lt;/code&gt; on BUI, and &lt;strong&gt;removed the &lt;code&gt;variant&lt;/code&gt; prop&lt;/strong&gt; in the process. &lt;strong&gt;⚠️ Caution:&lt;/strong&gt; if your code uses &lt;code&gt;variant="gridItem"&lt;/code&gt; it will still compile but quietly break the EntityPage layout. Visual regression on EntityPage must be on the PR review checklist.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Actions Registry — Beyond Scaffolder, Toward IDP-Wide Automation Governance
&lt;/h2&gt;

&lt;p&gt;Actions Registry arrived as beta in 1.48 and solidifies in 1.49 as &lt;strong&gt;a first-class platform-engineering primitive&lt;/strong&gt;. The pre-1.48 model only allowed action invocation inside Scaffolder templates, which meant permissions, auditing, and reusability were all tied to template scope. Actions Registry lifts those concerns into &lt;strong&gt;a standard action type that plugins register, the permission system governs, and CLI / UI / MCP can all invoke identically&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// plugins/my-platform-backend/src/actions/registerService.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createBackendModule&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@backstage/backend-plugin-api&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;actionsRegistryServiceRef&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@backstage/backend-plugin-api/alpha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;registerServiceModule&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createBackendModule&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;pluginId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;platform&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;moduleId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;register-service&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;reg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;reg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;registerInit&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;deps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;actionsRegistryServiceRef&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;init&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;registry&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;platform:register-service&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Register a new microservice&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Creates the catalog entry, GitHub repo, and ArgoCD app in one shot&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
              &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
              &lt;span class="na"&gt;owner&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
              &lt;span class="na"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enum&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tier-1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tier-2&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tier-3&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
            &lt;span class="p"&gt;}),&lt;/span&gt;
            &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;catalogRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;repoUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
          &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="c1"&gt;// New: who is allowed to invoke this action&lt;/span&gt;
          &lt;span class="na"&gt;visibilityPermission&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;resourceType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;platform-action&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;allow&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;platform-admins&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;action&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;credentials&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;logger&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Registering service: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="c1"&gt;// ... create GitHub repo, register in catalog, render Argo app&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;catalogRef&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`component:default/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;repoUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
          &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That single block delivers a major leverage point: a Scaffolder template can call it as &lt;code&gt;action: platform:register-service&lt;/code&gt;, a platform engineer can invoke it standalone via &lt;code&gt;yarn backstage-cli actions execute&lt;/code&gt;, and &lt;strong&gt;&lt;code&gt;mcp-actions-backend&lt;/code&gt; exposes it verbatim as an MCP tool&lt;/strong&gt;. Write the action once, and humans, the CLI, and AI agents all consume it with identical semantics.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 New Built-in Actions in 1.49
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;backstage:who-am-i&lt;/code&gt; — returns the caller's identity and permission context. Critical for AI agents to introspect their own permissions.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;catalog:query-entities&lt;/code&gt; — predicate-filter catalog queries (&lt;code&gt;$all&lt;/code&gt;, &lt;code&gt;$any&lt;/code&gt;, &lt;code&gt;$not&lt;/code&gt;, &lt;code&gt;$exists&lt;/code&gt;, &lt;code&gt;$in&lt;/code&gt;, &lt;code&gt;$hasPrefix&lt;/code&gt;, &lt;code&gt;$contains&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;scaffolder:list-actions&lt;/code&gt; — Scaffolder action self-introduction.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;scaffolder:get-task-logs&lt;/code&gt; — stream logs of running or completed Scaffolder tasks.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;scaffolder:list-tasks&lt;/code&gt; — list tasks visible under the caller's permissions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These five compose &lt;strong&gt;the meta-toolset an AI agent uses to bootstrap itself&lt;/strong&gt; the first time it meets an IDP. At session start, calling &lt;code&gt;who-am-i → list-actions → query-entities&lt;/code&gt; lets Claude Code learn what the IDP can do and what permissions it has — without human intervention.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. mcp-actions-backend — The Standard Way to Expose an IDP Over MCP
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;@backstage/plugin-mcp-actions-backend&lt;/code&gt; is the most attention-grabbing addition in 1.49. Its job summarizes in one line: &lt;strong&gt;"Expose every action registered in the Actions Registry over the MCP protocol automatically."&lt;/strong&gt; No additional translation code, no schema mapping. The &lt;code&gt;schema.input/output&lt;/code&gt; you already defined in Zod is serialized straight to the MCP tool descriptor.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app-config.yaml&lt;/span&gt;
&lt;span class="na"&gt;mcpActions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# 1.49 lets you split actions across multiple servers&lt;/span&gt;
  &lt;span class="na"&gt;servers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;catalog-readonly&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Read-only catalog — safe for external LLMs&lt;/span&gt;
      &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;catalog:query-entities&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;backstage:who-am-i&lt;/span&gt;
      &lt;span class="na"&gt;authentication&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;oauth2&lt;/span&gt;
        &lt;span class="na"&gt;clientRegistration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;dynamic&lt;/span&gt;   &lt;span class="c1"&gt;# DCR enabled&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;platform-admin&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Platform admins only — internal Claude Code sessions&lt;/span&gt;
      &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;platform:*&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;scaffolder:*&lt;/span&gt;
      &lt;span class="na"&gt;exclude&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;scaffolder:delete-task&lt;/span&gt;   &lt;span class="c1"&gt;# block dangerous actions&lt;/span&gt;
      &lt;span class="na"&gt;authentication&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bearer&lt;/span&gt;
        &lt;span class="na"&gt;longLivedTokens&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The real strength of 1.49's mcp-actions is &lt;strong&gt;server-by-purpose splitting&lt;/strong&gt;. From a single Backstage instance you can declaratively expose a read-only catalog endpoint to an external SaaS LLM and a full Scaffolder endpoint to authenticated internal Claude Code clients. Combined, &lt;code&gt;include&lt;/code&gt;/&lt;code&gt;exclude&lt;/code&gt;, dotted-wildcard names (&lt;code&gt;plugin.action&lt;/code&gt;), and &lt;code&gt;visibilityPermission&lt;/code&gt; give Backstage a small but real &lt;strong&gt;API gateway&lt;/strong&gt; role for AI traffic.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1 Hooking Up Claude Code via DCR (Dynamic Client Registration)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Enable OAuth2 DCR in the Backstage &lt;code&gt;auth&lt;/code&gt; backend (&lt;code&gt;experimentalDynamicClientRegistration: true&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;In Claude Code, run &lt;code&gt;/mcp add backstage https://portal.example.com/api/mcp/catalog-readonly&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Browser callback completes OAuth2 → long-lived token issued (&lt;code&gt;longLivedTokens: true&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Restart Claude Code (the tool catalog only loads at session start).&lt;/li&gt;
&lt;li&gt;Natural-language queries like "Group every tier-1 service by owner" translate into &lt;code&gt;catalog:query-entities&lt;/code&gt; calls.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;ManoIT's platform team wired this into a &lt;strong&gt;Tier-1 on-call triage bot&lt;/strong&gt;. PagerDuty alert → Claude → &lt;code&gt;catalog:query-entities&lt;/code&gt; to identify affected services → &lt;code&gt;scaffolder:get-task-logs&lt;/code&gt; to inspect recent deployments → first-pass diagnosis report. Mean response time dropped from 11 minutes to &lt;strong&gt;2 minutes 40 seconds&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Catalog and Scaffolder Changes — Migration Checklist
&lt;/h2&gt;

&lt;p&gt;Beyond NFS and the Actions Registry, 1.49 ships substantive updates to catalog and scaffolder.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;Change&lt;/th&gt;
&lt;th&gt;Migration Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Catalog&lt;/td&gt;
&lt;td&gt;Predicate filters (&lt;code&gt;$all&lt;/code&gt;, &lt;code&gt;$any&lt;/code&gt;, &lt;code&gt;$not&lt;/code&gt;, &lt;code&gt;$exists&lt;/code&gt;, &lt;code&gt;$in&lt;/code&gt;, &lt;code&gt;$hasPrefix&lt;/code&gt;, &lt;code&gt;$contains&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Existing simple filters keep working; adopt incrementally&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Catalog Processor&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;AnnotateScmSlugEntityProcessor&lt;/code&gt; and &lt;code&gt;CodeOwnersProcessor&lt;/code&gt; moved to community plugin&lt;/td&gt;
&lt;td&gt;Add &lt;code&gt;@backstage-community/plugin-catalog-backend-module-scm&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scaffolder&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;secrets.schema&lt;/code&gt; validation introduced, new &lt;code&gt;gitlab:group:access&lt;/code&gt; action&lt;/td&gt;
&lt;td&gt;Add a schema to templates that consume secrets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scaffolder API&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;retry&lt;/code&gt;, &lt;code&gt;listTasks&lt;/code&gt;, &lt;code&gt;listTemplatingExtensions&lt;/code&gt;, &lt;code&gt;dryRun&lt;/code&gt;, &lt;code&gt;autocomplete&lt;/code&gt; are now required methods&lt;/td&gt;
&lt;td&gt;Implement these on any custom &lt;code&gt;ScaffolderApi&lt;/code&gt; class&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bitbucket&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;integrations.bitbucket&lt;/code&gt; removed, &lt;code&gt;BitbucketUrlReader&lt;/code&gt; removed&lt;/td&gt;
&lt;td&gt;Migrate to &lt;code&gt;bitbucketCloud&lt;/code&gt; / &lt;code&gt;bitbucketServer&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slack notifications&lt;/td&gt;
&lt;td&gt;Channel DMs deprecated — DMs now only target user entity recipients&lt;/td&gt;
&lt;td&gt;Route channel messages through Webhook channels&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BUI Provider&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;BUIProvider&lt;/code&gt; routing context now mandatory&lt;/td&gt;
&lt;td&gt;Wrap custom-page entry points with the provider&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  5.1 ManoIT Internal IDP Migration Checklist
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;① Node 20.19 / 22.12+&lt;/strong&gt; — verify Yarn Berry + corepack environment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;② Bulk dependency bump&lt;/strong&gt; — run &lt;code&gt;yarn backstage-cli versions:bump --release latest&lt;/code&gt; and regenerate the lockfile.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;③ Canary NFS opt-in&lt;/strong&gt; — seed a new app (no &lt;code&gt;--legacy&lt;/code&gt;) on staging; visually regress EntityPage / TechDocs / Scaffolder.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;④ EntityCard &lt;code&gt;variant&lt;/code&gt; audit&lt;/strong&gt; — grep for &lt;code&gt;EntityAboutCard variant="gridItem"&lt;/code&gt; and migrate to BUI grid props.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⑤ Actions Registry inventory&lt;/strong&gt; — &lt;code&gt;yarn backstage-cli actions list&lt;/code&gt; to discover unpermissioned actions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⑥ MCP server permission model&lt;/strong&gt; — agree the externally-exposable action whitelist with security before configuring &lt;code&gt;mcpActions.servers&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⑦ Bitbucket / Slack split&lt;/strong&gt; — replace single &lt;code&gt;bitbucket&lt;/code&gt; key with &lt;code&gt;bitbucketCloud&lt;/code&gt; / &lt;code&gt;bitbucketServer&lt;/code&gt;; route Slack channels through Webhooks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⑧ Catalog Processor reinforcement&lt;/strong&gt; — explicitly install the community SCM Slug / CODEOWNERS processors that left core.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⑨ Rollback plan&lt;/strong&gt; — keep previous-version lockfiles on a dedicated branch with a 24-hour rollback window.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6. Competitive Landscape — How Does Backstage Hold 89% Against Port, Cortex, and Roadie?
&lt;/h2&gt;

&lt;p&gt;The 2026 IDP market has SaaS-first players (Port, Cortex), managed-Backstage providers (Spotify Portal, Roadie), and the self-hosted Backstage core community all pulling in different directions.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Product&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Strength&lt;/th&gt;
&lt;th&gt;Relationship to 1.49&lt;/th&gt;
&lt;th&gt;Best Fit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Backstage (self-host)&lt;/td&gt;
&lt;td&gt;OSS framework&lt;/td&gt;
&lt;td&gt;Fastest path to NFS / Actions Registry / MCP&lt;/td&gt;
&lt;td&gt;The core itself&lt;/td&gt;
&lt;td&gt;Platform teams that can self-operate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spotify Portal&lt;/td&gt;
&lt;td&gt;Managed Backstage&lt;/td&gt;
&lt;td&gt;Spotify in-house plugins + DX Insights&lt;/td&gt;
&lt;td&gt;NFS / MCP GA delivered in lockstep&lt;/td&gt;
&lt;td&gt;Buyers who trust Spotify's lineage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Roadie&lt;/td&gt;
&lt;td&gt;Managed Backstage&lt;/td&gt;
&lt;td&gt;Automated upgrades, plugin curation&lt;/td&gt;
&lt;td&gt;1.49 compatibility shipped automatically&lt;/td&gt;
&lt;td&gt;Teams without dedicated platform headcount&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Port&lt;/td&gt;
&lt;td&gt;Independent SaaS&lt;/td&gt;
&lt;td&gt;Low-code data model and blueprints&lt;/td&gt;
&lt;td&gt;Not compatible — separate stack&lt;/td&gt;
&lt;td&gt;Orgs that include non-developer departments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cortex&lt;/td&gt;
&lt;td&gt;Independent SaaS&lt;/td&gt;
&lt;td&gt;Service catalog + scorecards&lt;/td&gt;
&lt;td&gt;Not compatible&lt;/td&gt;
&lt;td&gt;SRE-led, governance-heavy organizations&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The headline is that Actions Registry + MCP put Backstage somewhere Port and Cortex cannot easily reach. Independent SaaS players have to define proprietary protocols and SDKs to expose their data models to AI; Backstage already publishes every action over the standard MCP. On the new evaluation axis of &lt;strong&gt;"AI-native IDP compatibility,"&lt;/strong&gt; 1.49 is effectively running unopposed.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Conclusion — H2 2026 Forces a Rewrite of Every IDP Operating Model
&lt;/h2&gt;

&lt;p&gt;Backstage 1.49 is not a minor release; it is &lt;strong&gt;a turning point in IDP operating models&lt;/strong&gt;. NFS 1.0 RC becomes the foundation for every new portal. Actions Registry collapses scattered automation into a single governance layer. &lt;code&gt;mcp-actions-backend&lt;/code&gt; lifts the IDP into infrastructure that AI agents call routinely. ManoIT's recommendation is twofold. First, &lt;strong&gt;align every internal Backstage instance to 1.49 by Q3 2026&lt;/strong&gt; and break the NFS migration into PR-sized units to preserve a safety net. Second, &lt;strong&gt;agree the MCP-exposed action whitelist with security in advance&lt;/strong&gt;. Catalog queries surfaced to external LLMs become objects of permissions, audit, and logging — policy precedes code. AI-native IDPs are not a tool change, they are an operating-model change, and Backstage 1.49 is the first release to industrialize that change.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was authored by ManoIT's engineering team with assistance from Anthropic Claude AI. Facts and code are based on Backstage's official documentation, the 1.49 release notes, and BackstageCon Europe 2026 sessions; some performance / operational figures and ManoIT internal measurements vary by environment. Validate in your own environment before adoption. — ManoIT (manoit.co.kr)&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1465511" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>mcp</category>
      <category>ai</category>
      <category>devops</category>
    </item>
    <item>
      <title>Vite 8 + Rolldown Complete Guide — Ending the esbuild·Rollup Dual-Bundler Era with a Single Rust Bundler and the Oxc Toolchain</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Sat, 02 May 2026 00:17:39 +0000</pubDate>
      <link>https://dev.to/x4nent/vite-8-rolldown-complete-guide-ending-the-esbuildrollup-dual-bundler-era-with-a-single-rust-m4j</link>
      <guid>https://dev.to/x4nent/vite-8-rolldown-complete-guide-ending-the-esbuildrollup-dual-bundler-era-with-a-single-rust-m4j</guid>
      <description>&lt;h1&gt;
  
  
  Vite 8 + Rolldown Complete Guide — Ending the esbuild·Rollup Dual-Bundler Era with a Single Rust Bundler and the Oxc Toolchain
&lt;/h1&gt;

&lt;p&gt;On March 12, 2026, Evan You's &lt;strong&gt;VoidZero&lt;/strong&gt; released &lt;strong&gt;Vite 8.0&lt;/strong&gt; as GA. After seven years, Vite's dual-bundler architecture — &lt;strong&gt;esbuild for dev + Rollup for production&lt;/strong&gt; — finally retires, replaced by &lt;strong&gt;Rolldown&lt;/strong&gt;, a Rust-native bundler. This is not a cosmetic dependency swap. The &lt;strong&gt;parser (Oxc Parser), transformer (Oxc Transformer), bundler (Rolldown), CSS minifier (Lightning CSS), and linter (Oxlint)&lt;/strong&gt; all collapse into a single end-to-end Rust toolchain maintained by the same team. Production builds are &lt;strong&gt;10–30× faster than Rollup&lt;/strong&gt;, React Refresh transforms run &lt;strong&gt;40× faster than Babel&lt;/strong&gt;, and dev/prod output is now bit-for-bit identical. This guide breaks down Vite 8's architecture, the Environment API + ModuleRunner SSR migration, plugin-react v6 with Babel removed, breaking config changes such as &lt;code&gt;build.rolldownOptions&lt;/code&gt;, and a production-grade migration checklist drawn from real ManoIT projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why Vite 8 Is a Game Changer — The End of the 7-Year Dual-Bundler Era
&lt;/h2&gt;

&lt;p&gt;When Vite 1.0 shipped in 2021, Evan You made two pragmatic bets: &lt;strong&gt;esbuild (Go) for dev speed&lt;/strong&gt;, and &lt;strong&gt;Rollup (JS) for production output quality&lt;/strong&gt;. The split worked, but it created a "dual-bundler debug hell": code-splitting, tree-shaking, and ESM/CJS interop subtly diverged between dev and prod, producing bugs that only reproduced after deployment. Vite 8 unifies both ends behind &lt;strong&gt;Rolldown&lt;/strong&gt;, a Rust-native bundler that is 100% compatible with the Rollup plugin API while running &lt;strong&gt;10–30× faster than Rollup and roughly 1.5–2× faster than esbuild&lt;/strong&gt; through native multithreading.&lt;/p&gt;

&lt;p&gt;The deeper change is the &lt;strong&gt;integrated toolchain&lt;/strong&gt;: Vite (build tool) → Rolldown (bundler) → Oxc (parser, transformer, minifier, linter), all maintained inside VoidZero. Dev and prod now share the exact same parser, resolver, and transformer, eliminating an entire class of "works in dev, breaks in prod" issues.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Before Vite 7 (2024–2026 Q1)&lt;/th&gt;
&lt;th&gt;Vite 8 (2026-03 GA)&lt;/th&gt;
&lt;th&gt;Operational Impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Dev bundler&lt;/td&gt;
&lt;td&gt;esbuild (Go)&lt;/td&gt;
&lt;td&gt;Rolldown (Rust)&lt;/td&gt;
&lt;td&gt;dev/prod parity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Prod bundler&lt;/td&gt;
&lt;td&gt;Rollup (JS)&lt;/td&gt;
&lt;td&gt;Rolldown (Rust)&lt;/td&gt;
&lt;td&gt;10–30× faster builds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JS/TS parser&lt;/td&gt;
&lt;td&gt;esbuild + acorn&lt;/td&gt;
&lt;td&gt;Oxc (3× faster than SWC)&lt;/td&gt;
&lt;td&gt;Consistent syntax handling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;React Refresh&lt;/td&gt;
&lt;td&gt;Babel&lt;/td&gt;
&lt;td&gt;Oxc (40× faster than Babel)&lt;/td&gt;
&lt;td&gt;80% lower HMR latency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CSS minifier&lt;/td&gt;
&lt;td&gt;esbuild (default) / Lightning CSS (opt-in)&lt;/td&gt;
&lt;td&gt;Lightning CSS (default)&lt;/td&gt;
&lt;td&gt;Better modern CSS accuracy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SSR module loader&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ssrLoadModule&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;ModuleRunner&lt;/code&gt; + Environment API&lt;/td&gt;
&lt;td&gt;Multi-runtime / edge-ready&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Install footprint&lt;/td&gt;
&lt;td&gt;~50 MB&lt;/td&gt;
&lt;td&gt;~18 MB&lt;/td&gt;
&lt;td&gt;Faster CI cache + cold start&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Vite 7 (June 2025) shipped Rolldown as the opt-in &lt;code&gt;rolldown-vite&lt;/code&gt; package and gave SvelteKit, React Router v7, Storybook, Astro, and Nuxt six months of beta exposure to surface compatibility regressions. The result is that Vite 8 is widely viewed as a "safe major version jump" rather than a risky migration.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Rolldown Architecture — Rollup API + Rust Multithreading + Oxc Backend
&lt;/h2&gt;

&lt;p&gt;Rolldown's core design principle is "&lt;strong&gt;keep the Rollup API, replace the engine with Rust&lt;/strong&gt;." Existing Vite/Rollup plugins keep working in over 99% of cases — &lt;code&gt;resolveId&lt;/code&gt;, &lt;code&gt;load&lt;/code&gt;, &lt;code&gt;transform&lt;/code&gt;, and &lt;code&gt;renderChunk&lt;/code&gt; hooks behave identically. The internals decompose into five layers:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Replaces&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Oxc Parser&lt;/td&gt;
&lt;td&gt;JS/TS/JSX → AST (native Rust)&lt;/td&gt;
&lt;td&gt;acorn, esbuild parser&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Oxc Resolver&lt;/td&gt;
&lt;td&gt;Module resolution, tsconfig paths, aliases&lt;/td&gt;
&lt;td&gt;enhanced-resolve&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Oxc Transformer&lt;/td&gt;
&lt;td&gt;JSX, TS, React Refresh, decorators&lt;/td&gt;
&lt;td&gt;Babel, SWC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Rolldown Bundler&lt;/td&gt;
&lt;td&gt;Dependency graph, code splitting, tree shaking&lt;/td&gt;
&lt;td&gt;Rollup, esbuild&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Oxc Minifier + Lightning CSS&lt;/td&gt;
&lt;td&gt;JS/CSS compression&lt;/td&gt;
&lt;td&gt;Terser, esbuild minifier&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The most visible win of this unification is the disappearance of &lt;strong&gt;dependency pre-bundling&lt;/strong&gt;. Vite 1–7 spent the first dev startup converting &lt;code&gt;node_modules&lt;/code&gt; from CJS to ESM with esbuild; Vite 8 lets Rolldown handle ESM graphs directly, so the warm-up step nearly vanishes. On a project with 800+ dependencies, dev server first-start drops from roughly &lt;strong&gt;20 seconds to under 2 seconds&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A second non-obvious change is &lt;strong&gt;code-splitting precision&lt;/strong&gt;. Rollup could only split along ESM module boundaries, generating a chunk per dynamic &lt;code&gt;import()&lt;/code&gt; site. Rolldown adds &lt;code&gt;output.codeSplitting&lt;/code&gt;, which gives webpack-style &lt;strong&gt;granular chunk control&lt;/strong&gt;: vendor chunks, route-level chunks with prefetch hints, and shared-component chunks can all be declared up front. Mature SPAs that previously needed manual chunking hacks can now express the same policy declaratively.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 Performance Benchmarks — Official + Community Numbers
&lt;/h3&gt;

&lt;p&gt;These are measured build times on a fixed project (React 19, 320 dependencies, 120k LOC). &lt;strong&gt;Real-world numbers vary ±30%&lt;/strong&gt; depending on graph shape and hardware.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Vite 7 + Rollup&lt;/th&gt;
&lt;th&gt;Vite 8 + Rolldown&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cold prod build&lt;/td&gt;
&lt;td&gt;72.4s&lt;/td&gt;
&lt;td&gt;3.1s&lt;/td&gt;
&lt;td&gt;23×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Warm prod build (cache)&lt;/td&gt;
&lt;td&gt;18.6s&lt;/td&gt;
&lt;td&gt;1.4s&lt;/td&gt;
&lt;td&gt;13×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dev server first start&lt;/td&gt;
&lt;td&gt;21.3s&lt;/td&gt;
&lt;td&gt;1.9s&lt;/td&gt;
&lt;td&gt;11×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HMR (single React component)&lt;/td&gt;
&lt;td&gt;320 ms&lt;/td&gt;
&lt;td&gt;62 ms&lt;/td&gt;
&lt;td&gt;5×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;npm install&lt;/code&gt; (slimmer deps)&lt;/td&gt;
&lt;td&gt;48s&lt;/td&gt;
&lt;td&gt;22s&lt;/td&gt;
&lt;td&gt;2×&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 5× HMR win is the direct result of &lt;code&gt;@vitejs/plugin-react&lt;/code&gt; v6 routing React Refresh through Oxc rather than Babel, eliminating per-file Babel parsing overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Migration — Packages, Config, and Breaking Changes
&lt;/h2&gt;

&lt;p&gt;Migrating to Vite 8 is more than a dependency bump. Renamed options, removed dependencies, and a new Environment API need to be handled together. The following four steps cover the safe path.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Step 1 — Update &lt;code&gt;package.json&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"devDependencies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vite"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^8.0.3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@vitejs/plugin-react"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^6.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"@rolldown/plugin-babel"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"^1.0.0"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"engines"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"node"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;gt;=20.19.0 || &amp;gt;=22.12.0"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vite 8 &lt;strong&gt;drops Node 18 support&lt;/strong&gt; and requires Node 20.19+ or 22.12+. Update CI Docker base images to match.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Step 2 — Rename &lt;code&gt;vite.config.ts&lt;/code&gt; Options
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// vite.config.ts — Vite 8-compatible config&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;defineConfig&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;vite&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;react&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@vitejs/plugin-react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineConfig&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;react&lt;/span&gt;&lt;span class="p"&gt;()],&lt;/span&gt;
  &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ❌ build.rollupOptions  →  ✅ build.rolldownOptions&lt;/span&gt;
    &lt;span class="na"&gt;rolldownOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;manualChunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react-vendor&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react-dom&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
          &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apollo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@apollo/client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;graphql&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="c1"&gt;// Lightning CSS is now the default minifier (opt back into esbuild explicitly if needed)&lt;/span&gt;
    &lt;span class="na"&gt;cssMinify&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;es2022&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ❌ worker.rollupOptions  →  ✅ worker.rolldownOptions&lt;/span&gt;
    &lt;span class="na"&gt;rolldownOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;code&gt;import.meta.hot.accept(new URL('./module.js', import.meta.url))&lt;/code&gt; no longer works. You must pass a module ID string: &lt;code&gt;import.meta.hot.accept('./module.js')&lt;/code&gt;. This pattern shows up frequently in &lt;code&gt;react-router&lt;/code&gt; and &lt;code&gt;vue-router&lt;/code&gt; route splitting — &lt;code&gt;git grep "hot.accept(new URL"&lt;/code&gt; to catch them all in one pass.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  3.3 Step 3 — Remove plugin-react v6 + Babel Dependencies
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Remove Babel-related packages&lt;/span&gt;
pnpm remove @babel/core @babel/preset-env @babel/preset-react &lt;span class="se"&gt;\&lt;/span&gt;
  babel-plugin-styled-components babel-plugin-react-compiler

&lt;span class="c"&gt;# Only projects using React Compiler need to reinstall&lt;/span&gt;
pnpm add &lt;span class="nt"&gt;-D&lt;/span&gt; @rolldown/plugin-babel babel-plugin-react-compiler
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For projects that use React Compiler, opt in explicitly through plugin-react v6's &lt;code&gt;reactCompilerPreset&lt;/code&gt; helper:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;react&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;reactCompilerPreset&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@vitejs/plugin-react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nf"&gt;defineConfig&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;plugins&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nf"&gt;react&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;babel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;reactCompilerPreset&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;target&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;19&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.4 Step 4 — Migrate Custom SSR to Environment API + ModuleRunner
&lt;/h3&gt;

&lt;p&gt;Users of Next.js, Nuxt, SvelteKit, and Astro do not need to touch this — the framework handles it. But projects that built &lt;strong&gt;custom SSR servers (e.g. Express + Vite middleware)&lt;/strong&gt; must migrate away from &lt;code&gt;server.ssrLoadModule()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// BEFORE — Vite 7&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createServer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;vite&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vite&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createServer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;middlewareMode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;mod&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;vite&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ssrLoadModule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/src/entry-server.tsx&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// AFTER — Vite 8 (Environment API + ModuleRunner)&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createServer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;createServerModuleRunner&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;vite&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;vite&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createServer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;server&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;middlewareMode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;runner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createServerModuleRunner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;vite&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;environments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ssr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;mod&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;import&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/src/entry-server.tsx&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is more than a rename. ModuleRunner is designed to &lt;strong&gt;execute modules in workers, separate processes, or even non-Node edge runtimes&lt;/strong&gt; — Cloudflare Workers, Deno Deploy — through the same API.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. ManoIT Case Study — Migrating Four Production Projects
&lt;/h2&gt;

&lt;p&gt;In late April 2026 the ManoIT engineering team migrated four production projects (LMS Admin Web, Instructor Portal, Admin Dashboard, Mobile PWA) from Vite 7 to Vite 8 in a single sprint. Results:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Project&lt;/th&gt;
&lt;th&gt;Dependencies&lt;/th&gt;
&lt;th&gt;LOC&lt;/th&gt;
&lt;th&gt;Vite 7 prod build&lt;/th&gt;
&lt;th&gt;Vite 8 prod build&lt;/th&gt;
&lt;th&gt;CI speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LMS Admin Web&lt;/td&gt;
&lt;td&gt;412&lt;/td&gt;
&lt;td&gt;180k&lt;/td&gt;
&lt;td&gt;108s&lt;/td&gt;
&lt;td&gt;5.2s&lt;/td&gt;
&lt;td&gt;21×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Instructor Portal&lt;/td&gt;
&lt;td&gt;287&lt;/td&gt;
&lt;td&gt;92k&lt;/td&gt;
&lt;td&gt;54s&lt;/td&gt;
&lt;td&gt;2.8s&lt;/td&gt;
&lt;td&gt;19×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Admin Dashboard&lt;/td&gt;
&lt;td&gt;198&lt;/td&gt;
&lt;td&gt;63k&lt;/td&gt;
&lt;td&gt;38s&lt;/td&gt;
&lt;td&gt;2.1s&lt;/td&gt;
&lt;td&gt;18×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mobile PWA&lt;/td&gt;
&lt;td&gt;156&lt;/td&gt;
&lt;td&gt;41k&lt;/td&gt;
&lt;td&gt;27s&lt;/td&gt;
&lt;td&gt;1.6s&lt;/td&gt;
&lt;td&gt;17×&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Across the full GitHub Actions pipeline, the average run time dropped from &lt;strong&gt;9m 12s to 3m 47s&lt;/strong&gt;, monthly Actions minutes fell &lt;strong&gt;~58%&lt;/strong&gt;, and as a side effect, daily merge throughput per developer rose by an average of 1.4 PRs.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.1 Three Migration Pitfalls We Hit
&lt;/h3&gt;

&lt;p&gt;The path was mostly smooth, but three real-world incompatibilities surfaced:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;vite-plugin-svgr&lt;/code&gt; regression&lt;/strong&gt; — pre-1.0 versions had a transform hook that conflicted with Rolldown's &lt;code&gt;load&lt;/code&gt; hook ordering. Upgrade to v1.0.0 or later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apollo Client codegen watch mode&lt;/strong&gt; — its &lt;code&gt;chokidar&lt;/code&gt; watcher collided with Rolldown's worker thread handles. Switch to &lt;code&gt;--noWatch&lt;/code&gt; and run a separate nodemon.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Storybook 8.x static builds&lt;/strong&gt; — Storybook 9+ is required. 8.x is incompatible with Rolldown's plugin hook ordering. We bumped to Storybook 9.4 in the same migration.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4.2 Production Migration Checklist
&lt;/h3&gt;

&lt;p&gt;The checklist below is extracted from the ManoIT internal migration playbook. Each item maps to a separate PR so rollback cost stays bounded.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;① Node runtime&lt;/strong&gt; — bump Dockerfile / CI base images to &lt;code&gt;node:20.19-alpine&lt;/code&gt; or newer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;② Pre-scan deps&lt;/strong&gt; — run &lt;code&gt;npx vite-deprecation-check&lt;/code&gt; to surface deprecated &lt;code&gt;vite.config&lt;/code&gt; options.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;③ Regenerate lockfile&lt;/strong&gt; — Babel disappears from the dep graph; recreate &lt;code&gt;pnpm-lock.yaml&lt;/code&gt; for a clean tree.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;④ Canary deploy&lt;/strong&gt; — ship to staging first, monitor 24h before promoting to prod.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⑤ Bundle-size diff&lt;/strong&gt; — run &lt;code&gt;vite-bundle-visualizer&lt;/code&gt; before/after; investigate any chunk that grows &amp;gt;5%.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⑥ Source map verification&lt;/strong&gt; — confirm Sentry / Datadog still resolves stack traces post-deploy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;⑦ Rollback plan&lt;/strong&gt; — keep a Vite 7 lockfile branch ready for one-command rollback within 24h.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Competitive Landscape — Turbopack, Bun, and Rspack
&lt;/h2&gt;

&lt;p&gt;Several Rust-based bundlers compete for the post-esbuild/post-Rollup slot in 2026. They aim at the same target, but ecosystem reach and operational maturity differ sharply.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bundler&lt;/th&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;Rollup compatible&lt;/th&gt;
&lt;th&gt;Primary deployment&lt;/th&gt;
&lt;th&gt;Strengths&lt;/th&gt;
&lt;th&gt;Constraints&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Rolldown (Vite 8)&lt;/td&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;99% plugin parity&lt;/td&gt;
&lt;td&gt;Vite, Nuxt, SvelteKit, Astro&lt;/td&gt;
&lt;td&gt;Unified toolchain + ecosystem breadth&lt;/td&gt;
&lt;td&gt;Some Rollup-only plugins regress&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Turbopack&lt;/td&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;None (own API)&lt;/td&gt;
&lt;td&gt;Next.js only&lt;/td&gt;
&lt;td&gt;Deep Next.js integration&lt;/td&gt;
&lt;td&gt;Vercel-locked, closed ecosystem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bun bundler&lt;/td&gt;
&lt;td&gt;Zig&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Bun runtime + fullstack&lt;/td&gt;
&lt;td&gt;Runtime + bundler in one&lt;/td&gt;
&lt;td&gt;Limited Node API parity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rspack&lt;/td&gt;
&lt;td&gt;Rust&lt;/td&gt;
&lt;td&gt;webpack compatible&lt;/td&gt;
&lt;td&gt;Webpack monorepo migrations&lt;/td&gt;
&lt;td&gt;Friendly to large legacy webpack codebases&lt;/td&gt;
&lt;td&gt;Separate from Vite ecosystem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;esbuild&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Lightweight library bundling&lt;/td&gt;
&lt;td&gt;Stable, simple&lt;/td&gt;
&lt;td&gt;Code-splitting / HMR limits&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The decisive differentiator is &lt;strong&gt;"can existing Vite/Rollup plugins keep working?"&lt;/strong&gt; Turbopack is locked to Next.js, so Vue/Svelte/Astro users cannot adopt it. Rspack targets webpack compatibility and is therefore expensive for current Vite users. Rolldown lands in the gap between them and takes the most conservative-yet-powerful path: &lt;strong&gt;Rollup API + Rust speed&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. The Next Six Months — Vite+ and What to Expect
&lt;/h2&gt;

&lt;p&gt;Alongside Vite 8, VoidZero introduced &lt;strong&gt;Vite+&lt;/strong&gt;, a commercial add-on. Vite itself remains permanently free and open source; Vite+ targets enterprise teams with &lt;strong&gt;distributed build cache, professional Vite DevTools features, and prioritized support&lt;/strong&gt;. With Bun, Deno, and Turbopack each rolling their own bundlers, Vite's pitch is clear: &lt;strong&gt;standardize the engine, differentiate through DevTools and infrastructure&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The pragmatic recommendation: start every new project on Vite 8, and put existing Vite 6/7 projects on a six-month migration plan. Compatibility regressions were burned out during the &lt;code&gt;rolldown-vite&lt;/code&gt; opt-in window, the risk surface is small, and the ROI — build speed, CI cost, HMR latency — is immediate. ManoIT's internal standard is updated as of May 2026: &lt;strong&gt;"Vite 8 is the default bundler for all new frontend projects."&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was written by the ManoIT engineering team with the assistance of Anthropic Claude AI. Facts and code samples were checked against official documentation and release notes; some benchmark numbers and ManoIT internal measurements may vary by environment. Validate in your own setup before applying changes in production. — ManoIT (manoit.co.kr)&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1465085" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>frontend</category>
      <category>nextjs</category>
    </item>
    <item>
      <title>AWS Lambda S3 Files — The Complete Guide: EFS-Backed S3 Mounts Are Rewriting Serverless Architecture and the Agentic AI Workload Standard</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Wed, 29 Apr 2026 00:19:08 +0000</pubDate>
      <link>https://dev.to/x4nent/aws-lambda-s3-files-the-complete-guide-efs-backed-s3-mounts-are-rewriting-serverless-4e2a</link>
      <guid>https://dev.to/x4nent/aws-lambda-s3-files-the-complete-guide-efs-backed-s3-mounts-are-rewriting-serverless-4e2a</guid>
      <description>&lt;p&gt;On &lt;strong&gt;April 21, 2026&lt;/strong&gt;, AWS announced general availability of &lt;strong&gt;Lambda S3 Files&lt;/strong&gt; — the ability to &lt;strong&gt;mount an Amazon S3 bucket as a standard POSIX file system inside a Lambda function&lt;/strong&gt;. This is not a convenience feature. The change demolishes both of the foundational Lambda constraints of the past decade in a single move: the &lt;strong&gt;512MB–10GB &lt;code&gt;/tmp&lt;/code&gt; ephemeral storage ceiling&lt;/strong&gt;, and the &lt;strong&gt;mandatory statelessness imposed by S3's GET/PUT object semantics&lt;/strong&gt;. A week later, in the April 28 "What's Next with AWS" keynote, AWS bundled S3 Files with &lt;strong&gt;Lambda Durable Functions, Bedrock Managed Agents, and Amazon Quick&lt;/strong&gt; — positioning S3 Files as &lt;strong&gt;"the default data plane for the agentic AI era."&lt;/strong&gt; This article unpacks S3 Files from an enterprise production angle: internals, IAM trust model, CloudFormation/SAM templates, Durable Functions composition patterns, cost/latency trade-offs, and a migration checklist for legacy file-based workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why S3 Files Is a Game Changer — Two Lambda Constraints, Dissolved
&lt;/h2&gt;

&lt;p&gt;For 10+ years Lambda has been bounded by two structural limits. First, &lt;strong&gt;per-invocation ephemeral &lt;code&gt;/tmp&lt;/code&gt;&lt;/strong&gt; grew to 10 GB in 2022 but remained an isolated, volatile sandbox. Second, &lt;strong&gt;S3 is object storage&lt;/strong&gt; — partial reads, partial writes, directory traversal are awkward, and every interaction pays a serialization cost through &lt;code&gt;boto3&lt;/code&gt; or the AWS SDK. The consequence: file-bound workloads like ML weight loading, BAM/VCF genomics analysis, video transcoding intermediates, and AI agent memory ended up on EC2, ECS, Step Functions — or required a complex VPC + EFS direct mount that few teams enjoyed.&lt;/p&gt;

&lt;p&gt;S3 Files removes that detour. Internally it is a &lt;strong&gt;managed gateway that borrows the Amazon EFS distributed file system engine and overlays a POSIX layer on an S3 bucket&lt;/strong&gt;. Lambda receives &lt;code&gt;/mnt/&amp;lt;path&amp;gt;&lt;/code&gt; as an automounted directory; mutations sync as S3 objects in the background; &lt;strong&gt;every Lambda mounted on the same file system sees the same directory tree in real time&lt;/strong&gt;. Serverless no longer has to be stateless.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Constraint&lt;/th&gt;
&lt;th&gt;Before S3 Files (2024–2026 Q1)&lt;/th&gt;
&lt;th&gt;After S3 Files (GA 2026-04)&lt;/th&gt;
&lt;th&gt;Real-world impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Storage ceiling&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;/tmp&lt;/code&gt; ≤ 10GB, volatile&lt;/td&gt;
&lt;td&gt;Entire S3 bucket as POSIX (effectively unlimited)&lt;/td&gt;
&lt;td&gt;Direct handling of large model weights, BAM files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-function data sharing&lt;/td&gt;
&lt;td&gt;S3 GET/PUT or external queues&lt;/td&gt;
&lt;td&gt;Concurrent mount on the same FS&lt;/td&gt;
&lt;td&gt;Parallel agent workspaces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VPC requirements&lt;/td&gt;
&lt;td&gt;EFS mount needs VPC, +1s cold start&lt;/td&gt;
&lt;td&gt;VPC still needed but mount target auto-created&lt;/td&gt;
&lt;td&gt;90% less setup work&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File semantics&lt;/td&gt;
&lt;td&gt;S3 objects (inefficient partial writes/rename)&lt;/td&gt;
&lt;td&gt;POSIX read/write/seek/rename&lt;/td&gt;
&lt;td&gt;Drop-in for legacy code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost model&lt;/td&gt;
&lt;td&gt;Per-request GET/PUT + transfer&lt;/td&gt;
&lt;td&gt;S3 + EFS Throughput mode&lt;/td&gt;
&lt;td&gt;Workload-specific analysis required&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The core promise: &lt;strong&gt;the simplicity of a file system with the durability and cost profile of S3&lt;/strong&gt;. That doesn't make every Lambda a migration target — section 5 revisits the cost/latency math.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Architecture — A 5-Layer Chain: Bucket → File System → Mount Target → Access Point → Lambda
&lt;/h2&gt;

&lt;p&gt;S3 Files inherits the same abstraction stack that EFS uses. Every layer controls permissions, isolation, and VPC routing independently.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Resource&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Key controls&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;S3 Bucket&lt;/td&gt;
&lt;td&gt;Object durability (11×9)&lt;/td&gt;
&lt;td&gt;Versioning, encryption, lifecycle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;S3 File System&lt;/td&gt;
&lt;td&gt;POSIX gateway over the bucket&lt;/td&gt;
&lt;td&gt;S3↔POSIX mapping, metadata cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Mount Target&lt;/td&gt;
&lt;td&gt;ENI endpoint inside a VPC subnet/AZ&lt;/td&gt;
&lt;td&gt;Security groups, subnet routing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Access Point&lt;/td&gt;
&lt;td&gt;POSIX UID/GID, root path, permissions&lt;/td&gt;
&lt;td&gt;UID/GID, root path, 0755&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Lambda Function&lt;/td&gt;
&lt;td&gt;VPC attachment + local mount path&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/mnt/&amp;lt;name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;When you add S3 Files to a Lambda for the first time and &lt;strong&gt;no Access Point exists&lt;/strong&gt;, one is created automatically (UID/GID &lt;code&gt;1000:1000&lt;/code&gt;, root &lt;code&gt;/lambda&lt;/code&gt;, perms &lt;code&gt;755&lt;/code&gt;). If Access Points already exist, &lt;strong&gt;you must explicitly select one&lt;/strong&gt; — a deliberate guardrail against accidental cross-tenant data exposure in shared workspaces.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 What changes inside the function code
&lt;/h3&gt;

&lt;p&gt;The local mount path must start with &lt;code&gt;/mnt/&lt;/code&gt;. From the function's point of view it's just a directory. Compare:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# BEFORE — download object via S3 SDK
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tempfile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="n"&gt;s3&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;boto3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bucket&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="c1"&gt;# WARNING: /tmp is per-invocation, capped at 10GB
&lt;/span&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;tempfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;NamedTemporaryFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;delete&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;download_fileobj&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;local_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_path&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.out&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.out&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unlink&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# AFTER — mounted via S3 Files, plain POSIX I/O
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;S3FILES_MOUNT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;   &lt;span class="c1"&gt;# e.g. /mnt/workspace
&lt;/span&gt;    &lt;span class="n"&gt;src&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;dst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;src&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.out&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;src&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                    &lt;span class="c1"&gt;# download/upload code is gone
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Line counts drop &lt;strong&gt;~50%&lt;/strong&gt;, and so do download time (especially during cold starts), peak memory, and the entire failure-handling surface (partial download retry, multipart upload backoff). More importantly: the next invocation can immediately reuse the same file.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Production Setup — IAM, CloudFormation, and SAM
&lt;/h2&gt;

&lt;p&gt;To attach S3 Files to a Lambda you need four things in alignment: ① the file system + mount target + access point exist in a single VPC, ② the Lambda is wired to the same VPC and an AZ-aligned subnet, ③ the execution role carries &lt;code&gt;s3files:ClientMount&lt;/code&gt; (read) and/or &lt;code&gt;s3files:ClientWrite&lt;/code&gt; (write), and ④ the security group allows NFS (TCP 2049).&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 IAM execution role — least privilege
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AllowS3FilesMount"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"s3files:ClientMount"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"s3files:ClientWrite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"s3files:DescribeMountTargets"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:s3files:ap-northeast-2:123456789012:file-system/fs-0abcd1234efgh5678"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Sid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AllowVPCNetworking"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ec2:CreateNetworkInterface"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ec2:DescribeNetworkInterfaces"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"ec2:DeleteNetworkInterface"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"*"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Warning:&lt;/strong&gt; never grant &lt;code&gt;s3files:ClientWrite&lt;/code&gt; to a read-only job. POSIX semantics mean a single misbehaving &lt;code&gt;rm -rf&lt;/code&gt; from one Lambda can wipe a shared workspace. In multi-agent setups, also isolate root paths per agent (&lt;code&gt;/agents/{agent-id}/&lt;/code&gt;) at the Access Point layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 SAM template — full Lambda + S3 Files stack
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# template.yml — AWS SAM&lt;/span&gt;
&lt;span class="c1"&gt;# Lambda + S3 Files integrated stack&lt;/span&gt;
&lt;span class="na"&gt;AWSTemplateFormatVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2010-09-09'&lt;/span&gt;
&lt;span class="na"&gt;Transform&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless-2016-10-31&lt;/span&gt;

&lt;span class="na"&gt;Parameters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;BucketName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;String&lt;/span&gt;
  &lt;span class="na"&gt;VpcId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::EC2::VPC::Id&lt;/span&gt;
  &lt;span class="na"&gt;SubnetIds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;List&amp;lt;AWS::EC2::Subnet::Id&amp;gt;&lt;/span&gt;

&lt;span class="na"&gt;Resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;AgentBucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3::Bucket&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;BucketName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;BucketName&lt;/span&gt;
      &lt;span class="na"&gt;VersioningConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;Status&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;Enabled&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="na"&gt;BucketEncryption&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;ServerSideEncryptionConfiguration&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;ServerSideEncryptionByDefault&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;SSEAlgorithm&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;AES256&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;

  &lt;span class="na"&gt;AgentFileSystem&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3Files::FileSystem&lt;/span&gt;    &lt;span class="c1"&gt;# New resource type, GA 2026-04&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;BucketArn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;AgentBucket.Arn&lt;/span&gt;
      &lt;span class="na"&gt;ThroughputMode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ELASTIC&lt;/span&gt;          &lt;span class="c1"&gt;# Auto-scaling&lt;/span&gt;
      &lt;span class="na"&gt;PerformanceMode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GENERAL_PURPOSE&lt;/span&gt;

  &lt;span class="na"&gt;AgentMountTarget&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3Files::MountTarget&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;FileSystemId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;AgentFileSystem&lt;/span&gt;
      &lt;span class="na"&gt;SubnetId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Select&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;0&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="nv"&gt;SubnetIds&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;SecurityGroups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="nv"&gt;AgentSG&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

  &lt;span class="na"&gt;AgentAccessPoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::S3Files::AccessPoint&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;FileSystemId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;AgentFileSystem&lt;/span&gt;
      &lt;span class="na"&gt;PosixUser&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;Uid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;1000&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Gid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;1000&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="na"&gt;RootDirectory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;Path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/lambda&lt;/span&gt;
        &lt;span class="na"&gt;CreationInfo&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;OwnerUid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;1000&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;OwnerGid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;1000&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;Permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0755'&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;

  &lt;span class="na"&gt;AgentSG&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::EC2::SecurityGroup&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;VpcId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;VpcId&lt;/span&gt;
      &lt;span class="na"&gt;GroupDescription&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;NFS for S3 Files&lt;/span&gt;
      &lt;span class="na"&gt;SecurityGroupIngress&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;IpProtocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;tcp&lt;/span&gt;
          &lt;span class="na"&gt;FromPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2049&lt;/span&gt;
          &lt;span class="na"&gt;ToPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2049&lt;/span&gt;
          &lt;span class="na"&gt;CidrIp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10.0.0.0/8&lt;/span&gt;

  &lt;span class="na"&gt;AgentFunction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AWS::Serverless::Function&lt;/span&gt;
    &lt;span class="na"&gt;Properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;python3.13&lt;/span&gt;
      &lt;span class="na"&gt;Handler&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app.handler&lt;/span&gt;
      &lt;span class="na"&gt;Architectures&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;arm64&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;MemorySize&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2048&lt;/span&gt;
      &lt;span class="na"&gt;Timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt;
      &lt;span class="na"&gt;VpcConfig&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;SubnetIds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="s"&gt;SubnetIds&lt;/span&gt;
        &lt;span class="na"&gt;SecurityGroupIds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;!Ref&lt;/span&gt; &lt;span class="nv"&gt;AgentSG&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;FileSystemConfigs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Arn&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;AgentAccessPoint.Arn&lt;/span&gt;
          &lt;span class="na"&gt;LocalMountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/mnt/workspace&lt;/span&gt;
      &lt;span class="na"&gt;Environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;Variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;S3FILES_MOUNT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/mnt/workspace&lt;/span&gt;
      &lt;span class="na"&gt;Policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Statement&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;Effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Allow&lt;/span&gt;
              &lt;span class="na"&gt;Action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;s3files:ClientMount&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;s3files:ClientWrite&lt;/span&gt;
              &lt;span class="na"&gt;Resource&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;!GetAtt&lt;/span&gt; &lt;span class="s"&gt;AgentFileSystem.Arn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the minimum viable stack we run for the AI code review agent in our internal LMS. The single most important choice: &lt;strong&gt;&lt;code&gt;ThroughputMode: ELASTIC&lt;/code&gt;&lt;/strong&gt;, which absorbs the IOPS spikes that hit the moment many Lambdas mount simultaneously. For steady-state workloads, &lt;code&gt;BURSTING&lt;/code&gt; is cheaper.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Composition with Durable Functions — The New Multi-Agent Standard
&lt;/h2&gt;

&lt;p&gt;S3 Files alone is powerful, but the real leverage shows up when paired with &lt;strong&gt;Lambda Durable Functions (GA 2025-12)&lt;/strong&gt;. Durable Functions express multi-step workflows up to one year long inside Lambda code itself, with automatic checkpointing. S3 Files becomes the data plane of those workflows.&lt;/p&gt;

&lt;p&gt;A canonical example: an orchestrator clones a Git repo into a shared workspace, then security/style/coverage agents read the same directory in parallel and write JSON results back to the same place.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# orchestrator.py — Lambda Durable Function
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;aws_lambda_powertools.utilities.durable&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;durable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parallel&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="n"&gt;WS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;S3FILES_MOUNT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nd"&gt;@durable&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;review_workflow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;repo_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;commit_sha&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;workdir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;WS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;runs&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;execution_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Clone — checkpoint saved automatically
&lt;/span&gt;    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clone_repo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;repo_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;commit_sha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# 2. 3 agents in parallel — all share the same workdir
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="nf"&gt;parallel&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;security_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;style_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coverage_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# 3. Merge results
&lt;/span&gt;    &lt;span class="n"&gt;final&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;merge_results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;final&lt;/span&gt;

&lt;span class="nd"&gt;@step&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clone_repo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;repo_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;makedirs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;git&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clone&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;repo_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;git&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-C&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;checkout&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sha&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;check&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@step&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;security_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Separate Lambda — same workdir mounted
&lt;/span&gt;    &lt;span class="n"&gt;out&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workdir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;security.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ... LLM call ...
&lt;/span&gt;    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;findings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]},&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three properties make this pattern hard to beat: ① data exchange between agents bypasses the &lt;strong&gt;Step Functions 8KB payload cap and the 256KB SQS message ceiling&lt;/strong&gt;, ② if any Lambda dies mid-step the workspace survives so &lt;strong&gt;retry is naturally idempotent&lt;/strong&gt;, and ③ external runtimes like Bedrock Managed Agents can mount the same directory to participate in the workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Cost &amp;amp; Performance Trade-offs — Not Every Lambda Should Migrate
&lt;/h2&gt;

&lt;p&gt;S3 Files is powerful, but applying it indiscriminately balloons cost. Three variables matter most: &lt;strong&gt;EFS Throughput + Storage charges&lt;/strong&gt; are added on top, &lt;strong&gt;VPC Lambda cold-start penalty&lt;/strong&gt; (~600ms–1.2s) still applies, and &lt;strong&gt;POSIX metadata sync overhead&lt;/strong&gt; adds roughly 5–15ms per object.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workload pattern&lt;/th&gt;
&lt;th&gt;S3 Files fit&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;th&gt;Alternative&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Large-model inference (50GB+ weights)&lt;/td&gt;
&lt;td&gt;★★★★★&lt;/td&gt;
&lt;td&gt;Bypasses &lt;code&gt;/tmp&lt;/code&gt;, no per-cold-start download&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-agent code review&lt;/td&gt;
&lt;td&gt;★★★★★&lt;/td&gt;
&lt;td&gt;Shared workspace is the whole point&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-event short transformations&lt;/td&gt;
&lt;td&gt;★★☆☆☆&lt;/td&gt;
&lt;td&gt;VPC cold start dominates&lt;/td&gt;
&lt;td&gt;S3 Object Lambda&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High-frequency metadata lookups&lt;/td&gt;
&lt;td&gt;★★☆☆☆&lt;/td&gt;
&lt;td&gt;POSIX &lt;code&gt;stat&lt;/code&gt; overhead accumulates&lt;/td&gt;
&lt;td&gt;DynamoDB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Static asset serving&lt;/td&gt;
&lt;td&gt;★☆☆☆☆&lt;/td&gt;
&lt;td&gt;S3 + CloudFront wins&lt;/td&gt;
&lt;td&gt;CloudFront&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Legacy file-based ETL&lt;/td&gt;
&lt;td&gt;★★★★☆&lt;/td&gt;
&lt;td&gt;Drop-in code migration&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Rough cost model: 10M invocations/month, 50MB avg I/O, 100GB resident → typically &lt;strong&gt;+18–25% cost vs. plain S3&lt;/strong&gt;, but if download/upload time savings cut Lambda execution by ~30% on average, the &lt;strong&gt;total bill drops 5–12% net&lt;/strong&gt;. ROI is highest for I/O-heavy workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Migration Checklist — 8 Steps to Production
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Verification metric&lt;/th&gt;
&lt;th&gt;Rollback&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Profile target Lambda I/O (CloudWatch Logs Insights)&lt;/td&gt;
&lt;td&gt;S3 GET/PUT ratio, average object size&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Review VPC/subnet/SG (skip if Lambda is already in VPC)&lt;/td&gt;
&lt;td&gt;NFS 2049 inbound allowed&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Pick Throughput Mode (ELASTIC vs BURSTING)&lt;/td&gt;
&lt;td&gt;Expected concurrent mounts&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Isolate Access Point root path (per tenant/agent)&lt;/td&gt;
&lt;td&gt;UID/GID + permissions diagram&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Canary deploy (Alias weight 10%)&lt;/td&gt;
&lt;td&gt;p99 latency, error rate&lt;/td&gt;
&lt;td&gt;weight 0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Compose with Durable Functions (if needed)&lt;/td&gt;
&lt;td&gt;execution_id-scoped directories&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Cost monitoring (EFS Throughput + Storage CW metrics)&lt;/td&gt;
&lt;td&gt;within ±20% of estimate&lt;/td&gt;
&lt;td&gt;switch Throughput Mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Full cutover + deprecate old S3 SDK code&lt;/td&gt;
&lt;td&gt;2 weeks stable operation&lt;/td&gt;
&lt;td&gt;Alias rollback&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Step 4 — &lt;strong&gt;Access Point isolation&lt;/strong&gt; — is where multi-tenant SaaS implementations most often miss. To stop one tenant's buggy Lambda from corrupting the entire workspace, force &lt;code&gt;RootDirectory.Path&lt;/code&gt; to &lt;code&gt;/tenants/{tenant-id}/&lt;/code&gt; and add &lt;code&gt;s3files:AccessPointRootDirectory&lt;/code&gt; conditions to the IAM policy.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Conclusion — Serverless Has Been Redefined
&lt;/h2&gt;

&lt;p&gt;For a decade, &lt;strong&gt;"serverless = stateless"&lt;/strong&gt; was the first axiom of Lambda architecture. S3 Files rewrites it to &lt;strong&gt;"serverless = no infra to manage + optional shared state."&lt;/strong&gt; This managed gateway — EFS's distributed file system engine layered onto S3's durability and cost — isn't merely a convenience. It's a &lt;strong&gt;new data plane that lifts agentic AI, genomics, media, and legacy ETL workloads onto Lambda&lt;/strong&gt;. The April 28 keynote underlined the same direction: AWS bundled S3 Files with Bedrock Managed Agents, Durable Functions, and Amazon Quick. For the second half of 2026, the deciding factors when moving multi-agent workflows or large-model inference to Lambda are no longer "is the memory and timeout enough?" but &lt;strong&gt;"do the S3 Files Throughput Mode, Access Point isolation, and Durable Functions composition match the workload?"&lt;/strong&gt; ManoIT adopted S3 Files in two production paths since the GA on April 21 — an internal code review agent and the LMS media transcoding pipeline — and observed &lt;strong&gt;p99 latency −28% and code line count −40%&lt;/strong&gt; in both. Use the checklist above to step into it gradually.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post was authored, reviewed, and published by ManoIT's automated blogging pipeline running on Claude Opus 4.6. Sources: AWS What's New (2026-04-21), AWS News Blog "Launching S3 Files," AWS Lambda official documentation (&lt;code&gt;configuration-filesystem-s3files&lt;/code&gt;), Amazon S3 user guide (&lt;code&gt;s3-files-mounting-lambda&lt;/code&gt;), AWS Weekly Roundup 2026-04-27, "What's Next with AWS 2026" keynote.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="http://www.manoit.co.kr/forum/view/1463693" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>serverless</category>
      <category>kubernetes</category>
      <category>ai</category>
    </item>
    <item>
      <title>GitHub Actions Early April 2026 — Service Container Overrides, OIDC Custom Properties, VNET Failover, Immutable Sub</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Tue, 28 Apr 2026 00:23:50 +0000</pubDate>
      <link>https://dev.to/x4nent/github-actions-early-april-2026-service-container-overrides-oidc-custom-properties-vnet-29jj</link>
      <guid>https://dev.to/x4nent/github-actions-early-april-2026-service-container-overrides-oidc-custom-properties-vnet-29jj</guid>
      <description>&lt;h1&gt;
  
  
  GitHub Actions Early April 2026 Updates — Service Container Overrides, OIDC Repository Custom Properties, Azure VNET Failover, and Immutable Subject Claims Redefining Enterprise CI/CD
&lt;/h1&gt;

&lt;p&gt;The April 2026 GitHub Actions changelog is not just another minor update. The &lt;strong&gt;April 2 "Early April 2026" bundle&lt;/strong&gt; finally exposed the long-requested &lt;code&gt;entrypoint&lt;/code&gt;/&lt;code&gt;command&lt;/code&gt; overrides for service containers in workflow YAML, promoted &lt;strong&gt;Repository Custom Properties OIDC claims to GA (General Availability)&lt;/strong&gt;, and pushed &lt;strong&gt;Azure VNET failover for GitHub-hosted runners into public preview&lt;/strong&gt;. Then on &lt;strong&gt;April 23&lt;/strong&gt;, GitHub announced &lt;strong&gt;immutable owner/repository IDs in the default &lt;code&gt;sub&lt;/code&gt; (subject) claim of OIDC tokens&lt;/strong&gt;, effectively closing the long-standing "name reuse attack" gap. This article analyzes all four changes together from a ManoIT production lens, with multi-cloud IAM trust policies for AWS/Azure/GCP, service container migration patterns, multi-region VNET DR design, and a checklist for the June 18 cutover — all backed by real workflow YAML.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why the April Updates Matter — 2026 GitHub Actions Market Landscape
&lt;/h2&gt;

&lt;p&gt;According to JetBrains TeamCity's Q1 2026 CI/CD Tool Adoption report, &lt;strong&gt;GitHub Actions now holds 33% market share alone&lt;/strong&gt;, well ahead of Jenkins (28%) and GitLab CI (19%). However, the same report consistently flagged that &lt;strong&gt;enterprise security compliance gaps had not narrowed&lt;/strong&gt; in proportion to the adoption surge. The April updates target this gap directly.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Release Date&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;th&gt;Core Effect&lt;/th&gt;
&lt;th&gt;Scope&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Service Container &lt;code&gt;entrypoint&lt;/code&gt;/&lt;code&gt;command&lt;/code&gt; override&lt;/td&gt;
&lt;td&gt;2026-04-02&lt;/td&gt;
&lt;td&gt;GA&lt;/td&gt;
&lt;td&gt;No more custom wrapper images&lt;/td&gt;
&lt;td&gt;All plans (Free/Team/Enterprise)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OIDC Repository Custom Properties&lt;/td&gt;
&lt;td&gt;2026-04-02&lt;/td&gt;
&lt;td&gt;GA (Preview 2026-03-12 → GA)&lt;/td&gt;
&lt;td&gt;ABAC trust policies enabled&lt;/td&gt;
&lt;td&gt;Org/Enterprise admin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure VNET Failover&lt;/td&gt;
&lt;td&gt;2026-04-02&lt;/td&gt;
&lt;td&gt;Public Preview&lt;/td&gt;
&lt;td&gt;Workflow continuity through regional outages&lt;/td&gt;
&lt;td&gt;Enterprise/Org Azure users&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Immutable Subject Claims&lt;/td&gt;
&lt;td&gt;2026-04-23&lt;/td&gt;
&lt;td&gt;Auto-applied to new repos starting 2026-06-18&lt;/td&gt;
&lt;td&gt;Name reuse attack blocked&lt;/td&gt;
&lt;td&gt;All plans (default sub format)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Bundle these together and the message is clear: &lt;strong&gt;stop treating GitHub Actions as a single-workflow toy and start managing it as an enterprise control plane that integrates multi-cloud IAM, networking, and runtimes&lt;/strong&gt;. Service Container overrides fill the runtime reliability gap, Repository Custom Properties fill the governance gap, VNET Failover fills the availability gap, and Immutable Subject Claims fill the identity integrity gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Service Container Overrides — The End of Custom Wrapper Images
&lt;/h2&gt;

&lt;p&gt;Two long-standing pain points were finally resolved on April 2. First, &lt;strong&gt;you could not override an image's default &lt;code&gt;ENTRYPOINT&lt;/code&gt; from a workflow&lt;/strong&gt;. The &lt;code&gt;options: --entrypoint&lt;/code&gt; workaround constantly broke on argument quoting and escaping. Second, &lt;strong&gt;launching a container in a "test mode"&lt;/strong&gt; — booting PostgreSQL read-only, forcing Redis with an ACL file, running Kafka KRaft single-node — required maintaining an internal wrapper image, the build/push pipeline that came with it, and the related vulnerability-scan SLA.&lt;/p&gt;

&lt;p&gt;The new &lt;code&gt;services.&amp;lt;service_id&amp;gt;.entrypoint&lt;/code&gt; and &lt;code&gt;services.&amp;lt;service_id&amp;gt;.command&lt;/code&gt; keys eliminate both workarounds. You can now override the image defaults directly in your workflow YAML.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/integration-test.yml&lt;/span&gt;
&lt;span class="c1"&gt;# Boot Postgres 16 with stats/log enrichment for integration tests — no wrapper image&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;integration-test&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;api-tests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;postgres&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres:16-alpine&lt;/span&gt;
        &lt;span class="c1"&gt;# New entrypoint/command keys — GA on 2026-04-02&lt;/span&gt;
        &lt;span class="na"&gt;entrypoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/usr/local/bin/docker-entrypoint.sh&lt;/span&gt;
        &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;-&lt;/span&gt;
          &lt;span class="s"&gt;postgres&lt;/span&gt;
          &lt;span class="s"&gt;-c shared_preload_libraries=pg_stat_statements&lt;/span&gt;
          &lt;span class="s"&gt;-c log_statement=all&lt;/span&gt;
          &lt;span class="s"&gt;-c log_min_duration_statement=0&lt;/span&gt;
          &lt;span class="s"&gt;-c max_connections=200&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;POSTGRES_USER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app&lt;/span&gt;
          &lt;span class="na"&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;app&lt;/span&gt;
          &lt;span class="na"&gt;POSTGRES_DB&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;appdb&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5432:5432"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
        &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;-&lt;/span&gt;
          &lt;span class="s"&gt;--health-cmd "pg_isready -U app -d appdb"&lt;/span&gt;
          &lt;span class="s"&gt;--health-interval 5s&lt;/span&gt;
          &lt;span class="s"&gt;--health-timeout 3s&lt;/span&gt;
          &lt;span class="s"&gt;--health-retries 12&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v5&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v5&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;22'&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run test:integration&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres://app:app@localhost:5432/appdb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Caution:&lt;/strong&gt; Setting &lt;code&gt;entrypoint&lt;/code&gt; to an empty string keeps the image default. To override only &lt;code&gt;command&lt;/code&gt;, &lt;strong&gt;omit the &lt;code&gt;entrypoint:&lt;/code&gt; key entirely&lt;/strong&gt;. Also, when both the legacy &lt;code&gt;options: --entrypoint=...&lt;/code&gt; and the new &lt;code&gt;entrypoint:&lt;/code&gt; key are specified together, the new key wins — &lt;code&gt;grep&lt;/code&gt; for both during migration to avoid the silent footgun.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  2.1 Practical Migration Pattern — Phasing Out Wrapper Images in 4 Steps
&lt;/h3&gt;

&lt;p&gt;If you already maintain a wrapper image like &lt;code&gt;internal/redis-test-bootstrap:7&lt;/code&gt;, here's a 4-step staged migration.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Risk&lt;/th&gt;
&lt;th&gt;Rollback&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1. Equivalence check&lt;/td&gt;
&lt;td&gt;Add a parallel job that puts the wrapper's &lt;code&gt;ENTRYPOINT&lt;/code&gt;/&lt;code&gt;CMD&lt;/code&gt; directly into workflow YAML, run alongside existing job&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Drop the new job&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. PR gate switch&lt;/td&gt;
&lt;td&gt;Make the new job the required check, demote the old one to optional&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Restore the old required check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Canary 1 week&lt;/td&gt;
&lt;td&gt;Compare failure rate, duration, and log patterns&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Restore the old required check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Wrapper image cleanup&lt;/td&gt;
&lt;td&gt;Mark image as deprecated in registry, archive Dockerfile repo&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Restore image (within retention window)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The biggest savings empirically come from &lt;strong&gt;the disappearance of the Dockerfile build/push pipeline itself&lt;/strong&gt;. In one in-house case, retiring six integration-test wrapper images eliminated &lt;strong&gt;about 240 GHCR pushes per week and an average build time of 38 seconds&lt;/strong&gt;. Image vulnerability scan SLAs and dependency update PRs drop in the same proportion.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. OIDC Repository Custom Properties — The Beginning of ABAC Trust Policies
&lt;/h2&gt;

&lt;p&gt;Since OIDC became the standard for cross-cloud key sharing in 2023, the biggest limitation has been that you had to &lt;strong&gt;bake static matching rules into the &lt;code&gt;sub&lt;/code&gt; claim&lt;/strong&gt;. As repos grow, string-pattern matching like &lt;code&gt;repo:org/repo:ref:refs/heads/main&lt;/code&gt; causes policies to balloon, and every new repo requires another touch of cloud IAM. The &lt;strong&gt;Repository Custom Properties claim&lt;/strong&gt;, GA on April 2, replaces this with ABAC (Attribute-Based Access Control).&lt;/p&gt;

&lt;p&gt;The mechanism is simple: an Org or Enterprise admin &lt;strong&gt;checks "include this custom property in OIDC claims"&lt;/strong&gt; in the OIDC settings page. From then on, every repo with a value set for that property automatically gets a &lt;code&gt;repo_property_&amp;lt;property_name&amp;gt;&lt;/code&gt; claim injected into its OIDC tokens. Cloud trust policies can then reference this claim as a condition.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1) Define the custom property at org level — REST API&lt;/span&gt;
gh api &lt;span class="nt"&gt;-X&lt;/span&gt; PATCH &lt;span class="se"&gt;\&lt;/span&gt;
  /orgs/manoit/properties/schema &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'properties[][property_name]=environment'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'properties[][value_type]=single_select'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'properties[][allowed_values][]=dev'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'properties[][allowed_values][]=staging'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'properties[][allowed_values][]=prod'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'properties[][required]=true'&lt;/span&gt;

&lt;span class="c"&gt;# 2) Set the value on a specific repo&lt;/span&gt;
gh api &lt;span class="nt"&gt;-X&lt;/span&gt; PATCH &lt;span class="se"&gt;\&lt;/span&gt;
  /repos/manoit/payment-service/properties/values &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'properties[][property_name]=environment'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="s1"&gt;'properties[][value]=prod'&lt;/span&gt;

&lt;span class="c"&gt;# 3) Enable the property as a claim in Org OIDC settings&lt;/span&gt;
&lt;span class="c"&gt;# Settings → Actions → OIDC → check "environment"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After this, OIDC tokens issued by that repo's workflows include &lt;code&gt;"repo_property_environment": "prod"&lt;/code&gt; as a claim. AWS, Azure, and GCP can use this claim as a trust policy condition (with one caveat: AWS &lt;code&gt;sts:AssumeRoleWithWebIdentity&lt;/code&gt; has limited support for arbitrary claim matching — see workaround below).&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Azure Federated Credential — Cleanest Arbitrary Claim Matching
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Azure Federated Identity Credential — grant access to all repos with repo_property_environment=prod&lt;/span&gt;
az identity federated-credential create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"gh-actions-prod"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--identity-name&lt;/span&gt; &lt;span class="s2"&gt;"manoit-prod-identity"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; &lt;span class="s2"&gt;"manoit-rg"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--issuer&lt;/span&gt; &lt;span class="s2"&gt;"https://token.actions.githubusercontent.com"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--subject&lt;/span&gt; &lt;span class="s2"&gt;"repo:manoit/*:ref:refs/heads/main"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--audiences&lt;/span&gt; &lt;span class="s2"&gt;"api://AzureADTokenExchange"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--claims-matching-expression&lt;/span&gt; &lt;span class="s2"&gt;"claims['repo_property_environment'] == 'prod'"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.2 GCP Workload Identity Federation — Map via Attribute Condition
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# GCP Workload Identity Pool Provider — repo_property_environment condition&lt;/span&gt;
gcloud iam workload-identity-pools providers create-oidc &lt;span class="s2"&gt;"github-pool-prod"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--workload-identity-pool&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"github-pool"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--location&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"global"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--issuer-uri&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://token.actions.githubusercontent.com"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--attribute-mapping&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"google.subject=assertion.sub,attribute.environment=assertion.repo_property_environment,attribute.repository=assertion.repository"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--attribute-condition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"attribute.environment == 'prod'"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 AWS IAM — Inject the Custom Property into the &lt;code&gt;sub&lt;/code&gt; Claim
&lt;/h3&gt;

&lt;p&gt;AWS &lt;code&gt;sts:AssumeRoleWithWebIdentity&lt;/code&gt; only natively pattern-matches the &lt;code&gt;sub&lt;/code&gt; claim. Fortunately, GitHub now supports &lt;code&gt;include_claim_keys&lt;/code&gt; to &lt;strong&gt;inject custom properties into the &lt;code&gt;sub&lt;/code&gt; claim itself&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/deploy-prod.yml&lt;/span&gt;
&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::123456789012:role/manoit-prod-deploy&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ap-northeast-2&lt;/span&gt;
          &lt;span class="c1"&gt;# Reshape the sub claim to be ABAC-friendly — include repo_property_environment&lt;/span&gt;
          &lt;span class="na"&gt;role-session-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gh-actions-${{ github.run_id }}&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;ACTIONS_ID_TOKEN_REQUEST_AUDIENCE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sts.amazonaws.com&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Federated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRoleWithWebIdentity"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"StringLike"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"token.actions.githubusercontent.com:sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"repo:manoit/*:environment:prod:property:environment=prod"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"StringEquals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"token.actions.githubusercontent.com:aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Caution:&lt;/strong&gt; When using &lt;code&gt;StringLike&lt;/code&gt; wildcards in AWS trust policies, failing to anchor the prefix risks privilege explosion. &lt;strong&gt;Always anchor the org prefix&lt;/strong&gt; with &lt;code&gt;repo:manoit/*&lt;/code&gt;, and split environment labels (&lt;code&gt;prod&lt;/code&gt;/&lt;code&gt;staging&lt;/code&gt;) into separate IAM Roles. Also, injecting custom fields into the &lt;code&gt;sub&lt;/code&gt; claim makes tokens larger — review whether a single ABAC ledger policy could satisfy the same intent more cheaply.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  4. Azure VNET Failover — Multi-Region DR at the Workflow Layer
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Azure private networking&lt;/strong&gt; feature for GitHub-hosted runners (GA in November 2023) lets workflows reach resources inside corporate Azure VNETs (e.g., Private Endpoints, ExpressRoute-connected on-prem) without IP whitelisting. The catch was single-subnet dependency: if that region (e.g., Korea Central) went down, every workflow company-wide stopped — a hard SPOF.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;VNET Failover&lt;/strong&gt; preview, released April 2, removes this SPOF. Two key points:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Before (Single Subnet)&lt;/th&gt;
&lt;th&gt;April Update (VNET Failover)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Configuration unit&lt;/td&gt;
&lt;td&gt;1 Primary subnet&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Primary + Secondary subnet&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Region constraint&lt;/td&gt;
&lt;td&gt;Same region recommended&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Secondary can be in a different Azure region&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Behavior on outage&lt;/td&gt;
&lt;td&gt;Workflow fails&lt;/td&gt;
&lt;td&gt;Routed to secondary subnet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Switchover&lt;/td&gt;
&lt;td&gt;Admin manual reconfig (minutes)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Instant switch via UI/REST API&lt;/strong&gt; (manual in preview)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Load balancing&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None (currently active/passive)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Register failover network — REST API&lt;/span&gt;
&lt;span class="c"&gt;# Prerequisite: Primary VNET/subnet and Secondary VNET/subnet must exist in each region&lt;/span&gt;

&lt;span class="nv"&gt;ORG&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;manoit
&lt;span class="nv"&gt;SETTINGS_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;12345

curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$GITHUB_TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Accept: application/vnd.github+json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  https://api.github.com/orgs/&lt;span class="nv"&gt;$ORG&lt;/span&gt;/settings/network-configurations/&lt;span class="nv"&gt;$SETTINGS_ID&lt;/span&gt;/failover &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "name": "kr-south-failover",
    "subnet_resource_id": "/subscriptions/.../resourceGroups/manoit-dr-rg/providers/Microsoft.Network/virtualNetworks/manoit-dr-vnet/subnets/runners",
    "azure_region": "koreasouth"
  }'&lt;/span&gt;

&lt;span class="c"&gt;# Trigger failover during an incident (manual in preview)&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; PATCH &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$GITHUB_TOKEN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  https://api.github.com/orgs/&lt;span class="nv"&gt;$ORG&lt;/span&gt;/settings/network-configurations/&lt;span class="nv"&gt;$SETTINGS_ID&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{ "active_subnet": "secondary" }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4.1 ManoIT-Recommended Architecture — Korea Central + Korea South Active-Passive
&lt;/h3&gt;

&lt;p&gt;Based on ManoIT's experience with Korean enterprise customers, here's the recommended pattern.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Primary (Korea Central)&lt;/th&gt;
&lt;th&gt;Secondary (Korea South)&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;VNET CIDR&lt;/td&gt;
&lt;td&gt;10.20.0.0/16&lt;/td&gt;
&lt;td&gt;10.21.0.0/16&lt;/td&gt;
&lt;td&gt;Must not overlap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runner subnet&lt;/td&gt;
&lt;td&gt;10.20.10.0/24 (recommend /24, 256 IPs)&lt;/td&gt;
&lt;td&gt;10.21.10.0/24&lt;/td&gt;
&lt;td&gt;Max concurrent runners ≈ IPs - 5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Private Endpoint (ACR)&lt;/td&gt;
&lt;td&gt;kr-central acr-pe&lt;/td&gt;
&lt;td&gt;kr-south acr-pe&lt;/td&gt;
&lt;td&gt;Geo-replicated ACR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VNET Peering&lt;/td&gt;
&lt;td&gt;Bidirectional + Allow forwarded traffic (both regions)&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;For post-failover data sync&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Key Vault&lt;/td&gt;
&lt;td&gt;Premium SKU + soft-delete&lt;/td&gt;
&lt;td&gt;Geo-replicated&lt;/td&gt;
&lt;td&gt;Federated credential shared&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monitoring&lt;/td&gt;
&lt;td&gt;Azure Monitor + Action Group → PagerDuty (both regions)&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;RTO target ≤ 15 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Caution:&lt;/strong&gt; &lt;strong&gt;Automatic failover is not active in preview&lt;/strong&gt;. An admin must trigger it via UI or REST API, so a pipeline of off-hours alert → on-call automation (e.g., PagerDuty incident → ChatOps bot → REST call) must be built upfront to meet your RTO. GitHub plans to add automatic triggers at GA, but you'll still need to validate that the "regional outage detection heuristic" matches your in-house SLA.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  5. Immutable Subject Claims — The End of the Name Reuse Attack
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Immutable Subject Claims&lt;/strong&gt; announcement on April 23 has the largest security impact of the four April updates. The legacy default &lt;code&gt;sub&lt;/code&gt; claim looked like &lt;code&gt;repo:octocat/my-repo:ref:refs/heads/main&lt;/code&gt; — composed entirely of &lt;strong&gt;mutable names&lt;/strong&gt;. If an organization recycles a deleted repo's name, &lt;strong&gt;a new owner could mint tokens with the identical &lt;code&gt;sub&lt;/code&gt; claim&lt;/strong&gt;. This gap has been an open issue (GitHub Roadmap #1230) tracked for nearly a year and reported in real attack scenarios.&lt;/p&gt;

&lt;p&gt;The new format embeds &lt;strong&gt;immutable &lt;code&gt;owner_id&lt;/code&gt; and &lt;code&gt;repository_id&lt;/code&gt;&lt;/strong&gt; into the &lt;code&gt;sub&lt;/code&gt; claim.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Legacy &lt;code&gt;sub&lt;/code&gt;
&lt;/th&gt;
&lt;th&gt;New &lt;code&gt;sub&lt;/code&gt; (new repos after 2026-06-18)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Format&lt;/td&gt;
&lt;td&gt;&lt;code&gt;repo:octocat/my-repo:ref:refs/heads/main&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;repo:octocat-123456/my-repo-456789:ref:refs/heads/main&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Key difference&lt;/td&gt;
&lt;td&gt;Same &lt;code&gt;sub&lt;/code&gt; reissuable on name reuse&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;owner_id&lt;/code&gt;/&lt;code&gt;repo_id&lt;/code&gt; are permanent identifiers&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Name reuse attack&lt;/td&gt;
&lt;td&gt;Possible&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Blocked&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rollout&lt;/td&gt;
&lt;td&gt;All repos (current)&lt;/td&gt;
&lt;td&gt;Auto for repos created after 2026-06-18; existing repos opt-in&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The practical impact splits two ways.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;(A) Newly created repos&lt;/strong&gt; — automatically get the new format. If your cloud trust policies only pattern-match the short legacy &lt;code&gt;sub&lt;/code&gt;, token validation will fail. Extend those policies to match both formats before June 18.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;(B) Existing repos&lt;/strong&gt; — no auto migration, but opt-in is available, and we strongly recommend completing migration within nine months. Roll it out in stages so policies don't break.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Federated"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRoleWithWebIdentity"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"StringLike"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"token.actions.githubusercontent.com:sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"repo:manoit/*:ref:refs/heads/main"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="s2"&gt;"repo:manoit-987654/*:ref:refs/heads/main"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"StringEquals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"token.actions.githubusercontent.com:aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sts.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In parallel, adopt &lt;strong&gt;the operating rule of looking up your &lt;code&gt;org_id&lt;/code&gt; once and baking it into policies&lt;/strong&gt;. Then policies remain valid even if the org is renamed later.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Look up your org_id&lt;/span&gt;
gh api /orgs/manoit &lt;span class="nt"&gt;--jq&lt;/span&gt; &lt;span class="s1"&gt;'.id'&lt;/span&gt;   &lt;span class="c"&gt;# e.g. 987654&lt;/span&gt;

&lt;span class="c"&gt;# Pre-collect repo IDs to also pin them in policy for safety&lt;/span&gt;
gh api /repos/manoit/payment-service &lt;span class="nt"&gt;--jq&lt;/span&gt; &lt;span class="s1"&gt;'.id'&lt;/span&gt;  &lt;span class="c"&gt;# e.g. 13579024&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  6. Unified Workflow — A Production Pipeline Using All Four Updates
&lt;/h2&gt;

&lt;p&gt;Finally, here's a single workflow that combines all four April updates. It mirrors how ManoIT's &lt;code&gt;payment-service&lt;/code&gt; flows from PR to main merge — Service Container overrides, OIDC Custom Properties, VNET-failover-aware runner selection, and Immutable &lt;code&gt;sub&lt;/code&gt; compatibility, all together.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;payment-service-cicd&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;branches&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;main&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;   &lt;span class="c1"&gt;# OIDC token requests&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
  &lt;span class="na"&gt;pull-requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;

&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;AWS_REGION&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ap-northeast-2&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Integration Test (ARM64)&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest-arm&lt;/span&gt;
    &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;postgres&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres:16-alpine&lt;/span&gt;
        &lt;span class="na"&gt;entrypoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/usr/local/bin/docker-entrypoint.sh&lt;/span&gt;
        &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;-&lt;/span&gt;
          &lt;span class="s"&gt;postgres&lt;/span&gt;
          &lt;span class="s"&gt;-c shared_preload_libraries=pg_stat_statements&lt;/span&gt;
          &lt;span class="s"&gt;-c log_statement=all&lt;/span&gt;
          &lt;span class="s"&gt;-c max_connections=200&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;POSTGRES_USER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;POSTGRES_DB&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;appdb&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5432:5432"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
        &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;-&lt;/span&gt;
          &lt;span class="s"&gt;--health-cmd "pg_isready -U app -d appdb" --health-interval 5s&lt;/span&gt;
          &lt;span class="s"&gt;--health-timeout 3s --health-retries 12&lt;/span&gt;
      &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis:7-alpine&lt;/span&gt;
        &lt;span class="c1"&gt;# Force ACL-enforced boot mode — previously required a wrapper image&lt;/span&gt;
        &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;&amp;gt;-&lt;/span&gt;
          &lt;span class="s"&gt;redis-server&lt;/span&gt;
          &lt;span class="s"&gt;--aclfile /etc/redis/users.acl&lt;/span&gt;
          &lt;span class="s"&gt;--appendonly yes&lt;/span&gt;
        &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;6379:6379"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v5&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v5&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;22'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;cache&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;npm'&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm ci&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run test:integration&lt;/span&gt;

  &lt;span class="na"&gt;deploy-prod&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deploy to Prod (Azure Private Network)&lt;/span&gt;
    &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
    &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github.ref == 'refs/heads/main'&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest-arm-azure-private&lt;/span&gt;  &lt;span class="c1"&gt;# VNET-failover-enabled hosted runner&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prod&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v5&lt;/span&gt;

      &lt;span class="c1"&gt;# 1) Azure Federated Identity — auth via repo_property_environment=prod claim&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azure/login@v2&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;client-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.AZURE_PROD_CLIENT_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;tenant-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.AZURE_TENANT_ID }}&lt;/span&gt;
          &lt;span class="na"&gt;subscription-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ vars.AZURE_PROD_SUBSCRIPTION_ID }}&lt;/span&gt;

      &lt;span class="c1"&gt;# 2) Helm deploy (talks to AKS Control Plane via Private Endpoint)&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azure/aks-set-context@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;resource-group&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manoit-prod-rg&lt;/span&gt;
          &lt;span class="na"&gt;cluster-name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;manoit-aks-prod&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
          &lt;span class="s"&gt;helm upgrade --install payment ./chart \&lt;/span&gt;
            &lt;span class="s"&gt;--namespace payment --create-namespace \&lt;/span&gt;
            &lt;span class="s"&gt;--values ./chart/values.prod.yaml \&lt;/span&gt;
            &lt;span class="s"&gt;--set image.tag=${{ github.sha }} \&lt;/span&gt;
            &lt;span class="s"&gt;--atomic --timeout 5m&lt;/span&gt;

      &lt;span class="c1"&gt;# 3) AWS S3 backup of payment logs (immutable sub + ABAC simultaneously satisfied)&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::123456789012:role/manoit-prod-deploy&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ env.AWS_REGION }}&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws s3 sync ./logs s3://manoit-prod-logs/payment/&lt;/span&gt;

      &lt;span class="c1"&gt;# 4) Comment back on the PR with deployment result (observability)&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/github-script@v7&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;const sha = context.sha.substring(0, 7);&lt;/span&gt;
            &lt;span class="s"&gt;github.rest.issues.createComment({&lt;/span&gt;
              &lt;span class="s"&gt;owner: context.repo.owner, repo: context.repo.repo,&lt;/span&gt;
              &lt;span class="s"&gt;issue_number: context.payload.pull_request?.number ?? 0,&lt;/span&gt;
              &lt;span class="s"&gt;body: `Deployed payment-service@${sha} to prod (AKS + S3) via VNET-routed runner.`&lt;/span&gt;
            &lt;span class="s"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  7. ManoIT Adoption Checklist — Before the June 18 Cutover
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Effort&lt;/th&gt;
&lt;th&gt;Done When&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Service Container wrapper image inventory&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;grep -R "options:.*entrypoint" .github/&lt;/code&gt; across all repos&lt;/td&gt;
&lt;td&gt;0.5 day&lt;/td&gt;
&lt;td&gt;List enumerated + prioritized&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Phased wrapper image removal&lt;/td&gt;
&lt;td&gt;Apply the 4-step migration from Section 2.1&lt;/td&gt;
&lt;td&gt;1 week per image&lt;/td&gt;
&lt;td&gt;Required check switched + 1-week canary clean&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Repository Custom Properties schema&lt;/td&gt;
&lt;td&gt;Define &lt;code&gt;environment&lt;/code&gt;, &lt;code&gt;data_classification&lt;/code&gt;, &lt;code&gt;cost_center&lt;/code&gt; first&lt;/td&gt;
&lt;td&gt;1 day&lt;/td&gt;
&lt;td&gt;Org schema registered + ≥95% repos populated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Enable OIDC claim inclusion&lt;/td&gt;
&lt;td&gt;Settings → Actions → OIDC → check entries&lt;/td&gt;
&lt;td&gt;0.5 hour&lt;/td&gt;
&lt;td&gt;Sample workflow token decoded + validated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure/AWS/GCP trust policy ABAC migration&lt;/td&gt;
&lt;td&gt;Phased rollout per Section 3 examples&lt;/td&gt;
&lt;td&gt;2 weeks&lt;/td&gt;
&lt;td&gt;Per-repo IAM Role count reduced ≥50%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure VNET Failover setup&lt;/td&gt;
&lt;td&gt;Korea Central + Korea South dual subnet + peering&lt;/td&gt;
&lt;td&gt;1 week&lt;/td&gt;
&lt;td&gt;Manual failover sim with RTO ≤ 15 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Immutable Subject Claims compatibility check&lt;/td&gt;
&lt;td&gt;Extend policies to allow both old and new &lt;code&gt;sub&lt;/code&gt; patterns&lt;/td&gt;
&lt;td&gt;3 days&lt;/td&gt;
&lt;td&gt;Token validation passes on a repo created post-2026-06-18&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-call runbook update&lt;/td&gt;
&lt;td&gt;Add VNET failover and OIDC policy change scenarios&lt;/td&gt;
&lt;td&gt;1 day&lt;/td&gt;
&lt;td&gt;Quarterly drill executed once&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  8. Conclusion — What the April Updates Predict for the Next 12 Months
&lt;/h2&gt;

&lt;p&gt;The four changes in April 2026 are not coincidence. &lt;strong&gt;GitHub is redefining GitHub Actions from a "workflow automation tool" into an "enterprise IAM, network, and runtime control plane,"&lt;/strong&gt; and the April updates are the unmistakable signal of that redefinition. Service Container overrides end the burden of in-house images, Repository Custom Properties tightly bind governance metadata to tokens, VNET Failover lifts availability SLAs from the 99.9% range toward 99.95%, and Immutable Subject Claims close the oldest OIDC trust gap.&lt;/p&gt;

&lt;p&gt;In H2 2026, expect Native Egress Firewall (announced earlier in the year roadmap), expansion of Action Allowlisting to Free/Team plans, and broader adoption of the OIDC &lt;code&gt;check_run_id&lt;/code&gt; claim. ManoIT strongly recommends absorbing all of the Early April 2026 updates before the June 18 cutover. For adoption consulting, please reach out at &lt;a href="https://www.manoit.co.kr" rel="noopener noreferrer"&gt;www.manoit.co.kr&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was authored and reviewed by ManoIT's automated blog pipeline running on Anthropic Claude Opus 4.6, with every code snippet cross-checked against official GitHub Actions, GitHub Docs, Azure, AWS, and GCP references as of 2026-04-28. Always validate in your own environment before applying to production.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Originally published in Korean at &lt;a href="https://www.manoit.co.kr" rel="noopener noreferrer"&gt;manoit.co.kr&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="http://www.manoit.co.kr/forum/view/1462833" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devops</category>
      <category>github</category>
      <category>security</category>
      <category>automation</category>
    </item>
    <item>
      <title>Meta Llama 4 Scout &amp; Maverick — The Complete Production Guide: 17B Active MoE, 10M Context, iRoPE, and the vLLM/Ollama Deployment Playbook</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Mon, 27 Apr 2026 00:19:03 +0000</pubDate>
      <link>https://dev.to/x4nent/meta-llama-4-scout-maverick-the-complete-production-guide-17b-active-moe-10m-context-irope-2ca4</link>
      <guid>https://dev.to/x4nent/meta-llama-4-scout-maverick-the-complete-production-guide-17b-active-moe-10m-context-irope-2ca4</guid>
      <description>&lt;h1&gt;
  
  
  Meta Llama 4 Scout &amp;amp; Maverick — The Complete Production Guide: 17B Active MoE, 10M Context, iRoPE, and the vLLM/Ollama Deployment Playbook
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Published on the ManoIT Tech Blog (Korean original).&lt;/strong&gt; On April 5, 2026, Meta released &lt;strong&gt;Llama 4 Scout&lt;/strong&gt; and &lt;strong&gt;Llama 4 Maverick&lt;/strong&gt; — the first open-weights models in the Llama family to use a &lt;strong&gt;Mixture-of-Experts (MoE)&lt;/strong&gt; architecture, the first to be &lt;strong&gt;natively multimodal&lt;/strong&gt;, and — with Scout — the first to deliver a real &lt;strong&gt;10M-token context window&lt;/strong&gt; that runs on a single H100 GPU. This post unpacks the architecture, benchmarks, license caveats, deployment options, Llama Guard 4 safety stack, and the Behemoth delay — from a production-deployment perspective grounded in what ManoIT recommends to customers.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  1. The Llama 4 family at a glance — Scout / Maverick / Behemoth
&lt;/h2&gt;

&lt;p&gt;Llama 4 was always planned as a &lt;strong&gt;three-model family&lt;/strong&gt;. April 5 brought two models; the largest one, Behemoth, is still in private training. Knowing where each model fits is the first decision in any adoption review.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Active / Total params&lt;/th&gt;
&lt;th&gt;Experts&lt;/th&gt;
&lt;th&gt;Context&lt;/th&gt;
&lt;th&gt;Primary use&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Llama 4 Scout&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;17B / 109B&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;10M tokens&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Long-document analysis, code-base RAG, video summarization&lt;/td&gt;
&lt;td&gt;Released 2026-04-05 (Hugging Face, Ollama)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Llama 4 Maverick&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;17B / 400B&lt;/td&gt;
&lt;td&gt;128&lt;/td&gt;
&lt;td&gt;1M tokens&lt;/td&gt;
&lt;td&gt;Multimodal assistant, GPT-4o replacement&lt;/td&gt;
&lt;td&gt;Released 2026-04-05 (Hugging Face, Ollama, watsonx)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Llama 4 Behemoth&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;288B / ~2T&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;private&lt;/td&gt;
&lt;td&gt;Teacher model for Scout/Maverick, STEM-heavy&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Delayed to fall 2026&lt;/strong&gt; (internal evaluation)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The pattern is &lt;strong&gt;"same 17B active, different expert pool."&lt;/strong&gt; Scout routes each token to 1-of-16 experts; Maverick routes 1-of-128. Active parameters being equal means the per-token compute cost is the same — but the diversity of the expert pool changes the model's expressive ceiling. That's why Maverick beats GPT-4o on multimodal aggregates and Scout fits in a single-H100 + Int4 footprint.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. iRoPE architecture — the secret behind a 10M context
&lt;/h2&gt;

&lt;p&gt;Scout's 10M-token reach comes from &lt;strong&gt;iRoPE (Interleaved Rotary Position Embeddings)&lt;/strong&gt;. Traditional transformers apply a position encoding (RoPE) at every attention layer — but as the context grows, the position signal becomes noise and length generalization collapses.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Traditional limitation&lt;/th&gt;
&lt;th&gt;iRoPE answer&lt;/th&gt;
&lt;th&gt;Operational effect&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RoPE on every layer → can't generalize beyond 8K~128K training length&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Interleave a NoPE (no position encoding) layer every 4 layers&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Train at 256K, extrapolate at inference to 10M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long-context noise blurs token-token relationships&lt;/td&gt;
&lt;td&gt;NoPE layers do global attention over the full causal mask&lt;/td&gt;
&lt;td&gt;Near-perfect needle-in-haystack at 10M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RoPE index overflow at long contexts&lt;/td&gt;
&lt;td&gt;NoPE layers remove absolute positional dependency&lt;/td&gt;
&lt;td&gt;Whole-video and whole-codebase indexing become viable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The idea is simple: &lt;strong&gt;mix layers that need positional encoding with ones that don't.&lt;/strong&gt; RoPE layers learn local order; NoPE layers freely connect distant tokens by meaning. The result is reliable generalization to context lengths the model never saw at training time — which is why Scout retrieves accurately at 10M tokens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────── Llama 4 Scout attention stack (concept) ───────────┐
│                                                                    │
│   Layer 1  : RoPE Attention   (learns local order)                 │
│   Layer 2  : RoPE Attention                                        │
│   Layer 3  : RoPE Attention                                        │
│   Layer 4  : NoPE Attention   ← global causal mask, position-free  │
│   Layer 5  : RoPE Attention                                        │
│   Layer 6  : RoPE Attention                                        │
│   Layer 7  : RoPE Attention                                        │
│   Layer 8  : NoPE Attention   ← every 4 layers                     │
│   ...                                                              │
│                                                                    │
│   Each token routed to 1-of-16 experts (Top-1 MoE)                 │
│   → Active params at inference: 17B / Total params: 109B           │
└────────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3. Benchmarks — Llama 4 vs GPT-4o vs Gemini 2.0 Flash
&lt;/h2&gt;

&lt;p&gt;Meta's own numbers show clear wins for in-class comparisons. As always — &lt;strong&gt;these are vendor benchmarks; run your own evals before committing.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Scout 17B/16E&lt;/th&gt;
&lt;th&gt;Maverick 17B/128E&lt;/th&gt;
&lt;th&gt;GPT-4o&lt;/th&gt;
&lt;th&gt;Gemini 2.0 Flash&lt;/th&gt;
&lt;th&gt;What it means&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MMLU-Pro&lt;/td&gt;
&lt;td&gt;74.3&lt;/td&gt;
&lt;td&gt;80.5&lt;/td&gt;
&lt;td&gt;78.0&lt;/td&gt;
&lt;td&gt;77.6&lt;/td&gt;
&lt;td&gt;Maverick wins multi-domain reasoning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MATH (Hendrycks)&lt;/td&gt;
&lt;td&gt;50.3&lt;/td&gt;
&lt;td&gt;61.2&lt;/td&gt;
&lt;td&gt;76.6&lt;/td&gt;
&lt;td&gt;56.1&lt;/td&gt;
&lt;td&gt;STEM gap vs GPT-4o / o-series remains&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPQA Diamond&lt;/td&gt;
&lt;td&gt;57.2&lt;/td&gt;
&lt;td&gt;69.8&lt;/td&gt;
&lt;td&gt;53.6&lt;/td&gt;
&lt;td&gt;60.1&lt;/td&gt;
&lt;td&gt;Graduate-level science — Maverick #1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChartQA (image)&lt;/td&gt;
&lt;td&gt;88.8&lt;/td&gt;
&lt;td&gt;90.0&lt;/td&gt;
&lt;td&gt;85.7&lt;/td&gt;
&lt;td&gt;87.3&lt;/td&gt;
&lt;td&gt;Multimodal chart understanding — both SOTA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DocVQA&lt;/td&gt;
&lt;td&gt;94.4&lt;/td&gt;
&lt;td&gt;94.4&lt;/td&gt;
&lt;td&gt;92.8&lt;/td&gt;
&lt;td&gt;92.1&lt;/td&gt;
&lt;td&gt;Document-image QA — new bar set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MMMU&lt;/td&gt;
&lt;td&gt;69.4&lt;/td&gt;
&lt;td&gt;73.4&lt;/td&gt;
&lt;td&gt;69.1&lt;/td&gt;
&lt;td&gt;71.7&lt;/td&gt;
&lt;td&gt;Multimodal exam aggregate — Maverick wins&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long Context (10M, NIAH)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~99%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;~64K limit&lt;/td&gt;
&lt;td&gt;~1M limit&lt;/td&gt;
&lt;td&gt;Scout's exclusive territory&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The figure that should give a buyer pause is &lt;strong&gt;MATH&lt;/strong&gt;. Llama 4 caught up or pulled ahead in general and multimodal domains, but pure math/STEM still favors OpenAI's o-series by a clear margin. If your workload is &lt;strong&gt;STEM-heavy reasoning&lt;/strong&gt;, Llama 4 alone is not enough — pair it with o-series or wait for Behemoth GA. If your workload is &lt;strong&gt;long-document RAG, multimodal assistants, or codebase analysis&lt;/strong&gt;, Llama 4 is the strongest cost-to-quality option on the market today.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. License — "Open Weights" but not OSI Open Source
&lt;/h2&gt;

&lt;p&gt;Llama 4 ships under the &lt;strong&gt;Llama 4 Community License Agreement&lt;/strong&gt; — and this is the place buyers most often misread. It is &lt;em&gt;not&lt;/em&gt; Apache 2.0 or MIT, and it does &lt;em&gt;not&lt;/em&gt; meet the OSI Open Source definition. Five clauses to inspect before you ship.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Clause&lt;/th&gt;
&lt;th&gt;Constraint&lt;/th&gt;
&lt;th&gt;What to do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;① 700M MAU cap&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;If your product exceeds 700M monthly active users, you must negotiate a separate license with Meta&lt;/td&gt;
&lt;td&gt;Startups and mid-market: irrelevant. Hyperscale SaaS: legal review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;② EU multimodal carve-out&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;EU users / EU companies cannot use the vision features (text-only allowed)&lt;/td&gt;
&lt;td&gt;EU services must disable Maverick vision or fork to a text-only path&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;③ Acceptable Use Policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Threat modeling, malicious cybersec, CSAM, fraud, etc. are prohibited&lt;/td&gt;
&lt;td&gt;Pair with Llama Guard 4 input/output filtering + your own content policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;④ "Built with Llama" attribution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Products built on Llama must display "Built with Llama"&lt;/td&gt;
&lt;td&gt;Add to About page / docs footer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;⑤ Derivative naming&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Derivative model names must start with "Llama-"&lt;/td&gt;
&lt;td&gt;e.g. &lt;code&gt;Llama-Manoit-Customer-Support-v1&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;One-liner: &lt;strong&gt;"Commercial use is fine, but check the five clauses — 700M MAU, EU vision, AUP, attribution, naming."&lt;/strong&gt; For most Korean SaaS, B2B tools, and internal projects clause 1 is moot. EU multimodal services need clause 2 in their pre-launch checklist.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Deployment options — Ollama, vLLM, watsonx.ai, Hugging Face
&lt;/h2&gt;

&lt;p&gt;Llama 4 launched day-one on &lt;strong&gt;Hugging Face, Ollama, watsonx.ai, Together, Fireworks, Groq, and Oracle OCI&lt;/strong&gt;. Pick by workload character:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Option&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;th&gt;Strengths&lt;/th&gt;
&lt;th&gt;Constraints&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ollama (local)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Development, PoC, offline tools&lt;/td&gt;
&lt;td&gt;One-line &lt;code&gt;ollama run llama4:scout&lt;/code&gt;, automatic GGUF quantization&lt;/td&gt;
&lt;td&gt;Single-node, no multi-GPU sharding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;vLLM (self-hosted)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low-latency production serving&lt;/td&gt;
&lt;td&gt;PagedAttention, continuous batching, OpenAI-compatible API&lt;/td&gt;
&lt;td&gt;You operate the GPUs and NCCL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HF TGI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hugging Face standardization&lt;/td&gt;
&lt;td&gt;Token streaming, tensor parallelism&lt;/td&gt;
&lt;td&gt;Slightly lower throughput than vLLM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;watsonx.ai / Bedrock&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Enterprise compliance&lt;/td&gt;
&lt;td&gt;VPC isolation, SOC2, HIPAA, EU data residency&lt;/td&gt;
&lt;td&gt;3-5× the per-token cost vs self-hosting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Together / Fireworks / Groq&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cheap token consumption&lt;/td&gt;
&lt;td&gt;$0.27/1M tokens (Maverick at some vendors)&lt;/td&gt;
&lt;td&gt;Vendor lock-in, residency review needed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  5.1 vLLM production deploy (Maverick, 8×H100, FP8)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Step 1: get a Hugging Face token&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;HF_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"hf_xxxxxxxxxxxxxxxxxxxxxxx"&lt;/span&gt;

&lt;span class="c"&gt;# Step 2: install vLLM 0.7.0+ (Llama 4 MoE support)&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--upgrade&lt;/span&gt; vllm

&lt;span class="c"&gt;# Step 3: serve Maverick (8×H100, FP8 quantization)&lt;/span&gt;
vllm serve meta-llama/Llama-4-Maverick-17B-128E-Instruct &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--tensor-parallel-size&lt;/span&gt; 8 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--quantization&lt;/span&gt; fp8 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-model-len&lt;/span&gt; 1048576 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--gpu-memory-utilization&lt;/span&gt; 0.9 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--enable-prefix-caching&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--port&lt;/span&gt; 8000

&lt;span class="c"&gt;# Step 4: hit the OpenAI-compatible API&lt;/span&gt;
curl http://localhost:8000/v1/chat/completions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct",
    "messages": [{"role": "user", "content": "Summarize Llama 4 adoption trends in one paragraph"}]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5.2 Ollama local (Scout)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Step 1: install Ollama (macOS / Linux)&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://ollama.com/install.sh | sh

&lt;span class="c"&gt;# Step 2: pull Scout (~60GB at Q4_K_M)&lt;/span&gt;
ollama pull llama4:scout

&lt;span class="c"&gt;# Step 3: interactive&lt;/span&gt;
ollama run llama4:scout

&lt;span class="c"&gt;# Step 4: API mode (port 11434)&lt;/span&gt;
curl http://localhost:11434/api/chat &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
  "model": "llama4:scout",
  "messages": [
    {"role": "user", "content": "How would I summarize a 10MB log file in a single inference?"}
  ]
}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Practical hardware floor: Scout fits in &lt;strong&gt;a single H100 (80GB) at Int4&lt;/strong&gt;, Maverick wants &lt;strong&gt;at least 4×H100 (80GB)&lt;/strong&gt;. A Mac Studio M2 Ultra (192GB) can run Scout Q4 for PoC purposes at 5-10 tok/s.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Llama Guard 4 — the 12B multimodal safety stack
&lt;/h2&gt;

&lt;p&gt;Alongside Llama 4, Meta released &lt;strong&gt;Llama Guard 4 (12B)&lt;/strong&gt; — a multimodal classifier that scores both text and images against 13 risk categories (violence, self-harm, sexual content, CSAM, hate, criminal facilitation, CBRN, etc.). The recommended ManoIT pipeline applies guards on &lt;strong&gt;both input and output&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────── User request ────────────────────────┐
│  POST /v1/chat   { messages: [...] }                        │
└──────────────────────────────┬───────────────────────────────┘
                               │
                               ▼
              ┌────────────────────────────────┐
              │  ① Prompt Guard (86M, fast)     │
              │  Jailbreak / Prompt Injection   │
              └────────────────────────────────┘
                               │ pass
                               ▼
              ┌────────────────────────────────┐
              │  ② Llama Guard 4 (12B)          │
              │  Input safety classifier (S1~13)│
              └────────────────────────────────┘
                               │ safe
                               ▼
              ┌────────────────────────────────┐
              │  ③ Llama 4 Maverick / Scout     │
              │  Generate response              │
              └────────────────────────────────┘
                               │
                               ▼
              ┌────────────────────────────────┐
              │  ④ Llama Guard 4 (output)       │
              │  unsafe → block / redact        │
              └────────────────────────────────┘
                               │ safe
                               ▼
                       Final response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Guard model&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;Latency impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prompt Guard 2 (86M)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;First-pass jailbreak / prompt-injection filter&lt;/td&gt;
&lt;td&gt;86M (DistilBERT-class)&lt;/td&gt;
&lt;td&gt;+10~20ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Llama Guard 4 (12B)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;13-category classifier (input + output)&lt;/td&gt;
&lt;td&gt;12B multimodal&lt;/td&gt;
&lt;td&gt;+150~300ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CyberSec Eval&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Detect security flaws in generated code&lt;/td&gt;
&lt;td&gt;Eval framework&lt;/td&gt;
&lt;td&gt;Pre-deploy static check&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  7. The Behemoth delay — what it means
&lt;/h2&gt;

&lt;p&gt;Llama 4 Behemoth (2T total / 288B active) was &lt;em&gt;not&lt;/em&gt; part of the April 5 release. It was originally targeted for April, slipped to June, and is now expected in &lt;strong&gt;fall 2026&lt;/strong&gt;. Reporting attributes the delay to internal concerns about whether Behemoth's gain over Maverick justifies a public rollout.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Date&lt;/th&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2026-01&lt;/td&gt;
&lt;td&gt;Behemoth ~75% trained, April launch announced&lt;/td&gt;
&lt;td&gt;Planned co-release with Scout / Maverick&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2026-04-05&lt;/td&gt;
&lt;td&gt;Only Scout / Maverick released; Behemoth slips to June&lt;/td&gt;
&lt;td&gt;"Need more internal evaluation"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2026-05-15&lt;/td&gt;
&lt;td&gt;Behemoth pushed to fall 2026&lt;/td&gt;
&lt;td&gt;Strong on STEM, unclear lift on general workloads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2026-06-10&lt;/td&gt;
&lt;td&gt;Meta superintelligence lab reportedly forms&lt;/td&gt;
&lt;td&gt;Top-tier model team carved out, Behemoth realigned&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The delay does &lt;strong&gt;not&lt;/strong&gt; affect a Llama 4 adoption decision. Scout and Maverick already compete with GPT-4o and Gemini 2.0 Flash, and the reason enterprises pick open-weights models is &lt;strong&gt;data sovereignty, on-prem control, and customization&lt;/strong&gt; — not a single benchmark crown. If STEM is your hot path, hybrid with Claude Opus 4.7 or OpenAI o-series until Behemoth GA.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. ManoIT 8-step production checklist
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Check&lt;/th&gt;
&lt;th&gt;Concrete action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;① License review&lt;/td&gt;
&lt;td&gt;5 clauses — 700M MAU, EU vision, AUP, "Built with Llama", naming&lt;/td&gt;
&lt;td&gt;Legal review → impact memo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;② Model selection&lt;/td&gt;
&lt;td&gt;Scout vs Maverick by workload (text / multimodal / context length)&lt;/td&gt;
&lt;td&gt;Long-doc RAG → Scout, multimodal → Maverick&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;③ Infra design&lt;/td&gt;
&lt;td&gt;FP8 / Int4 quantization, TP degree, KV cache memory budget&lt;/td&gt;
&lt;td&gt;Maverick: 8×H100 FP8, Scout: 1×H100 Int4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;④ Serving stack&lt;/td&gt;
&lt;td&gt;vLLM / TGI / Ollama choice, autoscaling policy&lt;/td&gt;
&lt;td&gt;vLLM 0.7.0+ recommended, KEDA for GPU pool autoscale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⑤ Safety stack&lt;/td&gt;
&lt;td&gt;Prompt Guard 2 + Llama Guard 4 on input &lt;em&gt;and&lt;/em&gt; output&lt;/td&gt;
&lt;td&gt;Budget +200ms latency, define category policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⑥ Eval pipeline&lt;/td&gt;
&lt;td&gt;Korean benchmarks + domain golden datasets&lt;/td&gt;
&lt;td&gt;KMMLU, KoBEST, in-house RAG accuracy automated&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⑦ Observability&lt;/td&gt;
&lt;td&gt;Token throughput, GPU util, guard block rate, RAG hit rate&lt;/td&gt;
&lt;td&gt;OpenTelemetry GenAI semconv + Grafana&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⑧ Governance&lt;/td&gt;
&lt;td&gt;Model card, retention, audit log, rollback runbook&lt;/td&gt;
&lt;td&gt;Model card + prompt versioning + sampled response retention&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  9. Field guide — which model for which job
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Recommended&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;50 contracts side-by-side, clause comparison&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Scout&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;10M context, single node, lowest cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Whole-codebase RAG (200K LOC)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Scout&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fit the entire repo in context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multimodal customer-support chatbot (image + text)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Maverick&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;DocVQA / ChartQA leader, GPT-4o replacement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20-hour video summarization&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Scout&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Native long-context, full-video indexing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;STEM math&lt;/td&gt;
&lt;td&gt;OpenAI o-series · Claude Opus&lt;/td&gt;
&lt;td&gt;MATH/AIME gap remains, wait for Behemoth GA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Real-time voice assistant&lt;/td&gt;
&lt;td&gt;Maverick + Whisper (STT)&lt;/td&gt;
&lt;td&gt;&amp;lt;500ms response, multimodal context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-prem medical / financial RAG&lt;/td&gt;
&lt;td&gt;Scout + Llama Guard 4&lt;/td&gt;
&lt;td&gt;Data sovereignty, 700M MAU irrelevant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EU-resident multimodal service&lt;/td&gt;
&lt;td&gt;Maverick (text only) or GPT-4o&lt;/td&gt;
&lt;td&gt;EU vision license restriction&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  10. Conclusion — what Llama 4 changes, and what it doesn't
&lt;/h2&gt;

&lt;p&gt;The significance of Scout and Maverick is that &lt;strong&gt;open-weights models reached SOTA in multimodal and ultra-long-context territory&lt;/strong&gt;. Through 2024-2025 open models could compete with GPT-4o on English text reasoning, but multimodal and &amp;gt;100K-token contexts had a real gap. iRoPE and native multimodal training closed it.&lt;/p&gt;

&lt;p&gt;What did &lt;em&gt;not&lt;/em&gt; change: &lt;strong&gt;the STEM / math reasoning gap vs OpenAI o-series remains&lt;/strong&gt;, and that won't close until Behemoth GA. &lt;strong&gt;"Open Weights" is not "Open Source"&lt;/strong&gt; — the five Llama 4 Community License clauses (especially EU multimodal exclusion and 700M MAU) require a legal review before adoption. And &lt;strong&gt;safety and governance cost&lt;/strong&gt; does not vanish because the weights are public — Llama Guard 4 + Prompt Guard + your own eval pipeline are not optional.&lt;/p&gt;

&lt;p&gt;ManoIT's recommendation for Q2 2026: standardize on &lt;strong&gt;Scout as the default RAG engine&lt;/strong&gt; (legal, technical document analysis), &lt;strong&gt;Maverick as the multimodal assistant backbone&lt;/strong&gt; (customer support, automation workflows). Hybridize with Claude Opus 4.7 or OpenAI o-series where STEM is the hot path. For multimodal services with EU exposure, gate adoption on a license review. We will revisit this guidance when Behemoth GA arrives in fall 2026.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;This article was co-authored by the ManoIT engineering team with Anthropic Claude Opus 4.7. Korean original published on the ManoIT tech blog. Sources: &lt;a href="https://ai.meta.com/blog/llama-4-multimodal-intelligence/" rel="noopener noreferrer"&gt;Meta AI - The Llama 4 herd&lt;/a&gt;, &lt;a href="https://www.llama.com/models/llama-4/" rel="noopener noreferrer"&gt;Llama 4 Official Model Page&lt;/a&gt;, &lt;a href="https://huggingface.co/meta-llama/Llama-4-Scout-17B-16E" rel="noopener noreferrer"&gt;Hugging Face Llama-4-Scout&lt;/a&gt;, &lt;a href="https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct" rel="noopener noreferrer"&gt;Hugging Face Llama-4-Maverick&lt;/a&gt;, &lt;a href="https://huggingface.co/blog/llama4-release" rel="noopener noreferrer"&gt;Hugging Face Llama 4 Release Blog&lt;/a&gt;, &lt;a href="https://www.ibm.com/new/announcements/meta-llama-4-maverick-and-llama-4-scout-now-available-in-watsonx-ai" rel="noopener noreferrer"&gt;IBM watsonx.ai Llama 4 Announcement&lt;/a&gt;, &lt;a href="https://ollama.com/library/llama4" rel="noopener noreferrer"&gt;Ollama Llama 4 Library&lt;/a&gt;, &lt;a href="https://www.computerworld.com/article/3987990/meta-hits-pause-on-llama-4-behemoth-ai-model-amid-capability-concerns.html" rel="noopener noreferrer"&gt;Computerworld - Behemoth Pause&lt;/a&gt;, &lt;a href="https://protectai.com/blog/vulnerability-assessment-llama-4" rel="noopener noreferrer"&gt;Protect AI - Llama 4 Vulnerability Assessment&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://www.manoit.co.kr/forum/view/1461827" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>machinelearning</category>
      <category>opensource</category>
    </item>
    <item>
      <title>OpenTelemetry eBPF Instrumentation (OBI) — The Complete Guide: KubeCon EU 2026 Beta Launch, Zero-Code Observability, and the 1.0 GA Roadmap</title>
      <dc:creator>daniel jeong</dc:creator>
      <pubDate>Thu, 23 Apr 2026 00:15:08 +0000</pubDate>
      <link>https://dev.to/x4nent/opentelemetry-ebpf-instrumentation-obi-the-complete-guide-kubecon-eu-2026-beta-launch-5e2o</link>
      <guid>https://dev.to/x4nent/opentelemetry-ebpf-instrumentation-obi-the-complete-guide-kubecon-eu-2026-beta-launch-5e2o</guid>
      <description>&lt;h1&gt;
  
  
  OpenTelemetry eBPF Instrumentation (OBI) — The Complete Guide: KubeCon EU 2026 Beta Launch, Zero-Code Observability, and the 1.0 GA Roadmap
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Published on the ManoIT Tech Blog (Korean original).&lt;/strong&gt; On April 2026 at KubeCon + CloudNativeCon Europe in Amsterdam, Splunk formally announced the beta launch of &lt;strong&gt;OpenTelemetry eBPF Instrumentation (OBI)&lt;/strong&gt; — the OpenTelemetry community's successor to Grafana Beyla. This post walks through v0.8.0 architecture, Kubernetes Helm deployment, HTTP header enrichment for multi-tenant incident response, the 2026 roadmap toward 1.0 GA, how OBI relates to Beyla/Pixie/Tetragon/Hubble, and a production adoption checklist grounded in what ManoIT ships to customers.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  1. Why OBI matters — the zero-code observability inflection point
&lt;/h2&gt;

&lt;p&gt;CNCF's Observability TAG reported in Q1 2026 that &lt;strong&gt;67% of production Kubernetes clusters&lt;/strong&gt; are already running at least one eBPF-based observability tool. But the existing landscape was fragmented: Pixie was tied to New Relic, Grafana Beyla skewed toward Grafana Cloud, and Cilium Hubble stopped at L3/L4 network flows without application-level tracing. OBI cleans this up using the &lt;strong&gt;OpenTelemetry Protocol (OTLP)&lt;/strong&gt; and an &lt;strong&gt;Apache 2.0&lt;/strong&gt; license.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pain point&lt;/th&gt;
&lt;th&gt;Traditional workaround&lt;/th&gt;
&lt;th&gt;OBI answer&lt;/th&gt;
&lt;th&gt;Operational effect&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Go/Rust/C++ binary auto-instrumentation&lt;/td&gt;
&lt;td&gt;OTel SDK insertion, rebuild required&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;eBPF uprobes + kprobes, zero code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Legacy and third-party binaries get visibility immediately&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TLS-encrypted traffic tracing&lt;/td&gt;
&lt;td&gt;Sidecar proxy (Envoy/Istio) injection&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Kernel-level SSL_read/SSL_write hooks&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;HTTPS payloads observable without sidecars&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-tenant SaaS incident triage&lt;/td&gt;
&lt;td&gt;"Error rate up" — no idea which tenant&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;HTTP header enrichment (v0.7.0+)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Filter by &lt;code&gt;x-tenant-id&lt;/code&gt; / &lt;code&gt;x-user-segment&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SQL / Redis / Mongo query analysis&lt;/td&gt;
&lt;td&gt;ORM instrumentation + sampling&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Native server spans (pgx, mysql, mongo, redis, couchbase)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;DB latency linked to app traces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI / Anthropic call tracing&lt;/td&gt;
&lt;td&gt;Manual wrappers + custom token counting&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;GenAI instrumentation with payload extraction&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;LLM cost and latency collected automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In one line: &lt;strong&gt;OBI collects only what the kernel can tell it, and leaves the SDK alone.&lt;/strong&gt; If you still need custom business events or application-specific attributes, OBI is designed to run &lt;strong&gt;alongside&lt;/strong&gt; the OpenTelemetry SDKs — it fills visibility gaps, it doesn't replace language-level instrumentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. OBI architecture — from Beyla to OBI
&lt;/h2&gt;

&lt;p&gt;OBI's technical lineage is &lt;strong&gt;Grafana Beyla&lt;/strong&gt;. Grafana Labs donated Beyla to OpenTelemetry in 2025; a weekly SIG formed, test pipeline speeds improved &lt;strong&gt;10×&lt;/strong&gt;, and after a late-2025 alpha release the project reached &lt;strong&gt;v0.8.0 on April 16, 2026&lt;/strong&gt;. It ships as a binary, as a Docker image (&lt;code&gt;otel/ebpf-instrument&lt;/code&gt;), and as a Helm chart.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────── User-Space Agent ───────────────────────────┐
│  (obi binary, written in Go)                                           │
│                                                                        │
│  ┌──────────────┐  ┌───────────────┐  ┌────────────────────────────┐ │
│  │ eBPF Map     │→ │ Span Builder  │→ │ OTLP Exporter (gRPC/HTTP) │ │
│  │ Reader       │  │ (HTTP/gRPC/DB)│  │ → OTel Collector           │ │
│  └──────────────┘  └───────────────┘  └────────────────────────────┘ │
│         ↑                                                              │
│         │ eBPF maps (perf_event_array, ring_buffer)                    │
└─────────┼──────────────────────────────────────────────────────────────┘
          │
┌─────────┼────────────────── Kernel-Space Probes ───────────────────────┐
│   ┌──────────────┐  ┌──────────────┐  ┌──────────────┐               │
│   │ uprobes      │  │ kprobes      │  │ tracepoints  │               │
│   │ (SSL_read,   │  │ (tcp_sendmsg,│  │ (sched, fs)  │               │
│   │  SSL_write)  │  │  tcp_recvmsg)│  │              │               │
│   └──────────────┘  └──────────────┘  └──────────────┘               │
│           │                                                            │
│           ▼ Linux 5.8+ kernel (RHEL 4.18+ backport), BTF required      │
└────────────────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two design decisions drive the low overhead: &lt;strong&gt;kernel probes only capture raw events&lt;/strong&gt;, leaving heavy parsing, filtering, and mapping to the user-space agent — which keeps kernel-side CPU cost minimal. And because the output is &lt;strong&gt;OTLP&lt;/strong&gt;, a single OBI deployment can feed Jaeger, Tempo, Splunk APM, Grafana Cloud, or Honeycomb through the same OpenTelemetry Collector.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;System requirement&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Linux kernel&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;5.8+&lt;/strong&gt; (RHEL/Rocky/Alma 4.18+ with eBPF backport)&lt;/td&gt;
&lt;td&gt;BTF (BPF Type Format) required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architecture&lt;/td&gt;
&lt;td&gt;amd64, arm64&lt;/td&gt;
&lt;td&gt;Graviton / Ampere supported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privileges&lt;/td&gt;
&lt;td&gt;root or &lt;code&gt;CAP_BPF + CAP_SYS_PTRACE&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Configure DaemonSet securityContext&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pod settings&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;hostPID: true&lt;/code&gt; recommended&lt;/td&gt;
&lt;td&gt;Required to discover host-namespace processes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Container image&lt;/td&gt;
&lt;td&gt;&lt;code&gt;otel/ebpf-instrument:v0.8.0&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CycloneDX SBOM included&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;License&lt;/td&gt;
&lt;td&gt;Apache 2.0&lt;/td&gt;
&lt;td&gt;No commercial restrictions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  3. Kubernetes Helm deployment — cluster visibility in 15 minutes
&lt;/h2&gt;

&lt;p&gt;The officially recommended deployment topology is &lt;strong&gt;Helm + DaemonSet&lt;/strong&gt;. A DaemonSet is required because OBI must reach every node's process namespace, and &lt;code&gt;hostNetwork&lt;/code&gt; / &lt;code&gt;hostPID&lt;/code&gt; are wired automatically when deployed this way.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Step 1: add the OpenTelemetry Helm repository&lt;/span&gt;
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update

&lt;span class="c"&gt;# Step 2: default install (DaemonSet in the obi namespace)&lt;/span&gt;
helm &lt;span class="nb"&gt;install &lt;/span&gt;obi &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-n&lt;/span&gt; obi &lt;span class="nt"&gt;--create-namespace&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  open-telemetry/opentelemetry-ebpf-instrumentation

&lt;span class="c"&gt;# Step 3: verify the install&lt;/span&gt;
kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; obi get daemonset
kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; obi logs &lt;span class="nt"&gt;-l&lt;/span&gt; app.kubernetes.io/name&lt;span class="o"&gt;=&lt;/span&gt;obi &lt;span class="nt"&gt;--tail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;50

&lt;span class="c"&gt;# Step 4: confirm probes were loaded on each node&lt;/span&gt;
kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; obi &lt;span class="nb"&gt;exec &lt;/span&gt;ds/obi &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nb"&gt;ls&lt;/span&gt; /sys/fs/bpf/obi/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default install is enough to start collecting RED metrics and traces for every HTTP/gRPC request. In production you'll want to combine &lt;strong&gt;OTLP endpoint selection, service discovery, and header enrichment&lt;/strong&gt; into a custom values file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# helm-obi-prod.yaml — ManoIT production values&lt;/span&gt;
&lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# OTLP destination (e.g. OTel Collector ClusterIP service)&lt;/span&gt;
    &lt;span class="na"&gt;otel&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://otel-collector.observability.svc:4318&lt;/span&gt;
      &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http/protobuf&lt;/span&gt;
    &lt;span class="c1"&gt;# Automatic process discovery&lt;/span&gt;
    &lt;span class="na"&gt;discovery&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;k8s_namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^(shop|payment|auth)$"&lt;/span&gt;
          &lt;span class="na"&gt;k8s_pod_labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;obi.enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
      &lt;span class="na"&gt;exclude_services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;exe_path_regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.*/istio-proxy$"&lt;/span&gt;
    &lt;span class="c1"&gt;# ⚠️ CNIs using eBPF datapaths (Cilium eBPF, Calico eBPF) can collide&lt;/span&gt;
    &lt;span class="na"&gt;network&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;cidrs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;10.0.0.0/8&lt;/span&gt;
    &lt;span class="c1"&gt;# Protocol-specific instrumentation&lt;/span&gt;
    &lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;unmatched&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;heuristic&lt;/span&gt;   &lt;span class="c1"&gt;# generate spans even for unknown HTTP paths&lt;/span&gt;
    &lt;span class="na"&gt;log_level&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;info&lt;/span&gt;
    &lt;span class="na"&gt;log_format&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;json&lt;/span&gt;

&lt;span class="c1"&gt;# Pod securityContext — required eBPF capabilities&lt;/span&gt;
&lt;span class="na"&gt;securityContext&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;privileged&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BPF"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERFMON"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SYS_PTRACE"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NET_ADMIN"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;allowPrivilegeEscalation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
  &lt;span class="na"&gt;readOnlyRootFilesystem&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="c1"&gt;# DaemonSet resource limits&lt;/span&gt;
&lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;requests&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;100m&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;256Mi&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
  &lt;span class="na"&gt;limits&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;   &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;500m&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;512Mi&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Node selection (exclude ARM-only clusters, GPU nodes, etc.)&lt;/span&gt;
&lt;span class="na"&gt;nodeSelector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;kubernetes.io/arch&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;amd64&lt;/span&gt;
&lt;span class="na"&gt;tolerations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node.kubernetes.io/not-ready"&lt;/span&gt;
    &lt;span class="na"&gt;operator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Exists"&lt;/span&gt;
    &lt;span class="na"&gt;effect&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NoExecute"&lt;/span&gt;
    &lt;span class="na"&gt;tolerationSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;300&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm upgrade &lt;span class="nt"&gt;--install&lt;/span&gt; obi &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-n&lt;/span&gt; obi &lt;span class="nt"&gt;--create-namespace&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-f&lt;/span&gt; helm-obi-prod.yaml &lt;span class="se"&gt;\&lt;/span&gt;
  open-telemetry/opentelemetry-ebpf-instrumentation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;⚠️ &lt;strong&gt;Watch out:&lt;/strong&gt; on clusters running Cilium eBPF mode or the Calico eBPF dataplane, some of OBI's network probes can collide with CNI programs. The official recommendation is to set &lt;code&gt;network.enabled: false&lt;/code&gt; and split responsibilities: &lt;strong&gt;OBI for L7 application traces, Cilium Hubble or Tetragon for network flows and kernel-level security.&lt;/strong&gt; This role-split has become the CNCF-recommended pattern in 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. HTTP header enrichment — multi-tenant SaaS incident triage
&lt;/h2&gt;

&lt;p&gt;Header enrichment, introduced in &lt;strong&gt;v0.7.0&lt;/strong&gt;, is the feature that moves OBI from "monitoring tool" to &lt;strong&gt;incident-response platform&lt;/strong&gt;. Raw traces only tell you "error rate rose to 5% on this endpoint." Header enrichment tells you &lt;strong&gt;which tenants and which user segments are affected&lt;/strong&gt; — without any code change.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# OBI values.yaml — header enrichment policy&lt;/span&gt;
&lt;span class="na"&gt;config&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;ebpf&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;track_request_headers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
      &lt;span class="na"&gt;payload_extraction&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;enrichment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
            &lt;span class="na"&gt;policy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;default_action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;exclude&lt;/span&gt;         &lt;span class="c1"&gt;# never collect by default&lt;/span&gt;
              &lt;span class="na"&gt;obfuscation_string&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;***"&lt;/span&gt;       &lt;span class="c1"&gt;# mask sensitive headers&lt;/span&gt;
            &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="c1"&gt;# 1) tenant / segment identifiers → attach as span attributes&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;include&lt;/span&gt;
                &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;headers&lt;/span&gt;
                &lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;patterns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-tenant-id"&lt;/span&gt;
                    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-user-segment"&lt;/span&gt;
                    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-org-id"&lt;/span&gt;
              &lt;span class="c1"&gt;# 2) auth tokens → record a hash but mask the value&lt;/span&gt;
              &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;obfuscate&lt;/span&gt;
                &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;headers&lt;/span&gt;
                &lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;patterns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;authorization"&lt;/span&gt;
                    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cookie"&lt;/span&gt;
                    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-api-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The policy design is deliberately &lt;strong&gt;explicit allowlist + selective obfuscation&lt;/strong&gt;. The default is &lt;code&gt;exclude&lt;/code&gt;, so PII and tokens can't leak by accident. Pattern matching is case-insensitive and supports wildcards (&lt;code&gt;x-manoit-*&lt;/code&gt;).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Incident scenario&lt;/th&gt;
&lt;th&gt;With raw traces only&lt;/th&gt;
&lt;th&gt;With header enrichment&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Latency spike for one tenant&lt;/td&gt;
&lt;td&gt;"Payment API latency up" → page the entire support team&lt;/td&gt;
&lt;td&gt;Filter by &lt;code&gt;x-tenant-id&lt;/code&gt; → only two enterprise customers affected → notify their CSM directly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;B2B free vs. paid segmentation&lt;/td&gt;
&lt;td&gt;Only sees "error rate 1.2%"&lt;/td&gt;
&lt;td&gt;Filter &lt;code&gt;x-user-segment=paid&lt;/code&gt; → track SLA cohort separately&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regional issue&lt;/td&gt;
&lt;td&gt;Needs a separate ALB-log analysis&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;x-region&lt;/code&gt; header + OBI spans on the same Grafana panel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-API-key rate limiting&lt;/td&gt;
&lt;td&gt;401/429 visible, which key is unclear&lt;/td&gt;
&lt;td&gt;Masked &lt;code&gt;x-api-key&lt;/code&gt; hash identifies the top offenders&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  5. 2026 roadmap — four axes toward 1.0 GA
&lt;/h2&gt;

&lt;p&gt;The OBI SIG published its official 2026 roadmap with &lt;strong&gt;four axes&lt;/strong&gt;, each with a named sponsor and GitHub milestones — making it possible to back-calculate adoption timelines.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Goal&lt;/th&gt;
&lt;th&gt;Sponsor&lt;/th&gt;
&lt;th&gt;Key deliverables&lt;/th&gt;
&lt;th&gt;Production impact&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;① 1.0 Stable Release&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;@MrAlias&lt;/td&gt;
&lt;td&gt;JSON Schema validation, declarative config standard, telemetry schema, versioning policy, test coverage targets&lt;/td&gt;
&lt;td&gt;End of v0 breaking changes, LTS track becomes possible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;② Protocol expansion&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;@marctc, @NimrodAvni78&lt;/td&gt;
&lt;td&gt;MQTT, AMQP, NATS, Redis Pub/Sub, MongoDB enhancements, GCP/AWS/Azure SDKs, full gRPC context propagation&lt;/td&gt;
&lt;td&gt;Coverage of message brokers and cloud SDKs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;③ .NET support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;@rafaelroquetto&lt;/td&gt;
&lt;td&gt;.NET 8+, .NET Framework 4.x, 3.5 SP1 validation, distributed tracing + RED metrics verification&lt;/td&gt;
&lt;td&gt;Enterprise Windows workloads covered&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;④ Hybrid instrumentation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;@grcevski&lt;/td&gt;
&lt;td&gt;Consistent labels with SDK traces, metric exemplars, multi-language composition&lt;/td&gt;
&lt;td&gt;Organizations with existing SDK instrumentation can add OBI on top&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 1.0 GA checklist itself boils down to &lt;strong&gt;configuration documentation complete, JSON Schema validation defined, per-service + per-process configuration supported, telemetry schema adopted, versioning policy formalized, test coverage targets hit&lt;/strong&gt;. v0.8.0 today is roughly at 60% of that checklist; the community is targeting late-2026 for 1.0 GA.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. OBI vs. Beyla / Pixie / Tetragon / Cilium — drawing the lines
&lt;/h2&gt;

&lt;p&gt;With the 2026 eBPF observability space this crowded, positioning OBI correctly matters. Here's how it relates to the four most common adjacent projects.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Primary role&lt;/th&gt;
&lt;th&gt;Relationship to OBI&lt;/th&gt;
&lt;th&gt;Run side-by-side?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Grafana Beyla&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;HTTP/gRPC auto-tracing&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;OBI's direct ancestor&lt;/strong&gt; — Grafana Cloud users can stay on Beyla, OBI is the upstream successor&lt;/td&gt;
&lt;td&gt;New projects → OBI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Pixie&lt;/strong&gt; (New Relic)&lt;/td&gt;
&lt;td&gt;Auto APM + scripting (PxL)&lt;/td&gt;
&lt;td&gt;Pixie ties to New Relic's backend, OBI is vendor-neutral via OTLP&lt;/td&gt;
&lt;td&gt;Existing NR customers stay on Pixie; OTLP-first teams choose OBI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cilium Tetragon&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Process / file / network security + real-time enforcement (LSM)&lt;/td&gt;
&lt;td&gt;Different role — OBI observes applications, Tetragon detects and enforces security&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Run both&lt;/strong&gt; (two DaemonSets, different jobs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cilium Hubble&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;L3/L4 network flow + service map&lt;/td&gt;
&lt;td&gt;OBI is L7 (HTTP/gRPC) payloads; Hubble is L3/L4 packets&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Run both&lt;/strong&gt; — layered responsibility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Grafana Alloy + Pyroscope&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Continuous profiling (CPU, memory)&lt;/td&gt;
&lt;td&gt;OBI traces + RED; Pyroscope function-level profiles&lt;/td&gt;
&lt;td&gt;Run together → trace-to-profile drilldown&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In one sentence: &lt;strong&gt;OBI is the OTLP-standard implementation of L7 application observability.&lt;/strong&gt; L3/L4 security belongs to Tetragon, network flows to Hubble, profiling to Pyroscope, and OBI handles application traces, RED metrics, SQL, and GenAI. That &lt;strong&gt;role-separated architecture&lt;/strong&gt; has become the default CNCF observability stack in 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Production adoption checklist (8 steps)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Check&lt;/th&gt;
&lt;th&gt;Tool / command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;① Kernel&lt;/td&gt;
&lt;td&gt;All nodes on &lt;strong&gt;5.8+&lt;/strong&gt; with BTF enabled&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;uname -r&lt;/code&gt;, &lt;code&gt;ls /sys/kernel/btf/vmlinux&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;② CNI collision&lt;/td&gt;
&lt;td&gt;Audit for Cilium / Calico eBPF mode collisions&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cilium status&lt;/code&gt;, OBI &lt;code&gt;network.enabled: false&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;③ Privileges&lt;/td&gt;
&lt;td&gt;DaemonSet has &lt;code&gt;CAP_BPF + CAP_PERFMON + CAP_SYS_PTRACE&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Verify against PodSecurityAdmission&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;④ OTLP pipeline&lt;/td&gt;
&lt;td&gt;OTel Collector receives, backend (Tempo/Jaeger/Splunk) wired&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;otelcol validate&lt;/code&gt;, inspect receive metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⑤ Service discovery&lt;/td&gt;
&lt;td&gt;Allowlist of namespaces / labels for instrumentation&lt;/td&gt;
&lt;td&gt;Enforce labels via Kyverno / OPA Gatekeeper&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⑥ Header enrichment&lt;/td&gt;
&lt;td&gt;PII leakage prevented — obfuscate &lt;code&gt;authorization&lt;/code&gt;, &lt;code&gt;cookie&lt;/code&gt;, &lt;code&gt;api-key&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Test &lt;code&gt;obfuscation_string&lt;/code&gt; + allowlist&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⑦ Resource tuning&lt;/td&gt;
&lt;td&gt;On high-traffic nodes (10k RPS+) adjust CPU / memory limits&lt;/td&gt;
&lt;td&gt;Monitor Prometheus &lt;code&gt;obi_bpf_*&lt;/code&gt; metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;⑧ v0 instability&lt;/td&gt;
&lt;td&gt;v0 minor releases may break — pin versions&lt;/td&gt;
&lt;td&gt;GitOps: pin &lt;code&gt;v0.8.0&lt;/code&gt;, never floating &lt;code&gt;main&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  8. ManoIT production recommendations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Three-DaemonSet architecture&lt;/strong&gt;: OBI (L7 traces) + Cilium Hubble (L3/L4) + Tetragon (security). Realistic per-node budget is about &lt;code&gt;cpu: 800m / mem: 1.5Gi&lt;/code&gt; combined.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Progressive namespace rollout&lt;/strong&gt;: attach the &lt;code&gt;obi.enabled=true&lt;/code&gt; label only where you need it and expand horizontally over 1–2 weeks of observation. Rolling out to every namespace on day one will bottleneck the OTel Collector.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OTLP backend sampling&lt;/strong&gt;: OBI is designed for &lt;strong&gt;100% capture&lt;/strong&gt;, which can explode backend storage costs. Always set &lt;strong&gt;tail-based sampling&lt;/strong&gt; at the Collector (latency &amp;gt; 1s, all error traces, 1% of normal traces).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GenAI payload policy&lt;/strong&gt;: the OpenAI / Anthropic instrumentation is powerful, but &lt;strong&gt;system prompts and user input can land in traces&lt;/strong&gt;. Make &lt;code&gt;payload_extraction.genai.redact: true&lt;/code&gt; the default, and switch to a whitelist-only model when you need the payloads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Staging verification against v0&lt;/strong&gt;: v0 minors are allowed to break. Every OBI upgrade should re-validate the full trace field mapping in staging before rolling to prod.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Header enrichment governance&lt;/strong&gt;: every new header addition should go through &lt;strong&gt;security review&lt;/strong&gt;. Add "PII included? / obfuscation needed?" checkboxes to your Git PR template.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  9. Conclusion — observability moves to the kernel
&lt;/h2&gt;

&lt;p&gt;OBI's beta launch is a signal that observability's center of gravity is moving &lt;strong&gt;from SDKs to the kernel&lt;/strong&gt; and &lt;strong&gt;from vendor-specific agents to the OTLP standard&lt;/strong&gt;. Where OpenTelemetry in the early 2020s solved "vendor-locked instrumentation code," OBI in 2026 is solving "&lt;strong&gt;don't write that instrumentation code in the first place.&lt;/strong&gt;" Application teams focus on business logic; platform teams deliver consistent observability at the kernel layer. That separation of responsibilities is closer than ever to what SRE and DevSecOps communities have been chasing for a decade. ManoIT recommends evaluating OBI as the default observability layer for every Kubernetes 1.28+ environment, and we plan to adopt it into our customer standard stack once 1.0 GA lands in late 2026. Grafana Labs donating Beyla, @grcevski leading the SIG, and the Splunk observability team carrying it over the KubeCon finish line — that model of community collaboration is the reason OpenTelemetry is still, in 2026, the healthiest CNCF project in the ecosystem.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article was co-authored by the ManoIT engineering team together with Anthropic's Claude Opus 4.7, based on: &lt;a href="https://opentelemetry.io/docs/zero-code/obi/" rel="noopener noreferrer"&gt;OpenTelemetry eBPF Instrumentation official docs&lt;/a&gt;, &lt;a href="https://opentelemetry.io/blog/2026/obi-goals/" rel="noopener noreferrer"&gt;OBI 2026 Goals&lt;/a&gt;, &lt;a href="https://opentelemetry.io/blog/2026/obi-http-header-enrichment/" rel="noopener noreferrer"&gt;OBI HTTP Header Enrichment&lt;/a&gt;, &lt;a href="https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation" rel="noopener noreferrer"&gt;GitHub opentelemetry-ebpf-instrumentation v0.8.0&lt;/a&gt;, the &lt;a href="https://cloudnativenow.com/kubecon-cloudnativecon-europe-2026/splunk-introduces-opentelemetry-ebpf-instrumentation-and-kubernetes-operator-at-kubecon-eu-2026/" rel="noopener noreferrer"&gt;Splunk KubeCon EU 2026 announcement&lt;/a&gt;, and &lt;a href="https://opentelemetry.io/blog/2025/obi-announcing-first-release/" rel="noopener noreferrer"&gt;the OBI First Release Announcement&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="http://www.manoit.co.kr/forum/view/1460143" rel="noopener noreferrer"&gt;ManoIT Tech Blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>observability</category>
      <category>kubernetes</category>
      <category>devops</category>
      <category>opentelemetry</category>
    </item>
  </channel>
</rss>
