<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Matheus</title>
    <description>The latest articles on DEV Community by Matheus (@matheus_releaserun).</description>
    <link>https://dev.to/matheus_releaserun</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3758534%2Ffda69e43-38b0-48a9-8f55-71a14b1c7f3b.png</url>
      <title>DEV Community: Matheus</title>
      <link>https://dev.to/matheus_releaserun</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/matheus_releaserun"/>
    <language>en</language>
    <item>
      <title>Rust 1.94.0: array_windows, Cargo Config Includes, and 10 Breaking Changes You Should Know About</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Fri, 06 Mar 2026 19:05:53 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/rust-1940-arraywindows-cargo-config-includes-and-10-breaking-changes-you-should-know-about-5gc7</link>
      <guid>https://dev.to/matheus_releaserun/rust-1940-arraywindows-cargo-config-includes-and-10-breaking-changes-you-should-know-about-5gc7</guid>
      <description>&lt;p&gt;Rust 1.94.0 landed on March 5, 2026. Three headline features and a surprisingly long compatibility notes section.&lt;/p&gt;

&lt;p&gt;Here's what actually matters if you're shipping Rust in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Headlines
&lt;/h2&gt;

&lt;h3&gt;
  
  
  array_windows Finally Stabilized
&lt;/h3&gt;

&lt;p&gt;This one's been cooking since 2020. &lt;code&gt;array_windows&lt;/code&gt; gives you sliding window iteration over slices, but with compile-time known sizes instead of runtime slices.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Old way: runtime-sized windows, manual indexing&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="nf"&gt;.windows&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.any&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;// New way: destructure directly, compiler knows the size&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="nf"&gt;.as_bytes&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;.array_windows&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;.any&lt;/span&gt;&lt;span class="p"&gt;(|[&lt;/span&gt;&lt;span class="n"&gt;a1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a2&lt;/span&gt;&lt;span class="p"&gt;]|&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a1&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;b1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a1&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;a2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b1&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;b2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The real win here isn't just ergonomics. The compiler can eliminate bounds checks entirely because it knows the window size at compile time. If you're doing any kind of signal processing, pattern matching, or rolling calculations over slices, this is a free performance upgrade.&lt;/p&gt;

&lt;p&gt;The window size is inferred from usage too. That destructuring pattern &lt;code&gt;|[a1, b1, b2, a2]|&lt;/code&gt; tells the compiler you want windows of 4. No need to specify it explicitly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cargo Config Includes
&lt;/h3&gt;

&lt;p&gt;You can now split your &lt;code&gt;.cargo/config.toml&lt;/code&gt; across multiple files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="py"&gt;include&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;path&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ci.toml"&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="err"&gt;{&lt;/span&gt; &lt;span class="py"&gt;path&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"local-overrides.toml"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;optional&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="err"&gt;}&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is genuinely useful for teams. You can keep CI-specific settings, local developer overrides, and shared config separate without fighting merge conflicts in one massive config file. The &lt;code&gt;optional = true&lt;/code&gt; flag means you can have developer-specific files that don't need to exist for everyone.&lt;/p&gt;

&lt;p&gt;Monorepo teams will probably get the most out of this. Think shared build profiles, registry mirrors, or target-specific settings that only some developers need.&lt;/p&gt;

&lt;p&gt;Also worth noting: Cargo now records a &lt;code&gt;pubtime&lt;/code&gt; field in the registry index, tracking when each crate version was published. This lays groundwork for time-based dependency resolution in the future. crates.io is gradually backfilling existing packages.&lt;/p&gt;

&lt;h3&gt;
  
  
  TOML 1.1 in Cargo
&lt;/h3&gt;

&lt;p&gt;Cargo now parses TOML v1.1, which means you can finally write multi-line inline tables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# Before: everything crammed on one line&lt;/span&gt;
&lt;span class="py"&gt;serde&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"derive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"rc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"alloc"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# After: readable and trailing commas allowed&lt;/span&gt;
&lt;span class="py"&gt;serde&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="py"&gt;version&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="py"&gt;features&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="s"&gt;"derive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"rc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s"&gt;"alloc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One catch: if you use TOML 1.1 syntax in your &lt;code&gt;Cargo.toml&lt;/code&gt;, your development MSRV effectively becomes Rust 1.94. Cargo rewrites the manifest on &lt;code&gt;publish&lt;/code&gt; to stay compatible with older parsers, so your users won't be affected. But anyone building your crate from source with an older toolchain will hit parse errors.&lt;/p&gt;

&lt;p&gt;If you're maintaining a library with a strict MSRV policy, hold off on the new syntax for now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking Changes: The Real Release Notes
&lt;/h2&gt;

&lt;p&gt;This is where 1.94 gets interesting. Ten compatibility notes, and some of them will bite you.&lt;/p&gt;

&lt;h3&gt;
  
  
  Closure Capturing Behavior Changed
&lt;/h3&gt;

&lt;p&gt;The biggest one. How closures capture variables around pattern matching has been tightened up. Previously, a non-move closure might capture an entire variable by move in some pattern matching contexts. Now it captures only the parts it needs.&lt;/p&gt;

&lt;p&gt;Sounds good in theory, but it can cause new borrow checker errors where code previously compiled fine. It can also change when &lt;code&gt;Drop&lt;/code&gt; runs for partially captured values.&lt;/p&gt;

&lt;p&gt;If you have closures near &lt;code&gt;match&lt;/code&gt; or &lt;code&gt;if let&lt;/code&gt; expressions that suddenly stop compiling after upgrading, this is likely why.&lt;/p&gt;

&lt;h3&gt;
  
  
  Standard Library Macros Import Change
&lt;/h3&gt;

&lt;p&gt;Standard library macros (&lt;code&gt;println!&lt;/code&gt;, &lt;code&gt;vec!&lt;/code&gt;, &lt;code&gt;matches!&lt;/code&gt;, etc.) are now imported via the prelude instead of &lt;code&gt;#[macro_use]&lt;/code&gt;. This sounds like an internal change, but it has a visible effect: if you have a custom macro with the same name as a standard library macro and you glob-import it, you'll get an ambiguity error.&lt;/p&gt;

&lt;p&gt;The most common case: if you defined your own &lt;code&gt;matches!&lt;/code&gt; macro and glob-imported it. You'll need an explicit import to resolve which one you mean.&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;#![no_std]&lt;/code&gt; code that glob-imports from &lt;code&gt;std&lt;/code&gt;, you might see a new &lt;code&gt;ambiguous_panic_imports&lt;/code&gt; warning because both &lt;code&gt;core::panic!&lt;/code&gt; and &lt;code&gt;std::panic!&lt;/code&gt; are now in scope.&lt;/p&gt;

&lt;h3&gt;
  
  
  dyn Trait Lifetime Casting Restricted
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;dyn&lt;/code&gt; trait objects can no longer freely cast between different lifetime bounds. If you were doing something like casting &lt;code&gt;dyn Foo + 'a&lt;/code&gt; to &lt;code&gt;dyn Foo + 'b&lt;/code&gt;, the compiler now correctly rejects it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Shebang Lines in include!()
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;include!()&lt;/code&gt; in expression context no longer strips shebang lines (&lt;code&gt;#!/...&lt;/code&gt;). If you were including files that start with a shebang, they'll now fail to compile. The fix is to remove the shebang from included files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Other Compat Notes
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ambiguous glob reexports&lt;/strong&gt; are now visible cross-crate (may introduce new ambiguity errors in downstream crates)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Where-clause normalization&lt;/strong&gt; changed in well-formedness checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Codegen attributes on body-free trait methods&lt;/strong&gt; now produce a future compatibility warning (they had no effect anyway)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windows SystemTime&lt;/strong&gt; changes: &lt;code&gt;checked_sub_duration&lt;/code&gt; returns &lt;code&gt;None&lt;/code&gt; for times before the Windows epoch (Jan 1, 1601)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lifetime identifiers are now NFC normalized&lt;/strong&gt; (e.g. &lt;code&gt;'á&lt;/code&gt; written with combining characters vs precomposed). Edge case, but if you're generating Rust code programmatically, double check.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compiler filename handling overhauled&lt;/strong&gt; for cross-compiler consistency. Paths in diagnostics for local crates in Cargo workspaces are now relative instead of absolute. This can break CI scripts that grep compiler output for absolute paths.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Stabilized APIs Worth Knowing
&lt;/h2&gt;

&lt;p&gt;Beyond &lt;code&gt;array_windows&lt;/code&gt;, a few other stabilizations stand out:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;LazyCell::get&lt;/code&gt; and &lt;code&gt;LazyLock::get&lt;/code&gt;&lt;/strong&gt;: Check whether a lazy value has been initialized without forcing it. Useful for conditional logic around cached values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;Peekable::next_if_map&lt;/code&gt;&lt;/strong&gt;: Conditionally advance a peekable iterator and transform the value in one step. Cleaner than peek + next + map separately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;element_offset&lt;/code&gt;&lt;/strong&gt;: Get the index of an element in a slice from a reference to it. Handy when you have a reference into a slice and need to know where it is.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;f32/f64::consts::EULER_GAMMA&lt;/code&gt; and &lt;code&gt;GOLDEN_RATIO&lt;/code&gt;&lt;/strong&gt;: Mathematical constants added to the standard library. Minor, but saves you from defining them yourself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;f32/f64::mul_add&lt;/code&gt; now const&lt;/strong&gt;: Fused multiply-add in const contexts. Useful for compile-time math.&lt;/p&gt;

&lt;h2&gt;
  
  
  Platform and Compiler Notes
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;New tier 3 target: &lt;code&gt;riscv64im-unknown-none-elf&lt;/code&gt; (RISC-V without atomics)&lt;/li&gt;
&lt;li&gt;29 additional RISC-V target features stabilized, covering large parts of RVA22U64 and RVA23U64 profiles&lt;/li&gt;
&lt;li&gt;Unicode 17 support&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;BinaryHeap&lt;/code&gt; methods relaxed: some no longer require &lt;code&gt;T: Ord&lt;/code&gt; (for methods that don't need ordering)&lt;/li&gt;
&lt;li&gt;Error messages now use &lt;code&gt;annotate-snippets&lt;/code&gt; internally, so diagnostic output may look slightly different&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Upgrade Recommendation
&lt;/h2&gt;

&lt;p&gt;Rust 1.94 is a solid release. &lt;code&gt;array_windows&lt;/code&gt; alone makes it worth upgrading if you do any slice processing. The Cargo improvements are pure quality of life.&lt;/p&gt;

&lt;p&gt;The main risk is the closure capturing change. If you have a large codebase, run &lt;code&gt;cargo check&lt;/code&gt; before deploying and watch for new borrow checker errors around closures. The macro import change is lower risk but check if you have any custom macros that shadow stdlib names.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rustup update stable
cargo check  &lt;span class="c"&gt;# Run this before committing to the upgrade&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For teams on MSRV policies: 1.94 is safe to adopt as your new MSRV if you want the Cargo improvements. If you're maintaining a library, consider waiting one release cycle (until 1.95) to let the closure capturing changes settle and for downstream users to upgrade.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ReleaseRun Health Grade: A&lt;/strong&gt; (actively maintained, 6-week release cadence, no EOL concerns)&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Track Rust and 300+ other technologies at &lt;a href="https://releaserun.com" rel="noopener noreferrer"&gt;releaserun.com&lt;/a&gt;. Get version health grades, EOL alerts, and upgrade recommendations for your entire stack.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Keep Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://releaserun.com/rust-releases/" rel="noopener noreferrer"&gt;Rust Release History&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://releaserun.com/how-to-add-version-health-badges-to-your-project/" rel="noopener noreferrer"&gt;How to Add Version Health Badges&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/cargo-health/" rel="noopener noreferrer"&gt;Cargo Dependency Health Checker&lt;/a&gt; — paste your &lt;code&gt;Cargo.toml&lt;/code&gt; and check every crate for deprecation and latest versions. Free.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>programming</category>
      <category>devops</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Rust 1.93.0 release notes: SIMD, varargs, and the stuff that breaks builds</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:25:19 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/rust-1930-release-notes-simd-varargs-and-the-stuff-that-breaks-builds-d0j</link>
      <guid>https://dev.to/matheus_releaserun/rust-1930-release-notes-simd-varargs-and-the-stuff-that-breaks-builds-d0j</guid>
      <description>&lt;p&gt;I’ve watched “minor” Rust upgrades stall a release train for one dumb reason. Emscripten flags.&lt;/p&gt;

&lt;p&gt;Rust 1.93.0 lands with real wins for low-level work (SIMD on s390x, C-style variadic functions), plus a few changes that can trip CI in under 60 seconds if you ship WebAssembly or rely on sloppy tests.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 30-second upgrade call
&lt;/h2&gt;

&lt;p&gt;Upgrade if you hit FFI edges, ship on IBM Z, or you want stricter diagnostics before prod. Wait a week if your WebAssembly pipeline depends on Emscripten and you cannot spare an afternoon to chase linker flags.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High risk:&lt;/strong&gt; Emscripten unwinding ABI change for panic=unwind. Your build can fail at link time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Medium risk:&lt;/strong&gt; Stricter #[test] validation. Rust stops ignoring invalid placements and starts erroring.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low risk:&lt;/strong&gt; New lints and Cargo quality-of-life changes. You will mostly see warnings.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What actually changed (the parts you will notice)
&lt;/h2&gt;

&lt;p&gt;This bit me when a “harmless” std behavior change hid in a patch note. BTreeMap::append now stops overwriting existing keys when the incoming map contains a key you already have.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;New lints:&lt;/strong&gt; Rust now warns by default on &lt;strong&gt;const_item_interior_mutations&lt;/strong&gt; and &lt;strong&gt;function_casts_as_integer&lt;/strong&gt;. Expect fresh warnings in older codebases with clever const tricks or pointer-ish casts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cargo clean:&lt;/strong&gt; &lt;strong&gt;cargo clean --workspace&lt;/strong&gt; now cleans every package in a workspace, not just the current one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Built-in attributes:&lt;/strong&gt; Rust adds &lt;strong&gt;pin_v2&lt;/strong&gt; to the built-in attribute namespace.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Future incompat warnings:&lt;/strong&gt; Rust now warns about &lt;strong&gt;...&lt;/strong&gt; parameters without a pattern (outside extern blocks), repr(C) enums with discriminants outside c_int/c_uint, and repr(transparent) that “forgets” an inner repr(C) type.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;musl bump:&lt;/strong&gt; The bundled musl version moves to &lt;strong&gt;1.2.5&lt;/strong&gt;. This usually feels boring until your static builds stop matching yesterday’s container image.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  SIMD on s390x: useful, but only if you live there
&lt;/h2&gt;

&lt;p&gt;Most teams will not rewrite hot loops this quarter. Good.&lt;/p&gt;

&lt;p&gt;If you run on IBM Z, stabilized s390x vector target features matter because they let you ship one binary that checks CPU features at runtime, then takes the fast path. The macro to look for is &lt;strong&gt;is_s390x_feature_detected!&lt;/strong&gt;. The thing nobody mentions is the boring part: you still need a scalar fallback unless you control every machine you deploy to.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Where it pays off:&lt;/strong&gt; tight numeric code, compression, crypto-ish primitives, and batch processing where you touch big buffers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Where it wastes time:&lt;/strong&gt; request routing, JSON glue, anything dominated by syscalls or allocations.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;If you cannot test your CNI in staging, you should not be running Kubernetes. Same energy here. If you cannot test on the actual CPU, do not pretend your SIMD change helped.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  C-style variadic functions: great for FFI, still a foot-gun
&lt;/h2&gt;

&lt;p&gt;Variadics make FFI wrappers less awkward. They do not make them safe.&lt;/p&gt;

&lt;p&gt;Rust 1.93.0 stabilizes declaring C-style variadic functions for the &lt;strong&gt;system&lt;/strong&gt; ABI, which helps when you need to bind to APIs like printf-style functions. In most cases, you should wrap the variadic call in a tiny unsafe boundary and expose a non-variadic Rust API to the rest of your crate. Some folks skip that and export varargs directly. I do not.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Good pattern:&lt;/strong&gt; keep the extern varargs signature private, then build typed wrappers around the handful of formats you actually need.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bad pattern:&lt;/strong&gt; re-export varargs in your public Rust API and hope callers pass the right types on every platform.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Breaking changes that deserve a real test run
&lt;/h2&gt;

&lt;p&gt;Here’s the one that will ruin your afternoon. Emscripten.&lt;/p&gt;

&lt;p&gt;Rust changed the Emscripten unwinding ABI from JS exception handling to wasm exception handling when you compile with &lt;strong&gt;panic=unwind&lt;/strong&gt;. If you link C or C++ objects, you now need to pass &lt;strong&gt;-fwasm-exceptions&lt;/strong&gt; to the linker. If you build wasm once a month, you will forget this and rediscover it the hard way.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;#[test] validation:&lt;/strong&gt; Rust now errors when you slap &lt;strong&gt;#[test]&lt;/strong&gt; on structs, trait methods, or other invalid spots. Older code that “worked” only worked because Rust ignored it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;deref_nullptr lint:&lt;/strong&gt; Rust upgrades &lt;strong&gt;deref_nullptr&lt;/strong&gt; to deny-by-default. Builds can fail where they used to warn.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;offset_of! macro:&lt;/strong&gt; offset_of! now validates user-written types for well-formedness. Code that relied on sketchy layouts might stop compiling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cargo publish output:&lt;/strong&gt; cargo publish no longer leaves .crate files as a final artifact when build.build-dir is unset.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Known issues and the “annoying but real” corner cases
&lt;/h2&gt;

&lt;p&gt;I’ve seen this show up as a clean compile on one machine and a dead build in CI. Environment variables.&lt;/p&gt;

&lt;p&gt;A Cargo environment variable change around &lt;strong&gt;CARGO_CFG_DEBUG_ASSERTIONS&lt;/strong&gt; can break projects that depend on &lt;strong&gt;static-init&lt;/strong&gt; versions 1.0.1 to 1.0.3, typically with an unresolved module-style error. If your dependency tree includes static-init, test this upgrade before you merge a toolchain bump across all repos.&lt;/p&gt;

&lt;p&gt;Other stuff in this release: dependency bumps, some image updates, the usual.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I’d roll this out (without drama)
&lt;/h2&gt;

&lt;p&gt;Pin the toolchain first. Seriously.&lt;/p&gt;

&lt;p&gt;Update locally, then run the exact commands your CI runs, not the “happy path” build you do on your laptop. For prod systems, test this twice. For a dev sandbox, sure, yolo it on Friday.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Update:&lt;/strong&gt; rustup update stable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confirm:&lt;/strong&gt; rustc --version and check for 1.93.0&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build:&lt;/strong&gt; cargo build --release&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test:&lt;/strong&gt; cargo test and watch for new #[test] errors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scan warnings:&lt;/strong&gt; pay attention to const_item_interior_mutations and function_casts_as_integer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI:&lt;/strong&gt; align every runner image and container to the same toolchain&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Official notes
&lt;/h2&gt;

&lt;p&gt;Read the upstream release notes on GitHub if you ship to weird targets or you maintain a library. That page holds the exact wording and linked PRs.&lt;/p&gt;

&lt;p&gt;Anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keep Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/rust-releases/"&gt;Rust Release History&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/rust-1-93-function-casts-as-integer-lint/"&gt;function_casts_as_integer lint in Rust 1.93.0: How to Use&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What are the biggest changes in Rust 1.93.0?&lt;/strong&gt; Three headline features: (1) SIMD intrinsics for s390x architecture (niche but important for IBM mainframe teams), (2) C-style variadic functions using extern "C" fn with ... syntax for better FFI interop, and (3) several lint changes that may cause existing code to fail compilation. The lint changes are what most teams will actually notice during upgrades.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Will Rust 1.93.0 break my existing code?&lt;/strong&gt; Possibly. The function_casts_as_integer lint was elevated from warning to error, meaning code that casts function pointers to integers will now fail to compile. Check for patterns like fn_ptr as usize in your codebase. Additionally, some previously allowed implicit conversions in match arms are now flagged. Run cargo build with the new version in CI before committing to the upgrade.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How do I safely upgrade Rust in CI?&lt;/strong&gt; Pin your Rust version in rust-toolchain.toml (e.g., channel = "1.93.0"). Before upgrading: (1) Run cargo clippy with the new version to catch new warnings, (2) Run your full test suite, (3) Check for deprecation warnings that became errors. If anything breaks, you can temporarily allow specific lints with #[allow(lint_name)] while you fix the underlying code. Never upgrade Rust and merge in the same PR.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What are C-style variadic functions in Rust 1.93.0 used for?&lt;/strong&gt; They let you write Rust functions that accept a variable number of arguments, matching C's printf(const char*, ...) pattern. This is primarily useful for FFI (Foreign Function Interface) - if you're writing a Rust library that needs to expose a C-compatible API, you can now define variadic functions directly instead of using workarounds. For pure Rust code, macros remain the idiomatic way to handle variable arguments.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/rust-1-93-function-casts-as-integer-lint/" rel="noopener noreferrer"&gt;function_casts_as_integer Lint in Rust 1.93&lt;/a&gt; - How to fix the new lint&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/upgrade-rust-safely-rustup-toolchains-ci-pinning/" rel="noopener noreferrer"&gt;How to Upgrade Rust Safely&lt;/a&gt; - Without breaking CI&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/how-to-add-version-health-badges-to-your-project/" rel="noopener noreferrer"&gt;How to Add Version Health Badges&lt;/a&gt; - Track release health in your README&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/cargo-health/" rel="noopener noreferrer"&gt;Cargo Dependency Health Checker&lt;/a&gt; — paste your &lt;code&gt;Cargo.toml&lt;/code&gt; and check every crate for deprecation and latest versions. Free.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>programming</category>
      <category>devops</category>
      <category>security</category>
    </item>
    <item>
      <title>Node.js 25.6.0 Release Notes: What Breaks, What Changed, What I’d Test</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:24:42 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/nodejs-2560-release-notes-what-breaks-what-changed-what-id-test-3bg4</link>
      <guid>https://dev.to/matheus_releaserun/nodejs-2560-release-notes-what-breaks-what-changed-what-id-test-3bg4</guid>
      <description>&lt;p&gt;Another “maintenance” release. What broke this time, and why does it touch async tracking, networking headers, URL parsing, and OpenSSL?&lt;/p&gt;

&lt;p&gt;I’ve watched teams ship patch and minor updates on Friday, then spend Saturday bisecting TLS handshakes and weird latency spikes. Node.js 25.6.0 looks useful. It also pokes several sharp edges at once, so I would not treat it as a free win.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concerns first: the stuff the changelog won’t warn you about
&lt;/h2&gt;

&lt;p&gt;This bit me before.&lt;/p&gt;

&lt;p&gt;The release notes say a lot about new knobs, and almost nothing about the boring failure modes. If you run Node in production, those failure modes matter more than the feature bullets.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;“No known issues” does not mean “safe”:&lt;/strong&gt; The official notes list no known issues, but that just means nobody wrote them down there. I do not trust “known issues: none” from any project, especially right after a release drops.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Promise tracking can turn into a tax:&lt;/strong&gt; &lt;code&gt;async_hooks&lt;/code&gt; instrumentation often costs CPU and memory in promise-heavy code. Node adds a &lt;code&gt;trackPromises&lt;/code&gt; option, which is great, but the notes do not give overhead numbers. You need to measure your own workload.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TOS socket controls vary by OS:&lt;/strong&gt; Setting Type of Service sounds simple until you hit platform differences, privilege constraints, and “best effort” behavior. The release notes do not give you a support matrix in one place. Assume surprises unless you test on your exact fleet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenSSL bumps change real behavior:&lt;/strong&gt; Even when nobody calls it a breaking change, TLS stacks change. Cipher support, defaults, and edge-case handshakes can shift. If you talk to legacy endpoints, run handshake tests before you celebrate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;URL parser updates can change parsing outcomes:&lt;/strong&gt; Updating the Ada URL parser to a new version can change how weird inputs normalize. If you sign URLs, compare canonical forms, or parse user-supplied URLs, you should run a corpus test.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  So what actually changed in Node.js 25.6.0?
&lt;/h2&gt;

&lt;p&gt;Here’s the clean list, with the parts I’d pay attention to.&lt;/p&gt;

&lt;p&gt;Node.js shipped v25.6.0 on Feb 3, 2026. The headline items include promise lifecycle tracking in &lt;code&gt;async_hooks&lt;/code&gt;, a new Type of Service API on sockets, initial ESM support for embedders, and a handful of runtime and dependency updates.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;async_hooks: Promise lifecycle tracking:&lt;/strong&gt; Node adds a &lt;code&gt;trackPromises&lt;/code&gt; option to &lt;code&gt;async_hooks.createHook()&lt;/code&gt; so you can observe promise creation and settlement. They claim it helps reduce overhead when you do not need promise execution tracking, but you still need to validate overhead when you enable it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;net: Type of Service on sockets:&lt;/strong&gt; Node adds socket TOS controls via &lt;code&gt;socket.setTypeOfService(tos)&lt;/code&gt; and &lt;code&gt;socket.getTypeOfService()&lt;/code&gt;. The part to remember: this depends on the OS and network stack. Test it. Do not assume it changes packet handling in your environment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedder API: initial ESM entry points:&lt;/strong&gt; Embedders get initial support for loading ESM. “Initial” usually means “expect sharp corners,” so if you ship a custom embedder, plan time to read the PR and try it against your module loader setup.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;stream consumers: bytes():&lt;/strong&gt; &lt;code&gt;node:stream/consumers&lt;/code&gt; gains &lt;code&gt;bytes()&lt;/code&gt; to collect stream data into a &lt;code&gt;Uint8Array&lt;/code&gt;. If your code expects a &lt;code&gt;Buffer&lt;/code&gt;, check your call sites.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;test_runner: env option:&lt;/strong&gt; &lt;code&gt;test_runner.run()&lt;/code&gt; gets an &lt;code&gt;env&lt;/code&gt; option for isolated test environment variables. This can fix “why did CI pass but local failed” problems, unless your tests secretly depend on inherited env state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance: TextEncoder:&lt;/strong&gt; Node improves &lt;code&gt;TextEncoder.encode&lt;/code&gt; using simdutf. Great, but measure on your CPU and container base image. SIMD paths can behave differently across architectures.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;url: Ada parser update:&lt;/strong&gt; Node updates Ada to 3.4.2 with Unicode 17 support. If you do URL-heavy work, run regression tests against real inputs, not just “happy path” URLs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dependencies:&lt;/strong&gt; undici 7.19.2, corepack 0.34.6, nghttp3 1.15.0, ngtcp2 1.20.0, and OpenSSL 3.5.5. Dependency bumps cause most of the “nothing changed” outages I’ve seen, because they change behavior outside your diff.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I’d test in staging before I let this near production
&lt;/h2&gt;

&lt;p&gt;Test this twice.&lt;/p&gt;

&lt;p&gt;I’d wait a week for ecosystem noise, then I’d do a canary. Some folks skip canaries for minor releases. I do not, because I like sleeping.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TLS sanity:&lt;/strong&gt; Run a handshake suite against every external dependency that still scares you. Old proxies, legacy APIs, that one vendor endpoint that only fails in one region.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP client behavior (undici):&lt;/strong&gt; Hit your highest-QPS routes and watch connection reuse, timeouts, and error rates. If you pin undici behavior indirectly through frameworks, this matters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Async hot paths:&lt;/strong&gt; If you plan to use &lt;code&gt;trackPromises&lt;/code&gt;, load test with it on and off. Watch heap growth and p95 latency. If it adds 5 ms to a hot endpoint, you will feel it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;URL parsing regression corpus:&lt;/strong&gt; Feed the new URL parser the ugliest URLs you see in logs. Compare normalized outputs if you do redirects, signing, allowlists, or cache keys.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network TOS verification:&lt;/strong&gt; If you actually need TOS, capture packets and confirm the DSCP/TOS bits show up. “API exists” does not equal “network respects it.”&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Do not claim “no known issues.” Say “none listed in the official notes as of today,” then keep a rollback plan.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Recommendation (grudgingly)
&lt;/h2&gt;

&lt;p&gt;I’d wait.&lt;/p&gt;

&lt;p&gt;If you need promise lifecycle visibility right now, or you have a clear use case for Type of Service tagging, try 25.6.0 in staging and canary it into one slice of traffic. If your app runs fine on 25.5.x and you are not chasing one of these features, give it 7 days, watch for issue reports, then roll it out with a quick rollback path. Other stuff in this release: dependency bumps, some parser changes, the usual. There’s probably a better way to test this, but…&lt;/p&gt;

&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/node-20-vs-22-vs-24-which-node-js-lts-should-you-run-in-production/" rel="noopener noreferrer"&gt;Node 20 vs 22 vs 24: Which LTS to Run&lt;/a&gt; - The version decision for production Node.js&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/nodejs-20-end-of-life-migration-playbook/" rel="noopener noreferrer"&gt;Node.js 20 End of Life: Migration Playbook&lt;/a&gt; - EOL April 30, 2026&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/package-json-health/" rel="noopener noreferrer"&gt;npm Package Health Checker&lt;/a&gt; — paste your &lt;code&gt;package.json&lt;/code&gt; and check every dependency for deprecation and staleness. Free.&lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>VS Code 1.109.0 Release Notes: Claude Agents, Integrated Browser, and the Stuff People Actually Mention</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:24:07 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/vs-code-11090-release-notes-claude-agents-integrated-browser-and-the-stuff-people-actually-jgb</link>
      <guid>https://dev.to/matheus_releaserun/vs-code-11090-release-notes-claude-agents-integrated-browser-and-the-stuff-people-actually-jgb</guid>
      <description>&lt;p&gt;Reddit's already arguing about this one.&lt;/p&gt;

&lt;p&gt;The consensus seems to be "yeah, upgrade," mostly because 1.109 tightens Copilot Chat and adds an integrated browser preview, but Linux folks keep side-eyeing the Snap packaging situation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Community take: what people are saying this week
&lt;/h2&gt;

&lt;p&gt;I've watched teams treat VS Code updates like Chrome updates. They just happen, until they don't.&lt;/p&gt;

&lt;p&gt;On the k8s and devtools Slacks, the vibe around 1.109 feels practical: frontend folks like the in-editor browser, AI-heavy teams like having Claude in the mix, and ops-minded people immediately ask "where do the keys go?"&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Most teams are upgrading for Copilot Chat and agents:&lt;/strong&gt; The chatter I see centers on "agent sessions feel less sticky now" and "streaming feels snappier," not on big editor changes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linux users keep bringing up Snap disk usage:&lt;/strong&gt; Some teams report deleted files piling up in a snap-local Trash folder and eating disk. Others say "just use .deb and move on." Either way, do not write "no known issues."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend devs like the integrated browser preview:&lt;/strong&gt; As one SRE put it, "anything that kills the alt-tab loop is worth trying," but they still keep Chrome open for real debugging.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So. If you run VS Code via Snap on Linux, read the community reports before you hit update.&lt;/p&gt;

&lt;p&gt;If you install via .deb/.rpm or you sit on macOS/Windows, you probably won't notice drama. You'll just notice new toys.&lt;/p&gt;

&lt;h2&gt;
  
  
  Official changelog recap (what 1.109.0 actually ships)
&lt;/h2&gt;

&lt;p&gt;The official notes call out three big buckets: chat improvements, multi-agent workflows, and the new integrated browser preview.&lt;/p&gt;

&lt;p&gt;They also sneak in a couple operational changes people miss until their terminal stops working on an old Windows VM.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Agent support (Preview):&lt;/strong&gt; VS Code adds Claude agent support through Anthropic integration in Copilot Chat. Expect "preview" sharp edges, and expect your security team to ask where the Anthropic API key lives.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrated browser (Preview):&lt;/strong&gt; VS Code can open a browser inside the workbench with DevTools. You can test localhost flows and keep it in a tab next to your code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chat UX changes:&lt;/strong&gt; The notes mention faster streaming and better reasoning display. You feel this as "less waiting for the full blob," not as magic correctness.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two more official bits matter for upgrades, even if they don't sound exciting.&lt;/p&gt;

&lt;p&gt;VS Code deprecates the old Copilot extension in favor of Copilot Chat, and VS Code also removes winpty support, which can hit older Windows installs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Claude Agents in VS Code: what actually changes in your workflow
&lt;/h2&gt;

&lt;p&gt;This is the headline feature and it deserves more than a bullet point. Here's what you're actually getting.&lt;/p&gt;

&lt;p&gt;Claude integration in VS Code 1.109 means you can select Claude models (Sonnet, Opus) as your chat provider inside Copilot Chat. Previously, Copilot was locked to OpenAI models. Now you pick your model in the chat panel dropdown - no extension swapping, no separate window.&lt;/p&gt;

&lt;p&gt;The practical impact depends on what you do with chat:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Code review prompts:&lt;/strong&gt; Claude tends to catch more architectural issues and explain trade-offs in more depth than GPT-4 in my experience. If you use chat for "review this PR diff," Claude is worth trying here.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-file refactoring:&lt;/strong&gt; The agent mode lets Claude make edits across multiple files in one session. This is where "agent" actually means something - it proposes changes, you approve them, and it moves to the next file without losing context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test generation:&lt;/strong&gt; Claude's test output tends to be less boilerplate-heavy. If your previous experience with Copilot test generation was "it writes tests that test the mock, not the behavior," Claude does better here in most cases.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The setup is straightforward but the security implications aren't trivial:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// settings.json - add your Anthropic API key
{
  "github.copilot.chat.models": ["claude-sonnet-4-20250514"],
  "anthropic.apiKey": "sk-ant-..."  // or use env var
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Security note that matters:&lt;/strong&gt; Your API key lives in settings.json by default. If you sync settings across machines (which most people do), that key syncs too. Use an environment variable instead (&lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt;) or configure it per-workspace in &lt;code&gt;.vscode/settings.json&lt;/code&gt; and add that file to &lt;code&gt;.gitignore&lt;/code&gt;. Do not commit API keys. This sounds obvious until you realize VS Code Settings Sync makes it non-obvious.&lt;/p&gt;

&lt;p&gt;Also worth knowing: Claude in VS Code sends your code context to Anthropic's API. If you work on proprietary code, check your org's data handling policy before enabling this. The context window includes the active file, selected text, and referenced files - it's not sending your entire workspace, but it's sending more than people usually realize.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrated browser: killing the alt-tab loop (mostly)
&lt;/h2&gt;

&lt;p&gt;The integrated browser preview lets you open a Chromium-based browser tab inside VS Code, complete with DevTools. You get this via the Command Palette (&lt;code&gt;Ctrl+Shift+P&lt;/code&gt; → "Simple Browser: Show") or by clicking a URL in the terminal.&lt;/p&gt;

&lt;p&gt;What it actually does well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Localhost testing without leaving the editor:&lt;/strong&gt; If you're running a dev server on &lt;code&gt;localhost:3000&lt;/code&gt;, you can preview it in a VS Code tab. CSS changes, component renders, API responses - all visible without switching windows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DevTools in the same pane:&lt;/strong&gt; The embedded browser includes a basic DevTools panel. Network tab, console, elements inspector. Good enough for "why is this API call failing?" checks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Side-by-side layout:&lt;/strong&gt; Split your editor left, browser right. Edit a React component, see it render immediately. This is genuinely useful for UI work.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it doesn't replace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chrome DevTools for serious debugging:&lt;/strong&gt; The embedded DevTools are a subset. No Performance tab, no Lighthouse, no Application panel for service workers. For real performance work, you still need a full browser.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-browser testing:&lt;/strong&gt; It's Chromium only. If you need to test Firefox or Safari rendering, this doesn't help.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth flows involving redirects:&lt;/strong&gt; OAuth redirects, SSO flows, anything that bounces you through multiple domains - the embedded browser handles these unpredictably. Test auth flows in a real browser.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The practical verdict: great for fast feedback loops on UI changes, not a replacement for your actual browser. Think of it as a better Live Server, not a better Chrome.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's changing under the hood (the stuff that breaks things)
&lt;/h2&gt;

&lt;p&gt;Two deprecations in 1.109 that create real tickets if you miss them:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Old Copilot extension deprecated:&lt;/strong&gt; If your org installed the standalone "GitHub Copilot" extension separately from "GitHub Copilot Chat," the standalone one is now deprecated. It still works in 1.109 but expect it to stop working in a future release. The migration path is to use Copilot Chat for everything. If you manage a fleet, check which extension ID your deployment scripts install - &lt;code&gt;github.copilot&lt;/code&gt; (old) vs &lt;code&gt;github.copilot-chat&lt;/code&gt; (current).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;winpty removed:&lt;/strong&gt; VS Code drops winpty terminal support, which was the fallback for terminals on older Windows systems. If you run VS Code on Windows Server 2016 or earlier, or on any machine where ConPTY isn't available, your integrated terminal breaks silently. The fix is to use a Windows version that supports ConPTY (Windows 10 1809+), but "upgrade Windows" isn't always an option in enterprise environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  1.108 vs 1.109: what actually changed between versions
&lt;/h2&gt;

&lt;p&gt;If you're on 1.108 and wondering whether to jump, here's what the delta looks like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1.108 → 1.109 gains:&lt;/strong&gt; Claude agent support, integrated browser preview, faster chat streaming, multi-agent session management, Copilot extension consolidation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1.108 → 1.109 losses:&lt;/strong&gt; winpty terminal support (Windows), standalone Copilot extension (deprecated, still functional).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stability:&lt;/strong&gt; 1.108.2 was a recovery build (see our &lt;a href="https://releaserun.com/vscode-1-108-2-release-notes/" rel="noopener noreferrer"&gt;1.108.2 analysis&lt;/a&gt;), which means 1.108 had bumps. 1.109.0 ships without a recovery build so far - that's a good sign but not a guarantee.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you skipped 1.108 entirely (waited on 1.107), you're getting two releases of accumulated changes. Read our &lt;a href="https://releaserun.com/vscode-1-108-release-notes/" rel="noopener noreferrer"&gt;1.108 analysis&lt;/a&gt; too.&lt;/p&gt;

&lt;h2&gt;
  
  
  My synthesis: who should upgrade now, who should test first
&lt;/h2&gt;

&lt;p&gt;Upgrade if you use chat daily.&lt;/p&gt;

&lt;p&gt;I do not think "feature release, upgrade immediately" is always smart, but 1.109 lands in the category where most teams won't regret it, unless you sit on a brittle packaging path or an old Windows baseline.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Upgrade now if you live in Copilot Chat:&lt;/strong&gt; If your day includes "ask for a refactor," "write tests," and "explain this stacktrace," you'll notice the streaming and session workflow tweaks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test first if you manage a fleet image:&lt;/strong&gt; If you bake VS Code into a golden image, test the Copilot extension deprecation behavior. Extensions disappearing during an update creates fun tickets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Be paranoid on Linux Snap:&lt;/strong&gt; If your devs install from Snap, you should probably prefer .deb/.rpm for now, or at least warn people to check disk usage and Trash behavior after updating.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hold on Windows Server 2016 or earlier:&lt;/strong&gt; The winpty removal means your terminal may break. Test before rolling out.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Ignore the GitHub commit count. It's a vanity metric. I care about "does my terminal still work," and "did my extensions behave."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How to upgrade (and what I check right after)
&lt;/h2&gt;

&lt;p&gt;Keep it boring.&lt;/p&gt;

&lt;p&gt;Restart VS Code when it prompts you, then do a quick smoke test before you trust it with an incident.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Confirm the version:&lt;/strong&gt; Open Help, then About, and verify you see 1.109.0.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smoke test the terminal:&lt;/strong&gt; Open a terminal, run &lt;code&gt;node -v&lt;/code&gt; or &lt;code&gt;python --version&lt;/code&gt;, and make sure the shell actually starts. On Windows, check that ConPTY is working.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smoke test chat:&lt;/strong&gt; Ask Copilot Chat to explain a small function. Watch streaming. If it lags or stalls, you'll notice immediately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check extensions:&lt;/strong&gt; Open the Extensions panel and verify no extensions are disabled or showing errors. Pay special attention to the Copilot extension status.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Try Claude if enabled:&lt;/strong&gt; Switch the model picker to Claude, send a prompt, confirm it responds. Check that your API key isn't visible in Settings Sync.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quick usage examples (the stuff people will try first)
&lt;/h2&gt;

&lt;p&gt;People won't read a long guide before clicking buttons.&lt;/p&gt;

&lt;p&gt;They'll try the browser tab, then they'll try Claude, then they'll ask why auth or keys feel weird.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Try the integrated browser preview:&lt;/strong&gt; Open a local web app, then open it in the integrated browser. Use DevTools to check network calls, especially auth redirects, because embedded browsers love to surprise you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Try Claude in chat (if your org allows it):&lt;/strong&gt; Add your Anthropic key per your org policy, then select a Claude model for review-style prompts. Keep an eye on what context you share. "Paste the whole repo" turns into a compliance conversation fast.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run two agent sessions on purpose:&lt;/strong&gt; Put one agent on "write tests," another on "document behavior," then see if session switching feels sane. This is where the update pays off if you actually work that way.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Known issues (official vs community)
&lt;/h2&gt;

&lt;p&gt;The official notes do not list a known-issues section.&lt;/p&gt;

&lt;p&gt;The community still reports issues, especially around VS Code on Linux via Snap and disk usage related to Trash behavior. If you hit that, switch install methods and move on with your life.&lt;/p&gt;

&lt;p&gt;Additional issues worth watching:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude rate limiting:&lt;/strong&gt; If you're on a free Anthropic tier, you'll hit rate limits fast when using Claude as your primary chat model. The error messages in VS Code aren't always clear about this - it looks like a timeout, not a rate limit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Settings Sync + API keys:&lt;/strong&gt; As mentioned above, API keys in settings.json sync across machines. If you're on a shared machine or syncing to a personal device, audit what's in your synced settings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extension conflicts:&lt;/strong&gt; If you have both the old Copilot extension and Copilot Chat installed, some users report duplicate suggestions or chat panel confusion. Uninstall the old one explicitly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There's probably a better way to test this, but...&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://releaserun.com/vscode-1-108-2-release-notes/" rel="noopener noreferrer"&gt;VS Code 1.108.2 recovery build&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://releaserun.com/vscode-1-108-release-notes/" rel="noopener noreferrer"&gt;VS Code 1.108.0 upgrade plan&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://releaserun.com/badges/" rel="noopener noreferrer"&gt;Version health badges for all your tools&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>vscode</category>
      <category>programming</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Kubernetes 1.36 apiserver /readyz now waits for watch cache</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:23:31 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/kubernetes-136-apiserver-readyz-now-waits-for-watch-cache-34j0</link>
      <guid>https://dev.to/matheus_releaserun/kubernetes-136-apiserver-readyz-now-waits-for-watch-cache-34j0</guid>
      <description>&lt;p&gt;Test first. If you run production traffic, treat this as a control-plane behavior change, not a feature.&lt;/p&gt;

&lt;h2&gt;
  
  
  should you care? verdict
&lt;/h2&gt;

&lt;p&gt;Yes, you should care. This changes when kube-apiserver admits it is ready, and your automation will notice.&lt;/p&gt;

&lt;p&gt;In my experience, the worst control-plane outages start with a “green” health check and a pile of controllers doing list+watch at the same time. This release nudges the apiserver toward honest readiness. That is good. It can still bite you if your probes or load balancer health checks assume startup always stays under 10 seconds.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Upgrade stance:&lt;/strong&gt; test in a disposable cluster first. Watch your apiserver readiness time, restart count, and error rates after deploying.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What breaks first:&lt;/strong&gt; aggressive liveness or external health checks that kill the apiserver before it finishes warming watch cache.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What gets better:&lt;/strong&gt; fewer “Ready but actually not ready” windows that trigger thundering-herd list+watch traffic.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  should you care? apiserver readiness waits for watch cache init (PR #135777)
&lt;/h2&gt;

&lt;p&gt;You will see this. Your /readyz can stay red longer.&lt;/p&gt;

&lt;p&gt;PR #135777 enables WatchCacheInitializationPostStartHook by default. kube-apiserver will not report ready until it initializes the watch cache, instead of letting it settle later. Read the PR if you want the gory details. It is a small default change with big operational side effects. Your mileage may vary depending on how large your cluster is and how noisy your controllers are.&lt;/p&gt;

&lt;p&gt;Here’s the thing nobody mentions in release notes. A lot of “control-plane automation” treats slow readiness as a failure and responds by killing the pod. That works great until Kubernetes starts doing more work before readyz flips green. Then you get a boot loop you created yourself. I cannot point to an upstream “this will restart-loop you” bug report yet, so treat it as an operator risk. Still. I have seen enough probe configs to know it happens.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What to monitor:&lt;/strong&gt; apiserver restart count during rollout, and how long /readyz stays non-200 after a process start.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What to check right after deploy:&lt;/strong&gt; controller error rates and request latency. A thundering herd shows up as a fat tail on apiserver request duration and a spike in inflight.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What to fix before blaming Kubernetes:&lt;/strong&gt; any external load balancer health check that marks the apiserver dead faster than your slowest control-plane node can warm its caches.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  should you care? watch_list_duration_seconds goes Beta (PR #136086)
&lt;/h2&gt;

&lt;p&gt;This matters. You can alert on it without feeling silly.&lt;/p&gt;

&lt;p&gt;PR #136086 graduates watch_list_duration_seconds to Beta. That gives you a more stable target for SLOs around watch-list behavior. If you run large informer fleets, or you have operators that “helpfully” relist the world every few minutes, this metric helps you stop guessing.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Do this before testing 1.36:&lt;/strong&gt; baseline watch_list_duration_seconds in your current version. Capture p50 and p95 for a normal day.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alerting posture:&lt;/strong&gt; start with a paging threshold only if you already page on apiserver latency. Otherwise, route it to a ticket first and tighten later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After deploying:&lt;/strong&gt; check your error rates after deploying, then check watch_list_duration_seconds. If both jump together, you found a real control-plane problem.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  should you care? declarative validation can fail closed (PR #136117, KEP-5073)
&lt;/h2&gt;

&lt;p&gt;Yes. This can turn a panic into user-visible 500s.&lt;/p&gt;

&lt;p&gt;PR #136117 adds WithDeclarativeNative so strategy.go code can opt into DV-native validations. The sharp edge is intentional. When DV-native rules exist, generated validation code can run even if the DeclarativeValidation feature gate is disabled. If the authoritative declarative validator panics, Kubernetes fails closed and returns InternalError. That trades availability for correctness. In an API server, I agree with that trade most days.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What you will see:&lt;/strong&gt; InternalError on create or update, correlated with apiserver stack traces.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What to monitor:&lt;/strong&gt; apiserver 5xx rate by resource and verb. If your 5xx jumps right after upgrade, do not shrug and blame clients.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operational caveat:&lt;/strong&gt; a fail-closed validator can block writes. Plan a rollback path for the control plane. Do not discover that during an incident.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  should you care? list/watch memory spikes and the 10x swing
&lt;/h2&gt;

&lt;p&gt;I have watched apiserver RSS climb for seven minutes straight. It looks like a leak. It usually is not.&lt;/p&gt;

&lt;p&gt;The Kubernetes API streaming work shows why list behavior hurts under load. In a synthetic test from the upstream blog, kube-apiserver memory stabilized around ~2 GB with watch-list patterns enabled versus ~20 GB without. That is a test, not a promise for your cluster. Still, the direction matches what I see in real incidents. List-heavy clients punish apiservers with big transient allocations, then you pray the OOM killer picks the right process.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What to monitor:&lt;/strong&gt; apiserver RSS and allocation pressure during informer resyncs. Pair it with watch_list_duration_seconds so you can tell “slow watch-list” from “just memory churn.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What to do if it climbs:&lt;/strong&gt; slow the rollout, reduce controller concurrency if you can, and check for misbehaving operators spamming list calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Opinion:&lt;/strong&gt; ignore GitHub commit counts. Watch your apiserver graphs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  should you care? how to test 1.36 alpha without ruining your week
&lt;/h2&gt;

&lt;p&gt;Do it in kind first. Then do it again with your real monitoring.&lt;/p&gt;

&lt;p&gt;The release schedule lists 2026-02-18 for 1.36.0-alpha.2. Schedules slip. Tags show up late. Verify the image exists before you assume it does. If you run kubeadm or a managed provider, you will wait for your distro anyway.&lt;/p&gt;

&lt;p&gt;Use a disposable cluster. Keep it boring.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Create a kind cluster:&lt;/strong&gt; kind create cluster --name k136 --image kindest/node:v1.36.0-alpha.2&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Basic smoke check:&lt;/strong&gt; kubectl cluster-info&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What to watch live:&lt;/strong&gt; apiserver readiness behavior, restart count, and 5xx error rate during startup and during controller churn tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  should you care? red flags and what I grep first
&lt;/h2&gt;

&lt;p&gt;Page on symptoms. Not vibes.&lt;/p&gt;

&lt;p&gt;If /readyz never goes green, look at watch cache init messages and then look at who kills the apiserver. If you see InternalError on writes, correlate timestamps with apiserver stack traces. If watch/list stalls show up, baseline watch_list_duration_seconds now that it is Beta-grade and compare during your canary. Your monitoring should tell you the story in five minutes, not after the postmortem.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Check your error rates after deploying. If you do not have an apiserver 5xx panel, you are flying blind.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Other stuff in this release: dependency bumps, some image updates, the usual. Anyway.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-upgrade-checklist/" rel="noopener noreferrer"&gt;Kubernetes Upgrade Checklist&lt;/a&gt; - The runbook for safe minor version upgrades&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-support-and-eol-policy/" rel="noopener noreferrer"&gt;Kubernetes EOL Policy Explained&lt;/a&gt; - Know when your version loses support&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/how-to-add-version-health-badges-to-your-project/" rel="noopener noreferrer"&gt;How to Add Version Health Badges&lt;/a&gt; - Track release health in your README&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/kubernetes-security-linter/" rel="noopener noreferrer"&gt;Kubernetes YAML Security Linter&lt;/a&gt; — paste any K8s manifest and scan for 12 security issues with an A–F grade. Free, browser-based.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>cloud</category>
      <category>sre</category>
    </item>
    <item>
      <title>Kubernetes 1.32 End of Life: Migration Playbook for February 28, 2026</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:22:55 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/kubernetes-132-end-of-life-migration-playbook-for-february-28-2026-3lnp</link>
      <guid>https://dev.to/matheus_releaserun/kubernetes-132-end-of-life-migration-playbook-for-february-28-2026-3lnp</guid>
      <description>&lt;p&gt;&lt;strong&gt;12 days.&lt;/strong&gt; That's how long Kubernetes 1.32 has left before the upstream project stops issuing patches. After February 28, 2026, there are no more security fixes, no more bug patches, no more backports. Version 1.32.12 - released on February 10 - is the last update you will ever get.&lt;/p&gt;

&lt;p&gt;If you're still running 1.32 in production, this is your migration playbook. Not a gentle nudge. A concrete, step-by-step plan to get off a version that's about to become a liability.&lt;/p&gt;




&lt;h2&gt;
  
  
  What "End of Life" Actually Means (It's Worse Than You Think)
&lt;/h2&gt;

&lt;p&gt;Let's be precise about what happens on March 1st if you're still on 1.32.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No more CVE patches.&lt;/strong&gt; When the next Kubernetes vulnerability drops - and it will - the fix will ship for 1.33, 1.34, and 1.35. Not 1.32. You'll read the advisory, understand exactly how your clusters are exposed, and have no upstream fix to apply.&lt;/p&gt;

&lt;p&gt;This isn't theoretical. Look at what's already been patched in 1.33.x that 1.32 users are exposed to &lt;em&gt;right now&lt;/em&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CVE-2025-5187&lt;/strong&gt; (fixed in 1.33.4): Nodes can delete themselves by adding an OwnerReference to their own Node object. An attacker with node-level access can cause cascading disruption by self-destructing nodes in your cluster. This is the kind of bug that makes incident response teams lose sleep.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CVE-2025-4563&lt;/strong&gt; (fixed in 1.33.2): DRA (Dynamic Resource Allocation) authorization bypass. If you're using DRA - and more teams are as GPU workloads grow - this one matters.
&lt;strong&gt;No more bug fixes.&lt;/strong&gt; Several nasty bugs were fixed in 1.33.x patches that will never be backported to 1.32:- A &lt;a href="https://releaserun.com/kubernetes-1-35-1-kubelet-restarts-pod-stability/" rel="noopener noreferrer"&gt;kubelet watchdog that kills the kubelet&lt;/a&gt; during slow container runtime initialization (1.33.7). If you've ever seen mysterious kubelet restarts after a node reboot, this might be why.&lt;/li&gt;
&lt;li&gt;A DRA double-allocation race condition during rapid pod scheduling (1.33.8). You won't hit this until you do - and when you do, two pods will think they own the same resource.&lt;/li&gt;
&lt;li&gt;A DaemonSet orphaned pod regression (1.33.6) that can leave ghost pods consuming resources with no controller managing them.
&lt;strong&gt;No more compatibility guarantees.&lt;/strong&gt; Ecosystem tools - Helm, Istio, cert-manager, ArgoCD - will drop 1.32 from their test matrices. You'll start seeing "unsupported version" warnings, then errors, then silent incompatibilities that only surface at 3 AM.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kubernetes 1.32 had a solid run. Originally released December 11, 2024, it received 12 patch releases over ~14 months. That's the standard lifecycle. But its time is up.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where Should You Land?
&lt;/h2&gt;

&lt;p&gt;You have three supported targets. Here's the honest comparison:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;1.33&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;1.34&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;1.35&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;EOL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;June 28, 2026&lt;/td&gt;
&lt;td&gt;October 27, 2026&lt;/td&gt;
&lt;td&gt;February 28, 2027&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Support remaining&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~4 months&lt;/td&gt;
&lt;td&gt;~8 months&lt;/td&gt;
&lt;td&gt;~12 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hops from 1.32&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maturity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fully battle-tested&lt;/td&gt;
&lt;td&gt;Stable, well-patched&lt;/td&gt;
&lt;td&gt;Current release, still early patches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Risk profile&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low risk, low runway&lt;/td&gt;
&lt;td&gt;Low risk, good runway&lt;/td&gt;
&lt;td&gt;Low risk on paper, less field time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recommended for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"Just get off 1.32 NOW"&lt;/td&gt;
&lt;td&gt;Most production teams&lt;/td&gt;
&lt;td&gt;Teams who just upgraded recently&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Our recommendation: Target 1.34 for most teams.
&lt;/h3&gt;

&lt;p&gt;Here's the reasoning:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why not 1.33?&lt;/strong&gt; It works, it's stable, and it's the fewest changes from where you are. But with EOL on June 28, you'd be doing this exact same fire drill in four months. That's not a migration strategy - that's procrastination with extra steps.&lt;br&gt;
&lt;strong&gt;Why not 1.35?&lt;/strong&gt; It's the &lt;a href="https://releaserun.com/kubernetes-1-35-release-preview/" rel="noopener noreferrer"&gt;current release&lt;/a&gt; with the longest support runway. But getting there requires three sequential minor version upgrades (1.32→1.33→1.34→1.35), and the newest release has had less time in the field. Unless you upgraded to 1.34 recently and are just continuing the chain, the extra hop adds risk and downtime for marginal benefit.&lt;br&gt;
&lt;strong&gt;Why 1.34?&lt;/strong&gt; Two hops (1.32→1.33→1.34), eight months of support, and a version that's had enough patch releases to shake out the rough edges. You get the major 1.33 features (sidecar containers GA, nftables GA) plus whatever 1.34 brought to the table, and you won't need to think about upgrading again until late summer.&lt;/p&gt;

&lt;p&gt;The one exception: if you're in a change-freeze or have a release cycle that makes two hops impossible before February 28, go to 1.33 &lt;em&gt;now&lt;/em&gt; and plan the 1.33→1.34 hop for March. Getting off 1.32 is the priority.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Upgrade Path: You Cannot Skip Minor Versions
&lt;/h2&gt;

&lt;p&gt;This is the part where people get burned. &lt;a href="https://releaserun.com/kubernetes-support-and-eol-policy/" rel="noopener noreferrer"&gt;Kubernetes version skew policy&lt;/a&gt; is strict: &lt;strong&gt;you must upgrade one minor version at a time.&lt;/strong&gt; There is no shortcut from 1.32 to 1.34. You go through 1.33, you validate, and then you continue.&lt;/p&gt;

&lt;p&gt;Here's the sequence for a kubeadm-managed cluster:&lt;/p&gt;
&lt;h3&gt;
  
  
  Pre-Upgrade Checklist
&lt;/h3&gt;

&lt;p&gt;Before you touch anything:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Confirm your current version&lt;/span&gt;

kubectl version &lt;span class="nt"&gt;--short&lt;/span&gt;


kubectl get nodes &lt;span class="nt"&gt;-o&lt;/span&gt; wide

&lt;span class="c"&gt;# 2. Check for deprecated API usage that will break on upgrade&lt;/span&gt;
&lt;span class="c"&gt;# Install kubectl-deprecations or use kubent&lt;/span&gt;

kubectl get &lt;span class="nt"&gt;--raw&lt;/span&gt; /metrics | &lt;span class="nb"&gt;grep &lt;/span&gt;apiserver_requested_deprecated_apis

&lt;span class="c"&gt;# 3. Verify etcd health&lt;/span&gt;

&lt;span class="nv"&gt;ETCDCTL_API&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3 etcdctl endpoint health &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--endpoints&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://127.0.0.1:2379 &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--cacert&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/ca.crt &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--cert&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/server.crt &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/server.key

&lt;span class="c"&gt;# 4. Back up etcd (non-negotiable)&lt;/span&gt;

&lt;span class="nv"&gt;ETCDCTL_API&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3 etcdctl snapshot save /backup/etcd-pre-upgrade-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.db &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--endpoints&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://127.0.0.1:2379 &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--cacert&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/ca.crt &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--cert&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/server.crt &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/etc/kubernetes/pki/etcd/server.key

&lt;span class="c"&gt;# 5. Check component version skew&lt;/span&gt;
&lt;span class="c"&gt;# kubelet must be within one minor version of the API server&lt;/span&gt;
&lt;span class="c"&gt;# kube-proxy must match the API server minor version&lt;/span&gt;

kubectl get nodes &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{range .items[*]}{.metadata.name}{"\t"}{.status.nodeInfo.kubeletVersion}{"\n"}{end}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hop 1: 1.32 → 1.33
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# On the first control plane node:&lt;/span&gt;
&lt;span class="c"&gt;# Update kubeadm&lt;/span&gt;

apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nv"&gt;kubeadm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.33.&amp;lt;em&amp;gt;-&amp;lt;/em&amp;gt;

&lt;span class="c"&gt;# or on RHEL/CentOS:&lt;/span&gt;
&lt;span class="c"&gt;# yum install -y kubeadm-1.33.*&lt;/span&gt;
&lt;span class="c"&gt;# Verify the upgrade plan&lt;/span&gt;

kubeadm upgrade plan

&lt;span class="c"&gt;# Apply the upgrade (first control plane only)&lt;/span&gt;

kubeadm upgrade apply v1.33.8

&lt;span class="c"&gt;# Upgrade kubelet and kubectl&lt;/span&gt;

apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nv"&gt;kubelet&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.33.&amp;lt;em&amp;gt;-&amp;lt;/em&amp;gt; &lt;span class="nv"&gt;kubectl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.33.&amp;lt;em&amp;gt;-&amp;lt;/em&amp;gt;


systemctl daemon-reload


systemctl restart kubelet
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For additional control plane nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubeadm upgrade node

apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nv"&gt;kubelet&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.33.&amp;lt;em&amp;gt;-&amp;lt;/em&amp;gt; &lt;span class="nv"&gt;kubectl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.33.&amp;lt;em&amp;gt;-&amp;lt;/em&amp;gt;


systemctl daemon-reload


systemctl restart kubelet
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For each worker node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# From a machine with kubectl access:&lt;/span&gt;

kubectl drain &amp;lt;node-name&amp;gt; &lt;span class="nt"&gt;--ignore-daemonsets&lt;/span&gt; &lt;span class="nt"&gt;--delete-emptydir-data&lt;/span&gt;

&lt;span class="c"&gt;# On the worker node:&lt;/span&gt;

apt-get update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nv"&gt;kubeadm&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.33.&amp;lt;em&amp;gt;-&amp;lt;/em&amp;gt;


kubeadm upgrade node


apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nv"&gt;kubelet&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.33.&amp;lt;em&amp;gt;-&amp;lt;/em&amp;gt;


systemctl daemon-reload


systemctl restart kubelet

&lt;span class="c"&gt;# From kubectl:&lt;/span&gt;

kubectl uncordon &amp;lt;node-name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Stop here. Validate.&lt;/strong&gt; Don't chain upgrades without confirming the cluster is healthy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl get nodes          &lt;span class="c"&gt;# All nodes Ready?&lt;/span&gt;

kubectl get pods &lt;span class="nt"&gt;-A&lt;/span&gt;        &lt;span class="c"&gt;# Any CrashLoopBackOff?&lt;/span&gt;


kubectl get cs             &lt;span class="c"&gt;# Component statuses healthy?&lt;/span&gt;

&lt;span class="c"&gt;# Run your smoke tests. You have smoke tests, right?&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Hop 2: 1.33 → 1.34
&lt;/h3&gt;

&lt;p&gt;Repeat the exact same process, substituting &lt;code&gt;1.34&lt;/code&gt; for &lt;code&gt;1.33&lt;/code&gt;. Same drain-upgrade-uncordon dance. Same validation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Version skew during upgrade:&lt;/strong&gt; The Kubernetes version skew policy allows kubelet to be one minor version behind the API server. This means during the 1.33→1.34 upgrade, your 1.33 kubelets will work with the 1.34 API server while you roll nodes. But 1.32 kubelets will &lt;em&gt;not&lt;/em&gt; work with a 1.34 API server. This is why you can't skip versions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cloud Provider Timelines: You Might Have More Time (For a Price)
&lt;/h2&gt;

&lt;p&gt;If you're running managed Kubernetes, your deadlines are slightly different - but don't get complacent.&lt;/p&gt;

&lt;h3&gt;
  
  
  Amazon EKS
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Standard support EOL for 1.32:&lt;/strong&gt; March 23, 2026 (three weeks after upstream)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extended support EOL:&lt;/strong&gt; March 23, 2027&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;EKS extended support buys you a full extra year, but at a premium: &lt;strong&gt;$0.60 per cluster per hour&lt;/strong&gt; on top of the standard $0.10/hour. That's roughly $4,400/year per cluster just for the privilege of staying on 1.32. For a single cluster, maybe. For a fleet, you're burning budget to avoid an upgrade you'll have to do anyway.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check your EKS cluster version&lt;/span&gt;

aws eks describe-cluster &lt;span class="nt"&gt;--name&lt;/span&gt; &amp;lt;cluster-name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'cluster.version'&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt; text

&lt;span class="c"&gt;# Start an EKS upgrade to 1.33&lt;/span&gt;

aws eks update-cluster-version &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--name&lt;/span&gt; &amp;lt;cluster-name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--kubernetes-version&lt;/span&gt; 1.33

&lt;span class="c"&gt;# Watch the update status&lt;/span&gt;

aws eks describe-update &lt;span class="nt"&gt;--name&lt;/span&gt; &amp;lt;cluster-name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--update-id&lt;/span&gt; &amp;lt;update-id-from-previous-command&amp;gt;

&lt;span class="c"&gt;# Don't forget to update your node groups after!&lt;/span&gt;

aws eks update-nodegroup-version &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--cluster-name&lt;/span&gt; &amp;lt;cluster-name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--nodegroup-name&lt;/span&gt; &amp;lt;nodegroup-name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Google GKE
&lt;/h3&gt;

&lt;p&gt;GKE typically provides 2-4 weeks of grace after upstream EOL before auto-upgrading clusters. If you haven't set a maintenance window and an upgrade strategy, GKE &lt;em&gt;will&lt;/em&gt; upgrade your clusters for you. That sounds convenient until it happens during your traffic peak.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check GKE cluster version&lt;/span&gt;

gcloud container clusters describe &amp;lt;cluster-name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--zone&lt;/span&gt; &amp;lt;zone&amp;gt; &lt;span class="nt"&gt;--format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"value(currentMasterVersion)"&lt;/span&gt;

&lt;span class="c"&gt;# Initiate upgrade&lt;/span&gt;

gcloud container clusters upgrade &amp;lt;cluster-name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--zone&lt;/span&gt; &amp;lt;zone&amp;gt; &lt;span class="nt"&gt;--master&lt;/span&gt; &lt;span class="nt"&gt;--cluster-version&lt;/span&gt; 1.33
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Azure AKS
&lt;/h3&gt;

&lt;p&gt;AKS follows a similar pattern: roughly 2-4 weeks past upstream EOL, with platform-managed upgrades kicking in after that. AKS's "long-term support" (LTS) versions are a separate track - 1.32 is not an LTS release, so no special treatment here.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check AKS version&lt;/span&gt;

az aks show &lt;span class="nt"&gt;--resource-group&lt;/span&gt; &amp;lt;rg&amp;gt; &lt;span class="nt"&gt;--name&lt;/span&gt; &amp;lt;cluster-name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--query&lt;/span&gt; kubernetesVersion &lt;span class="nt"&gt;-o&lt;/span&gt; tsv

&lt;span class="c"&gt;# Upgrade AKS&lt;/span&gt;

az aks upgrade &lt;span class="nt"&gt;--resource-group&lt;/span&gt; &amp;lt;rg&amp;gt; &lt;span class="nt"&gt;--name&lt;/span&gt; &amp;lt;cluster-name&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--kubernetes-version&lt;/span&gt; 1.33
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The bottom line for cloud users:&lt;/strong&gt; You have a few weeks of buffer. Use that buffer for testing, not for procrastination. Start the upgrade now and use the extra weeks as a safety net, not a crutch.&lt;/p&gt;




&lt;h2&gt;
  
  
  What You Gain: 5 Features Worth the Upgrade
&lt;/h2&gt;

&lt;p&gt;Upgrading isn't just about escaping EOL. The jump from 1.32 to 1.33 is one of the most feature-rich minor releases in recent Kubernetes history. Here's what actually matters in production:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Sidecar Containers - GA (KEP-753)
&lt;/h3&gt;

&lt;p&gt;This is the big one. After years of KEPs, alpha gates, and community debate, native sidecar containers are generally available. Init containers with &lt;code&gt;restartPolicy: Always&lt;/code&gt; now have proper lifecycle management: they start before your main containers, stay running alongside them, and shut down after them.&lt;/p&gt;

&lt;p&gt;If you're running service meshes (Istio, Linkerd), log shippers, or any sidecar-dependent architecture, this eliminates a whole class of race conditions. No more hacks with &lt;code&gt;postStart&lt;/code&gt; hooks and sleep loops to ensure your Envoy proxy is ready before your app starts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Watch out:&lt;/strong&gt; A sidecar startup probe race condition was fixed in 1.33.6. Make sure you're on 1.33.8 (latest) to avoid it.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. nftables Kube-Proxy Backend - GA (KEP-3866)
&lt;/h3&gt;

&lt;p&gt;The iptables-based kube-proxy is showing its age. nftables is faster, handles large rule sets better, and is the future of Linux packet filtering. With GA in 1.33, it's production-ready.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The caveat:&lt;/strong&gt; This doesn't mean nftables is the &lt;em&gt;default&lt;/em&gt; yet. You still need to opt in. But if you're running clusters with thousands of Services, the performance difference is measurable - especially rule reload times during Service churn. An &lt;code&gt;iif&lt;/code&gt; vs &lt;code&gt;iifname&lt;/code&gt; bug in local traffic detection was fixed in 1.33.6, so again: run the latest patch.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. In-Place Pod Resource Resize - Beta (KEP-1287)
&lt;/h3&gt;

&lt;p&gt;Change a pod's CPU and memory requests/limits without restarting it. Still beta, so it's behind a feature gate, but this is the kind of capability that changes how you think about vertical scaling. No more killing a pod just because it needs 200Mi more memory during a traffic spike.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Topology-Aware Routing - GA (KEP-4444)
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;trafficDistribution: PreferClose&lt;/code&gt; is now GA. Traffic prefers endpoints in the same zone before crossing zone boundaries. This is pure money in multi-AZ deployments: less cross-zone data transfer, lower latency, better tail percentiles. If you're on AWS or GCP and not using this, you're paying an invisible cloud networking tax.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Multiple Service CIDRs - GA (KEP-1880)
&lt;/h3&gt;

&lt;p&gt;You can now dynamically expand your ClusterIP range without cluster recreation. If you've ever hit the ceiling on your Service CIDR and had to do gymnastics to work around it, this fixes that permanently. Especially relevant for large multi-tenant clusters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Breaking Changes and Gotchas: What to Watch For
&lt;/h2&gt;

&lt;p&gt;Every upgrade has landmines. Here are the ones that bite in the 1.32→1.33 transition:&lt;/p&gt;

&lt;h3&gt;
  
  
  nftables Consideration
&lt;/h3&gt;

&lt;p&gt;While nftables kube-proxy went GA, the default backend is still iptables in 1.33. However, start planning your migration now. Test nftables in staging. Future versions may change the default, and you don't want to be scrambling when that happens. The migration guide is essential reading - nftables rule semantics differ from iptables in subtle ways that will break custom &lt;code&gt;NetworkPolicy&lt;/code&gt; implementations relying on iptables-specific behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Deprecated API Removals
&lt;/h3&gt;

&lt;p&gt;Check for any APIs that were deprecated in 1.31 or earlier and removed in 1.33. The &lt;code&gt;flowcontrol.apiserver.k8s.io/v1beta3&lt;/code&gt; API group is one to watch. Run &lt;code&gt;kubectl-deprecations&lt;/code&gt; or &lt;code&gt;kubent&lt;/code&gt; before upgrading:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Using kubent (kube-no-trouble)&lt;/span&gt;

kubent

&lt;span class="c"&gt;# Or check directly&lt;/span&gt;

kubectl get &lt;span class="nt"&gt;--raw&lt;/span&gt; /metrics | &lt;span class="nb"&gt;grep &lt;/span&gt;apiserver_requested_deprecated_apis
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Feature Gate Changes
&lt;/h3&gt;

&lt;p&gt;Some feature gates that were beta (and on by default) in 1.32 graduated to GA in 1.33, which means the gates are locked and removed. If you were explicitly setting these gates in your kubelet or API server configs, the flags will cause startup errors. Audit your &lt;code&gt;--feature-gates&lt;/code&gt; flags before upgrading.&lt;/p&gt;

&lt;h3&gt;
  
  
  DRA (Dynamic Resource Allocation) Changes
&lt;/h3&gt;

&lt;p&gt;If you're using DRA for GPU or custom resource scheduling, be aware of the authorization bypass fix (CVE-2025-4563) and the double-allocation race fix. The fixes are in 1.33.2 and 1.33.8 respectively, so target 1.33.8 as your landing version.&lt;/p&gt;




&lt;h2&gt;
  
  
  Your 5-Step Action Plan
&lt;/h2&gt;

&lt;p&gt;Here's what to do this week. Not next month. This week.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Audit (Today)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Find every cluster still on 1.32&lt;/span&gt;
&lt;span class="c"&gt;# For kubeadm clusters:&lt;/span&gt;

kubectl version &lt;span class="nt"&gt;-o&lt;/span&gt; json | jq &lt;span class="s1"&gt;'.serverVersion.minor'&lt;/span&gt;

&lt;span class="c"&gt;# For EKS:&lt;/span&gt;

aws eks list-clusters &lt;span class="nt"&gt;--output&lt;/span&gt; text | xargs &lt;span class="nt"&gt;-I&lt;/span&gt;&lt;span class="o"&gt;{}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;


aws eks describe-cluster &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'[cluster.name, cluster.version]'&lt;/span&gt; &lt;span class="nt"&gt;--output&lt;/span&gt; text

&lt;span class="c"&gt;# For GKE:&lt;/span&gt;

gcloud container clusters list &lt;span class="se"&gt;\&lt;/span&gt;


&lt;span class="nt"&gt;--format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"table(name, currentMasterVersion)"&lt;/span&gt;

&lt;span class="c"&gt;# For AKS:&lt;/span&gt;

az aks list &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'[].{name:name, version:kubernetesVersion}'&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Test in Staging (This Week)
&lt;/h3&gt;

&lt;p&gt;Upgrade a non-production cluster to 1.33. Run your full test suite (see our &lt;a href="https://releaserun.com/kubernetes-upgrade-checklist/" rel="noopener noreferrer"&gt;Kubernetes upgrade checklist&lt;/a&gt;). Pay special attention to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Service mesh behavior (sidecar lifecycle changes)&lt;/li&gt;
&lt;li&gt;Network policies (if you plan to test nftables)&lt;/li&gt;
&lt;li&gt;Any workloads using DRA&lt;/li&gt;
&lt;li&gt;Custom admission webhooks (API changes can break them silently)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 3: Upgrade Production to 1.33 (Week of Feb 23)
&lt;/h3&gt;

&lt;p&gt;Follow the kubeadm or cloud provider upgrade steps above. Target &lt;strong&gt;1.33.8&lt;/strong&gt; - it has the latest security and bug fixes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Validate and Soak (1 Week)
&lt;/h3&gt;

&lt;p&gt;Run 1.33 in production for at least a few days. Monitor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Watch for elevated error rates&lt;/span&gt;

kubectl get events &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'.lastTimestamp'&lt;/span&gt; &lt;span class="nt"&gt;-A&lt;/span&gt; | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-50&lt;/span&gt;

&lt;span class="c"&gt;# Check component health&lt;/span&gt;

kubectl get componentstatuses

&lt;span class="c"&gt;# Monitor pod restarts (a spike means something broke)&lt;/span&gt;

kubectl get pods &lt;span class="nt"&gt;-A&lt;/span&gt; &lt;span class="nt"&gt;--sort-by&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'.status.containerStatuses[0].restartCount'&lt;/span&gt; | &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-20&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Continue to 1.34 (Early March)
&lt;/h3&gt;

&lt;p&gt;Once 1.33 is stable, repeat the process for 1.34. This is your final destination - 8 months of support runway, the features you need, and a stable foundation.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Clock Is Ticking
&lt;/h2&gt;

&lt;p&gt;February 28 is not a soft deadline. It's the day your clusters become unpatched infrastructure. Every day after that, your attack surface grows and your ecosystem compatibility shrinks.&lt;/p&gt;

&lt;p&gt;The upgrade from 1.32 to 1.33 (and then 1.34) is well-trodden ground. Thousands of clusters have made this jump. The tooling works. The docs are solid. The features are worth it.&lt;/p&gt;

&lt;p&gt;What's not worth it is explaining to your security team in April why you're running a Kubernetes version with known, unpatched CVEs because the upgrade "wasn't prioritized."&lt;/p&gt;

&lt;p&gt;Start today. Your future on-call self will thank you.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-support-and-eol-policy/" rel="noopener noreferrer"&gt;Kubernetes EOL Policy Explained&lt;/a&gt; - how the support lifecycle works and what each phase means for your clusters&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-upgrade-checklist/" rel="noopener noreferrer"&gt;Kubernetes Upgrade Checklist (Minor Version)&lt;/a&gt; - the step-by-step runbook for any minor version upgrade&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/debugging-kubernetes-nodes-notready/" rel="noopener noreferrer"&gt;Debugging Kubernetes Nodes in NotReady State&lt;/a&gt; - essential troubleshooting for when nodes go dark during or after upgrades&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-1-35-release-preview/" rel="noopener noreferrer"&gt;Kubernetes 1.35 Release: What Can Break Your Cluster&lt;/a&gt; - if you're considering jumping all the way to 1.35&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-distributions-compared/" rel="noopener noreferrer"&gt;Popular Kubernetes Distributions Compared (2026)&lt;/a&gt; - EKS, GKE, AKS, and self-managed options compared&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-statistics-adoption-2026/" rel="noopener noreferrer"&gt;Kubernetes Statistics and Adoption Trends in 2026&lt;/a&gt; - the data behind K8s adoption and version usage&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Track Kubernetes version health, EOL dates, and upgrade paths in real-time at &lt;a href="https://releaserun.com" rel="noopener noreferrer"&gt;ReleaseRun&lt;/a&gt;. We monitor the releases so you don't miss a deadline.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/kubernetes-security-linter/" rel="noopener noreferrer"&gt;Kubernetes YAML Security Linter&lt;/a&gt; — paste any K8s manifest and scan for 12 security issues with an A–F grade. Free, browser-based.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>security</category>
      <category>cloud</category>
    </item>
    <item>
      <title>Popular Kubernetes Distributions Compared (2026)</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:17:35 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/popular-kubernetes-distributions-compared-2026-186o</link>
      <guid>https://dev.to/matheus_releaserun/popular-kubernetes-distributions-compared-2026-186o</guid>
      <description>&lt;p&gt;Choosing a Kubernetes distribution is one of the first decisions platform teams face. The ecosystem now includes over 200 certified options -- from lightweight single-node setups to enterprise platforms managing thousands of clusters.&lt;/p&gt;

&lt;p&gt;Here's a practical comparison of the most popular distributions, what each is best suited for, and how to decide.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is a Kubernetes Distribution?
&lt;/h2&gt;

&lt;p&gt;A Kubernetes distribution is a packaged version of upstream Kubernetes that adds installation tooling, default configurations, and often additional features like networking, storage, and security integrations.&lt;/p&gt;

&lt;p&gt;Think of it like Linux distributions: Ubuntu, Red Hat, and Alpine all run the Linux kernel, but each packages it differently for different use cases. Kubernetes distributions work the same way.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Major Distributions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Managed Cloud Services (Hosted)
&lt;/h3&gt;

&lt;p&gt;These are fully managed -- the cloud provider handles the control plane, upgrades, and infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon EKS&lt;/strong&gt; -- The market leader (~42% share). Tight integration with AWS services (IAM, VPC, ALB). Supports both cloud and on-premises deployment (EKS Anywhere). Best for teams already invested in AWS.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google GKE&lt;/strong&gt; -- Built by the team that created Kubernetes. Fastest to adopt new K8s versions (often same-day support for new releases). Autopilot mode eliminates node management entirely. Best for teams wanting the most "pure" Kubernetes experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Azure AKS&lt;/strong&gt; -- Deep integration with Azure Active Directory and Azure DevOps. Strong Windows container support. Free control plane (you only pay for worker nodes). Best for Microsoft-centric enterprises.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DigitalOcean Kubernetes (DOKS)&lt;/strong&gt; -- Simplest managed option. Free control plane, straightforward pricing. Limited to smaller scale. Best for startups and small teams.&lt;/p&gt;

&lt;h3&gt;
  
  
  Self-Managed (On-Premises / Hybrid)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;kubeadm&lt;/strong&gt; -- The official Kubernetes bootstrapping tool. Minimal opinions -- gives you vanilla upstream Kubernetes. Requires you to handle networking, storage, and upgrades yourself. Best for teams that want full control and understand the internals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Red Hat OpenShift&lt;/strong&gt; -- Enterprise Kubernetes platform with built-in CI/CD (Tekton), developer portal, and strict security defaults (SELinux, SCCs). Runs on any infrastructure. Opinionated but comprehensive. Best for regulated enterprises that need a complete platform, not just an orchestrator.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rancher (by SUSE)&lt;/strong&gt; -- Multi-cluster management platform. Can manage EKS, GKE, AKS, and on-prem clusters from a single dashboard. Includes its own lightweight distribution (RKE2). Best for teams managing Kubernetes across multiple environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;VMware Tanzu&lt;/strong&gt; -- Integrates Kubernetes into existing VMware infrastructure. Lets teams run containers alongside traditional VMs. Best for organisations transitioning from VMware to containers gradually.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lightweight / Edge Distributions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;k3s&lt;/strong&gt; -- Rancher's lightweight Kubernetes distribution. Single binary under 100MB. Ideal for edge computing, IoT, CI/CD pipelines, and development environments. Strips out cloud-provider-specific code and uses SQLite instead of etcd by default.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MicroK8s (Canonical)&lt;/strong&gt; -- Snap-packaged Kubernetes from the Ubuntu team. Zero-ops single-node to multi-node clusters. Strong add-on ecosystem (Istio, Knative, GPU support). Best for developer workstations and Ubuntu-based infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;minikube&lt;/strong&gt; -- Local Kubernetes for development and testing. Runs inside a VM or container on your laptop. Not intended for production. Best for learning Kubernetes and local development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;kind (Kubernetes in Docker)&lt;/strong&gt; -- Runs Kubernetes clusters using Docker containers as nodes. Designed for testing Kubernetes itself. Extremely fast to spin up and tear down. Best for CI/CD pipelines and integration testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Distribution&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;K8s Version Lag&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Amazon EKS&lt;/td&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;td&gt;AWS-native teams&lt;/td&gt;
&lt;td&gt;1-2 weeks&lt;/td&gt;
&lt;td&gt;Pay per cluster + nodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google GKE&lt;/td&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;td&gt;K8s-first teams&lt;/td&gt;
&lt;td&gt;Same day&lt;/td&gt;
&lt;td&gt;Pay per cluster + nodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Azure AKS&lt;/td&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;td&gt;Microsoft shops&lt;/td&gt;
&lt;td&gt;1-2 weeks&lt;/td&gt;
&lt;td&gt;Free control plane&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DigitalOcean&lt;/td&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;td&gt;Startups&lt;/td&gt;
&lt;td&gt;2-4 weeks&lt;/td&gt;
&lt;td&gt;Free control plane&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;kubeadm&lt;/td&gt;
&lt;td&gt;Self-managed&lt;/td&gt;
&lt;td&gt;Full control&lt;/td&gt;
&lt;td&gt;Same day&lt;/td&gt;
&lt;td&gt;Free (your infra)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenShift&lt;/td&gt;
&lt;td&gt;Platform&lt;/td&gt;
&lt;td&gt;Enterprises&lt;/td&gt;
&lt;td&gt;2-4 weeks&lt;/td&gt;
&lt;td&gt;Subscription&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rancher/RKE2&lt;/td&gt;
&lt;td&gt;Multi-cluster&lt;/td&gt;
&lt;td&gt;Hybrid/multi-cloud&lt;/td&gt;
&lt;td&gt;1-2 weeks&lt;/td&gt;
&lt;td&gt;Free + Enterprise tier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;k3s&lt;/td&gt;
&lt;td&gt;Lightweight&lt;/td&gt;
&lt;td&gt;Edge/IoT&lt;/td&gt;
&lt;td&gt;1-2 weeks&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MicroK8s&lt;/td&gt;
&lt;td&gt;Lightweight&lt;/td&gt;
&lt;td&gt;Dev/Ubuntu&lt;/td&gt;
&lt;td&gt;1-3 weeks&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  How to Choose
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Start with this question: Who manages the infrastructure?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"Not us"&lt;/strong&gt; → Managed service (EKS, GKE, AKS). Pick based on your cloud provider.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Us, on our hardware"&lt;/strong&gt; → kubeadm (DIY), OpenShift (enterprise), or Rancher (multi-cluster).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"It's for development/testing"&lt;/strong&gt; → k3s, minikube, or kind.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"It's for edge/IoT"&lt;/strong&gt; → k3s or MicroK8s.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Then consider:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Team size&lt;/strong&gt;: Small teams benefit from managed services or opinionated platforms. Large platform teams can handle kubeadm.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Compliance requirements&lt;/strong&gt;: Regulated industries often need OpenShift or Tanzu for their built-in security controls.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-cloud needs&lt;/strong&gt;: Rancher or Anthos (Google's hybrid offering) if you're running across providers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Version freshness&lt;/strong&gt;: If running the latest Kubernetes version matters, GKE and kubeadm track upstream fastest.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Version Support Across Distributions
&lt;/h2&gt;

&lt;p&gt;Not all distributions support the same Kubernetes versions at the same time. When upstream Kubernetes releases version 1.35, managed providers typically need 1-4 weeks to certify and offer it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check current version support for any distribution&lt;/strong&gt; on our Kubernetes Releases hub, which tracks every supported version with live health grades and EOL dates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key dates to know:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes 1.32 reaches end of life &lt;strong&gt;February 28, 2026&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Kubernetes 1.36 expected &lt;strong&gt;April 2026&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Track all Kubernetes versions, EOL dates, and security status in real time at ReleaseRun.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Maintained by ReleaseRun -- tracking release health for 300+ software products. Last updated: February 2026.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-1-32-end-of-life-migration-playbook/" rel="noopener noreferrer"&gt;Kubernetes 1.32 End of Life: Migration Playbook&lt;/a&gt; -- EOL Feb 28 -- upgrade path for every distribution&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-statistics-adoption-2026/" rel="noopener noreferrer"&gt;Kubernetes Statistics and Adoption Trends in 2026&lt;/a&gt; -- which distributions are gaining market share&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/docker-vs-kubernetes-production-2026-decision-rubric/" rel="noopener noreferrer"&gt;Docker vs Kubernetes in Production (2026)&lt;/a&gt; -- when you don't need a full K8s distribution&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-support-and-eol-policy/" rel="noopener noreferrer"&gt;Kubernetes EOL Policy Explained&lt;/a&gt; -- how the support lifecycle affects your distribution choice&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-gateway-api-vs-ingress-vs-service-loadbalancer-what-to-use-in-2026-migration-paths/" rel="noopener noreferrer"&gt;Gateway API vs Ingress vs LoadBalancer&lt;/a&gt; -- networking options across distributions&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/kubernetes-security-linter/" rel="noopener noreferrer"&gt;Kubernetes YAML Security Linter&lt;/a&gt; — paste any K8s manifest and scan for 12 security issues with an A–F grade. Free, browser-based.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>cloud</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>Node.js 20 End of Life: Migration Playbook for April 30, 2026</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:16:55 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/nodejs-20-end-of-life-migration-playbook-for-april-30-2026-2onh</link>
      <guid>https://dev.to/matheus_releaserun/nodejs-20-end-of-life-migration-playbook-for-april-30-2026-2onh</guid>
      <description>&lt;p&gt;Node.js 20 reaches end of life on April 30, 2026.&lt;/p&gt;

&lt;p&gt;If you are reading this in March or April, you are already behind. Node.js EOL dates do not come with a grace period. On May 1st, no more security patches. No more CVE fixes. The npm ecosystem moves on, and packages start dropping support in their CI matrices before the EOL date even arrives.&lt;/p&gt;

&lt;p&gt;I have seen teams discover they are running an EOL runtime at the worst possible moment -- during a security incident, when the fix only ships for supported versions. This playbook covers what Node 20 EOL means, which version to move to, what breaks along the way, and the exact steps to migrate without a production outage.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR -- What to do and when
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Right now (February):&lt;/strong&gt; Audit every service, container, Lambda function, and CI pipeline running Node 20. Run your test suite on Node 22.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;March:&lt;/strong&gt; Migrate production workloads to Node 22 LTS (recommended) or Node 24 if you need the latest features. Deploy behind canary or feature flags.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Early April:&lt;/strong&gt; Clean up stragglers -- serverless functions, internal tools, batch jobs, developer machines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;April 30:&lt;/strong&gt; Deadline. You want to be done a week before, not on the day.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you do one thing today, run &lt;code&gt;node --version&lt;/code&gt; across every production host, container, and CI runner. The number of places pinning Node 20 will surprise you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "end of life" actually means for Node 20
&lt;/h2&gt;

&lt;p&gt;Node.js 20 entered Maintenance LTS on October 22, 2024. Since then, it only receives &lt;a href="https://releaserun.com/nodejs-20-20-0-release-notes-security-patches/" rel="noopener noreferrer"&gt;critical bug fixes and security patches&lt;/a&gt;. On April 30, 2026, even that stops.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No more security patches:&lt;/strong&gt; If a vulnerability is found in Node 20 after April 30, the Node.js team will only patch supported versions (22, 24+). You get nothing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;npm ecosystem moves on:&lt;/strong&gt; Package authors drop Node 20 from their &lt;code&gt;engines&lt;/code&gt; field and CI matrices. Some already have. When a package you depend on releases a version that requires Node 22+, your lockfile becomes a ticking time bomb.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud runtimes deprecate:&lt;/strong&gt; AWS Lambda, Google Cloud Functions, and Azure Functions will deprecate their Node 20 runtimes on their own timeline. AWS gives at least 180 days notice and phases out in stages: first blocking new function creation, then updates, though existing invocations can continue indefinitely on deprecated runtimes. Other providers have similar but not identical policies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance gaps:&lt;/strong&gt; SOC 2, PCI DSS, and ISO 27001 all require running supported software. An EOL runtime is a finding waiting to happen.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your code still runs. Node.js does not brick itself. But security coverage evaporates and the maintenance burden increases every week as the ecosystem leaves you behind.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Node 20 is hiding
&lt;/h2&gt;

&lt;p&gt;The obvious places are easy. The ones that bite you are the ones nobody remembers deploying.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Docker base images:&lt;/strong&gt; &lt;code&gt;node:20&lt;/code&gt;, &lt;code&gt;node:20-slim&lt;/code&gt;, &lt;code&gt;node:20-alpine&lt;/code&gt;. Search your Dockerfiles: &lt;code&gt;grep -rn "FROM node:20" . --include="Dockerfile*"&lt;/code&gt;. Check multi-stage builds too.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;.nvmrc and .node-version files:&lt;/strong&gt; These pin the Node version for local development and often get copied into CI. Search: &lt;code&gt;find . -name ".nvmrc" -o -name ".node-version" | xargs grep "20"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;package.json engines field:&lt;/strong&gt; &lt;code&gt;grep -rn '"engines"' . --include="package.json" -A 3 | grep "20"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI/CD pipelines:&lt;/strong&gt; GitHub Actions (&lt;code&gt;setup-node&lt;/code&gt;), GitLab CI, CircleCI, and Jenkins configs. Search for &lt;code&gt;node-version: '20'&lt;/code&gt; or &lt;code&gt;NODE_VERSION: 20&lt;/code&gt; across all YAML files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS Lambda:&lt;/strong&gt; Check runtime settings: &lt;code&gt;aws lambda list-functions --query 'Functions[?Runtime==\&lt;/code&gt;nodejs20.x&lt;code&gt;].[FunctionName,Runtime]' --output table&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vercel / Netlify / Cloudflare Workers:&lt;/strong&gt; Check project settings for Node version overrides. Vercel uses &lt;code&gt;engines.node&lt;/code&gt; in package.json. Netlify uses environment variables. Cloudflare Workers has its own compatibility dates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tooling:&lt;/strong&gt; Husky, lint-staged, Prettier, ESLint config runners -- these run on your dev machine's Node version, which developers may not have updated.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Which version to upgrade to
&lt;/h2&gt;

&lt;p&gt;Short answer: &lt;strong&gt;&lt;a href="https://releaserun.com/node-20-vs-22-vs-24-which-node-js-lts-should-you-run-in-production/" rel="noopener noreferrer"&gt;Node 22 LTS&lt;/a&gt;&lt;/strong&gt; for production. &lt;strong&gt;Node 24&lt;/strong&gt; if you are already running it in development and can tolerate a shorter track record.&lt;/p&gt;

&lt;h3&gt;
  
  
  Node 22 LTS (recommended for most teams)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Entered Active LTS in October 2024, Maintenance LTS started October 21, 2025&lt;/li&gt;
&lt;li&gt;EOL April 30, 2027 -- gives you a full year of support after Node 20 dies&lt;/li&gt;
&lt;li&gt;V8 engine 12.4 -- significant performance improvements over Node 20's V8 11.3&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Key additions over Node 20:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;require()&lt;/code&gt; now works with ES modules (release candidate status as of 22.x -- usable in production but check your specific minor version) -- the biggest quality-of-life improvement in years&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Built-in &lt;code&gt;node --watch&lt;/code&gt; is stable (no more nodemon for simple use cases)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;fetch()&lt;/code&gt; and &lt;code&gt;WebStreams&lt;/code&gt; are stable (no longer experimental)&lt;/li&gt;
&lt;li&gt;Built-in WebSocket client (&lt;code&gt;WebSocket&lt;/code&gt; global) -- stable (was experimental behind a flag in Node 20.10+)&lt;/li&gt;
&lt;li&gt;Improved test runner (&lt;code&gt;node:test&lt;/code&gt;) with snapshot testing and coverage -- the test runner is stable in Node 20, but Node 22 adds significant features&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;glob&lt;/code&gt; and &lt;code&gt;matchesGlob&lt;/code&gt; in &lt;code&gt;node:fs&lt;/code&gt; and &lt;code&gt;node:path&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Task runner: &lt;code&gt;node --run&lt;/code&gt; as a faster alternative to &lt;code&gt;npm run&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Breaking changes from Node 20:&lt;/strong&gt; V8 upgrade may affect native addons compiled against Node 20's ABI. Rebuild native modules with &lt;code&gt;npm rebuild&lt;/code&gt; or &lt;code&gt;node-gyp rebuild&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Node 24 (for early adopters)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Released May 6, 2025, entered Active LTS October 2025&lt;/li&gt;
&lt;li&gt;EOL April 2028 -- nearly two years of runway&lt;/li&gt;
&lt;li&gt;V8 13.x with further performance improvements&lt;/li&gt;
&lt;li&gt;Permissions model stable, TypeScript type stripping is now stable (the old &lt;code&gt;--experimental-strip-types&lt;/code&gt; flag was removed in 24.12+)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Risk:&lt;/strong&gt; Some packages may not have been fully tested against Node 24's V8 engine. Native addons are the usual pain point.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My recommendation for February 2026: jump to Node 22 LTS. It is battle-tested, has the widest ecosystem compatibility, and gives you a year before you need to think about versions again. If you are starting a new project, consider Node 24 from day one.&lt;/p&gt;

&lt;h2&gt;
  
  
  What breaks when you upgrade from Node 20 to 22
&lt;/h2&gt;

&lt;p&gt;The gap is one LTS version (20 → 22), which is the smallest possible LTS jump. Good news: this is usually straightforward. Bad news: "usually" is not "always."&lt;/p&gt;

&lt;h3&gt;
  
  
  V8 engine changes
&lt;/h3&gt;

&lt;p&gt;Node 22 ships V8 12.4 (Node 20 had V8 11.3). This matters if you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use native addons compiled against Node 20. &lt;strong&gt;Fix:&lt;/strong&gt; &lt;code&gt;npm rebuild&lt;/code&gt; after upgrading. Most addons recompile automatically, but some with pinned prebuilt binaries (sharp, bcrypt, better-sqlite3) may need an explicit version bump.&lt;/li&gt;
&lt;li&gt;Rely on specific V8 flags for performance tuning. Some flags change between V8 versions. Check your &lt;code&gt;--v8-*&lt;/code&gt; flags still exist.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Deprecated APIs removed or changed
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;punycode&lt;/code&gt; module: Runtime deprecation warning in Node 20, still importable. In Node 22, the warning is louder and the module is scheduled for removal. Use the &lt;code&gt;punycode/&lt;/code&gt; npm package instead.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SlowBuffer&lt;/code&gt;: If you somehow still use this, switch to &lt;code&gt;Buffer.allocUnsafe()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;url.parse()&lt;/code&gt;: Still works but URL constructor is preferred. Some edge cases around auth parsing were tightened in Node 22.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenSSL 3.x changes:&lt;/strong&gt; Node 22 may use a newer OpenSSL patch that affects TLS behavior. If you connect to systems with legacy TLS configurations, test your HTTPS connections thoroughly.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ESM/CJS interop changes
&lt;/h3&gt;

&lt;p&gt;This is the area most likely to cause confusion, not breakage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Node 22 supports &lt;code&gt;require()&lt;/code&gt; for ES modules. This is unflagged but at "release candidate" stability (1.2) -- usable and increasingly relied on, but not yet fully stable. It does not break existing CJS code.&lt;/li&gt;
&lt;li&gt;If you have a mixed ESM/CJS codebase, test both &lt;code&gt;import&lt;/code&gt; and &lt;code&gt;require&lt;/code&gt; paths after upgrading.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;"type": "module"&lt;/code&gt; field in package.json behaves the same way. No changes there.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step-by-step migration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Audit your Node.js footprint
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check all hosts and containers&lt;/span&gt;
node &lt;span class="nt"&gt;--version&lt;/span&gt;

&lt;span class="c"&gt;# Find pinned versions in your codebase&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-rn&lt;/span&gt; &lt;span class="s2"&gt;"FROM node:20"&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--include&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"Dockerfile*"&lt;/span&gt;
find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;".nvmrc"&lt;/span&gt; &lt;span class="nt"&gt;-exec&lt;/span&gt; &lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; &lt;span class="se"&gt;\;&lt;/span&gt; &lt;span class="nt"&gt;-print&lt;/span&gt;
find &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;".node-version"&lt;/span&gt; &lt;span class="nt"&gt;-exec&lt;/span&gt; &lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt; &lt;span class="se"&gt;\;&lt;/span&gt; &lt;span class="nt"&gt;-print&lt;/span&gt;
&lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-rn&lt;/span&gt; &lt;span class="s1"&gt;'"node"'&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--include&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"package.json"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"20"&lt;/span&gt;

&lt;span class="c"&gt;# Check AWS Lambda functions&lt;/span&gt;
aws lambda list-functions &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s1"&gt;'Functions[?Runtime==`nodejs20.x`].[FunctionName]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; table
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Test on Node 22 locally
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# Using nvm&lt;/span&gt;
nvm install 22
nvm use 22
npm ci
npm test

&lt;span class="c"&gt;# Or with Docker&lt;/span&gt;
&lt;span class="c"&gt;# Before&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node:20-slim&lt;/span&gt;
&lt;span class="c"&gt;# After&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node:22-slim&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Watch for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Native addon compilation failures (most common: sharp, bcrypt, better-sqlite3, canvas)&lt;/li&gt;
&lt;li&gt;Test failures from tightened URL parsing or crypto behavior&lt;/li&gt;
&lt;li&gt;Deprecation warnings that became errors&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Fix dependency issues
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Rebuild all native addons&lt;/span&gt;
npm rebuild

&lt;span class="c"&gt;# Check for packages that declare Node engine requirements&lt;/span&gt;
npx check-engines

&lt;span class="c"&gt;# Update packages that need newer versions for Node 22&lt;/span&gt;
npm outdated
npm update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Common packages that needed updates for Node 22:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;sharp:&lt;/strong&gt; Needs 0.33+ for Node 22 prebuilt binaries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;bcrypt:&lt;/strong&gt; Needs 5.1+ for Node 22 ABI compatibility&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;node-sass:&lt;/strong&gt; Dead project. Switch to &lt;code&gt;sass&lt;/code&gt; (Dart Sass) immediately -- this will not get Node 22 support.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;better-sqlite3:&lt;/strong&gt; Needs 11+ for Node 22&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prisma:&lt;/strong&gt; 5.x supports Node 22. If you are on Prisma 4.x, this is a good time to upgrade.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Update CI pipelines
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# GitHub Actions -- run both during migration&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-node@v6&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;22'&lt;/span&gt;  &lt;span class="c1"&gt;# was '20'&lt;/span&gt;

&lt;span class="c1"&gt;# To test both versions in a matrix:&lt;/span&gt;
&lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;matrix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;node-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;20'&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;22'&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Update Docker base images
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pin the specific LTS version&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node:22-slim&lt;/span&gt;

&lt;span class="c"&gt;# If you were on Alpine&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node:22-alpine&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; If you use multi-stage builds, update ALL stages -- not just the final one. A common mistake is updating the runtime stage but leaving the builder stage on Node 20.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Update serverless runtimes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# AWS Lambda -- update runtime in SAM/CloudFormation&lt;/span&gt;
&lt;span class="na"&gt;Runtime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nodejs22.x&lt;/span&gt;  &lt;span class="c1"&gt;# was nodejs20.x&lt;/span&gt;

&lt;span class="c1"&gt;# Or via AWS CLI&lt;/span&gt;
&lt;span class="s"&gt;aws lambda update-function-configuration \&lt;/span&gt;
  &lt;span class="s"&gt;--function-name my-function \&lt;/span&gt;
  &lt;span class="s"&gt;--runtime nodejs22.x&lt;/span&gt;

&lt;span class="c1"&gt;# Vercel -- update package.json&lt;/span&gt;
&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;engines"&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node"&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;22.x"&lt;/span&gt;
&lt;span class="pi"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Netlify -- update environment variable&lt;/span&gt;
&lt;span class="s"&gt;NODE_VERSION=22&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Deploy with a canary
&lt;/h3&gt;

&lt;p&gt;Do not upgrade every service simultaneously. Pick your least-critical production service. Deploy with Node 22. Watch error rates, latency, and memory usage for 48 hours. Then roll forward to the next service.&lt;/p&gt;

&lt;p&gt;Pay particular attention to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Memory usage (V8 12.4 may have different heap behavior)&lt;/li&gt;
&lt;li&gt;Cold start times in serverless (first request after deploy)&lt;/li&gt;
&lt;li&gt;TLS handshake failures if connecting to legacy systems&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Node 20 vs 22 vs 24: Quick comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Node 20 LTS&lt;/th&gt;
&lt;th&gt;Node 22 LTS&lt;/th&gt;
&lt;th&gt;Node 24&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Release date&lt;/td&gt;
&lt;td&gt;April 2023&lt;/td&gt;
&lt;td&gt;April 2024&lt;/td&gt;
&lt;td&gt;May 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Active LTS start&lt;/td&gt;
&lt;td&gt;October 2023&lt;/td&gt;
&lt;td&gt;October 2024&lt;/td&gt;
&lt;td&gt;October 2025&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EOL&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;April 30, 2026&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;April 30, 2027&lt;/td&gt;
&lt;td&gt;April 30, 2028&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;V8 engine&lt;/td&gt;
&lt;td&gt;11.3&lt;/td&gt;
&lt;td&gt;12.4&lt;/td&gt;
&lt;td&gt;13.x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;fetch()&lt;/td&gt;
&lt;td&gt;Stable (since 21.x backport)&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WebSocket&lt;/td&gt;
&lt;td&gt;Experimental (20.10+, flag)&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;require(esm)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Release candidate&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test runner&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;td&gt;Stable (enhanced)&lt;/td&gt;
&lt;td&gt;Stable (enhanced)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Watch mode&lt;/td&gt;
&lt;td&gt;Stable (since 20.13)&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ecosystem support&lt;/td&gt;
&lt;td&gt;Universal&lt;/td&gt;
&lt;td&gt;Universal&lt;/td&gt;
&lt;td&gt;Most packages&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The one thing nobody tells you about Node version migrations
&lt;/h2&gt;

&lt;p&gt;The breakage is almost never in your application code. It is in native addons. Specifically, it is the one C++ addon that was compiled against Node 20's ABI and ships a prebuilt binary that does not exist for Node 22 yet.&lt;/p&gt;

&lt;p&gt;When this happens, &lt;code&gt;npm ci&lt;/code&gt; either fails with a compilation error (if you do not have build tools installed) or silently downloads a binary for the wrong ABI (which then crashes at runtime with &lt;code&gt;NODE_MODULE_VERSION mismatch&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The fix: always run &lt;code&gt;npm rebuild&lt;/code&gt; after switching Node versions. Add it to your Dockerfile. Add it to your CI setup step. Make it automatic so you never think about it again.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/node-20-vs-22-vs-24-which-node-js-lts-should-you-run-in-production/" rel="noopener noreferrer"&gt;Node 20 vs 22 vs 24: Which LTS Should You Run in Production?&lt;/a&gt; -- detailed comparison of all three versions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/nodejs-20-20-0-release-notes-security-patches/" rel="noopener noreferrer"&gt;Node.js 20.20.0 Security Patches&lt;/a&gt; -- the final security releases you should be running&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/nodejs-22-22-0-release-notes-lts-security/" rel="noopener noreferrer"&gt;Node.js 22.22.0 LTS Security Release&lt;/a&gt; -- what to expect after migrating&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/nodejs-24-13-1-release-notes/" rel="noopener noreferrer"&gt;Node.js 24.13.1 Release Notes&lt;/a&gt; -- if you're considering the current release line&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/nodejs-undici-7-18-2-critical-security-patch/" rel="noopener noreferrer"&gt;undici v7.18.2 Critical Security Patch&lt;/a&gt; -- the HTTP client vulnerability that affects all Node versions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/tlssocket-default-error-handler-nodejs-22-22-0/" rel="noopener noreferrer"&gt;Node.js 22.22.0 TLSSocket Changes&lt;/a&gt; -- a subtle behaviour change to watch for during migration&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;When exactly does Node 20 reach end of life?&lt;/strong&gt; April 30, 2026. After this date, the Node.js project will not release any further patches for the 20.x line. If a CVE is found in Node 20 after this date, the fix will only ship for Node 22+ and you will need to upgrade to receive it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I skip Node 22 and go straight to Node 24?&lt;/strong&gt; Yes, if Node 24 has entered Active LTS by the time you migrate (October 2025). The jump from Node 20 to 24 is larger -- two V8 major versions -- so expect more native addon rebuilds and test more thoroughly. For most teams, Node 22 is the safer choice because it has been in production for over a year.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does Node 20 EOL affect my operating system?&lt;/strong&gt; Linux distributions that ship Node 20 as a system package (some Debian/Ubuntu versions) may continue to backport security patches on their own timeline. However, this only covers the Node binary itself -- not npm packages, not your application dependencies. For application security, you need an upstream-supported Node version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Will AWS Lambda stop running Node 20 functions on April 30?&lt;/strong&gt; No. AWS provides at least 180 days notice before deprecating a runtime, then phases out in stages: first blocking new function creation, then blocking updates. Existing invocations can continue indefinitely on deprecated runtimes -- AWS does not forcibly stop running functions. But running an EOL runtime in Lambda means neither upstream Node.js nor AWS is patching it, so your functions are exposed to any vulnerabilities found after the EOL date. Do not confuse "still runs" with "still safe."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I check which Node version my production containers are running?&lt;/strong&gt; If you use Docker, check your base image tags. For running containers: &lt;code&gt;docker exec  node --version&lt;/code&gt;. For Kubernetes: &lt;code&gt;kubectl exec  -- node --version&lt;/code&gt;. For a fleet, consider adding Node version to your health check endpoint so monitoring can track it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What about TypeScript?&lt;/strong&gt; TypeScript itself runs on whatever Node version you have -- the compiler is pure JavaScript. The concern is with &lt;code&gt;@types/node&lt;/code&gt;: make sure you update to &lt;code&gt;@types/node@22&lt;/code&gt; to get accurate type definitions for Node 22's APIs. Also check that your &lt;code&gt;tsconfig.json&lt;/code&gt; target and lib settings are appropriate for Node 22's V8 version.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/package-json-health/" rel="noopener noreferrer"&gt;npm Package Health Checker&lt;/a&gt; — paste your &lt;code&gt;package.json&lt;/code&gt; and check every dependency for deprecation and staleness. Free.&lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>webdev</category>
      <category>devops</category>
    </item>
    <item>
      <title>Kubernetes Events Explained: Types, kubectl Commands, and Observability Patterns</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:16:15 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/kubernetes-events-explained-types-kubectl-commands-and-observability-patterns-4e1m</link>
      <guid>https://dev.to/matheus_releaserun/kubernetes-events-explained-types-kubectl-commands-and-observability-patterns-4e1m</guid>
      <description>&lt;h2&gt;
  
  
  What Are Kubernetes Events?
&lt;/h2&gt;

&lt;p&gt;Every time something happens inside a Kubernetes cluster -- a pod gets scheduled, a container image is pulled, a volume fails to mount -- the control plane records it as an &lt;strong&gt;Event&lt;/strong&gt;. Events are first-class API objects (kind: &lt;code&gt;Event&lt;/code&gt;) that provide a running log of what is happening across your nodes, pods, deployments, and other resources.&lt;/p&gt;

&lt;p&gt;Unlike application logs, which capture output from your code, Kubernetes events describe the &lt;em&gt;lifecycle of cluster objects themselves&lt;/em&gt;. They answer questions like: Why is this pod stuck in Pending? Why did that node go NotReady? Why was my container OOM-killed?&lt;/p&gt;

&lt;p&gt;Events are stored in etcd alongside other API objects and are accessible through the Kubernetes API. They are namespaced resources, meaning each event belongs to a specific namespace (or to the cluster scope for node-level events). Understanding how to read, filter, and export events is one of the most practical debugging skills a Kubernetes operator can develop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Event Types: Normal and Warning
&lt;/h2&gt;

&lt;p&gt;Kubernetes classifies every event into one of two types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Normal&lt;/strong&gt; -- Indicates that something expected happened. A pod was scheduled, a container started, a volume was successfully attached. These events confirm that the system is working as intended.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Warning&lt;/strong&gt; -- Indicates that something unexpected or potentially problematic occurred. A container crashed, an image pull failed, a node ran out of resources. Warning events are the ones you typically want to monitor and alert on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is an example of what a Normal event looks like when a pod starts successfully:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LAST SEEN   TYPE     REASON      OBJECT          MESSAGE
2m          Normal   Scheduled   pod/web-abc12   Successfully assigned default/web-abc12 to node-3
2m          Normal   Pulling     pod/web-abc12   Pulling image "nginx:1.27"
2m          Normal   Pulled      pod/web-abc12   Successfully pulled image "nginx:1.27" in 1.2s
2m          Normal   Created     pod/web-abc12   Created container nginx
2m          Normal   Started     pod/web-abc12   Started container nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here is a Warning event when something goes wrong:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;LAST SEEN   TYPE      REASON             OBJECT          MESSAGE
30s         Warning   FailedScheduling   pod/web-xyz99   0/5 nodes are available: 5 Insufficient memory.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Anatomy of a Kubernetes Event
&lt;/h2&gt;

&lt;p&gt;Each event object contains several fields that together tell you exactly what happened, to which object, and when. Understanding these fields is essential for effective debugging.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Event Fields
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;type&lt;/strong&gt; -- Either &lt;code&gt;Normal&lt;/code&gt; or &lt;code&gt;Warning&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;reason&lt;/strong&gt; -- A short, CamelCase string that categorizes the event. Examples: &lt;code&gt;Scheduled&lt;/code&gt;, &lt;code&gt;Pulling&lt;/code&gt;, &lt;code&gt;FailedMount&lt;/code&gt;, &lt;code&gt;BackOff&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;message&lt;/strong&gt; -- A human-readable description of what happened.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;involvedObject&lt;/strong&gt; -- The API object the event is about, including its &lt;code&gt;kind&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;namespace&lt;/code&gt;, and &lt;code&gt;uid&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;source&lt;/strong&gt; -- The component that generated the event (e.g., &lt;code&gt;kubelet&lt;/code&gt;, &lt;code&gt;default-scheduler&lt;/code&gt;, &lt;code&gt;kube-controller-manager&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;count&lt;/strong&gt; -- How many times this event has occurred. Kubernetes deduplicates repeated events and increments this counter instead of creating new objects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;firstTimestamp&lt;/strong&gt; -- When the event was first recorded.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;lastTimestamp&lt;/strong&gt; -- When the event was most recently recorded.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is a full event object in YAML format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Event&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-abc12.17f3a2b8c9d1e4f6&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
  &lt;span class="na"&gt;creationTimestamp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-02-16T10:30:00Z"&lt;/span&gt;
&lt;span class="na"&gt;involvedObject&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-abc12&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
  &lt;span class="na"&gt;uid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;a1b2c3d4-e5f6-7890-abcd-ef1234567890&lt;/span&gt;
&lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;BackOff&lt;/span&gt;
&lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Back-off&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;restarting&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;container&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;nginx&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pod&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;web-abc12_default"&lt;/span&gt;
&lt;span class="na"&gt;source&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;component&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kubelet&lt;/span&gt;
  &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node-3&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Warning&lt;/span&gt;
&lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
&lt;span class="na"&gt;firstTimestamp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-02-16T10:25:00Z"&lt;/span&gt;
&lt;span class="na"&gt;lastTimestamp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-02-16T10:30:00Z"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Viewing Events with kubectl
&lt;/h2&gt;

&lt;p&gt;The most common way to inspect events is through &lt;code&gt;kubectl&lt;/code&gt;. Here are the commands you will use most often.&lt;/p&gt;

&lt;h3&gt;
  
  
  List All Events in the Current Namespace
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl get events
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This returns events in the default namespace. To see events in a different namespace, add &lt;code&gt;-n&lt;/code&gt;. To see events across all namespaces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl get events --all-namespaces
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sort Events by Time
&lt;/h3&gt;

&lt;p&gt;By default, events are not guaranteed to be in chronological order. Sort them by creation timestamp to see the most recent activity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl get events --sort-by=.metadata.creationTimestamp
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is one of the most useful flags when triaging an incident. It lets you reconstruct a timeline of what happened in the cluster.&lt;/p&gt;

&lt;h3&gt;
  
  
  Filter Events by Type
&lt;/h3&gt;

&lt;p&gt;To see only Warning events, which are typically the ones that matter during debugging:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl get events --field-selector type=Warning
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also filter by the involved object. For example, to see events for a specific pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl get events --field-selector involvedObject.name=web-abc12
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or combine multiple field selectors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl get events --field-selector type=Warning,involvedObject.kind=Pod
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  View Events via kubectl describe
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;kubectl describe&lt;/code&gt; command shows events at the bottom of its output for any resource. This is often the fastest way to check events for a specific pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl describe pod web-abc12
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Events section at the bottom will show recent events related to that pod, sorted chronologically. This is usually the first command you run when a pod is misbehaving.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wide Output and Custom Columns
&lt;/h3&gt;

&lt;p&gt;For more detail, use wide output or custom columns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl get events -o wide
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or extract specific fields with JSONPath:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;kubectl get events -o jsonpath='{range .items[*]}{.lastTimestamp}{"\t"}{.type}{"\t"}{.reason}{"\t"}{.message}{"\n"}{end}'
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Common Warning Events and What They Mean
&lt;/h2&gt;

&lt;p&gt;Certain warning events appear frequently in production clusters. Knowing what they mean and how to respond to them will save you significant debugging time.&lt;/p&gt;

&lt;h3&gt;
  
  
  FailedScheduling
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Warning   FailedScheduling   pod/app-xyz   0/5 nodes are available: 2 Insufficient cpu, 3 Insufficient memory.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The scheduler cannot find a node with enough resources to place the pod. This usually means you need to scale up your node pool, reduce resource requests, or free up capacity by evicting lower-priority workloads. Check your resource requests and limits against actual node capacity.&lt;/p&gt;

&lt;h3&gt;
  
  
  ImagePullBackOff
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Warning   Failed    pod/app-xyz   Failed to pull image "myregistry.io/app:v2.1": rpc error: unauthorized
Warning   BackOff   pod/app-xyz   Back-off pulling image "myregistry.io/app:v2.1"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The kubelet cannot pull the container image. Common causes include incorrect image tags, missing or expired registry credentials (imagePullSecrets), or network connectivity issues to the registry. To debug, verify the image name and tag are correct, confirm that the imagePullSecret exists in the pod's namespace and contains valid credentials, and test registry connectivity from the node with &lt;code&gt;curl&lt;/code&gt; or &lt;code&gt;crictl pull&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  BackOff (CrashLoopBackOff)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Warning   BackOff   pod/app-xyz   Back-off restarting failed container app in pod app-xyz_default
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The container keeps crashing and Kubernetes is applying an exponential back-off delay before restarting it. Check the container logs with &lt;code&gt;kubectl logs app-xyz --previous&lt;/code&gt; to see why the application is crashing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unhealthy (Liveness/Readiness Probe Failures)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Warning   Unhealthy   pod/app-xyz   Readiness probe failed: HTTP probe failed with statuscode: 503
Warning   Unhealthy   pod/app-xyz   Liveness probe failed: connection refused
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The kubelet's health check probes are failing. If the liveness probe fails, Kubernetes will restart the container. If the readiness probe fails, the pod is removed from service endpoints. Review your probe configuration -- the path, port, and timeout values -- and verify that your application is actually healthy on those endpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  FailedMount and FailedAttachVolume
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Warning   FailedMount         pod/db-abc   Unable to attach or mount volumes: timed out waiting for the condition
Warning   FailedAttachVolume   pod/db-abc   Multi-Attach error for volume "pvc-123": Volume is already attached to node-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pod's volume cannot be attached or mounted. This is common with cloud block storage (EBS, Persistent Disk) when a volume is still attached to a previous node after a failover. Some storage backends do not support ReadWriteMany access mode. When you see this event, check the PersistentVolumeClaim status with &lt;code&gt;kubectl get pvc&lt;/code&gt; and verify the volume's availability in your cloud provider's console. In many cases, force-detaching the volume from the old node resolves the issue.&lt;/p&gt;

&lt;h3&gt;
  
  
  OOMKilling
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Warning   OOMKilling   pod/app-xyz   Memory cgroup out of memory: Killed process 12345 (java)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The container exceeded its memory limit and was killed by the kernel's OOM killer. Either the memory limit is too low for the workload, or the application has a memory leak. Increase the memory limit or investigate the application's memory usage patterns. For more on diagnosing node-level issues, see our &lt;a href="https://releaserun.com/debugging-kubernetes-nodes-notready/" rel="noopener noreferrer"&gt;guide to debugging Kubernetes nodes in NotReady state&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  NodeNotReady
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Warning   NodeNotReady   node/node-3   Node node-3 status is now: NodeNotReady
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A node has stopped reporting its status to the control plane. This can be caused by kubelet crashes, network partitions, or the node running out of resources (disk pressure, memory pressure, PID pressure). All pods on the affected node will eventually be rescheduled to other nodes after the &lt;code&gt;pod-eviction-timeout&lt;/code&gt; expires (default: 5 minutes). Monitor for this event closely in production -- it often indicates a node that needs investigation or replacement. For a detailed troubleshooting guide, see our article on &lt;a href="https://releaserun.com/debugging-kubernetes-nodes-notready/" rel="noopener noreferrer"&gt;debugging Kubernetes nodes in NotReady state&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Event Retention and the Default TTL
&lt;/h2&gt;

&lt;p&gt;One of the most important things to understand about Kubernetes events is that &lt;strong&gt;they are ephemeral by default&lt;/strong&gt;. The kube-apiserver has a default event time-to-live (TTL) of &lt;strong&gt;1 hour&lt;/strong&gt;. After that, events are garbage-collected from etcd.&lt;/p&gt;

&lt;p&gt;This means that if you look at events after an incident that happened two hours ago, they will already be gone. This is one of the main reasons teams set up event exporters (covered in the next section). The short default TTL is intentional -- events can be high-volume in large clusters, and storing them indefinitely in etcd would increase storage and memory pressure on the control plane.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuring the Event TTL
&lt;/h3&gt;

&lt;p&gt;You can change the default TTL by passing the &lt;code&gt;--event-ttl&lt;/code&gt; flag to the kube-apiserver:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# In the kube-apiserver manifest (e.g., /etc/kubernetes/manifests/kube-apiserver.yaml)&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kube-apiserver&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;--event-ttl=6h&lt;/span&gt;
    &lt;span class="c1"&gt;# ... other flags&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Increasing the TTL gives you a longer window to inspect events, but it also increases the load on etcd since more objects are stored. For most production clusters, 2-6 hours is a reasonable range. Beyond that, you should be exporting events to an external system.&lt;/p&gt;

&lt;p&gt;If you are planning a cluster upgrade, be aware that changes to apiserver flags may need to be reapplied. Our &lt;a href="https://releaserun.com/kubernetes-upgrade-checklist/" rel="noopener noreferrer"&gt;Kubernetes upgrade checklist&lt;/a&gt; covers these considerations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exporting Events for Long-Term Observability
&lt;/h2&gt;

&lt;p&gt;Since events are garbage-collected after the TTL expires, exporting them to an external logging or observability platform is essential for production clusters. Several tools are available for this purpose.&lt;/p&gt;

&lt;h3&gt;
  
  
  Kubernetes Event Exporter
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://github.com/resmoio/kubernetes-event-exporter" rel="noopener noreferrer"&gt;Kubernetes Event Exporter&lt;/a&gt; (originally by OpenPolicyAgent, now maintained by Resmo) watches the event stream and forwards events to sinks like Elasticsearch, OpenSearch, Slack, webhooks, or files.&lt;/p&gt;

&lt;p&gt;Here is a minimal configuration that forwards Warning events to Elasticsearch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ConfigMap&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;event-exporter-cfg&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;monitoring&lt;/span&gt;
&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;config.yaml&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;logLevel: error&lt;/span&gt;
    &lt;span class="s"&gt;logFormat: json&lt;/span&gt;
    &lt;span class="s"&gt;route:&lt;/span&gt;
      &lt;span class="s"&gt;routes:&lt;/span&gt;
        &lt;span class="s"&gt;- match:&lt;/span&gt;
            &lt;span class="s"&gt;- receiver: "elasticsearch"&lt;/span&gt;
              &lt;span class="s"&gt;type: Warning&lt;/span&gt;
    &lt;span class="s"&gt;receivers:&lt;/span&gt;
      &lt;span class="s"&gt;- name: "elasticsearch"&lt;/span&gt;
        &lt;span class="s"&gt;elasticsearch:&lt;/span&gt;
          &lt;span class="s"&gt;hosts:&lt;/span&gt;
            &lt;span class="s"&gt;- "http://elasticsearch.monitoring.svc:9200"&lt;/span&gt;
          &lt;span class="s"&gt;index: kube-events&lt;/span&gt;
          &lt;span class="s"&gt;indexFormat: "kube-events-{2006-01-02}"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Fluentd and Fluent Bit
&lt;/h3&gt;

&lt;p&gt;If you already run Fluentd or Fluent Bit for log collection, you can configure them to collect Kubernetes events as well. Fluent Bit has a built-in &lt;code&gt;kubernetes_events&lt;/code&gt; input plugin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[INPUT]&lt;/span&gt;
    &lt;span class="err"&gt;Name&lt;/span&gt;              &lt;span class="err"&gt;kubernetes_events&lt;/span&gt;
    &lt;span class="err"&gt;Tag&lt;/span&gt;               &lt;span class="err"&gt;kube_events.*&lt;/span&gt;
    &lt;span class="err"&gt;Kube_URL&lt;/span&gt;          &lt;span class="err"&gt;https://kubernetes.default.svc:443&lt;/span&gt;
    &lt;span class="err"&gt;Kube_CA_File&lt;/span&gt;      &lt;span class="err"&gt;/var/run/secrets/kubernetes.io/serviceaccount/ca.crt&lt;/span&gt;
    &lt;span class="err"&gt;Kube_Token_File&lt;/span&gt;   &lt;span class="err"&gt;/var/run/secrets/kubernetes.io/serviceaccount/token&lt;/span&gt;

&lt;span class="nn"&gt;[OUTPUT]&lt;/span&gt;
    &lt;span class="err"&gt;Name&lt;/span&gt;              &lt;span class="err"&gt;es&lt;/span&gt;
    &lt;span class="err"&gt;Match&lt;/span&gt;             &lt;span class="err"&gt;kube_events.*&lt;/span&gt;
    &lt;span class="err"&gt;Host&lt;/span&gt;              &lt;span class="err"&gt;elasticsearch.monitoring.svc&lt;/span&gt;
    &lt;span class="err"&gt;Port&lt;/span&gt;              &lt;span class="err"&gt;9200&lt;/span&gt;
    &lt;span class="err"&gt;Index&lt;/span&gt;             &lt;span class="err"&gt;kube-events&lt;/span&gt;
    &lt;span class="err"&gt;Type&lt;/span&gt;              &lt;span class="err"&gt;_doc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Kubernetes Event Router (Heptio/VMware)
&lt;/h3&gt;

&lt;p&gt;The Event Router is a simpler alternative that captures events and writes them to stdout in a structured format. You can then collect that stdout with any log aggregation system (Fluentd, Promtail, Vector, etc.):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eventrouter&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kube-system&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;matchLabels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eventrouter&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eventrouter&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;serviceAccountName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eventrouter&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kube-eventrouter&lt;/span&gt;
        &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gcr.io/heptio-images/eventrouter:v0.4&lt;/span&gt;
        &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;config-volume&lt;/span&gt;
          &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/etc/eventrouter&lt;/span&gt;
      &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;config-volume&lt;/span&gt;
        &lt;span class="na"&gt;configMap&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eventrouter-cm&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Prometheus and Alerting
&lt;/h3&gt;

&lt;p&gt;While events themselves are not natively exposed as Prometheus metrics, you can use &lt;code&gt;kube-state-metrics&lt;/code&gt; to generate metrics from events. The &lt;code&gt;kube_pod_status_reason&lt;/code&gt; and similar metrics can trigger alerts for patterns like repeated OOMKills or CrashLoopBackOffs. You can also build custom Prometheus alerts that fire when specific event patterns appear in your exported event data, creating a bridge between Kubernetes events and your alerting infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Events in Modern Kubernetes: events.k8s.io/v1
&lt;/h2&gt;

&lt;p&gt;Historically, Kubernetes events used the core &lt;code&gt;v1&lt;/code&gt; API (&lt;code&gt;apiVersion: v1, kind: Event&lt;/code&gt;). Starting with Kubernetes 1.19, a new API group &lt;code&gt;events.k8s.io/v1&lt;/code&gt; was introduced with improvements. As of Kubernetes 1.35, this is the recommended API for working with events. For a full overview of what changed in this release, see our &lt;a href="https://releaserun.com/kubernetes-1-35-release-preview/" rel="noopener noreferrer"&gt;Kubernetes 1.35 release preview&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Changes in events.k8s.io/v1
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;regarding&lt;/strong&gt; -- Replaces &lt;code&gt;involvedObject&lt;/code&gt;. Contains a reference to the primary object the event is about.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;related&lt;/strong&gt; -- A new field that provides a reference to a secondary object. For example, if a pod event is related to a specific node, the node reference goes here.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;reportingController&lt;/strong&gt; -- Replaces &lt;code&gt;source.component&lt;/code&gt;. A string identifying the controller that reported the event (e.g., &lt;code&gt;k8s.io/kubelet&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;reportingInstance&lt;/strong&gt; -- Replaces &lt;code&gt;source.host&lt;/code&gt;. Identifies the specific instance of the controller.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;note&lt;/strong&gt; -- Replaces &lt;code&gt;message&lt;/code&gt;. A human-readable description of the event.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;series&lt;/strong&gt; -- Replaces &lt;code&gt;count&lt;/code&gt;, &lt;code&gt;firstTimestamp&lt;/code&gt;, and &lt;code&gt;lastTimestamp&lt;/code&gt; with a structured &lt;code&gt;EventSeries&lt;/code&gt; object that tracks recurring events more efficiently.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is what a modern event looks like in the new API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;events.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Event&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-abc12.a1b2c3d4e5f6&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;regarding&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pod&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;web-abc12&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;default&lt;/span&gt;
&lt;span class="na"&gt;related&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
  &lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Node&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node-3&lt;/span&gt;
&lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;BackOff&lt;/span&gt;
&lt;span class="na"&gt;note&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Back-off&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;restarting&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;container&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;nginx&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pod&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;web-abc12_default"&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Warning&lt;/span&gt;
&lt;span class="na"&gt;reportingController&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kubelet&lt;/span&gt;
&lt;span class="na"&gt;reportingInstance&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node-3&lt;/span&gt;
&lt;span class="na"&gt;eventTime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-02-16T10:30:00.000000Z"&lt;/span&gt;
&lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Restarting&lt;/span&gt;
&lt;span class="na"&gt;series&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
  &lt;span class="na"&gt;lastObservedTime&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-02-16T10:30:00.000000Z"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Practical Recipes for Event-Driven Debugging
&lt;/h2&gt;

&lt;p&gt;Here are some workflows that combine event inspection with other kubectl commands to quickly diagnose common issues.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recipe 1: Why Is My Pod Pending?
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Check events &lt;span class="k"&gt;for &lt;/span&gt;the pending pod
&lt;span class="go"&gt;kubectl get events --field-selector involvedObject.name=my-pod --sort-by=.metadata.creationTimestamp

&lt;/span&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Look &lt;span class="k"&gt;for &lt;/span&gt;FailedScheduling reason and &lt;span class="nb"&gt;read &lt;/span&gt;the message
&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Common causes: insufficient CPU/memory, node affinity/anti-affinity rules,
&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;taints without matching tolerations, PVC not bound
&lt;span class="go"&gt;
&lt;/span&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Check node resource availability
&lt;span class="go"&gt;kubectl describe nodes | grep -A 5 "Allocated resources"
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Recipe 2: Find All Failing Pods in a Namespace
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Get all Warning events &lt;span class="k"&gt;in &lt;/span&gt;the production namespace, sorted by &lt;span class="nb"&gt;time&lt;/span&gt;
&lt;span class="go"&gt;kubectl get events -n production \
  --field-selector type=Warning \
  --sort-by=.metadata.creationTimestamp \
  -o custom-columns=TIME:.lastTimestamp,REASON:.reason,OBJECT:.involvedObject.name,MESSAGE:.message
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Recipe 3: Monitor Events in Real Time
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Watch events as they happen &lt;span class="o"&gt;(&lt;/span&gt;like &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="k"&gt;for &lt;/span&gt;events&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="go"&gt;kubectl get events --watch

&lt;/span&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Watch only warnings across all namespaces
&lt;span class="go"&gt;kubectl get events --all-namespaces --field-selector type=Warning --watch
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Recipe 4: Audit Node Stability
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Check events &lt;span class="k"&gt;for &lt;/span&gt;a specific node
&lt;span class="go"&gt;kubectl get events --field-selector involvedObject.kind=Node,involvedObject.name=node-3

&lt;/span&gt;&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;Look &lt;span class="k"&gt;for &lt;/span&gt;patterns: NodeNotReady, NodeHasDiskPressure, NodeHasMemoryPressure,
&lt;span class="gp"&gt;#&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;NodeHasInsufficientPID, NodeRebooted
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Best Practices for Working with Kubernetes Events
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Export events to a durable store.&lt;/strong&gt; The 1-hour default TTL means events vanish quickly. Use an event exporter, Fluent Bit, or another tool to ship events to Elasticsearch, Loki, or your SIEM.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alert on Warning events.&lt;/strong&gt; Set up alerts for high-frequency warnings like &lt;code&gt;OOMKilling&lt;/code&gt;, &lt;code&gt;FailedScheduling&lt;/code&gt;, and &lt;code&gt;CrashLoopBackOff&lt;/code&gt;. Track event counts over time to catch trends.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use field selectors in scripts.&lt;/strong&gt; When building automation, use &lt;code&gt;--field-selector&lt;/code&gt; to filter events server-side rather than piping through grep. This reduces the load on the API server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Correlate events with logs and metrics.&lt;/strong&gt; Events tell you &lt;em&gt;what&lt;/em&gt; happened at the orchestration layer. Combine them with container logs (the &lt;em&gt;why&lt;/em&gt;) and metrics (the &lt;em&gt;how much&lt;/em&gt;) for a complete picture.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Increase the TTL for staging and CI clusters.&lt;/strong&gt; In environments where you debug after the fact, set &lt;code&gt;--event-ttl=12h&lt;/code&gt; or higher to keep events around longer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat events as a first-class observability signal.&lt;/strong&gt; Events are often overlooked in favor of logs and metrics, but they provide the clearest view into Kubernetes control-plane decisions like scheduling, scaling, and health checking.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Kubernetes events are the cluster's built-in audit trail. They record every significant lifecycle change -- from pod scheduling to volume attachment to node health transitions. By mastering &lt;code&gt;kubectl get events&lt;/code&gt; with field selectors and time sorting, setting up event exporters for long-term retention, and alerting on Warning-type events, you gain deep visibility into what your cluster is doing and why.&lt;/p&gt;

&lt;p&gt;The shift to the &lt;code&gt;events.k8s.io/v1&lt;/code&gt; API brings cleaner semantics with &lt;code&gt;regarding&lt;/code&gt;/&lt;code&gt;related&lt;/code&gt; fields and better deduplication through the &lt;code&gt;series&lt;/code&gt; structure. Whether you are debugging a single failing pod or building a comprehensive observability stack, events should be one of the first signals you reach for.&lt;/p&gt;




&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-1-32-end-of-life-migration-playbook/" rel="noopener noreferrer"&gt;Kubernetes 1.32 End of Life: Migration Playbook&lt;/a&gt; -- version 1.32 reaches end of life February 28, 2026&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/kubernetes-security-linter/" rel="noopener noreferrer"&gt;Kubernetes YAML Security Linter&lt;/a&gt; — paste any K8s manifest and scan for 12 security issues with an A–F grade. Free, browser-based.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>observability</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Companies Using Kubernetes in 2026: Who Runs K8s and How They Scale</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:15:36 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/companies-using-kubernetes-in-2026-who-runs-k8s-and-how-they-scale-2log</link>
      <guid>https://dev.to/matheus_releaserun/companies-using-kubernetes-in-2026-who-runs-k8s-and-how-they-scale-2log</guid>
      <description>&lt;h2&gt;
  
  
  Kubernetes Adoption in 2026: The Numbers
&lt;/h2&gt;

&lt;p&gt;Kubernetes has moved well past the early-adopter phase. According to the &lt;strong&gt;CNCF Annual Survey 2024&lt;/strong&gt;, 84% of organizations are either using or evaluating containers in production, with Kubernetes as the dominant orchestrator. The &lt;strong&gt;Datadog 2024 Container Report&lt;/strong&gt; found that over 65% of organizations running containers have adopted Kubernetes, up from roughly 50% just two years prior.&lt;/p&gt;

&lt;p&gt;What was once a technology associated primarily with Silicon Valley hyperscalers is now standard infrastructure across industries -- from banking and healthcare to government agencies and particle physics labs. For a broader look at adoption trends and data, see our detailed &lt;a href="https://releaserun.com/kubernetes-statistics-adoption-2026/" rel="noopener noreferrer"&gt;Kubernetes statistics and adoption report for 2026&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This article profiles nine organizations that run Kubernetes at significant scale, covering what they run, how big their deployments are, and what lessons other teams can draw from their experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech and Media Companies
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Spotify: 4,000+ Microservices Across 200 Clusters
&lt;/h3&gt;

&lt;p&gt;Spotify is one of the most frequently cited large-scale Kubernetes adopters, and for good reason. The music streaming platform serves over 600 million monthly active users and runs more than &lt;strong&gt;4,000 microservices&lt;/strong&gt; across approximately &lt;strong&gt;200 Kubernetes clusters&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Spotify migrated from a Helios-based container orchestration system (built in-house) to Kubernetes beginning around 2019. The migration was driven by the desire to reduce the operational burden of maintaining a custom orchestrator and to benefit from the Kubernetes ecosystem's tooling and community.&lt;/p&gt;

&lt;p&gt;Key details of Spotify's Kubernetes setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Runs on &lt;strong&gt;Google Kubernetes Engine (GKE)&lt;/strong&gt; as the primary platform.&lt;/li&gt;
&lt;li&gt;Uses &lt;strong&gt;Backstage&lt;/strong&gt; -- Spotify's open-source developer portal, now a CNCF incubating project -- as the interface for developers to deploy and manage services on Kubernetes without needing deep K8s knowledge.&lt;/li&gt;
&lt;li&gt;Operates a multi-cluster architecture with separate clusters for different teams and environments.&lt;/li&gt;
&lt;li&gt;Handles over 10 million requests per second across its microservices mesh.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Spotify's experience shows that a strong developer platform layer on top of Kubernetes (like Backstage) is critical for adoption at scale. Most developers at Spotify do not write Kubernetes YAML directly -- the platform abstracts it away.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reddit: From Bare Metal to Kubernetes
&lt;/h3&gt;

&lt;p&gt;Reddit's migration story is notable because the company moved from a traditional bare-metal infrastructure to Kubernetes. For years, Reddit ran its services on physical servers managed with configuration management tools. The limitations of this approach -- slow deployments, manual scaling, and hardware procurement lead times -- drove the shift to Kubernetes on AWS.&lt;/p&gt;

&lt;p&gt;Reddit now runs its core platform on &lt;strong&gt;Amazon EKS&lt;/strong&gt;, including the services that power the front page, comment threads, voting, and real-time features. At peak traffic, Reddit serves hundreds of millions of page views per day, with traffic spikes that can be sudden and massive (viral posts, breaking news events, AMA sessions). The migration was gradual, taking several years to move all production workloads.&lt;/p&gt;

&lt;p&gt;The engineering team invested heavily in building a custom Kubernetes deployment platform that integrated with their existing tooling. They adopted a "paved road" approach, providing standardized Helm charts and CI/CD pipelines that made it easy for service teams to migrate without becoming Kubernetes experts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Large-scale bare-metal-to-Kubernetes migrations are possible but require patience. Reddit's team emphasized the importance of running old and new infrastructure in parallel during the transition, and investing heavily in CI/CD pipelines to support the new deployment model. They also found that the cost savings from moving away from owned hardware to cloud-based Kubernetes were significant, even accounting for the cloud provider costs.&lt;/p&gt;

&lt;h3&gt;
  
  
  The New York Times: News on GKE
&lt;/h3&gt;

&lt;p&gt;The New York Times moved its digital infrastructure to &lt;strong&gt;Google Kubernetes Engine (GKE)&lt;/strong&gt; to support the rapid iteration required by a modern digital newsroom. The migration consolidated a patchwork of deployment systems into a unified Kubernetes-based platform.&lt;/p&gt;

&lt;p&gt;The NYT runs content delivery, search, personalization, and subscription services on GKE. Their engineering team built an internal delivery platform that lets developers deploy services through a simplified interface, abstracting away Kubernetes complexity for reporters and editors who work on interactive projects.&lt;/p&gt;

&lt;p&gt;The NYT engineering team has spoken publicly about the benefits of Kubernetes for their newsroom's technical projects. During major news events -- elections, breaking stories, live events -- traffic can spike by 5-10x within minutes. Kubernetes' horizontal pod autoscaling lets them handle these spikes automatically, which was difficult to achieve on their previous infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Kubernetes adoption is not just for tech companies. Media organizations with demanding content delivery requirements benefit from the scalability and rapid deployment cycles that Kubernetes provides. The NYT also demonstrates the value of having a platform engineering team that shields content-focused developers from infrastructure complexity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pinterest: 30,000+ Pods at Scale
&lt;/h3&gt;

&lt;p&gt;Pinterest runs one of the larger Kubernetes deployments in the consumer technology space. The visual discovery platform operates &lt;strong&gt;over 30,000 pods&lt;/strong&gt; across multiple clusters, supporting a user base of more than 450 million monthly active users.&lt;/p&gt;

&lt;p&gt;Pinterest's infrastructure handles computationally intensive workloads including image processing, recommendation algorithms, and search indexing. The company has been public about the challenges of running machine learning training and inference workloads on Kubernetes, contributing to upstream projects around GPU scheduling and resource management.&lt;/p&gt;

&lt;p&gt;Key aspects of Pinterest's setup:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-cluster architecture running on &lt;strong&gt;AWS EKS&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Custom autoscaling policies tuned for workloads with bursty traffic patterns (e.g., holiday shopping seasons).&lt;/li&gt;
&lt;li&gt;Heavy use of Kubernetes for batch processing and ML training alongside serving workloads.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Running both serving and batch/ML workloads on Kubernetes is feasible but requires careful attention to scheduling, resource isolation, and autoscaling. Pinterest's multi-cluster strategy helps isolate failures and manage upgrades safely.&lt;/p&gt;

&lt;h2&gt;
  
  
  E-Commerce and Consumer Brands
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Airbnb: EKS After the Monolith
&lt;/h3&gt;

&lt;p&gt;Airbnb's Kubernetes journey began as part of a broader effort to decompose its Ruby on Rails monolith into microservices. The company migrated to &lt;strong&gt;Amazon EKS&lt;/strong&gt; and built a service-oriented architecture where hundreds of services run independently on Kubernetes.&lt;/p&gt;

&lt;p&gt;Airbnb's engineering team developed a significant amount of internal tooling around Kubernetes, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A service configuration system that generates Kubernetes manifests from a higher-level service definition.&lt;/li&gt;
&lt;li&gt;Custom admission controllers for enforcing organizational policies (resource limits, security contexts, labeling requirements).&lt;/li&gt;
&lt;li&gt;Integration with their experimentation platform, allowing A/B tests to be deployed as separate Kubernetes rollouts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Airbnb processes millions of searches and bookings daily, with each request touching dozens of downstream services. Their Kubernetes deployment handles significant computational workloads including search ranking, pricing algorithms, and real-time availability checks. The company has shared that their migration to Kubernetes reduced deployment times from hours to minutes and significantly improved developer velocity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Breaking up a monolith and moving to Kubernetes are often done together, but they are separate concerns. Airbnb found that the microservices decomposition was the harder problem -- Kubernetes provided the runtime, but the architectural decisions around service boundaries were what determined success. Their custom admission controllers are worth noting as well -- enforcing organizational standards at the cluster level prevents configuration drift and security gaps as the number of services grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adidas: On-Prem to Cloud-Native
&lt;/h3&gt;

&lt;p&gt;Adidas migrated its e-commerce platform from traditional on-premises infrastructure to Kubernetes on AWS. The sports brand was one of the earlier enterprise adopters in the retail space, driven by the need to handle massive traffic spikes during product launches (particularly limited-edition sneaker drops, which generate extreme burst traffic).&lt;/p&gt;

&lt;p&gt;After the migration, Adidas reported a significant reduction in deployment lead time -- from weeks to minutes -- and improved ability to scale for peak traffic events. The platform team standardized on Kubernetes across development, staging, and production environments, creating consistency across the software delivery lifecycle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Retail companies with extreme traffic variability benefit enormously from Kubernetes' horizontal pod autoscaling and cluster autoscaling. The ability to scale up for a product launch and scale down afterward translates directly into cost savings compared to provisioning for peak capacity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Financial Services
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Capital One: Kubernetes in Banking
&lt;/h3&gt;

&lt;p&gt;Capital One has been one of the most visible proponents of Kubernetes adoption in the financial services industry. The bank runs a large-scale Kubernetes platform on AWS and has contributed to several open-source projects in the Kubernetes ecosystem, including &lt;strong&gt;Critical Stack&lt;/strong&gt; (a Kubernetes management platform they later open-sourced).&lt;/p&gt;

&lt;p&gt;Running Kubernetes in financial services comes with additional constraints that do not apply to most technology companies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Regulatory compliance:&lt;/strong&gt; Financial regulators require strict controls around data access, encryption, and audit logging. Capital One's Kubernetes platform integrates with their compliance and governance systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security requirements:&lt;/strong&gt; Multi-tenancy is enforced through namespace isolation, network policies, and OPA/Gatekeeper admission policies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Change management:&lt;/strong&gt; Deployments follow formal change management processes, with Kubernetes rollouts integrated into the bank's change advisory board workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Kubernetes adoption in regulated industries is absolutely possible but requires upfront investment in policy enforcement, audit logging, and integration with existing compliance frameworks. Tools like OPA Gatekeeper and Kubernetes RBAC are essential building blocks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Government and Research
&lt;/h2&gt;

&lt;h3&gt;
  
  
  US Department of Defense: Platform One
&lt;/h3&gt;

&lt;p&gt;The US Department of Defense (DoD) operates &lt;strong&gt;Platform One&lt;/strong&gt;, a Kubernetes-based DevSecOps platform that provides a standardized, security-hardened software delivery environment for defense applications. Platform One is built on top of a DoD-hardened Kubernetes distribution and includes a curated set of tools for CI/CD, monitoring, logging, and security scanning.&lt;/p&gt;

&lt;p&gt;Platform One serves as the foundation for &lt;strong&gt;Big Bang&lt;/strong&gt;, a Helm-based deployment package that installs a complete DevSecOps stack on any Kubernetes cluster. Components include Istio for service mesh, Prometheus and Grafana for monitoring, Elasticsearch and Kibana for logging, and various security scanning tools that meet DoD security requirements (STIG compliance).&lt;/p&gt;

&lt;p&gt;Key aspects of Platform One:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Designed to run on &lt;strong&gt;any infrastructure&lt;/strong&gt;: cloud, on-premises, or air-gapped environments.&lt;/li&gt;
&lt;li&gt;All container images are scanned and signed through the DoD's &lt;strong&gt;Iron Bank&lt;/strong&gt; registry.&lt;/li&gt;
&lt;li&gt;Supports multiple classification levels with appropriate network isolation.&lt;/li&gt;
&lt;li&gt;Used by multiple branches of the military and defense agencies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; If the US Department of Defense can run Kubernetes with its extreme security requirements, most organizations can too. The key is a standardized platform approach (Platform One/Big Bang) rather than letting every team build their own Kubernetes setup. For context on how Kubernetes compares to simpler container runtimes in different scenarios, see our &lt;a href="https://releaserun.com/docker-vs-kubernetes-production-2026-decision-rubric/" rel="noopener noreferrer"&gt;Docker vs Kubernetes production decision rubric&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  CERN: Kubernetes for Particle Physics
&lt;/h3&gt;

&lt;p&gt;CERN, the European Organization for Nuclear Research, uses Kubernetes to manage the massive data processing pipelines required to analyze data from the Large Hadron Collider (LHC). CERN's computing infrastructure processes petabytes of physics data, and Kubernetes helps orchestrate the batch processing jobs and analysis workflows.&lt;/p&gt;

&lt;p&gt;CERN's Kubernetes deployment is notable for several reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Runs on &lt;strong&gt;on-premises infrastructure&lt;/strong&gt; in CERN's data centers, not on public cloud.&lt;/li&gt;
&lt;li&gt;Manages workloads that are heavily batch-oriented, using Kubernetes alongside HTCondor and other HPC schedulers.&lt;/li&gt;
&lt;li&gt;Uses &lt;strong&gt;OpenStack Magnum&lt;/strong&gt; for provisioning Kubernetes clusters on their private cloud infrastructure.&lt;/li&gt;
&lt;li&gt;Contributes to upstream Kubernetes development, particularly around batch scheduling and multi-cluster federation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Kubernetes is not just for web services. Batch processing, scientific computing, and data pipelines are legitimate Kubernetes workloads, especially when combined with tools like Kubernetes Jobs, CronJobs, and the emerging Kubernetes Batch/HPC features. For more on how different Kubernetes distributions serve these varied use cases, see our &lt;a href="https://releaserun.com/kubernetes-distributions-compared/" rel="noopener noreferrer"&gt;Kubernetes distributions comparison&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Industry Breakdown: Where Kubernetes Runs
&lt;/h2&gt;

&lt;p&gt;Looking across the companies profiled above and the broader ecosystem, Kubernetes adoption follows clear patterns by industry.&lt;/p&gt;

&lt;h3&gt;
  
  
  Technology and Media
&lt;/h3&gt;

&lt;p&gt;Technology companies were the earliest adopters and run the largest deployments. Adoption is near-universal among companies with more than 500 engineers. Kubernetes is typically managed by a dedicated platform engineering team that provides an internal developer platform. The tech sector also leads in multi-cluster adoption, with companies routinely running dozens or hundreds of clusters segmented by team, region, or workload type.&lt;/p&gt;

&lt;h3&gt;
  
  
  Financial Services
&lt;/h3&gt;

&lt;p&gt;Banks, insurance companies, and fintech firms have adopted Kubernetes aggressively over the past five years. The main drivers are faster time-to-market for financial products and the ability to scale trading and payment processing systems dynamically. Compliance and security overhead is significant but manageable with the right tooling.&lt;/p&gt;

&lt;h3&gt;
  
  
  E-Commerce and Retail
&lt;/h3&gt;

&lt;p&gt;Retail companies with seasonal traffic patterns (Black Friday, product launches, holiday shopping) benefit from Kubernetes' autoscaling capabilities. Companies like Adidas, Target, and Zalando have all migrated to Kubernetes-based platforms.&lt;/p&gt;

&lt;h3&gt;
  
  
  Healthcare and Life Sciences
&lt;/h3&gt;

&lt;p&gt;Healthcare organizations are increasingly adopting Kubernetes for electronic health record (EHR) systems, genomics processing, and medical imaging workloads. HIPAA compliance requirements add complexity, similar to financial services, but Kubernetes' namespace isolation and network policies provide the necessary building blocks. Companies like Philips and Kaiser Permanente have invested significantly in Kubernetes platforms for both clinical and research workloads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Government and Defense
&lt;/h3&gt;

&lt;p&gt;Government adoption has accelerated significantly, led by the US DoD's Platform One initiative. Other agencies, including the IRS and VA, have Kubernetes initiatives. Government adoption emphasizes security hardening, air-gapped deployment capabilities, and FedRAMP compliance. The US Census Bureau and NHS Digital (UK) have also adopted Kubernetes for citizen-facing services, showing that government use extends beyond defense into civilian applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adoption by Company Size
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Startups (1-50 Engineers)
&lt;/h3&gt;

&lt;p&gt;For early-stage startups, managed Kubernetes services (EKS, GKE, AKS) have reduced the barrier to entry significantly. However, the operational overhead of Kubernetes can be substantial for small teams. Many startups start with simpler alternatives (AWS ECS, Google Cloud Run, Railway) and migrate to Kubernetes as they grow. The decision depends on team expertise and workload complexity. That said, startups building infrastructure-heavy products (developer tools, data platforms, security tools) often adopt Kubernetes early because their customers expect Kubernetes-native deployment options.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mid-Market (50-500 Engineers)
&lt;/h3&gt;

&lt;p&gt;This is the fastest-growing adoption segment. Companies in this range typically have enough engineering capacity to justify a small platform team (2-5 engineers) dedicated to running Kubernetes. Managed services and platform-as-a-service layers like Humanitec, Upbound, or internal Backstage portals help make Kubernetes accessible to the broader engineering organization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Enterprise (500+ Engineers)
&lt;/h3&gt;

&lt;p&gt;Large enterprises overwhelmingly run Kubernetes, often across multiple cloud providers and on-premises data centers. Multi-cluster management, federation, and governance at scale are the primary challenges. These organizations typically run dedicated platform engineering organizations (not just teams) with 10-50+ engineers focused on Kubernetes infrastructure. At this scale, the focus shifts from "how do we run Kubernetes" to "how do we govern, secure, and provide self-service access to Kubernetes across hundreds of teams." Tools like Rancher, Tanzu, and OpenShift are common in this segment because they provide the multi-cluster management and enterprise governance features that large organizations require.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Patterns and Lessons Learned
&lt;/h2&gt;

&lt;p&gt;Across all the companies profiled here, several patterns emerge consistently:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Platform engineering is non-negotiable.&lt;/strong&gt; Every successful large-scale Kubernetes deployment has a dedicated platform team that abstracts Kubernetes complexity from application developers. Without this, adoption stalls because developers spend too much time fighting with YAML and cluster configuration.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Managed Kubernetes is the default.&lt;/strong&gt; Even companies with deep infrastructure expertise (Spotify, Reddit, Airbnb) run on managed services like GKE, EKS, or AKS. The operational overhead of running your own control plane is rarely justified.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Multi-cluster is the norm at scale.&lt;/strong&gt; No company running thousands of services uses a single Kubernetes cluster. Multi-cluster strategies provide blast radius isolation, allow independent upgrade schedules, and enable different security boundaries for different workloads.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Migrations are gradual.&lt;/strong&gt; Every company that moved to Kubernetes did so incrementally, running old and new infrastructure in parallel for months or years. Big-bang migrations are rarely successful.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Developer experience determines adoption speed.&lt;/strong&gt; Companies that invested in internal developer platforms, service templates, and self-service tooling saw faster adoption. Companies that asked developers to learn raw Kubernetes saw resistance and slow rollouts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Security and compliance are solvable.&lt;/strong&gt; Financial services, healthcare, and defense organizations have all proven that Kubernetes can meet strict regulatory requirements. The tools (OPA, network policies, RBAC, image signing) exist -- the work is in integrating them into your specific compliance framework.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Kubernetes adoption in 2026 spans virtually every industry and company size. From Spotify's 200 clusters powering music streaming to CERN's on-premises deployment analyzing particle physics data to the US DoD's security-hardened Platform One, Kubernetes has proven adaptable to radically different requirements.&lt;/p&gt;

&lt;p&gt;The common thread across all successful adopters is not the technology itself but the organizational investment around it: platform engineering teams, developer experience tooling, and gradual migration strategies. Kubernetes provides the runtime foundation, but it is the platform built on top of it -- and the team operating it -- that determines success.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/kubernetes-security-linter/" rel="noopener noreferrer"&gt;Kubernetes YAML Security Linter&lt;/a&gt; — paste any K8s manifest and scan for 12 security issues with an A–F grade. Free, browser-based.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>cloud</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>Pets vs Cattle DevOps: The Security Risk You Inherit</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:14:54 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/pets-vs-cattle-devops-the-security-risk-you-inherit-25pn</link>
      <guid>https://dev.to/matheus_releaserun/pets-vs-cattle-devops-the-security-risk-you-inherit-25pn</guid>
      <description>&lt;h1&gt;
  
  
  Pets vs Cattle DevOps: The Security Risk You Inherit
&lt;/h1&gt;

&lt;p&gt;No CVEs patched. Your attack surface still changes.&lt;/p&gt;

&lt;p&gt;I have watched teams “modernize” from pet VMs to cattle and accidentally make audits harder and breaches faster. If you do not treat pets vs cattle as a security classification, you will ship unauditable infrastructure and you will not notice until an incident, or a regulator, forces you to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security impact first: what changes when you move to “cattle”
&lt;/h2&gt;

&lt;p&gt;Patch this before your next standup. Not with a hotfix, with controls.&lt;/p&gt;

&lt;p&gt;Pets fail in slow motion. Cattle fail at scale. If you run cattle without guardrails, a single bad image, a poisoned Terraform module, or a compromised GitOps repo can roll out to 400 nodes before you finish your coffee. Until we see a PoC, the real risk is probably misconfiguration and supply-chain drift, not a Hollywood zero-day.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;If you keep pets:&lt;/strong&gt; Long-lived SSH keys and config drift hang around for years. An attacker who lands once can come back later and still find the same foothold.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If you move to cattle:&lt;/strong&gt; You reduce drift, but you increase blast radius. One promoted image becomes tomorrow’s fleet baseline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If you do nothing:&lt;/strong&gt; You keep “snowflake” servers that miss patches, and you also inherit new cloud-native failure modes like leaked service account tokens and over-permissive IAM.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;If you cannot prove what ran, who changed it, and when it changed, you do not have “cattle.” You have pets with better marketing.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Breaking operational changes (these cause outages and audit findings)
&lt;/h2&gt;

&lt;p&gt;Some folks skip canaries for “just infrastructure” changes. I do not.&lt;/p&gt;

&lt;p&gt;The thing nobody mentions is that pets vs cattle breaks your incident response muscle memory. Your old runbook said “SSH to db-primary.” Your new world says “the pod died, and the controller replaced it.” If you do not build an evidence trail and a break-glass path, you will lose time during containment and you will lose artifacts during forensics.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Logging changes:&lt;/strong&gt; SSH session logs disappear when you stop SSH-ing. You must replace them with Kubernetes audit logs, Git provider audit logs, CI logs, and centralized application logs with retention.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Access changes:&lt;/strong&gt; “No one SSHs into anything” sounds clean. In practice you still need privileged access for nodes, storage, and rare outages. Define who can do it, how you record it, and how you revoke it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stateful workloads:&lt;/strong&gt; Treating a database like cattle can delete data. The advisory does not specify how your org should do backups. Your SRE team still owns that risk.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What “pets” look like in a security review
&lt;/h2&gt;

&lt;p&gt;Pets keep secrets warm.&lt;/p&gt;

&lt;p&gt;A pet server usually carries a private key in /home, a forgotten debug binary, and a firewall rule nobody can explain. I have seen a “temporary” SSH exception live for 14 months because “nobody wants to touch prod.” If you do not upgrade your operating model, you will keep paying for that fear in outages and incident dwell time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Typical findings:&lt;/strong&gt; Untracked local changes, inconsistent patch levels, shared admin accounts, and backups that exist but never restore cleanly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Threat scenario if you do not change:&lt;/strong&gt; An attacker pivots through one unpatched pet, drops a persistent user, and waits for a quiet weekend. You only notice after data exfil shows up in DNS logs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What “cattle” look like when you do it safely
&lt;/h2&gt;

&lt;p&gt;Cattle need fences.&lt;/p&gt;

&lt;p&gt;Teams love to say “immutable infrastructure” and then run unsigned images from random registries. That bit me once in a staging cluster. A developer “temporarily” used :latest, the build pulled a new dependency, and we spent half a day chasing behavior that never reproduced locally. In production, that same pattern becomes a supply-chain incident.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Minimum bar for cattle:&lt;/strong&gt; Rebuild images on a schedule, scan them in CI, generate an SBOM, and sign the artifact before promotion.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitOps control:&lt;/strong&gt; Treat Git as production. Lock branches, require reviews, and alert on changes to cluster-admin RBAC and network policy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runtime control:&lt;/strong&gt; Enforce non-root containers, drop capabilities, and block privileged pods unless you can defend the exception in writing.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Stateful workloads: keep the “pet” behavior, automate the handling
&lt;/h2&gt;

&lt;p&gt;Databases do not forgive you.&lt;/p&gt;

&lt;p&gt;A PostgreSQL primary still needs a stable identity, durable storage, and careful failover. Kubernetes StatefulSets help, but they do not remove your need for tested restores and clear RTO/RPO targets. If you pretend state is disposable, you will eventually test your backups during an outage. That is the worst time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use StatefulSets for stateful systems:&lt;/strong&gt; Stable names, stable volumes, ordered rollout. This reduces chaos, it does not eliminate risk.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Threat scenario if you misclassify state:&lt;/strong&gt; A “self-healing” controller recreates a pod, attaches the wrong volume, and you corrupt data during recovery.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Migration checklist (security gates, not just steps)
&lt;/h2&gt;

&lt;p&gt;Move in slices.&lt;/p&gt;

&lt;p&gt;Start with workloads that can tolerate replacement, like stateless APIs and CI runners. Then work toward the ugly stuff. For each step, set a gate you can measure and audit, otherwise the project becomes vibes-based engineering.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inventory and classify:&lt;/strong&gt; Record what runs where, what data it touches, and what compliance regime applies. If you cannot classify it, you cannot secure it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Externalize state and secrets:&lt;/strong&gt; Move data off hosts. Move secrets into a managed system. Rotate anything that used to live on a pet box.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Codify and review:&lt;/strong&gt; Put Terraform, Helm, and policies under pull request review. Capture approvals as evidence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build immutable artifacts:&lt;/strong&gt; Build once. Promote the same artifact. Do not patch live nodes by hand unless you execute a documented break-glass procedure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Practice destruction:&lt;/strong&gt; Kill instances in staging on purpose. If the system cannot recover without a human, you still run pets.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Everything else you should know (quick and slightly unfinished)
&lt;/h2&gt;

&lt;p&gt;History matters less than evidence.&lt;/p&gt;

&lt;p&gt;Yes, the metaphor goes back to early 2010s talks and blog posts, and people still argue who said it first. I care more about whether you can produce a change log, an artifact signature, and an audit trail on demand. Other stuff you will run into: autoscaler cooldowns, weird storage edge cases, dependency pinning, the usual.&lt;/p&gt;

&lt;p&gt;If you do not upgrade your operating model, you will keep shipping servers you cannot recreate, cannot attest, and cannot explain under pressure. Attackers love that kind of environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/iac-security-in-2026-terraform-checkov-and-cloud-drift-detection/" rel="noopener noreferrer"&gt;IaC Security in 2026: Terraform, Checkov, and Cloud Drift Detection&lt;/a&gt; -- The tooling that makes cattle-style infrastructure actually secure&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/container-image-scanning-in-2026-clair-trivy-and-grype-compared/" rel="noopener noreferrer"&gt;Container Image Scanning in 2026&lt;/a&gt; -- Verify your cattle images before they stampede into production&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-operators-explained-what-they-are-how-they-work-and-how-to-build-one/" rel="noopener noreferrer"&gt;Kubernetes Operators Explained&lt;/a&gt; -- Automating the operational logic that cattle-mode demands&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/container-escape-vulnerabilities-the-cves-that-shaped-docker-and-kubernetes-security/" rel="noopener noreferrer"&gt;Container Escape Vulnerabilities&lt;/a&gt; -- The security risks that cattle infrastructure must mitigate&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://releaserun.com/kubernetes-upgrade-checklist/" rel="noopener noreferrer"&gt;Kubernetes Upgrade Checklist&lt;/a&gt; -- The structured process for updating your herd safely&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/header-analyzer/" rel="noopener noreferrer"&gt;HTTP Security Headers Analyzer&lt;/a&gt; — scan any URL for missing or misconfigured security headers and get an A–F grade. Free.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>security</category>
      <category>kubernetes</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>Container Escape Vulnerabilities: The CVEs That Shaped Docker and Kubernetes Security</title>
      <dc:creator>Matheus</dc:creator>
      <pubDate>Sat, 21 Feb 2026 17:00:11 +0000</pubDate>
      <link>https://dev.to/matheus_releaserun/container-escape-vulnerabilities-the-cves-that-shaped-docker-and-kubernetes-security-41hk</link>
      <guid>https://dev.to/matheus_releaserun/container-escape-vulnerabilities-the-cves-that-shaped-docker-and-kubernetes-security-41hk</guid>
      <description>&lt;h2&gt;
  
  
  Why Container Escapes Matter
&lt;/h2&gt;

&lt;p&gt;Containers are not virtual machines. A virtual machine runs its own kernel on emulated hardware, creating a strong isolation boundary. A container shares the host kernel with every other container on the system -- isolation comes from Linux kernel features (namespaces, cgroups, capabilities, seccomp filters), not from a hardware-enforced boundary.&lt;/p&gt;

&lt;p&gt;When an attacker escapes a container, they break through those kernel-level abstractions and gain access to the host. From there, they can reach every other container on that node, access mounted secrets and credentials, and pivot deeper into the cluster. In a Kubernetes or Docker production environment, a single container escape can compromise an entire node and, in the worst case, the entire cluster.&lt;/p&gt;

&lt;p&gt;This article covers the most significant container escape CVEs from 2017 through 2024: how each exploit worked, what made it possible, and how the ecosystem responded. The same classes of bugs keep resurfacing, and the defensive patterns developed in response form the foundation of modern container security.&lt;/p&gt;

&lt;h2&gt;
  
  
  CVE-2017-5123: The waitid Kernel Exploit
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Happened
&lt;/h3&gt;

&lt;p&gt;In October 2017, a vulnerability was discovered in the Linux kernel's &lt;code&gt;waitid()&lt;/code&gt; system call. During a refactor of the waitid code in kernel version 4.13, a critical check was accidentally removed: the &lt;code&gt;access_ok()&lt;/code&gt; call that validates whether a user-supplied pointer actually points to user-space memory. Without this check, an unprivileged process could pass a pointer to kernel memory, and the kernel would happily write data to that location.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the Exploit Worked
&lt;/h3&gt;

&lt;p&gt;The bug allowed an attacker to write partially controlled data to an arbitrary kernel memory address. While the attacker could not fully control the content being written -- the kernel wrote a &lt;code&gt;siginfo_t&lt;/code&gt; structure with fields determined by process state -- careful manipulation of which process was being waited on gave enough control to be dangerous.&lt;/p&gt;

&lt;p&gt;The container escape leveraged this kernel write primitive to modify the calling process's capability structure in kernel memory. Docker containers run with a restricted set of Linux capabilities, which is one of the primary mechanisms preventing containerized processes from performing privileged operations on the host. By overwriting the capability bitmask, the attacker could grant themselves &lt;code&gt;CAP_SYS_ADMIN&lt;/code&gt; and &lt;code&gt;CAP_NET_ADMIN&lt;/code&gt; -- effectively breaking out of the container's capability restrictions and gaining host-level privileges.&lt;/p&gt;

&lt;h3&gt;
  
  
  Impact and Fix
&lt;/h3&gt;

&lt;p&gt;This vulnerability affected Linux kernel 4.13 through 4.14.0-rc4. The fix was straightforward: re-adding the &lt;code&gt;access_ok()&lt;/code&gt; check to validate that the user-provided pointer targets user-space memory. The bug was introduced on May 21, 2017 and patched on October 9, 2017.&lt;/p&gt;

&lt;p&gt;CVE-2017-5123 demonstrated something fundamental: containers share the host kernel, and a kernel vulnerability is a container escape vulnerability. No amount of namespace isolation matters if the kernel itself can be tricked into overwriting its own security data structures.&lt;/p&gt;

&lt;h2&gt;
  
  
  CVE-2019-5736: The runc Overwrite
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Happened
&lt;/h3&gt;

&lt;p&gt;Disclosed on February 11, 2019, CVE-2019-5736 was arguably the most impactful container escape vulnerability ever published. It affected &lt;strong&gt;runc&lt;/strong&gt;, the low-level container runtime used by Docker, containerd, CRI-O, and essentially every OCI-compliant container platform. The vulnerability allowed a malicious process inside a container to overwrite the host's runc binary, gaining root-level code execution on the host.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the Exploit Worked
&lt;/h3&gt;

&lt;p&gt;The exploit took advantage of how Linux handles &lt;code&gt;/proc/self/exe&lt;/code&gt;. This special file is a symbolic link that points to the binary of the currently running process. When runc executes a command inside a container (via &lt;code&gt;docker exec&lt;/code&gt; or similar), there is a brief window where the container's process can access the runc binary through &lt;code&gt;/proc/self/exe&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The attack worked in two stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Set the trap.&lt;/strong&gt; The attacker replaces the container's &lt;code&gt;/bin/sh&lt;/code&gt; (or another entrypoint binary) with a script containing &lt;code&gt;#!/proc/self/exe&lt;/code&gt;. This tells the kernel to execute the binary that &lt;code&gt;/proc/self/exe&lt;/code&gt; points to -- which, during a &lt;code&gt;docker exec&lt;/code&gt; call, is the host's runc binary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overwrite runc.&lt;/strong&gt; When runc enters the container and the tampered entrypoint executes, the process gets a file handle to the host's runc binary via &lt;code&gt;/proc/self/exe&lt;/code&gt;. The attacker then writes a malicious payload to this file handle, overwriting the host's runc binary with attacker-controlled code.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The next time any container operation invokes runc on that host -- starting a container, running exec, or even performing a health check -- the attacker's payload executes with root privileges on the host.&lt;/p&gt;

&lt;h3&gt;
  
  
  Impact and Fix
&lt;/h3&gt;

&lt;p&gt;The severity was enormous. The exploit required only UID 0 inside the container (which is the default for most container images) and worked with default Docker configurations. No special privileges, no host mounts, no unusual capabilities. It affected Docker, Kubernetes, and any platform using runc versions prior to 1.0-rc6.&lt;/p&gt;

&lt;p&gt;The fix changed runc's behavior so that it creates a copy of itself as a sealed, read-only file descriptor (using &lt;code&gt;memfd_create&lt;/code&gt; with &lt;code&gt;F_SEAL&lt;/code&gt; flags) before entering the container. When the malicious process attempts to write to &lt;code&gt;/proc/self/exe&lt;/code&gt;, the kernel blocks the write because the file descriptor is sealed.&lt;/p&gt;

&lt;h2&gt;
  
  
  CVE-2019-1002101: kubectl cp Directory Traversal
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Happened
&lt;/h3&gt;

&lt;p&gt;While most container escape CVEs involve breaking out of a running container, CVE-2019-1002101 took a different approach: it targeted the operator's workstation. This vulnerability allowed a malicious container to write arbitrary files to the machine of any Kubernetes user who ran &lt;code&gt;kubectl cp&lt;/code&gt; to copy files from that container.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the Exploit Worked
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;kubectl cp&lt;/code&gt; command works by creating a tar archive inside the target container, streaming it over the network to the user's machine, and extracting it locally. The vulnerability was a classic directory traversal: the tar archive created inside the container could include file paths containing &lt;code&gt;../&lt;/code&gt; sequences, and kubectl did not sanitize these paths before extraction.&lt;/p&gt;

&lt;p&gt;If an attacker controlled the tar binary inside a container, they could craft filenames like &lt;code&gt;../../../etc/cron.d/backdoor&lt;/code&gt;. When the unsuspecting operator ran &lt;code&gt;kubectl cp mypod:/data ./local-dir&lt;/code&gt;, the malicious tar entries would be extracted outside the intended destination directory, writing files anywhere the user had permissions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Impact and Fix
&lt;/h3&gt;

&lt;p&gt;The fix added path validation to reject directory traversal sequences during tar extraction. The initial fix was incomplete -- follow-up CVEs (CVE-2019-11246 and CVE-2019-11249) addressed bypass techniques, highlighting how tricky path sanitization can be.&lt;/p&gt;

&lt;p&gt;This vulnerability is a reminder that the attack surface of a Kubernetes environment extends beyond the cluster. Operator tools, CI/CD pipelines, and client-side utilities are all part of the security perimeter.&lt;/p&gt;

&lt;h2&gt;
  
  
  CVE-2020-15257: containerd Host Network Escape
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Happened
&lt;/h3&gt;

&lt;p&gt;In November 2020, NCC Group disclosed CVE-2020-15257, a vulnerability in containerd that allowed containers running with host network access to escape to the host.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the Exploit Worked
&lt;/h3&gt;

&lt;p&gt;containerd uses a component called &lt;strong&gt;containerd-shim&lt;/strong&gt;, which runs as a parent process for each container. The shim exposes an API over an abstract namespace Unix domain socket. The critical flaw was that this socket was accessible from the host's network namespace.&lt;/p&gt;

&lt;p&gt;When a container was configured with &lt;code&gt;--net=host&lt;/code&gt; (sharing the host's network namespace), a root process inside that container could connect to the containerd-shim's abstract Unix socket. From there, the attacker could use the shim API to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read and write files on the host filesystem.&lt;/li&gt;
&lt;li&gt;Execute commands on the host as root.&lt;/li&gt;
&lt;li&gt;Spin up new, fully privileged containers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The attack required two conditions: the container had to be running with host networking, and the process inside had to be running as UID 0.&lt;/p&gt;

&lt;h3&gt;
  
  
  Impact and Fix
&lt;/h3&gt;

&lt;p&gt;The fix switched the shim API from abstract Unix sockets to file-based Unix sockets under &lt;code&gt;/run/containerd&lt;/code&gt;, which respect filesystem permissions and namespace boundaries. Important: containers running before the upgrade retained the old socket connections and had to be restarted.&lt;/p&gt;

&lt;p&gt;CVE-2020-15257 reinforced a well-known principle: &lt;strong&gt;do not use host networking unless absolutely necessary.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  CVE-2024-21626: Leaky Vessels
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Happened
&lt;/h3&gt;

&lt;p&gt;In January 2024, Snyk researchers disclosed a set of vulnerabilities collectively named "Leaky Vessels," with CVE-2024-21626 being the most severe. This was another runc vulnerability -- five years after CVE-2019-5736. It carried a CVSS score of 8.6.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the Exploit Worked
&lt;/h3&gt;

&lt;p&gt;The vulnerability stemmed from an internal file descriptor leak in runc. When runc set up a new container, it inadvertently leaked file descriptors that pointed to the host filesystem.&lt;/p&gt;

&lt;p&gt;Two primary attack vectors:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Malicious container image.&lt;/strong&gt; A Dockerfile with a &lt;code&gt;WORKDIR&lt;/code&gt; directive set to a path like &lt;code&gt;/proc/self/fd/[leaked_fd]&lt;/code&gt; could cause the container process to start with its working directory pointing to a host filesystem location.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Crafted exec command.&lt;/strong&gt; An attacker with the ability to run &lt;code&gt;runc exec&lt;/code&gt; could specify a working directory that referenced the leaked file descriptor.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What made this especially concerning was the image-based attack vector. Unlike CVE-2019-5736, which required an attacker to already have code execution inside a container, CVE-2024-21626 could be triggered simply by building or running a malicious image pulled from a registry.&lt;/p&gt;

&lt;h3&gt;
  
  
  Impact and Fix
&lt;/h3&gt;

&lt;p&gt;The fix in runc 1.1.12 ensured that all internal file descriptors are properly closed before the container process starts. The disclosure also included three other CVEs affecting Docker's BuildKit component, demonstrating that the container build pipeline -- not just runtime -- is a significant attack surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Other Notable Container Escape Vulnerabilities
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Dirty COW (CVE-2016-5195)
&lt;/h3&gt;

&lt;p&gt;A race condition in the Linux kernel's memory subsystem, present for nine years before discovery in October 2016. The vulnerability allowed an unprivileged process to write to read-only memory mappings. Researchers demonstrated container escape techniques using the &lt;strong&gt;vDSO&lt;/strong&gt; (virtual Dynamic Shared Object) to inject shellcode that would execute in the context of any process on the host.&lt;/p&gt;

&lt;h3&gt;
  
  
  systemd-journald Exploits (CVE-2018-16865 and CVE-2018-16866)
&lt;/h3&gt;

&lt;p&gt;Vulnerabilities in systemd-journald that, chained together, allowed a local attacker to obtain a root shell. Since journald runs as root and accepts log messages from containers, this created a path from containerized process to host root access through the logging infrastructure.&lt;/p&gt;

&lt;p&gt;These bugs highlighted the risk of host services that accept input from containers. Any host daemon that processes container-generated data is a potential escape vector.&lt;/p&gt;

&lt;h2&gt;
  
  
  Patterns Across Container Escape CVEs
&lt;/h2&gt;

&lt;p&gt;Several recurring patterns emerge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Shared kernel, shared fate.&lt;/strong&gt; CVE-2017-5123 and Dirty COW exploited kernel bugs that no amount of namespace isolation can defend against. This is the fundamental architectural limitation of containers versus virtual machines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File descriptor and /proc leaks.&lt;/strong&gt; CVE-2019-5736 and CVE-2024-21626 both exploited how runc handles file descriptors and &lt;code&gt;/proc&lt;/code&gt; entries during container setup.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Host services extend the attack surface.&lt;/strong&gt; CVE-2020-15257 and the systemd-journald exploits show that any host service that accepts container input is a potential escape path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client tools matter too.&lt;/strong&gt; CVE-2019-1002101 weaponized kubectl to compromise operator workstations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Modern Defenses Against Container Escapes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Seccomp Profiles
&lt;/h3&gt;

&lt;p&gt;Seccomp restricts which system calls a containerized process can make. Docker's default profile blocks approximately 44 of the 300+ available system calls. Custom profiles tailored to your application's actual system call usage offer stronger protection.&lt;/p&gt;

&lt;h3&gt;
  
  
  AppArmor and SELinux
&lt;/h3&gt;

&lt;p&gt;Mandatory Access Control (MAC) systems add restrictions beyond standard Linux permissions. &lt;strong&gt;SELinux&lt;/strong&gt; in enforcing mode mitigated CVE-2019-5736 by blocking writes to the host's runc binary. &lt;strong&gt;AppArmor&lt;/strong&gt; provides path-based controls.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rootless Containers and User Namespaces
&lt;/h3&gt;

&lt;p&gt;Many container escape exploits require UID 0 inside the container. Rootless containers address this by running the entire container runtime as an unprivileged user, using &lt;strong&gt;user namespaces&lt;/strong&gt; to remap UID 0 inside the container to an unprivileged UID on the host.&lt;/p&gt;

&lt;p&gt;With rootless mode, even a successful escape lands the attacker on the host as an unprivileged user. Docker supports rootless mode natively (since 20.10), Podman runs rootless by default, and Kubernetes user namespaces for pods reached beta in version 1.30.&lt;/p&gt;

&lt;h3&gt;
  
  
  Read-Only Root Filesystems
&lt;/h3&gt;

&lt;p&gt;Running containers with read-only root filesystems (&lt;code&gt;readOnlyRootFilesystem: true&lt;/code&gt;) prevents a compromised container from modifying its own filesystem, directly mitigating exploits like CVE-2019-5736.&lt;/p&gt;

&lt;h3&gt;
  
  
  Runtime Security: Falco and Tetragon
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Falco&lt;/strong&gt;, a CNCF graduated project, monitors system calls and container events against a rule engine. &lt;strong&gt;Tetragon&lt;/strong&gt;, from the Cilium project, uses eBPF to enforce security policies directly in the kernel with less than 1% performance overhead.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pod Security Standards
&lt;/h3&gt;

&lt;p&gt;Kubernetes Pod Security Standards define three profiles -- &lt;strong&gt;Privileged&lt;/strong&gt;, &lt;strong&gt;Baseline&lt;/strong&gt;, and &lt;strong&gt;Restricted&lt;/strong&gt;. The Restricted profile enforces non-root execution, drops all capabilities, disables privilege escalation, and requires a read-only root filesystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Image Scanning and Supply Chain Security
&lt;/h3&gt;

&lt;p&gt;Image scanning tools (Trivy, Grype, Snyk Container) detect known vulnerable packages, image signing with Sigstore/cosign provides provenance verification, and admission controllers can enforce that only signed, scanned images are deployed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Container Escape Prevention Checklist
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Runtime Configuration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Run containers as non-root.&lt;/strong&gt; Set &lt;code&gt;runAsNonRoot: true&lt;/code&gt; and specify a &lt;code&gt;runAsUser&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Drop all capabilities, add only what is needed.&lt;/strong&gt; Use &lt;code&gt;drop: ["ALL"]&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disable privilege escalation.&lt;/strong&gt; Set &lt;code&gt;allowPrivilegeEscalation: false&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use read-only root filesystems.&lt;/strong&gt; Set &lt;code&gt;readOnlyRootFilesystem: true&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid host namespaces.&lt;/strong&gt; Do not use &lt;code&gt;hostNetwork&lt;/code&gt;, &lt;code&gt;hostPID&lt;/code&gt;, or &lt;code&gt;hostIPC&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Never run privileged containers in production.&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Infrastructure and Patching
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Keep the host kernel updated.&lt;/strong&gt; Kernel vulnerabilities bypass all container isolation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Patch container runtimes promptly.&lt;/strong&gt; runc, containerd, and CRI-O vulnerabilities are direct escape vectors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update client tools.&lt;/strong&gt; kubectl and other client-side tools are part of the attack surface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable user namespaces.&lt;/strong&gt; Ensure UID 0 inside containers maps to an unprivileged host UID.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Detection and Monitoring
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deploy runtime security tooling.&lt;/strong&gt; Use Falco, Tetragon, or similar tools.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apply seccomp profiles.&lt;/strong&gt; Start with defaults and customize based on your application.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable audit logging.&lt;/strong&gt; Kubernetes audit logs, container runtime logs, and host-level audit provide forensic trails.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Supply Chain
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scan images for known CVEs.&lt;/strong&gt; Run vulnerability scanners in your CI/CD pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use minimal base images.&lt;/strong&gt; Smaller images have fewer potential vulnerabilities.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sign and verify images.&lt;/strong&gt; Use cosign/Sigstore for image signing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pin image digests.&lt;/strong&gt; Reference images by digest rather than mutable tags.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Future of Container Isolation
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Sandbox runtimes&lt;/strong&gt; like gVisor and Kata Containers add stronger isolation boundaries. &lt;strong&gt;eBPF-based security enforcement&lt;/strong&gt; is maturing rapidly. &lt;strong&gt;Confidential computing&lt;/strong&gt; (AMD SEV, Intel TDX) is bringing hardware-level isolation to container workloads using encrypted memory enclaves.&lt;/p&gt;

&lt;p&gt;For most teams today, defense in depth -- rootless containers, seccomp profiles, MAC policies, runtime security tools, and diligent patching -- provides strong protection. No single mechanism is a silver bullet, but the combination makes exploitation significantly harder and detection significantly faster.&lt;/p&gt;

&lt;p&gt;Container escapes are not theoretical. They have been discovered repeatedly in the most critical infrastructure components, from the Linux kernel to runc to containerd to kubectl. The organizations that avoid becoming case studies are the ones that treat these vulnerabilities as inevitable, and build their defenses accordingly.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;🔍 Related tool:&lt;/strong&gt; &lt;a href="https://releaserun.com/tools/kubernetes-security-linter/" rel="noopener noreferrer"&gt;Kubernetes YAML Security Linter&lt;/a&gt; — paste any K8s manifest and scan for 12 security issues with an A–F grade. Free, browser-based.&lt;/p&gt;

</description>
      <category>security</category>
      <category>kubernetes</category>
      <category>docker</category>
      <category>containers</category>
    </item>
  </channel>
</rss>
