<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alexandre Vazquez</title>
    <description>The latest articles on DEV Community by Alexandre Vazquez (@alexandrev).</description>
    <link>https://dev.to/alexandrev</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F167984%2F9c789f5c-7dab-4a86-aece-8bf66ea955bd.jpeg</url>
      <title>DEV Community: Alexandre Vazquez</title>
      <link>https://dev.to/alexandrev</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alexandrev"/>
    <language>en</language>
    <item>
      <title>Skopeo, Crane, and regctl: Container Image Management Without the Docker Daemon (2026)</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Thu, 28 May 2026 13:39:03 +0000</pubDate>
      <link>https://dev.to/alexandrev/skopeo-crane-and-regctl-container-image-management-without-the-docker-daemon-2026-36jn</link>
      <guid>https://dev.to/alexandrev/skopeo-crane-and-regctl-container-image-management-without-the-docker-daemon-2026-36jn</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Docker Is Overkill for Image Operations
&lt;/h2&gt;

&lt;p&gt;You need to copy an image from Docker Hub to your private registry. Or inspect a manifest before pulling. Or delete old tags programmatically. Or sync an entire repository during a migration.&lt;/p&gt;

&lt;p&gt;The instinct is to reach for Docker. But Docker requires a running daemon, root access (or group membership that amounts to the same thing), and pulls the entire image to disk just to read its metadata. For CI pipelines, GitOps workflows, and platform tooling, that’s a significant overhead for what should be lightweight registry operations.&lt;/p&gt;

&lt;p&gt;This is the problem that &lt;strong&gt;daemonless container image tools&lt;/strong&gt; solve. Skopeo pioneered the category; today it has real competition from crane, regctl, and ORAS — each with different strengths and ideal use cases.&lt;/p&gt;

&lt;p&gt;This article gives you the practical comparison to pick the right tool for your workflow.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Contenders
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Maintainer&lt;/th&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;Daemon required&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Skopeo&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Red Hat / containers&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;crane&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Google / ko-build&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;regctl&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;regclient&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ORAS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CNCF&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;cosign&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Sigstore / OpenSSF&lt;/td&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All five are Go binaries, statically compiled, and work directly against the OCI Distribution Spec. None of them require Docker or any container runtime.&lt;/p&gt;




&lt;h2&gt;
  
  
  Skopeo
&lt;/h2&gt;

&lt;p&gt;Skopeo was the first major tool to address daemonless image operations, released by Red Hat in 2016 as part of the &lt;code&gt;containers/image&lt;/code&gt; ecosystem (alongside Podman and Buildah).&lt;/p&gt;

&lt;h3&gt;
  
  
  What Skopeo does well
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Image inspection without pulling:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;skopeo inspect docker://registry.k8s.io/pause:3.9

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns full image metadata — digest, layers, labels, architecture, OS — without downloading a single layer. Useful in admission controllers, policy checks, and pre-deployment validation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-registry copying:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;skopeo copy &lt;span class="se"&gt;\&lt;/span&gt;
  docker://docker.io/library/nginx:1.27 &lt;span class="se"&gt;\&lt;/span&gt;
  docker://harbor.internal/library/nginx:1.27

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Copies image manifests and layers directly between registries, bypassing your local machine entirely. The image never touches your disk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-arch handling:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;skopeo copy &lt;span class="nt"&gt;--all&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  docker://docker.io/library/nginx:1.27 &lt;span class="se"&gt;\&lt;/span&gt;
  docker://harbor.internal/library/nginx:1.27

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--all&lt;/code&gt; flag copies the full manifest list, preserving all architectures (linux/amd64, linux/arm64, etc.). This is critical when mirroring images for multi-arch clusters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Registry synchronization:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;skopeo &lt;span class="nb"&gt;sync&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--src&lt;/span&gt; docker &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dest&lt;/span&gt; docker &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--all&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  docker.io/library/nginx &lt;span class="se"&gt;\&lt;/span&gt;
  harbor.internal/mirrors/

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;skopeo sync&lt;/code&gt; mirrors an entire repository, including all tags. You can also use a YAML file to define which images and tags to sync — useful for air-gapped environment bootstrapping.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tag deletion:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;skopeo delete docker://harbor.internal/myapp:old-tag

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Useful in CI for cleanup pipelines. Note that registry-side deletion requires the registry to have the DELETE method enabled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Skopeo’s weaknesses
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No image modification&lt;/strong&gt; : Skopeo copies and inspects, but doesn’t build or modify images&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tag listing is verbose&lt;/strong&gt; : &lt;code&gt;skopeo list-tags&lt;/code&gt; returns JSON you need to parse&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No retry logic by default&lt;/strong&gt; : transient network errors in long sync operations require wrapping with retry scripts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth configuration&lt;/strong&gt; : relies on &lt;code&gt;containers/auth.json&lt;/code&gt; format, which differs from Docker’s &lt;code&gt;~/.docker/config.json&lt;/code&gt; (though it supports both)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to use Skopeo
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Air-gapped environment image mirroring&lt;/li&gt;
&lt;li&gt;CI pipelines that need to copy or inspect images without Docker&lt;/li&gt;
&lt;li&gt;Platform teams on Red Hat / OpenShift stacks&lt;/li&gt;
&lt;li&gt;Any workflow already using Podman or Buildah&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Crane
&lt;/h2&gt;

&lt;p&gt;Crane is Google’s answer to Skopeo, developed as part of the &lt;a href="https://ko.build/" rel="noopener noreferrer"&gt;ko&lt;/a&gt; project and later extracted into its own tool. It’s simpler, more scriptable, and has a cleaner CLI design.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Crane does well
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Tag listing:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crane &lt;span class="nb"&gt;ls &lt;/span&gt;registry.k8s.io/pause

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No JSON parsing needed. One tag per line. Pipe directly into &lt;code&gt;grep&lt;/code&gt;, &lt;code&gt;sort&lt;/code&gt;, &lt;code&gt;head&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Digest resolution:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crane digest docker.io/library/nginx:1.27

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns the image digest. Combine with &lt;code&gt;yq&lt;/code&gt; or &lt;code&gt;sed&lt;/code&gt; to pin image references in Helm values or Kubernetes manifests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Image copying:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crane &lt;span class="nb"&gt;cp &lt;/span&gt;docker.io/library/nginx:1.27 harbor.internal/library/nginx:1.27

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same capability as Skopeo’s copy, arguably with a cleaner syntax.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Manifest inspection:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crane manifest docker.io/library/nginx:1.27 | jq &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns raw manifest JSON. Useful when you need the exact manifest for digest verification or policy enforcement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tagging and retagging:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crane tag harbor.internal/myapp:abc123 harbor.internal/myapp:stable

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adds a new tag to an existing image without re-uploading layers. The tag operation is purely a manifest pointer update.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flattening images:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;crane flatten docker.io/library/ubuntu:24.04 &lt;span class="nt"&gt;-t&lt;/span&gt; harbor.internal/ubuntu:flat

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Squashes all layers into one. Reduces layer count for images where layer history doesn’t matter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Crane’s weaknesses
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No sync command&lt;/strong&gt; : unlike Skopeo, crane has no built-in repository sync. You script it yourself with &lt;code&gt;crane ls&lt;/code&gt; + &lt;code&gt;crane cp&lt;/code&gt; in a loop&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Less mature multi-arch support&lt;/strong&gt; : &lt;code&gt;crane cp&lt;/code&gt; supports multi-arch but the UX is less explicit than Skopeo’s &lt;code&gt;--all&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No delete command&lt;/strong&gt; : doesn’t implement registry deletion&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to use Crane
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;CI/CD scripting where you want clean, pipeable output&lt;/li&gt;
&lt;li&gt;Digest pinning workflows&lt;/li&gt;
&lt;li&gt;Lightweight image tagging operations&lt;/li&gt;
&lt;li&gt;When you’re already in the ko / Google Cloud ecosystem&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  regctl
&lt;/h2&gt;

&lt;p&gt;Regctl is the least known of the three but arguably the most feature-complete. It’s the CLI for the &lt;code&gt;regclient&lt;/code&gt; Go library and covers use cases that Skopeo and crane leave out.&lt;/p&gt;

&lt;h3&gt;
  
  
  What regctl does uniquely well
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Image modification without rebuild:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;regctl image mod myimage:tag &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--label&lt;/span&gt; &lt;span class="s2"&gt;"org.opencontainers.image.version=1.2.3"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--replace&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can add/change labels, annotations, and config fields directly on an existing image in the registry — without pulling, rebuilding, or pushing a new image. This is impossible with Skopeo or crane.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer operations:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Remove a specific layer from an image&lt;/span&gt;
regctl image mod myimage:tag &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--layer-rm&lt;/span&gt; sha256:abc123... &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--replace&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Useful for removing accidentally included secrets or large unnecessary layers from published images.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OCI artifact support:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;regctl artifact put &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--media-type&lt;/span&gt; application/vnd.example.config.v1+json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--config&lt;/span&gt; config.json &lt;span class="se"&gt;\&lt;/span&gt;
  file.tar.gz &lt;span class="se"&gt;\&lt;/span&gt;
  harbor.internal/myartifacts:v1

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Regctl has solid OCI artifact support alongside standard image operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Formatting and output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;regctl tag list harbor.internal/myapp &lt;span class="nt"&gt;--format&lt;/span&gt; &lt;span class="s1"&gt;'{{range .}}{{println .}}{{end}}'&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Go template formatting throughout. Useful for integrating into shell scripts without jq.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Referrers (OCI 1.1):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;regctl manifest get-list harbor.internal/myapp:v1 &lt;span class="nt"&gt;--referrers&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lists referrers (signatures, SBOMs, attestations) attached to an image via the OCI 1.1 referrers API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Regctl’s weaknesses
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Smaller community&lt;/strong&gt; : fewer examples, less StackOverflow coverage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Steeper learning curve&lt;/strong&gt; : more commands, more flags&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Less packaging&lt;/strong&gt; : not in most distro repos by default&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When to use regctl
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Image post-processing (labels, annotations, layer removal) without rebuild&lt;/li&gt;
&lt;li&gt;Advanced manifest and referrer workflows&lt;/li&gt;
&lt;li&gt;When you need OCI artifact operations alongside image operations&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ORAS
&lt;/h2&gt;

&lt;p&gt;ORAS (OCI Registry As Storage) is a CNCF project focused specifically on OCI artifact management — pushing and pulling arbitrary files to container registries, not necessarily container images.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Push a Helm chart as an OCI artifact&lt;/span&gt;
oras push harbor.internal/charts/myapp:1.0.0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--artifact-type&lt;/span&gt; application/vnd.helm.chart.v1+tar &lt;span class="se"&gt;\&lt;/span&gt;
  mychart.tgz

&lt;span class="c"&gt;# Push SBOM&lt;/span&gt;
oras push harbor.internal/myapp:v1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--artifact-type&lt;/span&gt; application/spdx+json &lt;span class="se"&gt;\&lt;/span&gt;
  sbom.spdx.json

&lt;span class="c"&gt;# Pull&lt;/span&gt;
oras pull harbor.internal/charts/myapp:1.0.0

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;ORAS is not a direct Skopeo replacement — it’s for when your registry is a general-purpose artifact store, not just a container registry. Helm OCI, SBOMs, attestations, and policy bundles all benefit from ORAS.&lt;/p&gt;




&lt;h2&gt;
  
  
  cosign
&lt;/h2&gt;

&lt;p&gt;Cosign from Sigstore is not a general-purpose image tool — it’s specifically for supply chain security. But it’s increasingly part of any container image workflow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Sign an image&lt;/span&gt;
cosign sign &lt;span class="nt"&gt;--key&lt;/span&gt; cosign.key harbor.internal/myapp:v1@sha256:abc123...

&lt;span class="c"&gt;# Verify&lt;/span&gt;
cosign verify &lt;span class="nt"&gt;--key&lt;/span&gt; cosign.pub harbor.internal/myapp:v1

&lt;span class="c"&gt;# Attach SBOM&lt;/span&gt;
cosign attach sbom &lt;span class="nt"&gt;--sbom&lt;/span&gt; sbom.spdx harbor.internal/myapp:v1

&lt;span class="c"&gt;# Keyless signing (Sigstore)&lt;/span&gt;
cosign sign harbor.internal/myapp:v1

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cosign integrates with OIDC providers for keyless signing (no key management required), which is the direction the ecosystem is moving. If you’re building a supply chain security practice, cosign is mandatory, not optional.&lt;/p&gt;




&lt;h2&gt;
  
  
  Side-by-side comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Skopeo&lt;/th&gt;
&lt;th&gt;Crane&lt;/th&gt;
&lt;th&gt;regctl&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Inspect image&lt;/td&gt;
&lt;td&gt;&lt;code&gt;inspect&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;manifest&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;manifest get&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Copy image&lt;/td&gt;
&lt;td&gt;&lt;code&gt;copy&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;cp&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;image copy&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Copy all arches&lt;/td&gt;
&lt;td&gt;&lt;code&gt;copy --all&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cp&lt;/code&gt; (auto)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;image copy&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sync repository&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sync&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;script it&lt;/td&gt;
&lt;td&gt;script it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;List tags&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;list-tags&lt;/code&gt; (JSON)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;ls&lt;/code&gt; (plain)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;tag list&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Delete tag/image&lt;/td&gt;
&lt;td&gt;&lt;code&gt;delete&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;code&gt;tag delete&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Modify labels&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;code&gt;image mod&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remove layer&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;code&gt;image mod&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OCI artifacts&lt;/td&gt;
&lt;td&gt;limited&lt;/td&gt;
&lt;td&gt;limited&lt;/td&gt;
&lt;td&gt;&lt;code&gt;artifact&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Referrers (1.1)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;code&gt;manifest get-list&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Practical workflows
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Mirror images for air-gapped clusters (Skopeo)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# sync-list.yaml&lt;/span&gt;
&lt;span class="na"&gt;docker.io&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;images&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;library/nginx&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.25"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.26"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.27"&lt;/span&gt;
    &lt;span class="na"&gt;library/redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;7.2"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;7.4"&lt;/span&gt;


&lt;span class="s"&gt;skopeo sync \&lt;/span&gt;
  &lt;span class="s"&gt;--src yaml \&lt;/span&gt;
  &lt;span class="s"&gt;--dest docker \&lt;/span&gt;
  &lt;span class="s"&gt;--all \&lt;/span&gt;
  &lt;span class="s"&gt;sync-list.yaml \&lt;/span&gt;
  &lt;span class="s"&gt;harbor.internal/mirrors/&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pin image digests in CI (Crane)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Update image digests in values.yaml&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;image &lt;span class="k"&gt;in &lt;/span&gt;nginx:1.27 redis:7.4&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nv"&gt;digest&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;crane digest docker.io/library/&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;image&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"docker.io/library/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;image&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;@&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;digest&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combine with &lt;code&gt;yq&lt;/code&gt; to update Helm values files automatically, ensuring reproducible deployments.&lt;/p&gt;

&lt;h3&gt;
  
  
  Retag without re-pushing (Crane or regctl)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# After a successful deploy to staging, promote to production&lt;/span&gt;
crane tag harbor.internal/myapp:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; harbor.internal/myapp:production

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No layer transfer. The operation is a metadata update in the registry.&lt;/p&gt;

&lt;h3&gt;
  
  
  Add OCI annotations post-build (regctl)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;regctl image mod harbor.internal/myapp:v1.2.3 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--annotation&lt;/span&gt; &lt;span class="s2"&gt;"org.opencontainers.image.source=https://github.com/org/repo"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--annotation&lt;/span&gt; &lt;span class="s2"&gt;"org.opencontainers.image.revision=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--replace&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Attaches build metadata to an image already in the registry, without a rebuild.&lt;/p&gt;

&lt;h3&gt;
  
  
  Supply chain security pipeline
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Build and push&lt;/span&gt;
docker buildx build &lt;span class="nt"&gt;--push&lt;/span&gt; &lt;span class="nt"&gt;-t&lt;/span&gt; harbor.internal/myapp:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# 2. Generate SBOM&lt;/span&gt;
syft harbor.internal/myapp:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; spdx-json &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; sbom.spdx.json

&lt;span class="c"&gt;# 3. Attach SBOM&lt;/span&gt;
cosign attach sbom &lt;span class="nt"&gt;--sbom&lt;/span&gt; sbom.spdx.json harbor.internal/myapp:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# 4. Sign (keyless with OIDC in CI)&lt;/span&gt;
cosign sign harbor.internal/myapp:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;# 5. Verify in admission controller or deployment pipeline&lt;/span&gt;
cosign verify &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--certificate-identity-regexp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://github.com/org/repo"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--certificate-oidc-issuer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://token.actions.githubusercontent.com"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  harbor.internal/myapp:&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;GIT_SHA&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;All tools install as single static binaries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Skopeo (via package manager)&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;skopeo &lt;span class="c"&gt;# macOS&lt;/span&gt;
dnf &lt;span class="nb"&gt;install &lt;/span&gt;skopeo &lt;span class="c"&gt;# RHEL/Fedora&lt;/span&gt;
apt &lt;span class="nb"&gt;install &lt;/span&gt;skopeo &lt;span class="c"&gt;# Debian/Ubuntu&lt;/span&gt;

&lt;span class="c"&gt;# Crane&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;crane
&lt;span class="c"&gt;# or binary release&lt;/span&gt;
curl &lt;span class="nt"&gt;-sL&lt;/span&gt; https://github.com/google/go-containerregistry/releases/latest/download/go-containerregistry_Linux_x86_64.tar.gz | &lt;span class="nb"&gt;tar &lt;/span&gt;xz crane

&lt;span class="c"&gt;# regctl&lt;/span&gt;
curl &lt;span class="nt"&gt;-sL&lt;/span&gt; https://github.com/regclient/regclient/releases/latest/download/regctl.linux.amd64 &lt;span class="nt"&gt;-o&lt;/span&gt; /usr/local/bin/regctl
&lt;span class="nb"&gt;chmod&lt;/span&gt; +x /usr/local/bin/regctl

&lt;span class="c"&gt;# ORAS&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;oras

&lt;span class="c"&gt;# cosign&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;cosign

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Which tool should you use?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Use Skopeo if:&lt;/strong&gt; you’re on a Red Hat / OpenShift stack, you need repository sync, or you’re building air-gapped environment pipelines. It’s the most battle-tested and widely packaged.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Crane if:&lt;/strong&gt; you’re scripting image operations in CI and want clean, composable CLI output. &lt;code&gt;crane ls&lt;/code&gt; + &lt;code&gt;crane cp&lt;/code&gt; + &lt;code&gt;crane digest&lt;/code&gt; cover 80% of automation use cases with minimal friction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use regctl if:&lt;/strong&gt; you need to modify images post-build, work with OCI referrers, or want the most complete feature set for a registry client. It has a higher learning curve but can replace both Skopeo and crane for advanced workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use ORAS if:&lt;/strong&gt; you’re using a registry to store non-image artifacts — Helm charts, SBOMs, policy bundles, ML models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use cosign&lt;/strong&gt; regardless of which of the above you pick, as soon as supply chain security matters to your organization. It’s not a replacement for the others — it’s a complement.&lt;/p&gt;

&lt;p&gt;In practice, most platform teams end up using 2-3 of these tools together. Crane for day-to-day scripting, Skopeo for sync jobs, cosign for signing, ORAS for artifact storage.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Can I use these tools with private registries?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. All support standard registry authentication. Crane and Skopeo both read from &lt;code&gt;~/.docker/config.json&lt;/code&gt;. Regctl has its own config file (&lt;code&gt;~/.regctl/config.json&lt;/code&gt;) but can import Docker credentials. Set &lt;code&gt;DOCKER_CONFIG&lt;/code&gt; to point to your credentials file in CI environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do these work with Docker Hub rate limits?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes, and they’re often more efficient than the Docker CLI because they only fetch manifest metadata for inspect operations, not full layers. For heavy pull workloads, authenticate with your Docker Hub credentials to get higher rate limits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What about ECR, GCR, and Azure Container Registry?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;All tools support these with the appropriate credential helpers. For ECR, use &lt;code&gt;docker-credential-ecr-login&lt;/code&gt;. Crane has native ECR support via the &lt;code&gt;--platform&lt;/code&gt; flag and &lt;code&gt;crane auth&lt;/code&gt; commands. Skopeo supports ECR via &lt;code&gt;--creds "AWS:$(aws ecr get-login-password)"&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are these tools safe to run in Kubernetes pods?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. Since they require no daemon and no elevated privileges for read operations, they’re well-suited to run as init containers or sidecar containers in Kubernetes. Skopeo is commonly used in image pre-pulling init containers. Use a dedicated service account with least-privilege registry credentials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I copy a multi-arch image and keep all platforms?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Skopeo: &lt;code&gt;skopeo copy --all&lt;/code&gt;. Crane: &lt;code&gt;crane cp&lt;/code&gt; copies the index automatically when the source is a manifest list. Regctl: &lt;code&gt;regctl image copy&lt;/code&gt; preserves manifest lists by default.&lt;/p&gt;

&lt;p&gt;{&lt;br&gt;
  "&lt;a class="mentioned-user" href="https://dev.to/context"&gt;@context&lt;/a&gt;": "&lt;a href="https://schema.org" rel="noopener noreferrer"&gt;https://schema.org&lt;/a&gt;",&lt;br&gt;
  "@type": "FAQPage",&lt;br&gt;
  "mainEntity": [&lt;br&gt;
    {&lt;br&gt;
      "@type": "Question",&lt;br&gt;
      "name": "Can I use Skopeo, crane, and regctl with private registries?",&lt;br&gt;
      "acceptedAnswer": {&lt;br&gt;
        "@type": "Answer",&lt;br&gt;
        "text": "Yes. All support standard registry authentication. Crane and Skopeo both read from ~/.docker/config.json. Regctl has its own config file (~/.regctl/config.json) but can import Docker credentials. Set DOCKER_CONFIG to point to your credentials file in CI environments."&lt;br&gt;
      }&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      "@type": "Question",&lt;br&gt;
      "name": "Do these container image tools work with Docker Hub rate limits?",&lt;br&gt;
      "acceptedAnswer": {&lt;br&gt;
        "@type": "Answer",&lt;br&gt;
        "text": "Yes, and they are often more efficient than the Docker CLI because they only fetch manifest metadata for inspect operations, not full layers. For heavy pull workloads, authenticate with your Docker Hub credentials to get higher rate limits."&lt;br&gt;
      }&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      "@type": "Question",&lt;br&gt;
      "name": "Do Skopeo, crane, and regctl work with ECR, GCR, and Azure Container Registry?",&lt;br&gt;
      "acceptedAnswer": {&lt;br&gt;
        "@type": "Answer",&lt;br&gt;
        "text": "All tools support these registries with the appropriate credential helpers. For ECR, use docker-credential-ecr-login. Crane has native ECR support via crane auth commands. Skopeo supports ECR via --creds \"AWS:$(aws ecr get-login-password)\"."&lt;br&gt;
      }&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      "@type": "Question",&lt;br&gt;
      "name": "Are Skopeo, crane, and regctl safe to run in Kubernetes pods?",&lt;br&gt;
      "acceptedAnswer": {&lt;br&gt;
        "@type": "Answer",&lt;br&gt;
        "text": "Yes. Since they require no daemon and no elevated privileges for read operations, they are well-suited to run as init containers or sidecar containers in Kubernetes. Skopeo is commonly used in image pre-pulling init containers. Use a dedicated service account with least-privilege registry credentials."&lt;br&gt;
      }&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      "@type": "Question",&lt;br&gt;
      "name": "Can I copy a multi-arch image and keep all platforms?",&lt;br&gt;
      "acceptedAnswer": {&lt;br&gt;
        "@type": "Answer",&lt;br&gt;
        "text": "Yes. Skopeo: use skopeo copy --all. Crane: crane cp copies the index automatically when the source is a manifest list. Regctl: regctl image copy preserves manifest lists by default."&lt;br&gt;
      }&lt;br&gt;
    }&lt;br&gt;
  ]&lt;br&gt;
}&lt;/p&gt;

&lt;h2&gt;
  
  
  Related articles:
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://alexandre-vazquez.com/debugging-distroless-containers/" rel="noopener noreferrer"&gt;Debugging Distroless Containers: kubectl debug, Ephemeral Containers, and When to Use Each&lt;/a&gt; &lt;small&gt;The container works fine in CI. It deploys successfully to...&lt;/small&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://alexandre-vazquez.com/argocd-guide/" rel="noopener noreferrer"&gt;ArgoCD Guide: GitOps Continuous Delivery for Kubernetes&lt;/a&gt; &lt;small&gt;ArgoCD is the leading GitOps operator for Kubernetes. This guide...&lt;/small&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://alexandre-vazquez.com/harbor-registry-security/" rel="noopener noreferrer"&gt;Harbor Registry Explained: Securing Container Images in Kubernetes and DevSecOps&lt;/a&gt; &lt;small&gt;Learn how Harbor Registry improves container security by enabling vulnerability...&lt;/small&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Transforming XML to JSON and CSV with XSLT</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Thu, 28 May 2026 09:00:02 +0000</pubDate>
      <link>https://dev.to/alexandrev/transforming-xml-to-json-and-csv-with-xslt-4f4f</link>
      <guid>https://dev.to/alexandrev/transforming-xml-to-json-and-csv-with-xslt-4f4f</guid>
      <description>&lt;p&gt;XSLT is usually associated with XML-to-XML transformations, but in integration work you often need JSON or CSV. The good news is that XSLT is perfectly capable of producing non-XML outputs when you design the stylesheet for it. The key is to choose the right output method, control whitespace carefully, and build an intermediate structure if it helps clarify the mapping. This post covers practical patterns for generating JSON and CSV from XML while keeping the stylesheet maintainable.&lt;/p&gt;

&lt;p&gt;For JSON, the simplest method is to output text and build the JSON structure manually. This gives you precise control, but it also requires careful escaping and formatting. If you are on XSLT 3.0, use maps and arrays and let the processor serialize to JSON. This reduces string manipulation and makes your transform more robust. If you are on XSLT 1.0 or 2.0, you can still build JSON text safely by using templates that escape quotes, backslashes, and control characters.&lt;/p&gt;

&lt;p&gt;A clear pattern is to create a template that takes a string and outputs an escaped JSON string. Then, for each object, output the property names and values with explicit commas. Keep a template to handle comma placement so you do not end up with trailing commas in arrays. This is a good place to use position checks like &lt;code&gt;position() != last()&lt;/code&gt; to decide when to emit a comma. While it can look verbose, the logic is deterministic and easy to debug.&lt;/p&gt;

&lt;p&gt;CSV output is simpler but comes with its own hazards. You need to wrap fields that contain commas, quotes, or line breaks. The common rule is to wrap the field in quotes and double any interior quotes. Again, a dedicated template to escape fields pays off. Define the column order explicitly and avoid depending on the source document order. This keeps the CSV consistent even if the XML input changes slightly. If you need multiple CSV sections, consider running two passes: one to compute the rows and one to serialize them.&lt;/p&gt;

&lt;p&gt;An example CSV field template can look like this in XSLT 1.0:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
xml



&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>backend</category>
      <category>data</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Designing XSLT transforms with parameters and multiple inputs</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Mon, 25 May 2026 09:00:00 +0000</pubDate>
      <link>https://dev.to/alexandrev/designing-xslt-transforms-with-parameters-and-multiple-inputs-4cfg</link>
      <guid>https://dev.to/alexandrev/designing-xslt-transforms-with-parameters-and-multiple-inputs-4cfg</guid>
      <description>&lt;p&gt;Many real-world transformations do not run on a single XML document. You often merge a primary payload with reference data, catalog lookups, or environment configuration. Done well, this results in a clean, predictable transform. Done poorly, it becomes a maze of &lt;code&gt;document()&lt;/code&gt; calls and hidden dependencies. The difference is in how you model inputs and parameters from the start. As an integration engineer, I treat input selection and parameter design as first-class API design for the stylesheet.&lt;/p&gt;

&lt;p&gt;Start by naming every input. Instead of embedding &lt;code&gt;document('config.xml')&lt;/code&gt; in multiple templates, load each external document once near the top of the stylesheet and bind it to a global variable. This makes dependencies explicit and keeps the rest of the code focused on mapping. It also helps with testing, because you can override the URI with a parameter. A clean pattern is to define &lt;code&gt;xsl:param&lt;/code&gt; values for input URIs and then bind them to &lt;code&gt;xsl:variable&lt;/code&gt; values that hold the parsed documents.&lt;/p&gt;

&lt;p&gt;The same clarity applies to parameters. Keep parameters primitive and predictable, and avoid passing in node sets unless you truly need them. A parameter should be an external knob: region, language, a feature flag, or an output format. If you have a complex decision tree, consider using a lookup XML or JSON input and then query it inside the stylesheet. This approach keeps the invocation interface stable while still letting you evolve business rules.&lt;/p&gt;

&lt;p&gt;A simple skeleton might look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;



&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From there, templates can reference &lt;code&gt;$catalog&lt;/code&gt; without worrying about IO or base URIs. You can also define a named template that accepts a parameter for reuse across multiple modes. This is useful when the same output block is needed for several sections of the document but the selection context differs.&lt;/p&gt;

&lt;p&gt;When combining multiple inputs, always anchor your lookups to a clear key. If you can, define &lt;code&gt;xsl:key&lt;/code&gt; on the external document so lookups are efficient and readable. In XSLT 2.0 or 3.0, &lt;code&gt;xsl:for-each-group&lt;/code&gt; and the &lt;code&gt;map&lt;/code&gt; types can reduce boilerplate, but the core idea remains: make your joins explicit and deterministic. If you rely on default order or on undocumented assumptions about uniqueness, you will eventually get a hard-to-reproduce bug.&lt;/p&gt;

&lt;p&gt;Another important integration pattern is separating parsing from formatting. For example, you might normalize all values from the various inputs into a canonical intermediate structure and then render that structure into the final output. This makes testing easier and supports future outputs such as CSV, JSON, or a secondary XML format. Even in XSLT 1.0, you can emulate this by creating result tree fragments, then processing them in a second pass if needed.&lt;/p&gt;

&lt;p&gt;Multiple inputs also raise questions about fallbacks. Decide how you want to behave when optional data is missing. I prefer to centralize defaults in a few named templates or functions and avoid sprinkling &lt;code&gt;xsl:choose&lt;/code&gt; blocks everywhere. This keeps the stylesheet readable and makes it obvious how to override the defaults later. Document your fallbacks in the code with short, clear names so a future maintainer does not have to rediscover the rules by reading the entire stylesheet.&lt;/p&gt;

&lt;p&gt;Finally, create a small set of inputs that represent common scenarios and run them regularly. For example, have a baseline case, a case with missing reference data, and a case with unexpected elements. These are the cases that reveal poor assumptions about inputs. A fast way to iterate on these scenarios is to run the transform with a tool that lets you swap inputs and parameters quickly.&lt;/p&gt;

&lt;p&gt;If you want to try these patterns with real inputs and multiple documents, the online editor at &lt;a href="https://xsltplayground.com" rel="noopener noreferrer"&gt;https://xsltplayground.com&lt;/a&gt; is built for that workflow. It lets you load multiple XML documents and parameters, see how they interact, and keep your integration logic transparent as it grows.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>design</category>
      <category>programming</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>XSLT performance tuning without losing readability</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Thu, 21 May 2026 09:00:01 +0000</pubDate>
      <link>https://dev.to/alexandrev/xslt-performance-tuning-without-losing-readability-3acd</link>
      <guid>https://dev.to/alexandrev/xslt-performance-tuning-without-losing-readability-3acd</guid>
      <description>&lt;p&gt;Performance problems in XSLT are sneaky. The stylesheet looks clean, the output is correct, but the transform slows down as the input grows. Most of the time this is caused by expensive selections that are repeated in loops, or by deep &lt;code&gt;//&lt;/code&gt; searches that scan the entire tree more often than you expect. The good news is that you can usually fix these issues without turning the stylesheet into unreadable micro-optimizations.&lt;/p&gt;

&lt;p&gt;The first step is to examine where you are traversing the document. XSLT processors are optimized for template matching, so prefer &lt;code&gt;xsl:apply-templates&lt;/code&gt; and specific match patterns over &lt;code&gt;xsl:for-each&lt;/code&gt; with &lt;code&gt;//&lt;/code&gt; in the select. When you do need a search, limit it to the smallest possible subtree. A single &lt;code&gt;//&lt;/code&gt; at the top-level becomes a full-tree scan each time it runs. If it runs inside another loop, the cost can explode.&lt;/p&gt;

&lt;p&gt;Keys are the most important performance feature, and they also improve readability. When you define an &lt;code&gt;xsl:key&lt;/code&gt;, you turn a repeated search into a fast lookup. This is especially critical for join-like operations where you match a reference value to another document or a secondary section of the same document. Build the key once, and then use &lt;code&gt;key('id', $value)&lt;/code&gt; everywhere. The intent becomes clear: you are doing a lookup, not a scan. If you only use keys occasionally, it can feel like overkill, but it is often the biggest win.&lt;/p&gt;

&lt;p&gt;Modes are another useful tool. If you use the same templates in multiple contexts, you may end up doing extra work or firing templates that you do not need. A dedicated mode lets you create a focused processing pipeline that touches only the nodes relevant to that output section. This can reduce both runtime and mental overhead. It also makes it easier to reason about precedence: within a mode, you can define more specific templates without worrying about side effects on unrelated parts of the transform.&lt;/p&gt;

&lt;p&gt;Consider caching computed values in variables. XSLT variables are immutable, so they are safe to reuse without unintended side effects. If you are computing a complex string or a filtered node set repeatedly, store it once per relevant scope. Just be careful not to define a variable at the top of the stylesheet if it depends on the context; keep it as close as possible to where it is used to avoid confusion.&lt;/p&gt;

&lt;p&gt;If you are working in XSLT 2.0 or 3.0, you gain access to &lt;code&gt;xsl:for-each-group&lt;/code&gt; and higher-order functions. These can be faster and clearer than manual grouping with keys. For XSLT 1.0, the Muenchian grouping pattern is still effective, and when combined with keys it remains a strong choice. Either way, focus on minimizing passes over large node sets.&lt;/p&gt;

&lt;p&gt;Also consider the output method. Serializing large outputs can be a significant part of the runtime. If you do not need pretty-printed XML, avoid indentation to reduce the amount of whitespace and processing time. Similarly, if you are generating text or JSON, use &lt;code&gt;method="text"&lt;/code&gt; or structured XSLT 3.0 serialization options rather than building a text string node by node.&lt;/p&gt;

&lt;p&gt;I recommend using realistic test data when tuning performance. A transform that runs in 100 milliseconds on a tiny input may take seconds on real data. Use a handful of real documents and measure changes as you apply each improvement. This keeps the optimization process grounded and prevents you from making the code worse without a measurable gain.&lt;/p&gt;

&lt;p&gt;Finally, keep a balance between speed and clarity. The fastest stylesheet is useless if it is too hard to maintain. Use a few consistent patterns: keys for lookups, modes for pipelines, variables for repeated values, and limited selection scope. With those in place, the performance usually becomes acceptable without heroics.&lt;/p&gt;

&lt;p&gt;If you want a quick way to benchmark different approaches with the same input set, try the online editor at &lt;a href="https://xsltplayground.com" rel="noopener noreferrer"&gt;https://xsltplayground.com&lt;/a&gt;. It is a convenient place to experiment with keys, modes, and alternative match patterns while keeping your transform readable.&lt;/p&gt;

</description>
      <category>coding</category>
      <category>performance</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>XSLT debugging patterns that save hours</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Mon, 18 May 2026 09:00:00 +0000</pubDate>
      <link>https://dev.to/alexandrev/xslt-debugging-patterns-that-save-hours-204j</link>
      <guid>https://dev.to/alexandrev/xslt-debugging-patterns-that-save-hours-204j</guid>
      <description>&lt;p&gt;XSLT bugs are rarely loud. More often, a template silently matches the wrong node, a predicate filters out a value you needed, or a namespace mismatch turns an element into a ghost. The fastest fix comes from a repeatable debugging workflow that keeps your assumptions visible. Over time you learn the same patterns appear in almost every real project, whether you are cleansing XML feeds, integrating partner payloads, or generating documents. This post walks through the techniques I use as an integration engineer to debug transforms quickly without losing context.&lt;/p&gt;

&lt;p&gt;Start by making the matching rules obvious. The majority of issues are caused by using &lt;code&gt;//&lt;/code&gt; too freely or relying on default namespaces. Replace broad paths with anchored ones, and when in doubt, print out what the processor thinks the current node is. A simple &lt;code&gt;xsl:message&lt;/code&gt; combined with &lt;code&gt;name()&lt;/code&gt; and &lt;code&gt;namespace-uri()&lt;/code&gt; can reveal a namespace mismatch in seconds. I also add short, temporary templates that match the suspected nodes and output minimal text, which is a fast way to confirm whether the selection is correct.&lt;/p&gt;

&lt;p&gt;Next, isolate the failing region by reducing input size. You rarely need the entire input document to debug a single mapping. Extract the smallest fragment that reproduces the issue and run the transform against that. This lets you simplify predicates and remove unrelated templates. When a transform uses &lt;code&gt;xsl:key&lt;/code&gt;, add a temporary output that lists the key index for a given value so you can see if the key is being built correctly. The same idea works for variables: output them just once in a deterministic area of the result so you can verify their shape.&lt;/p&gt;

&lt;p&gt;A stable debug transform also benefits from deterministic ordering. When you iterate through nodes, add an explicit &lt;code&gt;xsl:sort&lt;/code&gt; so the output is predictable. That makes diffs meaningful when you tweak a predicate or update a template priority. If you are mixing modes, ensure the call chain is explicit; a missing &lt;code&gt;mode&lt;/code&gt; is a classic way to call a generic template by accident. A related trap is having a high-priority identity template that overrides a specialized one, so watch for priority values and make sure the most specific template wins.&lt;/p&gt;

&lt;p&gt;When handling multiple inputs, be clear about document boundaries. Use &lt;code&gt;document()&lt;/code&gt; or &lt;code&gt;collection()&lt;/code&gt; with explicit base URIs and add messages that show which document node you are iterating. If you are using XSLT 2.0 or 3.0, a quick &lt;code&gt;serialize()&lt;/code&gt; to a short string can show you whether the tree is what you expect. If you stick to XSLT 1.0, the same idea works by writing &lt;code&gt;xsl:copy-of&lt;/code&gt; into a separate debug result tree and inspecting it.&lt;/p&gt;

&lt;p&gt;Here is a short pattern I often add while troubleshooting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

    node=
    ns=



&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can drop this at the top of the stylesheet, run a quick transform, and then remove it once the root cause is found. The idea is not to keep noise in production, but to have a fast way to make the invisible visible. For more focused tracing, add the template only for the nodes you suspect are wrong. Debugging gets faster the more you scope down the noise.&lt;/p&gt;

&lt;p&gt;Finally, keep a checklist of the classic XSLT footguns: missing namespaces, wrong context node, incorrect &lt;code&gt;@&lt;/code&gt; in attribute selection, and predicates that use 1-based indexes when you thought they were 0-based. I also look for template import precedence issues and for unexpected whitespace handling when the output is textual. These are easy to miss because the transform still runs, it just runs incorrectly.&lt;/p&gt;

&lt;p&gt;If you want a fast place to test these patterns with real inputs, use the online editor at &lt;a href="https://xsltplayground.com" rel="noopener noreferrer"&gt;https://xsltplayground.com&lt;/a&gt;. It is built for rapid iteration with multiple inputs and parameters, which makes debugging much less painful and keeps your feedback loop tight.&lt;/p&gt;

</description>
      <category>backend</category>
      <category>productivity</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>XSLT string functions: complete reference with examples</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Thu, 14 May 2026 09:00:00 +0000</pubDate>
      <link>https://dev.to/alexandrev/xslt-string-functions-complete-reference-with-examples-lj8</link>
      <guid>https://dev.to/alexandrev/xslt-string-functions-complete-reference-with-examples-lj8</guid>
      <description>&lt;p&gt;String manipulation is one of the most common tasks in XSLT. Whether you are formatting output, parsing codes, or normalising values from external systems, XPath provides a rich set of string functions. This reference covers the most useful ones with examples you can run in &lt;a href="https://xsltplayground.com" rel="noopener noreferrer"&gt;XSLT Playground&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Basic string functions (XSLT 1.0+)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  string-length
&lt;/h3&gt;

&lt;p&gt;Returns the number of characters in a string.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  substring
&lt;/h3&gt;

&lt;p&gt;Extracts a portion of a string. Arguments: string, start position (1-based), optional length.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  substring-before and substring-after
&lt;/h3&gt;

&lt;p&gt;Split a string on a delimiter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  contains, starts-with, ends-with
&lt;/h3&gt;

&lt;p&gt;Test membership without extracting.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;...
...

...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  concat
&lt;/h3&gt;

&lt;p&gt;Joins strings together. Takes any number of arguments.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  normalize-space
&lt;/h3&gt;

&lt;p&gt;Strips leading and trailing whitespace, and collapses internal whitespace to single spaces. Essential for cleaning values from XML sources.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  translate
&lt;/h3&gt;

&lt;p&gt;Replaces characters one-for-one. Useful for simple case conversion or character removal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;




&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Advanced string functions (XSLT 2.0+)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  upper-case and lower-case
&lt;/h3&gt;

&lt;p&gt;No longer need &lt;code&gt;translate&lt;/code&gt; for case conversion.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  replace
&lt;/h3&gt;

&lt;p&gt;Regex-based substitution. Much more powerful than &lt;code&gt;translate&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;







&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  matches
&lt;/h3&gt;

&lt;p&gt;Tests a string against a regex.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  tokenize
&lt;/h3&gt;

&lt;p&gt;Splits a string into a sequence using a regex delimiter. Returns a sequence of strings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;







&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  string-join
&lt;/h3&gt;

&lt;p&gt;The inverse of &lt;code&gt;tokenize&lt;/code&gt;. Joins a sequence with a separator.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  format-number
&lt;/h3&gt;

&lt;p&gt;Formats a number as a string with a picture pattern.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;



&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  format-date and format-dateTime
&lt;/h3&gt;

&lt;p&gt;Format xs:date and xs:dateTime values using picture strings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;




&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Practical patterns
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Extract domain from URL:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pad a number with leading zeros:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Check if a node text is numeric:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All of these work in &lt;a href="https://xsltplayground.com" rel="noopener noreferrer"&gt;XSLT Playground&lt;/a&gt;. Set the version to 2.0 or 3.0 for the functions that require it.&lt;/p&gt;

</description>
      <category>coding</category>
      <category>programming</category>
      <category>tutorial</category>
      <category>webdev</category>
    </item>
    <item>
      <title>XSLT grouping with xsl:for-each-group: complete guide</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Mon, 11 May 2026 09:00:00 +0000</pubDate>
      <link>https://dev.to/alexandrev/xslt-grouping-with-xslfor-each-group-complete-guide-7lb</link>
      <guid>https://dev.to/alexandrev/xslt-grouping-with-xslfor-each-group-complete-guide-7lb</guid>
      <description>&lt;p&gt;Grouping is one of the most powerful features introduced in XSLT 2.0. Before it, grouping in XSLT 1.0 required the Muenchian method — a clever but verbose technique involving keys and node-set comparisons. In 2.0, &lt;code&gt;xsl:for-each-group&lt;/code&gt; makes grouping straightforward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Basic grouping with group-by
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;group-by&lt;/code&gt; groups nodes that share the same value for a given expression. The result is one iteration per distinct group value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
  DE120
  US85
  DE200
  FR60
  US140

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Stylesheet:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;










&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
  2320
  160
  2225

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key functions inside &lt;code&gt;for-each-group&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;current-grouping-key()&lt;/code&gt; — returns the value that defines the current group&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;current-group()&lt;/code&gt; — returns the sequence of all nodes in the current group&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Nested grouping
&lt;/h2&gt;

&lt;p&gt;Groups can be nested. Group orders by country, then within each country by status:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;








&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  group-adjacent
&lt;/h2&gt;

&lt;p&gt;Groups consecutive nodes that share the same key value. Unlike &lt;code&gt;group-by&lt;/code&gt;, it starts a new group when the key changes, even if the same key appeared earlier. This is useful for processing structured text or segmented data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
  Starting
  Processing
  Failed
  Retrying
  Done

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;




&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This produces three blocks: two INFO (positions 1-2), one ERROR (3-4), one INFO (5). With &lt;code&gt;group-by&lt;/code&gt;, the two INFO groups would be merged into one.&lt;/p&gt;

&lt;h2&gt;
  
  
  group-starting-with and group-ending-with
&lt;/h2&gt;

&lt;p&gt;These group nodes based on a pattern match rather than a key value. Every time a node matches the pattern, a new group starts (or ends).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;group-starting-with example&lt;/strong&gt; — treat every &lt;code&gt;##&lt;/code&gt; as the start of a section:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;

    &lt;span class="nt"&gt;&amp;lt;xsl:value-of&lt;/span&gt; &lt;span class="na"&gt;select=&lt;/span&gt;&lt;span class="s"&gt;"self::h2"&lt;/span&gt;&lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;



&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;group-ending-with example&lt;/strong&gt; — group lines until a blank line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;




&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Computing aggregates
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;current-group()&lt;/code&gt; returns a sequence, so you can apply any XPath aggregate function directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;







&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Try it in XSLT Playground
&lt;/h2&gt;

&lt;p&gt;Paste any of the examples above into &lt;a href="https://xsltplayground.com" rel="noopener noreferrer"&gt;XSLT Playground&lt;/a&gt; with version set to 2.0 or 3.0. Grouping is one of the features that benefits most from live testing — you can immediately see how changing the grouping key or switching between &lt;code&gt;group-by&lt;/code&gt; and &lt;code&gt;group-adjacent&lt;/code&gt; affects the output structure.&lt;/p&gt;

</description>
      <category>coding</category>
      <category>programming</category>
      <category>tutorial</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Radar: A New Kubernetes IDE Worth Knowing About (vs OpenLens, FreeLens)</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Sat, 09 May 2026 22:10:36 +0000</pubDate>
      <link>https://dev.to/alexandrev/radar-a-new-kubernetes-ide-worth-knowing-about-vs-openlens-freelens-4bma</link>
      <guid>https://dev.to/alexandrev/radar-a-new-kubernetes-ide-worth-knowing-about-vs-openlens-freelens-4bma</guid>
      <description>&lt;p&gt;If you’ve been following Kubernetes tooling, you’ve probably already been through the Lens saga: Lens went commercial, &lt;a href="https://alexandre-vazquez.com/openlens-vs-lens/" rel="noopener noreferrer"&gt;OpenLens&lt;/a&gt; emerged as the community fork, then &lt;a href="https://alexandre-vazquez.com/freelens-vs-openlens-vs-lens/" rel="noopener noreferrer"&gt;FreeLens&lt;/a&gt; appeared when OpenLens maintenance slowed. The pattern is familiar — a useful desktop tool, a licensing decision, a fork, another fork.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://radarhq.io/" rel="noopener noreferrer"&gt;Radar&lt;/a&gt; is not a fork. It’s a different approach to the same problem: giving engineers a useful interface for Kubernetes clusters without the friction of kubectl for every task. Built by &lt;a href="https://skyhook.io/" rel="noopener noreferrer"&gt;Skyhook&lt;/a&gt; (YC-backed, Google Cloud Partner), it’s been live since 2025, has 1.7k+ GitHub stars, releases weekly, and the founder reaches out to the community directly. That’s usually a good signal that someone is genuinely building in public.&lt;/p&gt;

&lt;p&gt;This article covers what Radar actually does, where it pulls ahead of OpenLens and FreeLens, and when those tools are still the right choice.&lt;/p&gt;




&lt;h2&gt;
  
  
  The State of Kubernetes Desktop Tooling in 2026
&lt;/h2&gt;

&lt;p&gt;Before getting into Radar specifically, it’s worth naming the landscape clearly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lens&lt;/strong&gt; — the original. Electron-based, polished, now commercial (Mirantis). The free Personal tier is non-commercial only. Pro is ~$22-35/user/month.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenLens&lt;/strong&gt; — the community fork of Lens before Mirantis closed exec/logs/shell in v6.3 (January 2023). Maintenance has slowed significantly. No active release cadence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FreeLens&lt;/strong&gt; — a more active community fork, filling the gap left by OpenLens’ decline. Restores the missing features. No commercial backing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;k9s&lt;/strong&gt; — terminal TUI, fast, keyboard-driven, single-cluster. Different audience.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Headlamp&lt;/strong&gt; — CNCF Sandbox project, plugin-extensible, web-based.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Radar&lt;/strong&gt; — Go binary, Apache 2.0, team-oriented, topology and event timeline focused.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The problem with OpenLens and FreeLens is not that they’re bad tools — they’re genuinely useful for the solo developer with one or two clusters. The problem is that they’re single-cluster-at-a-time desktop apps with no concept of team, no persistent state, and no awareness of the modern Kubernetes ecosystem (ArgoCD, Flux, Karpenter, KEDA). As your infrastructure grows, you outgrow them.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Radar Actually Is
&lt;/h2&gt;

&lt;p&gt;Radar is available in two forms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Radar OSS&lt;/strong&gt; — a single ~30MB Go binary, Apache 2.0, free forever. Can run locally (desktop app) or deployed in-cluster via Helm. No sidecars, no feature gates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Radar Cloud&lt;/strong&gt; — same binary, adds a hosted control plane with fleet aggregation, 30-day event retention, SSO/SCIM, scoped RBAC, and shared URLs for team incident response. Priced per cluster ($99/cluster/month for Team), not per user.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The per-cluster pricing is a deliberate design decision — teams don’t pay more as they add engineers, only as they add clusters. For a 20-person platform engineering team managing 5 clusters, Radar Cloud runs $495/month. The equivalent Lens Pro seats would cost $2,200-4,200/month.&lt;/p&gt;

&lt;p&gt;For most self-hosted environments, the OSS version is sufficient and costs nothing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Features
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Topology View
&lt;/h3&gt;

&lt;p&gt;This is the most visually distinctive feature. Radar renders a live service graph for your cluster: deployments, services, ingresses, cross-namespace dependencies, and east-west traffic flows — all in a single view without running &lt;code&gt;kubectl get all -A&lt;/code&gt; and stitching the output together mentally.&lt;/p&gt;

&lt;p&gt;OpenLens and FreeLens have resource list views. They show you what exists. Radar shows you how things connect — which is what you actually need when debugging why Service A can’t reach Service B.&lt;/p&gt;

&lt;h3&gt;
  
  
  Persistent Event Timeline
&lt;/h3&gt;

&lt;p&gt;Kubernetes events are ephemeral by default — they expire after approximately one hour. When something breaks at 2am and you’re looking at it at 9am, the events that explain what happened are gone. Logs may still be there if you’re running a log aggregator, but the Kubernetes-level events (pod restarts, scheduling failures, node pressure events, probe failures) are gone.&lt;/p&gt;

&lt;p&gt;Radar retains events. The OSS version extends this beyond the default 1-hour cluster retention. The Cloud version retains 30 days. You can rewind the timeline to any point and reconstruct what the cluster looked like at that moment.&lt;/p&gt;

&lt;p&gt;Neither OpenLens nor FreeLens have any event retention beyond what the cluster itself provides.&lt;/p&gt;

&lt;h3&gt;
  
  
  GitOps Integration (ArgoCD + Flux)
&lt;/h3&gt;

&lt;p&gt;Radar auto-detects ArgoCD and Flux and surfaces sync state, drift, and health directly in the UI. You can see whether a deployment is in sync, when it last synced, and whether it drifted from the desired state in Git.&lt;/p&gt;

&lt;p&gt;In OpenLens and FreeLens, ArgoCD resources appear as generic Kubernetes custom resources. You can see the CRDs, but there’s no purpose-built understanding of what they mean — no sync status visualization, no diff view, no rollback trigger.&lt;/p&gt;

&lt;h3&gt;
  
  
  Helm Management
&lt;/h3&gt;

&lt;p&gt;Radar tracks Helm releases with full revision history and supports one-click rollbacks from the UI. This is similar to what OpenLens/FreeLens offer via the Helm releases view, but Radar adds revision diffing — you can see what changed between release 5 and release 6 before deciding to roll back.&lt;/p&gt;

&lt;h3&gt;
  
  
  Image Filesystem
&lt;/h3&gt;

&lt;p&gt;You can browse container image filesystems through Radar without needing &lt;code&gt;kubectl exec&lt;/code&gt; into a running pod or access to the container registry. Useful for security audits and debugging — you can verify what’s actually in an image at rest.&lt;/p&gt;

&lt;h3&gt;
  
  
  MCP Server (AI Integration)
&lt;/h3&gt;

&lt;p&gt;Radar ships with an &lt;a href="https://alexandre-vazquez.com/mcp-kubernetes/" rel="noopener noreferrer"&gt;MCP (Model Context Protocol)&lt;/a&gt; server, which means you can connect Claude, Cursor, or GitHub Copilot directly to your cluster context and ask questions about it in natural language. The MCP server is token-optimized — it doesn’t dump raw YAML at the model, it structures cluster state into meaningful context.&lt;/p&gt;

&lt;p&gt;This is something neither OpenLens nor FreeLens have. It’s also something that’s genuinely useful if you’re already using AI assistants for development work.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cluster Audit
&lt;/h3&gt;

&lt;p&gt;30 built-in best-practice checks — resource requests/limits, RBAC permissions, image pinning, network policies, security contexts. The checks are labeled by compliance framework. This is not a replacement for dedicated security tooling (Trivy, Falco, Polaris), but it’s a useful first-pass audit without leaving the tool you’re already using.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-Cluster Support (Cloud)
&lt;/h3&gt;

&lt;p&gt;The Cloud tier adds fleet-level visibility: a single view across all clusters, cross-cluster search, and drift detection between environments (e.g., staging vs. production). This is the feature that changes the calculus for platform engineering teams managing 5+ clusters.&lt;/p&gt;

&lt;p&gt;OpenLens and FreeLens require you to switch cluster context manually. There is no fleet view.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture: Why a Go Binary Matters
&lt;/h2&gt;

&lt;p&gt;OpenLens and FreeLens are Electron apps — Chromium + Node.js wrapped in a desktop shell. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;200-500MB install size&lt;/li&gt;
&lt;li&gt;1-2 second startup time on a fast machine, more on slower ones&lt;/li&gt;
&lt;li&gt;Memory footprint in the hundreds of megabytes&lt;/li&gt;
&lt;li&gt;Local kubeconfig required on each engineer’s machine&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Radar’s in-cluster deployment is a single Go binary (~30MB) that runs as a Pod with a ServiceAccount. It connects to the hosted control plane over outbound WebSocket + TLS. No inbound firewall rules, no kubeconfig distribution, no per-engineer setup.&lt;/p&gt;

&lt;p&gt;The local desktop app is also a lightweight Go binary — 65-second startup was demonstrated on a 322-node cluster. That’s not a typo.&lt;/p&gt;

&lt;p&gt;For in-cluster deployment, the architecture means security is handled at the ServiceAccount level, not by distributing kubeconfigs to engineer laptops. That matters for teams with security requirements around credential management.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feature Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Radar OSS&lt;/th&gt;
&lt;th&gt;Radar Cloud&lt;/th&gt;
&lt;th&gt;OpenLens&lt;/th&gt;
&lt;th&gt;FreeLens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;License&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Apache 2.0&lt;/td&gt;
&lt;td&gt;Proprietary (hosted)&lt;/td&gt;
&lt;td&gt;MIT/GPL&lt;/td&gt;
&lt;td&gt;MIT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maintenance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Active (weekly releases)&lt;/td&gt;
&lt;td&gt;Active&lt;/td&gt;
&lt;td&gt;Stalled&lt;/td&gt;
&lt;td&gt;Active (community)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Architecture&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Go binary / in-cluster&lt;/td&gt;
&lt;td&gt;In-cluster + hosted&lt;/td&gt;
&lt;td&gt;Electron&lt;/td&gt;
&lt;td&gt;Electron&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-cluster&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Basic&lt;/td&gt;
&lt;td&gt;Fleet view&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Event retention&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Extended&lt;/td&gt;
&lt;td&gt;30 days&lt;/td&gt;
&lt;td&gt;Cluster default (~1h)&lt;/td&gt;
&lt;td&gt;Cluster default (~1h)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Topology view&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GitOps (ArgoCD/Flux)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;CRDs only&lt;/td&gt;
&lt;td&gt;CRDs only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Helm management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;kubectl exec / logs / shell&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt; (restored)&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP / AI integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cluster audit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SSO / SCIM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Shared incident URLs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Image filesystem browser&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost tracking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt; (OpenCost)&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b2y0m4hrgrahy569su0.png" alt="✅" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;td&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fygqve7u6bdpolgetc8vx.png" alt="❌" width="72" height="72"&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Price&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;$99/cluster/month&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  When Radar Makes Sense
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;You’re managing multiple clusters.&lt;/strong&gt; Even with the OSS version, the topology view and event timeline make Radar more useful than OpenLens/FreeLens at 3+ clusters. The Cloud fleet view is the compelling option at 5+.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your team uses GitOps.&lt;/strong&gt; If ArgoCD or Flux is part of your workflow, Radar’s native understanding of sync state and drift is meaningfully better than seeing CRDs in a generic list view.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You need post-mortem capability.&lt;/strong&gt; If your incident review process involves looking at what the cluster was doing when the alert fired, you need event retention. Radar has it; OpenLens and FreeLens don’t.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You’re adopting AI tooling.&lt;/strong&gt; The MCP server is the most forward-looking feature here. If you use Claude Code, Cursor, or Copilot for your infrastructure work, having cluster context available to those tools without copy-pasting YAML is a genuine productivity improvement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You have a platform engineering team.&lt;/strong&gt; Per-cluster pricing, SSO, SCIM, and shared incident URLs are features that only matter if you have more than one person managing infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  When OpenLens or FreeLens Still Makes Sense
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;You’re a solo developer with one or two clusters.&lt;/strong&gt; OpenLens and FreeLens are familiar, local, and have zero setup overhead. If you don’t need team features, event retention, or topology views, they remain perfectly functional tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You’re deeply invested in the Lens UX.&lt;/strong&gt; The resource tree, the terminal integration, the way Lens presents namespace-scoped resources — if your muscle memory is built around that interface, switching has a real cost. Radar is different, not just better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You need maximum customization.&lt;/strong&gt; OpenLens and FreeLens support plugins. Radar does not currently have a plugin system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your environment is air-gapped or has strict egress restrictions.&lt;/strong&gt; Radar OSS can run fully in-cluster, but Radar Cloud requires outbound connectivity to the hosted control plane. OpenLens and FreeLens are fully local.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;OSS installation takes about two minutes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Homebrew (macOS/Linux)&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;skyhook-io/tap/radar

&lt;span class="c"&gt;# Helm (in-cluster)&lt;/span&gt;
helm repo add skyhook https://charts.skyhook.io
helm &lt;span class="nb"&gt;install &lt;/span&gt;radar skyhook/radar &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--namespace&lt;/span&gt; radar &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--create-namespace&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--set&lt;/span&gt; service.type&lt;span class="o"&gt;=&lt;/span&gt;ClusterIP

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or download the binary directly from &lt;a href="https://radarhq.io/" rel="noopener noreferrer"&gt;radarhq.io&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Verdict
&lt;/h2&gt;

&lt;p&gt;Radar is the most interesting new entrant in the Kubernetes tooling space in a while — not because it replaces everything else, but because it addresses the specific gap that OpenLens and FreeLens never covered: teams, multiple clusters, and persistent state.&lt;/p&gt;

&lt;p&gt;For a solo developer, OpenLens or FreeLens are still completely reasonable choices. For a platform engineering team managing more than two clusters with ArgoCD or Flux, Radar’s feature set is materially better and the OSS version costs nothing.&lt;/p&gt;

&lt;p&gt;The active release cadence and the YC backing suggest this isn’t a one-person side project — there’s a team actively working on it. Whether the Cloud pricing sticks long-term is a question only usage will answer, but the Apache 2.0 core with an explicit “always open source” commitment is the right foundation.&lt;/p&gt;

&lt;p&gt;Worth evaluating if you haven’t already.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Tested with Radar OSS v0.x on Kubernetes 1.29–1.32. Pricing and feature availability as of May 2026.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
    </item>
    <item>
      <title>XSLT template matching explained with examples</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Thu, 07 May 2026 09:00:00 +0000</pubDate>
      <link>https://dev.to/alexandrev/xslt-template-matching-explained-with-examples-1f2n</link>
      <guid>https://dev.to/alexandrev/xslt-template-matching-explained-with-examples-1f2n</guid>
      <description>&lt;p&gt;Template matching is the mechanism that drives every XSLT transformation. Understanding how the processor selects templates — and what happens when multiple templates could match — is the difference between a stylesheet that works reliably and one that produces surprising output. This post covers everything you need to know.&lt;/p&gt;

&lt;h2&gt;
  
  
  How match patterns work
&lt;/h2&gt;

&lt;p&gt;When the processor visits a node, it evaluates every template's &lt;code&gt;match&lt;/code&gt; attribute as an XPath pattern. A pattern is a restricted form of XPath that tests properties of a node rather than selecting nodes from a starting point. If the pattern is satisfied, that template is a candidate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;





&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Priority and conflict resolution
&lt;/h2&gt;

&lt;p&gt;More than one template can match the same node. The processor resolves the conflict using priority. Each pattern has a default priority calculated by the spec:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern type&lt;/th&gt;
&lt;th&gt;Default priority&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;node()&lt;/code&gt; or &lt;code&gt;*&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;-0.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;element-name&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;prefix:element-name&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;a/b&lt;/code&gt; (path)&lt;/td&gt;
&lt;td&gt;0.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;a[predicate]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;@attr&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;More specific patterns automatically get higher priority. You can override this with the &lt;code&gt;priority&lt;/code&gt; attribute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If two templates have equal computed priority, the processor signals an error (or picks the last one, depending on implementation — Saxon issues an error by default). Always assign explicit priorities when you have competing templates.&lt;/p&gt;

&lt;h2&gt;
  
  
  The built-in templates
&lt;/h2&gt;

&lt;p&gt;XSLT has default templates for every node type. If no explicit template matches a node, the built-in fires. For elements and the document root, the built-in calls &lt;code&gt;apply-templates&lt;/code&gt; on all children. For text and attribute nodes, it outputs the string value.&lt;/p&gt;

&lt;p&gt;This means that without any templates at all, the processor will walk the entire tree and output all text content. Understanding this explains why simple stylesheets can produce unexpected extra text — a text node matched nothing explicit, and the built-in output it.&lt;/p&gt;

&lt;p&gt;To suppress text output globally, add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This overrides the built-in with an empty template, producing no output for text nodes that are not handled elsewhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  Modes
&lt;/h2&gt;

&lt;p&gt;Modes let you have multiple templates for the same node that serve different purposes. A mode is a named context for a set of templates.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;







    ## 





&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Call with mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Modes are especially useful when you need to process the same nodes in multiple places in the output with different logic each time.&lt;/p&gt;

&lt;h2&gt;
  
  
  apply-templates vs for-each
&lt;/h2&gt;

&lt;p&gt;Both iterate over a set of nodes. The difference is that &lt;code&gt;apply-templates&lt;/code&gt; dispatches to the best matching template for each node, while &lt;code&gt;for-each&lt;/code&gt; stays in the current context and does not do template lookup.&lt;/p&gt;

&lt;p&gt;Use &lt;code&gt;apply-templates&lt;/code&gt; when you want polymorphism — different node types handled differently. Use &lt;code&gt;for-each&lt;/code&gt; when you are doing a simple iteration over a homogeneous set and do not need dispatch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;




  - 

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing patterns in XSLT Playground
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://xsltplayground.com" rel="noopener noreferrer"&gt;XSLT Playground&lt;/a&gt; is a fast way to experiment with matching rules. Paste a stylesheet with multiple competing templates and check which one fires. Add `` in each template to trace which one the processor picks. The trace panel shows all messages in order so you can follow the dispatch chain.&lt;/p&gt;

&lt;p&gt;Solid understanding of template matching pays off every time you work on a complex stylesheet. Once you know how priorities and built-ins interact, most "unexpected output" bugs become obvious.&lt;/p&gt;

</description>
      <category>coding</category>
      <category>programming</category>
      <category>tutorial</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Prometheus Alertmanager vs Grafana Alerting (2026): Architecture, Features, and When to Use Each</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Tue, 05 May 2026 11:00:00 +0000</pubDate>
      <link>https://dev.to/alexandrev/prometheus-alertmanager-vs-grafana-alerting-2026-architecture-features-and-when-to-use-each-48d7</link>
      <guid>https://dev.to/alexandrev/prometheus-alertmanager-vs-grafana-alerting-2026-architecture-features-and-when-to-use-each-48d7</guid>
      <description>&lt;h1&gt;
  
  
  Prometheus Alertmanager vs Grafana Alerting (2026): Architecture, Features, and When to Use Each
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Most observability stacks running in production for over a year end up with alerting spread across two systems: Prometheus Alertmanager handling metric-based alerts and Grafana Alerting managing everything else. This creates the "alerting consolidation problem" where on-call teams receive duplicated pages, silencing rules live in two places, and nobody is certain which system is authoritative.&lt;/p&gt;

&lt;p&gt;The question is straightforward: should you standardize on Prometheus Alertmanager, move everything into Grafana Alerting, or deliberately run both? The answer depends on your datasource mix, your GitOps maturity, and how your organization manages on-call routing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture Overview
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prometheus Alertmanager: The Standalone Receiver
&lt;/h3&gt;

&lt;p&gt;Alertmanager is a dedicated, standalone component in the Prometheus ecosystem. It does not evaluate alert rules itself. Instead, Prometheus (or compatible senders like Thanos Ruler, Cortex, or Mimir Ruler) evaluates PromQL expressions and pushes firing alerts to the Alertmanager API. Alertmanager then handles deduplication, grouping, inhibition, silencing, and notification delivery.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Simplified Prometheus → Alertmanager flow
#
# [Prometheus] --evaluates rules--&amp;gt; [firing alerts]
#        |
#        +--POST /api/v2/alerts--&amp;gt; [Alertmanager]
#                                      |
#                          +-----------+-----------+
#                          |           |           |
#                       [Slack]    [PagerDuty]  [Email]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The entire configuration lives in a single YAML file (&lt;code&gt;alertmanager.yml&lt;/code&gt;). This includes the routing tree, receiver definitions, inhibition rules, and silence templates. There is no database, no UI-driven state — just a config file and an optional local storage directory for notification state and silences. This makes it trivially reproducible and ideal for GitOps workflows.&lt;/p&gt;

&lt;p&gt;For high availability, you run multiple Alertmanager instances in a gossip-based cluster. They use a mesh protocol to share silence and notification state, ensuring that failover does not result in duplicate or lost notifications.&lt;/p&gt;

&lt;h3&gt;
  
  
  Grafana Alerting: The Integrated Platform
&lt;/h3&gt;

&lt;p&gt;Grafana Alerting (sometimes called "Grafana Unified Alerting," introduced in Grafana 8) takes a different architectural approach. It embeds the entire alerting lifecycle — rule evaluation, state management, routing, and notification — inside the Grafana server process. Under the hood, it uses a fork of Alertmanager for the routing and notification layer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Simplified Grafana Alerting flow
#
# [Grafana Server]
#   ├── Rule Evaluation Engine
#   │     ├── queries Prometheus
#   │     ├── queries Loki
#   │     ├── queries CloudWatch
#   │     └── queries any supported datasource
#   │
#   ├── Alert State Manager (internal)
#   │
#   └── Embedded Alertmanager (routing + notifications)
#           |
#           +-----------+-----------+
#           |           |           |
#        [Slack]    [PagerDuty]  [Email]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The critical distinction is that Grafana Alerting evaluates alert rules itself, querying any configured datasource — not just Prometheus. It can fire alerts based on Loki log queries, Elasticsearch searches, CloudWatch metrics, PostgreSQL queries, or any of the 100+ datasource plugins available in Grafana. Rule definitions, contact points, notification policies, and mute timings are stored in the Grafana database (or provisioned via YAML files and the Grafana API).&lt;/p&gt;

&lt;h2&gt;
  
  
  Feature Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Prometheus Alertmanager&lt;/th&gt;
&lt;th&gt;Grafana Alerting&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Datasources&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Prometheus-compatible only (Prometheus, Thanos, Mimir, VictoriaMetrics)&lt;/td&gt;
&lt;td&gt;Any Grafana datasource (Prometheus, Loki, Elasticsearch, CloudWatch, SQL databases, etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rule evaluation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;External (Prometheus/Ruler evaluates rules and pushes alerts)&lt;/td&gt;
&lt;td&gt;Built-in (Grafana evaluates rules directly)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Routing tree&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hierarchical YAML-based routing with match/match_re, continue, group_by&lt;/td&gt;
&lt;td&gt;Notification policies with label matchers, nested policies, mute timings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Grouping&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full support via group_by, group_wait, group_interval&lt;/td&gt;
&lt;td&gt;Full support via notification policies with equivalent controls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Inhibition&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Native inhibition rules (suppress alerts when a related alert is firing)&lt;/td&gt;
&lt;td&gt;Supported since Grafana 10.3 but less flexible than Alertmanager&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Silencing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Label-based silences via API or UI, time-limited&lt;/td&gt;
&lt;td&gt;Mute timings (recurring schedules) and silences (ad-hoc, label-based)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Notification channels&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Email, Slack, PagerDuty, OpsGenie, VictoriaOps, webhook, WeChat, Telegram, SNS, Webex&lt;/td&gt;
&lt;td&gt;All of the above plus Teams, Discord, Google Chat, LINE, Threema, Oncall, and more via contact points&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Templating&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Go templates in notification config&lt;/td&gt;
&lt;td&gt;Go templates with access to Grafana template variables and functions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-tenancy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not built-in; achieved via separate instances or Mimir Alertmanager&lt;/td&gt;
&lt;td&gt;Native multi-tenancy via Grafana organizations and RBAC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;High availability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Gossip-based cluster (peer mesh, well-proven)&lt;/td&gt;
&lt;td&gt;Database-backed HA with peer discovery between Grafana instances&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Configuration model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single YAML file, fully declarative&lt;/td&gt;
&lt;td&gt;UI + API + provisioning YAML files, stored in database&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GitOps compatibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Excellent — config file lives in version control natively&lt;/td&gt;
&lt;td&gt;Possible via provisioning files or Terraform provider, but requires extra tooling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;External alert sources&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Any system that can POST to the Alertmanager API&lt;/td&gt;
&lt;td&gt;Supported via the Grafana Alerting API (external alerts can be pushed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Managed service&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Available via Grafana Cloud (as Mimir Alertmanager), Amazon Managed Prometheus&lt;/td&gt;
&lt;td&gt;Available via Grafana Cloud&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Alertmanager Strengths
&lt;/h2&gt;

&lt;p&gt;Alertmanager has been a production staple since 2015. Over a decade of use across thousands of organizations has made it one of the most battle-tested components in the CNCF ecosystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Declarative, GitOps-Native Configuration
&lt;/h3&gt;

&lt;p&gt;The entire Alertmanager configuration is a single YAML file. There is no hidden state in a database, no click-driven configuration that someone forgets to document. You check it into Git, review it in a pull request, and deploy it through your CI/CD pipeline like any other infrastructure code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# alertmanager.yml — everything in one file&lt;/span&gt;
&lt;span class="na"&gt;global&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;resolve_timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5m&lt;/span&gt;
  &lt;span class="na"&gt;slack_api_url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://hooks.slack.com/services/T00/B00/XXX"&lt;/span&gt;

&lt;span class="na"&gt;route&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;receiver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;platform-team&lt;/span&gt;
  &lt;span class="na"&gt;group_by&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;alertname&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;cluster&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;group_wait&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
  &lt;span class="na"&gt;group_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5m&lt;/span&gt;
  &lt;span class="na"&gt;repeat_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;4h&lt;/span&gt;
  &lt;span class="na"&gt;routes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;critical&lt;/span&gt;
      &lt;span class="na"&gt;receiver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pagerduty-oncall&lt;/span&gt;
      &lt;span class="na"&gt;group_wait&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;match_re&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;team&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^(payments|checkout)$"&lt;/span&gt;
      &lt;span class="na"&gt;receiver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;payments-slack&lt;/span&gt;
      &lt;span class="na"&gt;continue&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;

&lt;span class="na"&gt;receivers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;platform-team&lt;/span&gt;
    &lt;span class="na"&gt;slack_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;channel&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#platform-alerts"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pagerduty-oncall&lt;/span&gt;
    &lt;span class="na"&gt;pagerduty_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;service_key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;payments-slack&lt;/span&gt;
    &lt;span class="na"&gt;slack_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;channel&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#payments-oncall"&lt;/span&gt;

&lt;span class="na"&gt;inhibit_rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;source_match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;critical&lt;/span&gt;
    &lt;span class="na"&gt;target_match&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;warning&lt;/span&gt;
    &lt;span class="na"&gt;equal&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;alertname&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;cluster&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every change is auditable. Rollbacks are a &lt;code&gt;git revert&lt;/code&gt; away. This matters enormously when you are debugging why an alert did not fire at 3 AM.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lightweight and Single-Purpose
&lt;/h3&gt;

&lt;p&gt;Alertmanager does one thing: route and deliver notifications. It has no dashboard, no query engine, no datasource plugins. This single-purpose design makes it operationally simple. Resource consumption is minimal — a small Alertmanager instance handles thousands of active alerts on a few hundred megabytes of memory. It starts in milliseconds and requires almost no maintenance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mature Inhibition and Routing
&lt;/h3&gt;

&lt;p&gt;Alertmanager's inhibition rules are first-class citizens. You can suppress downstream warnings when a critical alert is already firing, preventing alert storms from overwhelming your on-call team. The hierarchical routing tree with &lt;code&gt;continue&lt;/code&gt; flags allows for nuanced delivery: send to the team channel AND escalate to PagerDuty simultaneously, with different grouping strategies at each level.&lt;/p&gt;

&lt;h3&gt;
  
  
  Proven High Availability
&lt;/h3&gt;

&lt;p&gt;The gossip-based HA cluster has been stable for years. Running three Alertmanager replicas behind a load balancer (or using Kubernetes service discovery) gives you reliable notification delivery without shared storage. The protocol handles deduplication across instances automatically, which is the hardest part of distributed alerting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Grafana Alerting Strengths
&lt;/h2&gt;

&lt;p&gt;Grafana Alerting has matured considerably since its rocky introduction in Grafana 8. By Grafana 11 and 12, it has become a legitimate production alerting platform with capabilities that Alertmanager cannot match on its own.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-Datasource Alert Rules
&lt;/h3&gt;

&lt;p&gt;This is Grafana Alerting's strongest differentiator. You can write alert rules that query Loki for error log spikes, CloudWatch for AWS resource utilization, Elasticsearch for application errors, or a PostgreSQL database for business metrics — all from the same alerting system. If your observability stack includes more than just Prometheus, this eliminates the need for separate alerting tools per datasource.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Grafana alert rule provisioning example — alerting on Loki log errors&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="na"&gt;groups&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;orgId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application-errors&lt;/span&gt;
    &lt;span class="na"&gt;folder&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Production&lt;/span&gt;
    &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;1m&lt;/span&gt;
    &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;loki-error-spike&lt;/span&gt;
        &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;High&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;rate&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;payment&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;service"&lt;/span&gt;
        &lt;span class="na"&gt;condition&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;C&lt;/span&gt;
        &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;refId: A\            datasourceUid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;loki-prod&lt;/span&gt;
            &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;expr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sum(rate({app="payment-service"}&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;|=&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;"ERROR"&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;[5m]))'&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;refId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;B&lt;/span&gt;
            &lt;span class="na"&gt;datasourceUid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__expr__"&lt;/span&gt;
            &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;reduce&lt;/span&gt;
              &lt;span class="na"&gt;expression&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;A&lt;/span&gt;
              &lt;span class="na"&gt;reducer&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;last&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;refId&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;C&lt;/span&gt;
            &lt;span class="na"&gt;datasourceUid&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__expr__"&lt;/span&gt;
            &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;threshold&lt;/span&gt;
              &lt;span class="na"&gt;expression&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;B&lt;/span&gt;
              &lt;span class="na"&gt;conditions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;evaluator&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gt&lt;/span&gt;
                    &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;10&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
        &lt;span class="na"&gt;for&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5m&lt;/span&gt;
        &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;warning&lt;/span&gt;
          &lt;span class="na"&gt;team&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;payments&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is something Alertmanager simply cannot do. Alertmanager only receives pre-evaluated alerts — it has no concept of datasources or query execution.&lt;/p&gt;

&lt;h3&gt;
  
  
  Unified UI for Alert Management
&lt;/h3&gt;

&lt;p&gt;Grafana provides a single pane of glass for alert rule creation, visualization, notification policy management, contact point configuration, and silence management. For teams where not every engineer is comfortable editing YAML routing trees, the visual notification policy editor significantly reduces the barrier to entry. You can see the state of every alert rule, its evaluation history, and the exact notification path it will take — all without leaving the browser.&lt;/p&gt;

&lt;h3&gt;
  
  
  Native Multi-Tenancy and RBAC
&lt;/h3&gt;

&lt;p&gt;Grafana's organization model and role-based access control extend naturally to alerting. Different teams can manage their own alert rules, contact points, and notification policies within their organization or folder scope, without seeing or interfering with other teams. Achieving this with standalone Alertmanager requires either running separate instances per tenant or using Mimir's multi-tenant Alertmanager.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mute Timings and Richer Scheduling
&lt;/h3&gt;

&lt;p&gt;While Alertmanager supports silences (ad-hoc, time-limited suppressions), Grafana Alerting adds mute timings — recurring time-based windows where notifications are suppressed. This is useful for scheduled maintenance windows, business-hours-only alerting, or suppressing non-critical alerts on weekends. Alertmanager requires external tooling or manual silence creation for recurring windows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Grafana Cloud as a Managed Option
&lt;/h3&gt;

&lt;p&gt;For teams that want to avoid managing alerting infrastructure entirely, Grafana Cloud provides a fully managed Grafana Alerting stack. This includes HA, state persistence, and notification delivery without any self-hosted components. The Grafana Cloud alerting stack also includes a managed Mimir Alertmanager, which means you can use Prometheus-native alerting rules if you prefer that model while still benefiting from the managed infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Prometheus Alertmanager
&lt;/h2&gt;

&lt;p&gt;Alertmanager is the right choice when the following conditions describe your environment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Your metrics stack is Prometheus-native.&lt;/strong&gt; If all your alert rules are PromQL expressions evaluated by Prometheus, Thanos Ruler, or Mimir Ruler, Alertmanager is the natural fit. There is no added value in routing those alerts through Grafana.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitOps is non-negotiable.&lt;/strong&gt; If every infrastructure change must go through a pull request and be fully declarative, Alertmanager's single-file configuration model is significantly easier to manage than Grafana's database-backed state. Tools like &lt;code&gt;amtool&lt;/code&gt; provide config validation in CI pipelines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You need fine-grained routing with inhibition.&lt;/strong&gt; Complex routing trees with multiple levels of grouping, inhibition rules, and &lt;code&gt;continue&lt;/code&gt; flags are more naturally expressed in Alertmanager's YAML format. The routing logic has been stable and well-documented for years.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You run microservices with per-team routing.&lt;/strong&gt; If each team owns its routing subtree and the routing logic is complex, Alertmanager's hierarchical model scales better than UI-driven configuration. Teams can own their section of the config file via CODEOWNERS in Git.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You want minimal operational overhead.&lt;/strong&gt; Alertmanager is a single binary with minimal resource requirements. There is no database to back up, no migrations to run, and no UI framework to keep updated.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When to Use Grafana Alerting
&lt;/h2&gt;

&lt;p&gt;Grafana Alerting is the right choice when these conditions apply:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You alert on more than just Prometheus metrics.&lt;/strong&gt; If you need alert rules based on Loki logs, Elasticsearch queries, CloudWatch metrics, or database queries, Grafana Alerting is the only option that handles all of these natively. The alternative is running separate alerting tools per datasource, which is worse.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Your team prefers UI-driven configuration.&lt;/strong&gt; Not every engineer wants to edit YAML routing trees. If your organization values a visual interface for managing alerts, contact points, and notification policies, Grafana's UI is a major productivity advantage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You are using Grafana Cloud.&lt;/strong&gt; If you are already on Grafana Cloud, using its built-in alerting is the path of least resistance. You get HA, managed notification delivery, and a unified experience without running any additional infrastructure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-tenancy is a requirement.&lt;/strong&gt; If multiple teams need isolated alerting configurations with RBAC, Grafana's native organization and folder-based access model is significantly easier to set up than running per-tenant Alertmanager instances.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You want mute timings for recurring maintenance windows.&lt;/strong&gt; If your team regularly needs to suppress alerts during scheduled windows (deploy windows, batch processing hours, weekend non-critical suppression), Grafana's mute timings feature is more ergonomic than creating and managing recurring silences in Alertmanager.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Running Both Together: The Hybrid Pattern
&lt;/h2&gt;

&lt;p&gt;In practice, many production environments run both Alertmanager and Grafana Alerting. This is not necessarily a mistake — it can be a deliberate architectural choice when done with clear boundaries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Common Hybrid Architecture
&lt;/h3&gt;

&lt;p&gt;The most common pattern looks like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Prometheus Alertmanager&lt;/strong&gt; handles all metric-based alerts. PromQL rules are evaluated by Prometheus or a long-term storage ruler (Thanos, Mimir). Alertmanager owns routing, grouping, and notification for these alerts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grafana Alerting&lt;/strong&gt; handles non-Prometheus alerts: log-based alerts from Loki, business metrics from SQL datasources, and cross-datasource correlation rules.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key to making this work without chaos is establishing clear ownership rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Ownership boundaries for hybrid alerting
#
# Prometheus Alertmanager owns:
#   - All PromQL-based alert rules
#   - Infrastructure alerts (node, kubelet, etcd, CoreDNS)
#   - Application SLO/SLI alerts based on metrics
#
# Grafana Alerting owns:
#   - Log-based alert rules (Loki, Elasticsearch)
#   - Business metric alerts (SQL datasources)
#   - Cross-datasource correlation rules
#   - Alerts for teams that prefer UI-driven management
#
# Shared:
#   - Contact points / receivers use the same Slack channels and PagerDuty services
#   - On-call rotations are managed externally (PagerDuty, Grafana OnCall)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both systems can deliver to the same notification channels. The critical discipline is ensuring that silencing and maintenance windows are applied in both systems when needed. This is the primary operational cost of the hybrid approach.&lt;/p&gt;

&lt;h3&gt;
  
  
  Grafana as a Viewer for Alertmanager
&lt;/h3&gt;

&lt;p&gt;Even if you use Alertmanager exclusively for routing and notification, Grafana can serve as a read-only viewer. Grafana natively supports connecting to an external Alertmanager datasource, allowing you to see firing alerts, active silences, and alert groups in the Grafana UI. This gives you the operational visibility of Grafana without moving your alerting logic into it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Grafana datasource provisioning for external Alertmanager&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
&lt;span class="na"&gt;datasources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Alertmanager&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;alertmanager&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://alertmanager.monitoring.svc:9093&lt;/span&gt;
    &lt;span class="na"&gt;access&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;proxy&lt;/span&gt;
    &lt;span class="na"&gt;jsonData&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;implementation&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prometheus&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Migration Considerations
&lt;/h2&gt;

&lt;p&gt;If you are moving from one system to the other, here are the practical considerations to plan for.&lt;/p&gt;

&lt;h3&gt;
  
  
  Migrating from Alertmanager to Grafana Alerting
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rule conversion.&lt;/strong&gt; Your PromQL-based recording and alerting rules defined in Prometheus rule files need to be recreated as Grafana alert rules. Grafana provides a migration tool that can import Prometheus-format rules, but complex expressions may need manual adjustment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Routing tree translation.&lt;/strong&gt; Alertmanager's hierarchical routing tree maps to Grafana's notification policies, but the semantics are not identical. Test the notification routing thoroughly — the &lt;code&gt;continue&lt;/code&gt; flag behavior and default routes may differ.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Silence and inhibition migration.&lt;/strong&gt; Active silences are ephemeral and do not need migration. Inhibition rules need to be recreated in Grafana's format. Recurring maintenance windows should be converted to mute timings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run in parallel first.&lt;/strong&gt; The safest migration strategy is to run both systems in parallel for two to four weeks, sending notifications from both, then cutting over when you have confidence in the Grafana setup. Accept the temporary noise of duplicate alerts — it is far cheaper than missing a critical page during migration.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Migrating from Grafana Alerting to Alertmanager
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Datasource limitation.&lt;/strong&gt; You can only migrate alerts that are based on Prometheus-compatible datasources. Alerts querying Loki, Elasticsearch, or SQL datasources have no equivalent in Alertmanager — you will need an alternative solution for those.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rule export.&lt;/strong&gt; Export Grafana alert rules and convert them to Prometheus-format rule files. The Grafana API (&lt;code&gt;GET /api/v1/provisioning/alert-rules&lt;/code&gt;) provides structured output that can be transformed with a script.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contact point mapping.&lt;/strong&gt; Map Grafana contact points to Alertmanager receivers. The configuration format is different, but the concepts are equivalent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State loss.&lt;/strong&gt; Alertmanager does not carry over Grafana's alert evaluation history. You start fresh. Plan for a brief period where alerts may re-fire as Prometheus evaluates rules that were previously managed by Grafana.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Decision Framework
&lt;/h2&gt;

&lt;p&gt;If you want a quick decision path, use this framework:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Start here:
│
├── Do you alert on non-Prometheus datasources (Loki, ES, SQL, CloudWatch)?
│   ├── YES → Grafana Alerting (at least for those datasources)
│   └── NO ↓
│
├── Is GitOps/declarative config a hard requirement?
│   ├── YES → Alertmanager
│   └── NO ↓
│
├── Do you need multi-tenancy with RBAC?
│   ├── YES → Grafana Alerting (or Mimir Alertmanager)
│   └── NO ↓
│
├── Are you on Grafana Cloud?
│   ├── YES → Grafana Alerting (path of least resistance)
│   └── NO ↓
│
└── Default → Alertmanager (simpler, lighter, well-proven)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For many teams, the honest answer is "both" — Alertmanager for the Prometheus-native metric pipeline, Grafana Alerting for everything else. That is a valid architecture as long as the ownership boundaries are documented and the on-call team knows where to look.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is the difference between Alertmanager and Grafana Alerting?
&lt;/h3&gt;

&lt;p&gt;Prometheus Alertmanager is a standalone notification routing engine that receives pre-evaluated alerts from Prometheus and delivers them to receivers like Slack, PagerDuty, or email. Grafana Alerting is an integrated alerting platform embedded in Grafana that both evaluates alert rules and handles notification routing. Alertmanager is configured entirely via YAML, while Grafana Alerting offers a UI, API, and file-based provisioning. The fundamental difference is scope: Alertmanager handles only the routing and notification phase, while Grafana Alerting handles the full lifecycle from query evaluation to notification.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can Grafana Alerting replace Prometheus Alertmanager?
&lt;/h3&gt;

&lt;p&gt;Yes, for many use cases. Grafana Alerting can evaluate PromQL rules directly against your Prometheus datasource, so you do not strictly need a separate Alertmanager instance. However, there are scenarios where Alertmanager remains the better choice: heavily GitOps-driven environments, teams that need Alertmanager's mature inhibition rules, or architectures where Prometheus rule evaluation happens externally (Thanos Ruler, Mimir Ruler) and a dedicated Alertmanager is already in the pipeline. If your only datasource is Prometheus and you value declarative configuration, Alertmanager is still simpler and lighter.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Grafana Alertmanager the same as Prometheus Alertmanager?
&lt;/h3&gt;

&lt;p&gt;Not exactly. Grafana Alerting uses a fork of the Prometheus Alertmanager code internally for its notification routing engine, but it is not the same product. The Grafana "Alertmanager" visible in the UI is a managed, embedded component with a different configuration interface (notification policies, contact points, mute timings) compared to the standalone Prometheus Alertmanager (routing tree, receivers, inhibition rules in YAML). Grafana can also connect to an external Prometheus Alertmanager as a datasource, which adds to the confusion.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are the best alternatives to Prometheus Alertmanager?
&lt;/h3&gt;

&lt;p&gt;The most direct alternative is Grafana Alerting, which can receive and route Prometheus alerts while also supporting other datasources. Beyond that: &lt;strong&gt;Grafana OnCall&lt;/strong&gt; for on-call management and escalation, &lt;strong&gt;PagerDuty&lt;/strong&gt; or &lt;strong&gt;Opsgenie&lt;/strong&gt; as managed incident response platforms, &lt;strong&gt;Keep&lt;/strong&gt; as an open-source AIOps alert management platform, and &lt;strong&gt;Mimir Alertmanager&lt;/strong&gt; for multi-tenant environments running Grafana Mimir.&lt;/p&gt;

&lt;h3&gt;
  
  
  Should I use Prometheus alerts or Grafana alerts for Kubernetes monitoring?
&lt;/h3&gt;

&lt;p&gt;For Kubernetes monitoring specifically, the kube-prometheus-stack (which includes Prometheus, Alertmanager, and a comprehensive set of pre-built alerting rules) remains the industry standard. These rules are PromQL-based and are designed to work with Alertmanager. If you are deploying kube-prometheus-stack, using Alertmanager for metric-based alerts is the straightforward choice. Add Grafana Alerting on top if you also need to alert on logs (via Loki) or non-metric datasources.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;The Alertmanager vs Grafana Alerting debate is not really about which tool is better — it is about which tool fits your operational context. Alertmanager is simpler, lighter, and more GitOps-friendly. Grafana Alerting is more versatile, more accessible to UI-oriented teams, and the only option if you need multi-datasource alerting. Running both is perfectly valid when the boundaries are clear.&lt;/p&gt;

&lt;p&gt;The worst outcome is not picking the "wrong" tool. The worst outcome is running both accidentally, with overlapping coverage, duplicated notifications, and no clear ownership. Whatever you choose, document the decision, define the ownership boundaries, and make sure your on-call team knows exactly where to go when they need to silence an alert at 3 AM.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://alexandre-vazquez.com/alertmanager-vs-grafana-alerting/" rel="noopener noreferrer"&gt;alexandre-vazquez.com/alertmanager-vs-grafana-alerting&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>prometheus</category>
      <category>grafana</category>
      <category>devops</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Prometheus Alertmanager Vs Grafana Alerting (2026): Architecture, Features, And When To Use Each</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Tue, 05 May 2026 10:00:00 +0000</pubDate>
      <link>https://dev.to/alexandrev/prometheus-alertmanager-vs-grafana-alerting-2026-architecture-features-and-when-to-use-each-18pi</link>
      <guid>https://dev.to/alexandrev/prometheus-alertmanager-vs-grafana-alerting-2026-architecture-features-and-when-to-use-each-18pi</guid>
      <description>&lt;h1&gt;
  
  
  Prometheus Alertmanager Vs Grafana Alerting (2026): Architecture, Features, And When To Use Each
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://alexandre-vazquez.com/alertmanager-vs-grafana-alerting/" rel="noopener noreferrer"&gt;alexandre-vazquez.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Read the full article on my blog: &lt;a href="https://alexandre-vazquez.com/alertmanager-vs-grafana-alerting/" rel="noopener noreferrer"&gt;https://alexandre-vazquez.com/alertmanager-vs-grafana-alerting/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>cloudnative</category>
    </item>
    <item>
      <title>Debugging Distroless Containers: kubectl debug, Ephemeral Containers, and When to Use Each</title>
      <dc:creator>Alexandre Vazquez</dc:creator>
      <pubDate>Tue, 05 May 2026 08:00:01 +0000</pubDate>
      <link>https://dev.to/alexandrev/debugging-distroless-containers-kubectl-debug-ephemeral-containers-and-when-to-use-each-5enb</link>
      <guid>https://dev.to/alexandrev/debugging-distroless-containers-kubectl-debug-ephemeral-containers-and-when-to-use-each-5enb</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://alexandre-vazquez.com/debugging-distroless-containers/" rel="noopener noreferrer"&gt;alexandre-vazquez.com/debugging-distroless-containers/&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The container works fine in CI. It deploys successfully to staging. Then something goes wrong in production and you type the command you always type: &lt;code&gt;kubectl exec -it my-pod -- /bin/bash&lt;/code&gt;. The response is immediate: &lt;code&gt;OCI runtime exec failed: exec failed: unable to start container process: exec: "/bin/bash": stat /bin/bash: no such file or directory&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You try &lt;code&gt;/bin/sh&lt;/code&gt;. Same error. You try &lt;code&gt;ls&lt;/code&gt;. Same error. The container image is distroless — it ships only your application binary and its runtime dependencies, with no shell, no package manager, no debugging tools of any kind. This is intentional and correct from a security standpoint. It is also a significant operational challenge the first time you face it in production.&lt;/p&gt;

&lt;p&gt;This article covers every practical technique for debugging distroless containers in Kubernetes: &lt;strong&gt;kubectl debug with ephemeral containers&lt;/strong&gt; (the standard approach), &lt;strong&gt;pod copy strategy&lt;/strong&gt; (for Kubernetes versions without ephemeral container support, or when you need to modify the running pod spec), &lt;strong&gt;debug image variants&lt;/strong&gt; (the pragmatic developer shortcut), &lt;strong&gt;cdebug&lt;/strong&gt; (a purpose-built tool that simplifies the process), and &lt;strong&gt;node-level debugging&lt;/strong&gt; (the last resort with the most power). For each technique I will explain what it can and cannot do, what Kubernetes version or RBAC permissions it requires, and in which scenario — developer in local, platform engineer in staging, ops in production — it is the appropriate choice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Distroless Breaks the Normal Debugging Workflow
&lt;/h2&gt;

&lt;p&gt;Traditional container debugging assumes you can exec into the container and use shell tools: &lt;code&gt;ps&lt;/code&gt;, &lt;code&gt;netstat&lt;/code&gt;, &lt;code&gt;strace&lt;/code&gt;, &lt;code&gt;curl&lt;/code&gt;, a text editor. Distroless images remove all of this by design. The Google distroless project, Chainguard's Wolfi-based images, and the broader minimal image ecosystem deliberately exclude everything that is not required to run the application. The result is a dramatically smaller attack surface: no shell means no RCE via shell injection, no package manager means no easy escalation path, fewer binaries means fewer CVEs in the image scan.&lt;/p&gt;

&lt;p&gt;The tradeoff is operational: when something goes wrong, you cannot use the tools that the process itself is not allowed to run. A Java application in &lt;code&gt;gcr.io/distroless/java17-debian12&lt;/code&gt; has the JRE and nothing else. A Go binary compiled with CGO disabled and shipped in &lt;code&gt;gcr.io/distroless/static-debian12&lt;/code&gt; has literally only the binary and the necessary CA certificates and timezone data. There is no &lt;code&gt;wget&lt;/code&gt; to download a debug binary, no &lt;code&gt;apt&lt;/code&gt; to install one, no &lt;code&gt;bash&lt;/code&gt; to run a script.&lt;/p&gt;

&lt;p&gt;Kubernetes solves this at the platform level with &lt;strong&gt;ephemeral containers&lt;/strong&gt; , added as stable in Kubernetes 1.25. The principle is that a debug container — which can have a full shell and any tools you want — can be injected into a running pod and share its process namespace, network namespace, and filesystem mounts without modifying the original container or restarting the pod.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 1: kubectl debug with Ephemeral Containers
&lt;/h2&gt;

&lt;p&gt;Ephemeral containers are the canonical solution. Since Kubernetes 1.25 (stable), &lt;code&gt;kubectl debug&lt;/code&gt; can inject a temporary container into a running pod. The container shares the target pod's network namespace by default, and with &lt;code&gt;--target&lt;/code&gt; it can also share the process namespace of a specific container, allowing you to inspect its running processes and open file descriptors.&lt;/p&gt;

&lt;p&gt;The basic invocation is:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl debug -it my-pod \
  --image=busybox:latest \
  --target=my-container
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;--target&lt;/code&gt; flag is the critical piece. Without it, the ephemeral container gets its own process namespace. With it, it shares the process namespace of the specified container — meaning you can run &lt;code&gt;ps aux&lt;/code&gt; and see the application's processes, use &lt;code&gt;ls -la /proc//fd&lt;/code&gt; to inspect open file descriptors, and read the application's environment via &lt;code&gt;cat /proc//environ&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For a more capable debug environment, replace &lt;code&gt;busybox&lt;/code&gt; with a richer image:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl debug -it my-pod \
  --image=nicolaka/netshoot \
  --target=my-container
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;code&gt;nicolaka/netshoot&lt;/code&gt; includes &lt;code&gt;tcpdump&lt;/code&gt;, &lt;code&gt;curl&lt;/code&gt;, &lt;code&gt;dig&lt;/code&gt;, &lt;code&gt;nmap&lt;/code&gt;, &lt;code&gt;ss&lt;/code&gt;, &lt;code&gt;iperf3&lt;/code&gt;, and dozens of other network diagnostic tools, making it the standard choice for network debugging scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  What You Can and Cannot Do
&lt;/h3&gt;

&lt;p&gt;Ephemeral containers share the pod's network namespace and, when &lt;code&gt;--target&lt;/code&gt; is used, the process namespace. This gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full visibility into the application's network traffic from inside the pod (tcpdump, ss, netstat)&lt;/li&gt;
&lt;li&gt;Process inspection via &lt;code&gt;/proc/&lt;/code&gt; — open files, memory maps, environment variables, CPU/memory usage&lt;/li&gt;
&lt;li&gt;Access to the pod's DNS resolution context — exactly the same &lt;code&gt;/etc/resolv.conf&lt;/code&gt; the application sees&lt;/li&gt;
&lt;li&gt;Ability to make outbound network calls from the same network namespace (testing service endpoints, DNS resolution)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What you do &lt;em&gt;not&lt;/em&gt; get with ephemeral containers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Access to the application container 's filesystem.&lt;/strong&gt; The ephemeral container has its own root filesystem. You cannot &lt;code&gt;cat /app/config.yaml&lt;/code&gt; from the application container's filesystem unless you access it via &lt;code&gt;/proc//root/&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ability to remove the container once added.&lt;/strong&gt; Ephemeral containers are permanent until the pod is deleted. This is by design — the Kubernetes API does not allow removing them after creation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Volume mount modifications via CLI.&lt;/strong&gt; You cannot add volume mounts to an ephemeral container via &lt;code&gt;kubectl debug&lt;/code&gt; (though the API spec supports it, the CLI does not expose this).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resource limits.&lt;/strong&gt; Ephemeral containers do not support resource requests and limits in the &lt;code&gt;kubectl debug&lt;/code&gt; CLI, though this is evolving.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Accessing the Application Filesystem
&lt;/h3&gt;

&lt;p&gt;The most common surprise for developers new to ephemeral containers is that they cannot directly browse the application container's filesystem. The workaround is the &lt;code&gt;/proc&lt;/code&gt; filesystem:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Find the application's PID
ps aux

# Browse its filesystem via /proc
ls /proc/1/root/app/
cat /proc/1/root/etc/config.yaml

# Or set the root to the application's root
chroot /proc/1/root /bin/sh  # only if /bin/sh exists in the app image
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;/proc//root&lt;/code&gt; path is a symlink to the container's root filesystem as seen from the process namespace. Because the ephemeral container shares the process namespace with &lt;code&gt;--target&lt;/code&gt;, the application's PID is typically 1, and &lt;code&gt;/proc/1/root&lt;/code&gt; gives you full read access to its filesystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  RBAC Requirements
&lt;/h3&gt;

&lt;p&gt;Ephemeral containers require the &lt;code&gt;pods/ephemeralcontainers&lt;/code&gt; subresource permission. This is separate from &lt;code&gt;pods/exec&lt;/code&gt;, which controls &lt;code&gt;kubectl exec&lt;/code&gt;. A common mistake is to grant &lt;code&gt;pods/exec&lt;/code&gt; for debugging purposes without realizing that ephemeral containers require an additional grant:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: ephemeral-debugger
rules:
- apiGroups: [""]
  resources: ["pods/ephemeralcontainers"]
  verbs: ["update", "patch"]
- apiGroups: [""]
  resources: ["pods/attach"]
  verbs: ["create", "get"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In production environments, this permission should be tightly scoped: time-limited via &lt;code&gt;RoleBinding&lt;/code&gt; rather than permanent &lt;code&gt;ClusterRoleBinding&lt;/code&gt;, restricted to specific namespaces, and ideally gated behind an approval workflow. The debug container runs as root by default, which can create privilege escalation paths if the application container runs as a non-root user with shared process namespace — the debug container can attach to the application's processes with higher privileges.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 2: kubectl debug -copy-to (Pod Copy Strategy)
&lt;/h2&gt;

&lt;p&gt;When you need to modify the pod's container spec — replace the image, change environment variables, add a sidecar with a shared filesystem — the &lt;code&gt;--copy-to&lt;/code&gt; flag creates a full copy of the pod with your modifications applied:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl debug my-pod \
  -it \
  --copy-to=my-pod-debug \
  --image=my-app:debug \
  --share-processes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This creates a new pod named &lt;code&gt;my-pod-debug&lt;/code&gt; that is a copy of &lt;code&gt;my-pod&lt;/code&gt; but with the container image replaced by &lt;code&gt;my-app:debug&lt;/code&gt;. If &lt;code&gt;my-app:debug&lt;/code&gt; is your application image built with debug tooling included (or a debug variant from your registry), this lets you interact with the exact same binary in the exact same configuration as the original pod.&lt;/p&gt;

&lt;p&gt;A more common use of &lt;code&gt;--copy-to&lt;/code&gt; is to attach a debug container alongside the existing application container while keeping the original image unchanged:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl debug my-pod \
  -it \
  --copy-to=my-pod-debug \
  --image=busybox \
  --share-processes \
  --container=debugger
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This creates the copy-pod with both the original containers and a new &lt;code&gt;debugger&lt;/code&gt; container sharing the process namespace. Unlike ephemeral containers, this approach supports volume mounts and resource limits, and the debug pod can be deleted cleanly when you are done.&lt;/p&gt;

&lt;h3&gt;
  
  
  Limitations of the Copy Strategy
&lt;/h3&gt;

&lt;p&gt;The pod copy approach has a critical limitation: &lt;strong&gt;it is not debugging the original pod&lt;/strong&gt;. It creates a new pod that may behave differently because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It does not share the original pod's &lt;strong&gt;in-memory state&lt;/strong&gt; — if the issue is a goroutine leak or heap corruption that has been accumulating for hours, the fresh copy will not exhibit it immediately&lt;/li&gt;
&lt;li&gt;It creates a new Pod UID, which means any admission webhooks, network policies, or pod-level security contexts that depend on pod identity may apply differently&lt;/li&gt;
&lt;li&gt;If the original pod is crashing (&lt;code&gt;CrashLoopBackOff&lt;/code&gt;), the copy will also crash — this technique does not help for crash debugging unless you also change the entrypoint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For crash debugging specifically, combine &lt;code&gt;--copy-to&lt;/code&gt; with a modified entrypoint to keep the container alive:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl debug my-crashing-pod \
  -it \
  --copy-to=my-pod-debug \
  --image=busybox \
  --share-processes \
  -- sleep 3600
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h2&gt;
  
  
  Option 3: Debug Image Variants
&lt;/h2&gt;

&lt;p&gt;The most pragmatic approach — and the one most appropriate for developer workflows — is to maintain a debug variant of your application image that includes shell tooling. Both the Google distroless project and Chainguard provide this pattern officially.&lt;/p&gt;

&lt;p&gt;Google distroless images have a &lt;code&gt;:debug&lt;/code&gt; tag that adds BusyBox to the image:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Production image
FROM gcr.io/distroless/java17-debian12

# Debug variant — identical but with BusyBox shell
FROM gcr.io/distroless/java17-debian12:debug
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Chainguard images follow a similar convention with &lt;code&gt;:latest-dev&lt;/code&gt; variants that include apk, a shell, and common utilities:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Production (zero shell, minimal footprint)
FROM cgr.dev/chainguard/go:latest

# Development/debug variant
FROM cgr.dev/chainguard/go:latest-dev
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If you build your own base images, the recommended approach is to use multi-stage builds and maintain separate build targets:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FROM golang:1.22 AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp .

# Production: static distroless image
FROM gcr.io/distroless/static-debian12 AS production
COPY --from=builder /app/myapp /myapp
ENTRYPOINT ["/myapp"]

# Debug variant: same binary, with shell tools
FROM gcr.io/distroless/static-debian12:debug AS debug
COPY --from=builder /app/myapp /myapp
ENTRYPOINT ["/myapp"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In your CI/CD pipeline, build both targets and push &lt;code&gt;my-app:${VERSION}&lt;/code&gt; (production) and &lt;code&gt;my-app:${VERSION}-debug&lt;/code&gt; (debug variant) to your registry. The debug image is never deployed to production by default, but it exists and is ready to be used with &lt;code&gt;kubectl debug --copy-to&lt;/code&gt; when needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security Considerations for Debug Variants
&lt;/h3&gt;

&lt;p&gt;Debug image variants defeat much of the security benefit of distroless if they are used in production, even temporarily. Track usage carefully: log when debug images are deployed, require explicit approval, and ensure they are removed after the debugging session. In regulated environments, consider whether deploying a debug variant to production namespaces is permitted by your security policy — in many cases it is not, and you must use ephemeral containers (which add a debug process to the pod without modifying the application image) instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 4: cdebug
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;cdebug&lt;/code&gt; is an open-source CLI tool that simplifies distroless debugging by wrapping &lt;code&gt;kubectl debug&lt;/code&gt; with more ergonomic defaults and additional capabilities. Its primary value is in making ephemeral container debugging feel like a native shell experience:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Install
brew install cdebug
# or: go install github.com/iximiuz/cdebug@latest

# Debug a running pod
cdebug exec -it my-pod

# Specify a namespace and container
cdebug exec -it -n production my-pod -c my-container

# Use a specific debug image
cdebug exec -it my-pod --image=nicolaka/netshoot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;What &lt;code&gt;cdebug&lt;/code&gt; adds over raw &lt;code&gt;kubectl debug&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automatic filesystem chroot.&lt;/strong&gt; &lt;code&gt;cdebug exec&lt;/code&gt; automatically sets the filesystem root of the debug container to the target container's filesystem, so you browse &lt;code&gt;/&lt;/code&gt; and see the application's files — not the debug image's files. This addresses the most common friction point with &lt;code&gt;kubectl debug&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker integration.&lt;/strong&gt; &lt;code&gt;cdebug exec&lt;/code&gt; works identically for Docker containers (&lt;code&gt;cdebug exec -it&lt;/code&gt;), making it the same muscle memory for local and cluster debugging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No RBAC complications&lt;/strong&gt; for Docker-based local development — useful for developer workflows before the code reaches Kubernetes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tradeoff: &lt;code&gt;cdebug&lt;/code&gt; is a third-party dependency and requires installation. In environments with strict tooling policies (regulated industries, air-gapped clusters), it may not be an option. In those cases, the raw &lt;code&gt;kubectl debug&lt;/code&gt; workflow with &lt;code&gt;/proc/1/root&lt;/code&gt; filesystem navigation is the baseline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Option 5: Node-Level Debugging
&lt;/h2&gt;

&lt;p&gt;When everything else fails — the pod is in &lt;code&gt;CrashLoopBackOff&lt;/code&gt; too fast to attach to, the issue is a kernel-level problem, or you need tools like &lt;code&gt;strace&lt;/code&gt; that require elevated privileges — node-level debugging gives you direct access to the container's processes from the host node.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;kubectl debug node/&lt;/code&gt; creates a privileged pod on the target node that mounts the node's root filesystem under &lt;code&gt;/host&lt;/code&gt;:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl debug node/my-node-name \
  -it \
  --image=nicolaka/netshoot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;From this privileged pod, you can use &lt;code&gt;nsenter&lt;/code&gt; to enter the namespaces of any container running on the node:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Find the container's PID on the node
# (from within the node debug pod)
crictl ps | grep my-container
crictl inspect  | grep pid

# Enter the container's namespaces
nsenter -t  -m -u -i -n -p -- /bin/sh

# Or just the network namespace (for network debugging)
nsenter -t  -n -- ip a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The &lt;code&gt;nsenter&lt;/code&gt; approach lets you run tools from the node's or debug container's toolset while operating in the namespaces of the target container. This is how you run &lt;code&gt;strace&lt;/code&gt; against a distroless process: &lt;code&gt;strace&lt;/code&gt; is not in the application container, but you can run it from the node level while targeting the application's PID.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Trace all syscalls from the application process
nsenter -t  -- strace -p  -f -e trace=network
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  RBAC and Security for Node Debugging
&lt;/h3&gt;

&lt;p&gt;Node-level debugging requires &lt;code&gt;nodes/proxy&lt;/code&gt; and the ability to create privileged pods, which in most production clusters is restricted to cluster administrators. The debug pod runs with &lt;code&gt;hostPID: true&lt;/code&gt; and &lt;code&gt;hostNetwork: true&lt;/code&gt;, giving it visibility into all processes and network traffic on the node — not just the target container. This is significant: every process running on the node, including those in other tenants' namespaces, is visible.&lt;/p&gt;

&lt;p&gt;This technique should be treated as a break-glass procedure: log the access, require dual approval in production environments, and clean up immediately after the debugging session with &lt;code&gt;kubectl delete pod --selector=app=node-debugger&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Choosing the Right Approach: Access Profile and Environment Matrix
&lt;/h2&gt;

&lt;p&gt;The technique you should use depends on two axes: &lt;strong&gt;who you are&lt;/strong&gt; (developer, platform engineer, ops/SRE) and &lt;strong&gt;where the issue is&lt;/strong&gt; (local development, staging, production). The requirements and constraints differ significantly across these combinations.&lt;/p&gt;
&lt;h3&gt;
  
  
  Developer — Local or Development Cluster
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Goal:&lt;/strong&gt; Reproduce and understand a bug, inspect configuration, verify network connectivity to services.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Constraints:&lt;/strong&gt; None material — full cluster admin on local or personal dev namespace.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Recommended approach:&lt;/strong&gt; Debug image variants or cdebug.&lt;/p&gt;

&lt;p&gt;In local development (Minikube, Kind, Docker Desktop), the fastest path is to build the debug variant of your image and deploy it directly. If you are working with another team's service, &lt;code&gt;cdebug exec&lt;/code&gt; gives you a shell in the container with automatic filesystem root without any special RBAC. The goal is speed and iteration — reserve the more structured approaches for higher environments.&lt;/p&gt;
&lt;h3&gt;
  
  
  Developer — Staging Cluster
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Goal:&lt;/strong&gt; Debug integration issues, inspect live configuration, verify environment-specific behavior.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Constraints:&lt;/strong&gt; Shared cluster — cannot deploy arbitrary workloads to other teams' namespaces, but has &lt;code&gt;pods/ephemeralcontainers&lt;/code&gt; in own namespace.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Recommended approach:&lt;/strong&gt; kubectl debug with ephemeral containers (&lt;code&gt;--target&lt;/code&gt;), scoped to own namespace.&lt;/p&gt;

&lt;p&gt;Staging is where ephemeral containers earn their keep. You can attach to a running pod without restarting it, without modifying the deployment spec, and without affecting other users of the same cluster. Grant developers &lt;code&gt;pods/ephemeralcontainers&lt;/code&gt; in their team's namespaces and they can self-service debug without needing ops involvement.&lt;/p&gt;
&lt;h3&gt;
  
  
  Platform Engineer / SRE — Production
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Goal:&lt;/strong&gt; Diagnose a live production incident. The pod is behaving unexpectedly — high latency, memory growth, unexpected connections, incorrect responses.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Constraints:&lt;/strong&gt; Changes to running pods are high-risk. Any debug image deployment must be gated. The issue is live and affecting users.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Recommended approach:&lt;/strong&gt; kubectl debug with ephemeral containers (ephemeral containers do not restart the pod, do not modify the deployment, and are auditable via API audit logs).&lt;/p&gt;

&lt;p&gt;The key production requirements are auditability and minimal blast radius. Ephemeral containers satisfy both: they are recorded in the Kubernetes API audit log (who attached, when, to which pod), they do not modify the running application container, and they are limited to the pod's own network and process namespaces. Document the debug session in your incident ticket: pod name, time, what was observed, who ran the debug container.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;--copy-to&lt;/code&gt; strategy is generally inappropriate for production incident response: it creates a new pod that may or may not exhibit the issue, it adds load to the cluster during an incident, and if it is attached to the same services (databases, downstream APIs), it produces additional traffic that complicates forensics.&lt;/p&gt;
&lt;h3&gt;
  
  
  Platform Engineer — Production, Node-Level Issue
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Goal:&lt;/strong&gt; Diagnose a kernel-level issue, a container runtime problem, a networking issue that spans multiple pods, or a situation where the pod is crashing too fast to attach to.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Constraints:&lt;/strong&gt; Maximum privilege required. High operational risk.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Recommended approach:&lt;/strong&gt; Node-level debug pod with &lt;code&gt;nsenter&lt;/code&gt;. Treat as break-glass.&lt;/p&gt;

&lt;p&gt;For this scenario, create a dedicated RBAC role that grants &lt;code&gt;nodes/proxy&lt;/code&gt; access and the ability to create pods with &lt;code&gt;hostPID: true&lt;/code&gt; in a dedicated debug namespace. Bind it only to specific users, require a separate authentication step (e.g., &lt;code&gt;kubectl auth can-i&lt;/code&gt; check against a time-limited binding), and log all access. This level of access should generate a PagerDuty-style alert so that the security team knows a privileged debug session is active in production.&lt;/p&gt;
&lt;h2&gt;
  
  
  Common Errors and Solutions
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Error: "ephemeral containers are disabled for this cluster"
&lt;/h3&gt;

&lt;p&gt;Ephemeral containers require Kubernetes 1.16+ (alpha, behind feature gate) and are stable from 1.25. If you are on 1.16–1.22, you need to enable the &lt;code&gt;EphemeralContainers&lt;/code&gt; feature gate on the API server and kubelet. From 1.23 it was beta and enabled by default. From 1.25 it is stable and always on. On managed Kubernetes services (EKS, GKE, AKS), check the cluster version — versions older than 1.25 may still have it disabled depending on your configuration.&lt;/p&gt;
&lt;h3&gt;
  
  
  Error: "cannot update ephemeralcontainers" (RBAC)
&lt;/h3&gt;

&lt;p&gt;You have &lt;code&gt;pods/exec&lt;/code&gt; but not &lt;code&gt;pods/ephemeralcontainers&lt;/code&gt;. Add the grant shown in the RBAC section above. Note that &lt;code&gt;pods/exec&lt;/code&gt; and &lt;code&gt;pods/ephemeralcontainers&lt;/code&gt; are separate subresources — having one does not imply the other.&lt;/p&gt;
&lt;h3&gt;
  
  
  Error: "container not found" with -target
&lt;/h3&gt;

&lt;p&gt;The container name in &lt;code&gt;--target&lt;/code&gt; must match exactly the container name as defined in the Pod spec — not the image name. Check with &lt;code&gt;kubectl get pod my-pod -o jsonpath='{.spec.containers[*].name}'&lt;/code&gt; to get the exact container names.&lt;/p&gt;
&lt;h3&gt;
  
  
  Error: Can see processes but cannot read /proc/1/root
&lt;/h3&gt;

&lt;p&gt;The application container runs as a non-root user (e.g., UID 1000) and the ephemeral container runs as root. The application's filesystem may have files owned by UID 1000 that are not readable by other UIDs depending on permissions. The &lt;code&gt;/proc//root&lt;/code&gt; path itself requires &lt;code&gt;CAP_SYS_PTRACE&lt;/code&gt; capability. If your cluster's PodSecurityStandards (PSS) are set to &lt;code&gt;restricted&lt;/code&gt;, the debug container may not have this capability. Use the &lt;code&gt;Baseline&lt;/code&gt; PSS profile for debug namespaces or explicitly add &lt;code&gt;SYS_PTRACE&lt;/code&gt; to the ephemeral container's &lt;code&gt;securityContext&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  Error: tcpdump shows no traffic
&lt;/h3&gt;

&lt;p&gt;When using &lt;code&gt;nicolaka/netshoot&lt;/code&gt; for network debugging, ensure the ephemeral container is created &lt;em&gt;without&lt;/em&gt; &lt;code&gt;--target&lt;/code&gt; if your goal is to capture all traffic on the pod's network interface (not just the specific container's process). With &lt;code&gt;--target&lt;/code&gt;, you share the process namespace but the network namespace is shared at the pod level regardless. Run &lt;code&gt;tcpdump -i any&lt;/code&gt; to capture on all interfaces including loopback, which is where inter-container traffic within a pod travels.&lt;/p&gt;
&lt;h2&gt;
  
  
  Decision Framework
&lt;/h2&gt;

&lt;p&gt;Use this as a starting point to select the right technique for your situation:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Technique&lt;/th&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Active production incident, pod running&lt;/td&gt;
&lt;td&gt;kubectl debug + ephemeral container&lt;/td&gt;
&lt;td&gt;pods/ephemeralcontainers RBAC, k8s 1.25+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pod crashing too fast to attach&lt;/td&gt;
&lt;td&gt;kubectl debug -copy-to + modified entrypoint&lt;/td&gt;
&lt;td&gt;Ability to create pods in namespace&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Developer debugging in dev/staging&lt;/td&gt;
&lt;td&gt;cdebug exec or kubectl debug&lt;/td&gt;
&lt;td&gt;pods/ephemeralcontainers or pod create&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need full filesystem access&lt;/td&gt;
&lt;td&gt;kubectl debug -copy-to + debug image variant&lt;/td&gt;
&lt;td&gt;Debug image in registry, pod create&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need strace or kernel tracing&lt;/td&gt;
&lt;td&gt;Node-level debug with nsenter&lt;/td&gt;
&lt;td&gt;nodes/proxy, cluster admin equivalent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Network packet capture&lt;/td&gt;
&lt;td&gt;kubectl debug + nicolaka/netshoot&lt;/td&gt;
&lt;td&gt;pods/ephemeralcontainers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local Docker debugging&lt;/td&gt;
&lt;td&gt;cdebug exec&lt;/td&gt;
&lt;td&gt;Docker socket access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI-reproducible debug environment&lt;/td&gt;
&lt;td&gt;Debug image variant in separate build target&lt;/td&gt;
&lt;td&gt;Separate image tag in registry&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  Production RBAC Design
&lt;/h2&gt;

&lt;p&gt;A clean RBAC design for production distroless debugging separates three roles with different privilege levels:&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Tier 1: Developer self-service in team namespaces
# Allows attaching ephemeral containers, no node access
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: distroless-debugger
  namespace: team-namespace
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["pods/ephemeralcontainers"]
  verbs: ["update", "patch"]
- apiGroups: [""]
  resources: ["pods/attach"]
  verbs: ["create", "get"]
---
# Tier 2: SRE production incident access
# Ephemeral containers across all namespaces
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: sre-distroless-debugger
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["pods/ephemeralcontainers"]
  verbs: ["update", "patch"]
- apiGroups: [""]
  resources: ["pods/attach"]
  verbs: ["create", "get"]
---
# Tier 3: Break-glass node access
# Only for platform team, time-limited binding recommended
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: node-debugger
rules:
- apiGroups: [""]
  resources: ["nodes/proxy"]
  verbs: ["get"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["create", "get", "list", "delete"]
  # Restrict to debug namespace via RoleBinding, not ClusterRoleBinding
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Bind Tier 1 permanently to your developers. Bind Tier 2 to SREs permanently but with audit alerts on use. Bind Tier 3 only on-demand (via a Kubernetes operator that creates time-limited RoleBindings) and never as a permanent ClusterRoleBinding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Distroless containers are the correct choice for production workloads. They reduce attack surface, eliminate unnecessary CVEs, and force a cleaner separation between application and tooling. The operational cost is that your traditional debugging workflow — exec into the container, run some commands — no longer works by default.&lt;/p&gt;

&lt;p&gt;Kubernetes provides a clean answer with ephemeral containers and &lt;code&gt;kubectl debug&lt;/code&gt;: inject a debug container with whatever tools you need into the running pod, sharing its network and process namespaces, without restarting or modifying the application. For scenarios where ephemeral containers are insufficient — filesystem access, crash debugging, kernel-level investigation — the copy strategy and node-level debug fill the remaining gaps.&lt;/p&gt;

&lt;p&gt;The key to making this work at scale is not the technique itself but the &lt;strong&gt;access model&lt;/strong&gt; : developers get self-service ephemeral container access in their own namespaces, SREs get cluster-wide ephemeral container access for production incidents, and node-level access is a break-glass procedure with audit trail and time limits. With that model in place, distroless becomes an operational non-issue rather than an obstacle.&lt;/p&gt;

</description>
      <category>containers</category>
      <category>docker</category>
      <category>kubernetes</category>
    </item>
  </channel>
</rss>
