<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ahmed Zidan</title>
    <description>The latest articles on DEV Community by Ahmed Zidan (@ahmedzidan).</description>
    <link>https://dev.to/ahmedzidan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F190722%2Fa4d75a6f-03cf-4af9-9eaa-98b54ad6f9cd.jpg</url>
      <title>DEV Community: Ahmed Zidan</title>
      <link>https://dev.to/ahmedzidan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ahmedzidan"/>
    <language>en</language>
    <item>
      <title>OpenTelemetry Collector Contrib V0.145.0: 10 Features That Will Transform Your Observability Pipeline</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Sun, 08 Feb 2026 12:52:12 +0000</pubDate>
      <link>https://dev.to/aws-builders/opentelemetry-collector-contrib-v01450-10-features-that-will-transform-your-observability-15b8</link>
      <guid>https://dev.to/aws-builders/opentelemetry-collector-contrib-v01450-10-features-that-will-transform-your-observability-15b8</guid>
      <description>&lt;h1&gt;
  
  
  OpenTelemetry Collector Contrib: 10 Features That Will Transform Your Observability Pipeline
&lt;/h1&gt;

&lt;p&gt;The OpenTelemetry Collector Contrib project continues to evolve at a rapid pace, and the latest release is packed with features that address real-world observability challenges. Whether you're running workloads on GCP, managing Kubernetes clusters, or trying to tame your log volumes, this release has something for you.&lt;/p&gt;

&lt;p&gt;Let's dive into the 10 most impactful features and see how they can improve your observability stack.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Export Traces to Google Cloud Storage
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; You can now export traces directly to Google Cloud Storage (GCS).&lt;/p&gt;

&lt;p&gt;This is huge for teams that need long-term trace retention without the cost of keeping everything in a real-time trace backend. Think of it as a "cold storage" tier for your traces.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;Traditional trace backends like Jaeger or Tempo are optimized for real-time querying, but storing months of trace data gets expensive fast. With GCS export, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Archive traces for compliance and auditing&lt;/li&gt;
&lt;li&gt;Reduce costs by moving older traces to cheaper storage&lt;/li&gt;
&lt;li&gt;Build custom analytics pipelines on historical trace data&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example Configuration
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;exporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;googlecloudstorage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;bucket&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-traces-bucket"&lt;/span&gt;
    &lt;span class="na"&gt;prefix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;traces/"&lt;/span&gt;
    &lt;span class="na"&gt;compression&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gzip&lt;/span&gt;

&lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pipelines&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;traces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;exporters&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;googlecloudstorage&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Best Practice
&lt;/h3&gt;

&lt;p&gt;Use the GCS exporter alongside your primary trace backend. Send real-time traces to Jaeger/Tempo for immediate debugging, and batch export to GCS for long-term retention.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Limit Maximum Trace Size in Tail Sampling
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; The tail sampling processor now supports &lt;code&gt;maximum_trace_size_bytes&lt;/code&gt; to limit the memory footprint of individual traces. Traces exceeding this byte limit are immediately dropped—no sampling decision is made for them.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem This Solves
&lt;/h3&gt;

&lt;p&gt;Tail sampling holds traces in memory while waiting for all spans to arrive before making a sampling decision. This is powerful, but it creates a vulnerability: occasionally, a single trace can grow to an enormous size (think: a batch job creating thousands of spans), causing spiky memory consumption that can crash your collector.&lt;/p&gt;

&lt;p&gt;The memory limiter processor doesn't fully solve this because it applies backpressure while traces are waiting for decisions, which can degrade sampling accuracy and overall throughput.&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;tail_sampling&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;decision_wait&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
    &lt;span class="na"&gt;maximum_trace_size_bytes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5242880&lt;/span&gt;  &lt;span class="c1"&gt;# 5 MB per trace&lt;/span&gt;
    &lt;span class="na"&gt;policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;error-policy&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;status_code&lt;/span&gt;
        &lt;span class="na"&gt;status_code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;status_codes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;ERROR&lt;/span&gt;&lt;span class="pi"&gt;]}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a trace's in-memory size exceeds &lt;code&gt;maximum_trace_size_bytes&lt;/code&gt;, it's immediately dropped without waiting for the &lt;code&gt;decision_wait&lt;/code&gt; period. No sampling decision is made—the trace is simply discarded to protect collector stability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-World Scenario
&lt;/h3&gt;

&lt;p&gt;Consider a data processing pipeline that creates a span for each record processed. A batch of 100,000 records generates 100,000 spans in a single trace. Each span might be 500 bytes, resulting in a 50MB trace sitting in memory. Without limits, a few concurrent batches could exhaust your collector's memory.&lt;/p&gt;

&lt;p&gt;With &lt;code&gt;maximum_trace_size_bytes: 5242880&lt;/code&gt; (5MB), oversized traces are dropped early, protecting your collector while still sampling normal-sized traces correctly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Best Practice
&lt;/h3&gt;

&lt;p&gt;Use this alongside the memory limiter processor for defense in depth:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;maximum_trace_size_bytes&lt;/code&gt; protects against individual large traces&lt;/li&gt;
&lt;li&gt;Memory limiter protects against overall memory pressure from many traces&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. Linux Hugepages Memory Monitoring
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; Monitor hugepages usage on Linux hosts via the new &lt;code&gt;system.memory.linux.hugepages&lt;/code&gt; metrics.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Are Hugepages?
&lt;/h3&gt;

&lt;p&gt;Standard memory pages on Linux are 4KB. Hugepages are larger (typically 2MB or 1GB), and they're critical for high-performance applications like databases, in-memory caches, and VMs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Monitor Them?
&lt;/h3&gt;

&lt;p&gt;If your application expects hugepages but they're exhausted, performance tanks. Now you can track:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;system.memory.linux.hugepages.usage&lt;/code&gt; - Currently used hugepages&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;system.memory.linux.hugepages.free&lt;/code&gt; - Available hugepages&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;system.memory.linux.hugepages.total&lt;/code&gt; - Total configured hugepages&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example Alert
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Prometheus alert for low hugepages&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;alert&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HugepagesExhausted&lt;/span&gt;
  &lt;span class="na"&gt;expr&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;system_memory_linux_hugepages_free &amp;lt; &lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;
  &lt;span class="na"&gt;for&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5m&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;summary&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hugepages&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;nearly&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;exhausted&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;on&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;{{&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;$labels.host&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;}}"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Who Needs This?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Teams running Redis, PostgreSQL, or MongoDB in production&lt;/li&gt;
&lt;li&gt;Anyone using DPDK for high-performance networking&lt;/li&gt;
&lt;li&gt;VM hosts using KVM with hugepages backing&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4. Exclude Namespaces from Kubernetes Watching
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; The k8sobjects receiver now supports excluding specific Kubernetes namespaces from being watched.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;The k8sobjects receiver watches Kubernetes objects (events, pods, deployments, etc.) and converts them to logs. In large clusters, watching all namespaces generates massive amounts of data. You often want to exclude system namespaces or namespaces managed by other tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  Configuration
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;receivers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;k8sobjects&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;objects&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;events&lt;/span&gt;
        &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;watch&lt;/span&gt;
        &lt;span class="na"&gt;namespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[]&lt;/span&gt;  &lt;span class="c1"&gt;# Empty means all namespaces&lt;/span&gt;
        &lt;span class="na"&gt;exclude_namespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kube-system&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kube-public&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kube-node-lease&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pods&lt;/span&gt;
        &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pull&lt;/span&gt;
        &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
        &lt;span class="na"&gt;exclude_namespaces&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kube-system&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Use Cases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reduce noise&lt;/strong&gt;: Exclude &lt;code&gt;kube-system&lt;/code&gt; events that flood your logs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance&lt;/strong&gt;: Only watch specific namespaces for audit purposes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-tenancy&lt;/strong&gt;: Different collectors for different namespace groups&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost control&lt;/strong&gt;: Reduce log volume by excluding high-churn namespaces&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best Practice
&lt;/h3&gt;

&lt;p&gt;Exclude namespaces that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate high volumes of Kubernetes events you don't need&lt;/li&gt;
&lt;li&gt;Are managed by separate observability pipelines&lt;/li&gt;
&lt;li&gt;Contain system components (kube-system, monitoring infrastructure)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  5. Suppress Repeated Permission Denied Errors
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; The filelog receiver now logs only one permission denied error per file per process run, with an informational message when the file becomes readable again.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before This Change
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ERROR file /var/log/secure: permission denied
ERROR file /var/log/secure: permission denied
ERROR file /var/log/secure: permission denied
# Repeated every second, forever
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  After This Change
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ERROR file /var/log/secure: permission denied
# ... silence ...
INFO file /var/log/secure: now readable, resuming collection
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;Log spam from permission errors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fills up your log storage&lt;/li&gt;
&lt;li&gt;Makes it harder to find real issues&lt;/li&gt;
&lt;li&gt;Can trigger false alerts on error counts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This small change significantly improves operational hygiene.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Trace Flags Policy for Sampling
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; A new &lt;code&gt;trace_flags&lt;/code&gt; policy for the tail sampling processor lets you make sampling decisions based on trace flags.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Are Trace Flags?
&lt;/h3&gt;

&lt;p&gt;Trace flags are part of the W3C Trace Context standard. The most common flag is the "sampled" bit, which indicates whether the trace was marked for sampling upstream.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Case: Honor Upstream Sampling Decisions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;tail_sampling&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;honor-upstream-sampling&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;trace_flags&lt;/span&gt;
        &lt;span class="na"&gt;trace_flags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;sampled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;  &lt;span class="c1"&gt;# Keep traces marked as sampled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Use Case: Force Sample Unsampled Traces with Errors
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;processors&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;tail_sampling&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;policies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;sample-errors-even-if-unsampled&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;and&lt;/span&gt;
        &lt;span class="na"&gt;and&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;not-sampled&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;trace_flags&lt;/span&gt;
            &lt;span class="na"&gt;trace_flags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;sampled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;has-error&lt;/span&gt;
            &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;status_code&lt;/span&gt;
            &lt;span class="na"&gt;status_code&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;status_codes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;ERROR&lt;/span&gt;&lt;span class="pi"&gt;]}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  7. GCP FaaS Attribute Migration (faas.id → faas.instance)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; The &lt;code&gt;processor.resourcedetection.removeGCPFaaSID&lt;/code&gt; feature gate is now stable and always enabled. The &lt;code&gt;faas.id&lt;/code&gt; attribute is replaced by &lt;code&gt;faas.instance&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Changed?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;faas.id: "abc123"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;faas.instance: "abc123"&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;This aligns with the OpenTelemetry semantic conventions. The &lt;code&gt;faas.instance&lt;/code&gt; attribute better represents "the execution environment instance" rather than just an ID.&lt;/p&gt;

&lt;h3&gt;
  
  
  Migration Steps
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Update any dashboards or alerts that filter on &lt;code&gt;faas.id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Search your codebase for references to &lt;code&gt;faas.id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Update to use &lt;code&gt;faas.instance&lt;/code&gt; instead&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  8. Improved Workflow Job Trace Structure
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; Step spans are now siblings of the queue/job span (under the job span) instead of children of the queue/job span.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before (Nested Structure)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Job Span
└── Queue/Job Span
    ├── Step 1 Span
    ├── Step 2 Span
    └── Step 3 Span
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  After (Sibling Structure)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Job Span
├── Queue/Job Span
├── Step 1 Span
├── Step 2 Span
└── Step 3 Span
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why This Is Better
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Clearer visualization in trace UIs&lt;/li&gt;
&lt;li&gt;Steps are directly associated with the job, not buried under queue processing&lt;/li&gt;
&lt;li&gt;Easier to calculate total step duration vs. queue wait time&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  9. Prometheus Receiver: Extra Scrape Metrics Ignored by Default
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; The &lt;code&gt;report_extra_scrape_metrics&lt;/code&gt; configuration option is now ignored by default (feature gate promoted to beta).&lt;/p&gt;

&lt;h3&gt;
  
  
  What This Means
&lt;/h3&gt;

&lt;p&gt;Previously, the Prometheus receiver could report additional metrics about the scrape process itself. These extra metrics are now disabled by default to reduce metric cardinality.&lt;/p&gt;

&lt;h3&gt;
  
  
  If You Need Them
&lt;/h3&gt;

&lt;p&gt;You can re-enable them by setting the feature gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;otelcol &lt;span class="nt"&gt;--feature-gates&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-receiver&lt;/span&gt;.prometheusreceiver.RemoveReportExtraScrapeMetricsConfig
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Best Practice
&lt;/h3&gt;

&lt;p&gt;Only enable extra scrape metrics if you're actively debugging Prometheus scrape issues. For most production deployments, the default (disabled) is correct.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Removable Prometheus Service Discoveries via Build Tags
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What's New:&lt;/strong&gt; Prometheus service discoveries can now be excluded at build time using Go build tags.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;The OpenTelemetry Collector binary can get large when it includes all Prometheus service discovery mechanisms (Kubernetes, Consul, EC2, Azure, etc.). If you only use Kubernetes SD, you're shipping unnecessary code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Building a Lighter Collector
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Include only Kubernetes service discovery&lt;/span&gt;
go build &lt;span class="nt"&gt;-tags&lt;/span&gt; &lt;span class="s2"&gt;"promsd_kubernetes"&lt;/span&gt; ./cmd/otelcol-contrib

&lt;span class="c"&gt;# Exclude all service discoveries except static&lt;/span&gt;
go build &lt;span class="nt"&gt;-tags&lt;/span&gt; &lt;span class="s2"&gt;"promsd_none"&lt;/span&gt; ./cmd/otelcol-contrib
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Benefits
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Smaller binary size&lt;/li&gt;
&lt;li&gt;Reduced attack surface&lt;/li&gt;
&lt;li&gt;Faster startup times&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;This release of OpenTelemetry Collector Contrib demonstrates the project's commitment to solving real-world observability challenges. From cost-effective trace archival with GCS export to protecting your collectors with trace size limits, these features address pain points that teams face daily.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Takeaways
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Use GCS export&lt;/strong&gt; for cost-effective long-term trace retention&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set trace size limits&lt;/strong&gt; to protect against memory exhaustion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor hugepages&lt;/strong&gt; if you run high-performance workloads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exclude namespaces&lt;/strong&gt; to reduce k8s processor load&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Migrate from faas.id to faas.instance&lt;/strong&gt; for GCP workloads&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Next Steps
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Review the &lt;a href="https://github.com/open-telemetry/opentelemetry-collector-contrib/releases" rel="noopener noreferrer"&gt;full changelog&lt;/a&gt; for additional changes&lt;/li&gt;
&lt;li&gt;Test these features in your staging environment&lt;/li&gt;
&lt;li&gt;Join the &lt;a href="https://cloud-native.slack.com" rel="noopener noreferrer"&gt;CNCF Slack #otel-collector&lt;/a&gt; channel for community support&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Stay updated on the latest cloud-native releases by following &lt;a href="https://relnx.io" rel="noopener noreferrer"&gt;Relnx&lt;/a&gt;. Never miss a feature release again.&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>The Ultimate Guide to Writing Effective Runbooks: Your Secret Weapon for Incident Response</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Sun, 11 Jan 2026 13:59:49 +0000</pubDate>
      <link>https://dev.to/aws-builders/the-ultimate-guide-to-writing-effective-runbooks-your-secret-weapon-for-incident-response-e5l</link>
      <guid>https://dev.to/aws-builders/the-ultimate-guide-to-writing-effective-runbooks-your-secret-weapon-for-incident-response-e5l</guid>
      <description>&lt;p&gt;When your monitoring system screams at 3 AM and you're jolted awake by that dreaded notification sound, what's your first instinct? Panic? Confusion? Frantically searching through old Slack messages hoping someone else dealt with this before?&lt;/p&gt;

&lt;p&gt;There's a better way. Enter the &lt;strong&gt;runbook&lt;/strong&gt;—your team's collective wisdom distilled into a single, accessible document that transforms any engineer into an expert on any system, even at 3 AM.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Exactly is a Runbook?
&lt;/h2&gt;

&lt;p&gt;A runbook is a documented procedure that guides an engineer through understanding and responding to a specific service or alert. Think of it as a field manual—comprehensive enough to inform, concise enough to act on quickly.&lt;/p&gt;

&lt;p&gt;In complex environments with dozens of microservices, databases, and integrations, no single person can hold complete knowledge of every system in their head. Runbooks democratize that knowledge, ensuring that the new engineer who just joined last week can respond to an incident as effectively as the veteran who built the system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Runbooks Matter More Than You Think
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Speed matters during incidents.&lt;/strong&gt; Every minute of downtime costs money, trust, and sanity. A well-crafted runbook eliminates the costly "investigation phase" where engineers stumble around trying to understand what they're looking at.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Knowledge shouldn't walk out the door.&lt;/strong&gt; When team members leave or switch projects, their expertise often leaves with them. Runbooks capture that institutional knowledge permanently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consistency saves lives (and systems).&lt;/strong&gt; Ad-hoc troubleshooting leads to inconsistent outcomes. A runbook ensures everyone follows the same proven path to resolution.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Anatomy of a Great Runbook
&lt;/h2&gt;

&lt;p&gt;Every effective runbook answers six critical questions about its service:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. What Is This Service and What Does It Do?
&lt;/h3&gt;

&lt;p&gt;Start with context. An engineer responding to an alert needs to quickly understand the service's purpose before they can reason about what might be wrong.&lt;/p&gt;

&lt;p&gt;Include the service's core functionality, business importance, and user impact. A payment processing service demands different urgency than a batch reporting job. Make this clear upfront so responders can prioritize appropriately.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Who Is Responsible for It?
&lt;/h3&gt;

&lt;p&gt;List the owning team, key contacts, and escalation paths. Include on-call schedules and alternative contacts. Nothing wastes time like an engineer hunting through directories at 2 AM trying to figure out who to page when things get serious.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. What Dependencies Does It Have?
&lt;/h3&gt;

&lt;p&gt;Modern services rarely exist in isolation. Document:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Upstream services&lt;/strong&gt; — What does this service call?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Downstream consumers&lt;/strong&gt; — What calls this service?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External dependencies&lt;/strong&gt; — Third-party APIs, cloud services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data stores&lt;/strong&gt; — Databases, caches, queues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When the service misbehaves, dependencies are prime suspects.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. What Does the Infrastructure Look Like?
&lt;/h3&gt;

&lt;p&gt;Include architecture diagrams, deployment topology, and resource specifications. Document where the service runs, how it scales, and what its typical resource utilization looks like. Engineers need this mental model to diagnose issues effectively.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. What Metrics and Logs Does It Emit?
&lt;/h3&gt;

&lt;p&gt;Describe the key metrics to watch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Latency&lt;/li&gt;
&lt;li&gt;Error rates&lt;/li&gt;
&lt;li&gt;Throughput&lt;/li&gt;
&lt;li&gt;Resource utilization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;More importantly, explain what these metrics &lt;strong&gt;mean&lt;/strong&gt;. A spike in queue depth means nothing without context—is that normal during peak hours, or a sign of trouble?&lt;/p&gt;

&lt;p&gt;Include direct links to dashboards and log queries. Reduce friction to zero.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. What Alerts Are Set Up and Why?
&lt;/h3&gt;

&lt;p&gt;For each alert, document:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trigger condition&lt;/strong&gt; — What threshold fires it?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why it matters&lt;/strong&gt; — What does this indicate?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;False positive scenarios&lt;/strong&gt; — When might this fire incorrectly?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remediation steps&lt;/strong&gt; — Specific actions to take&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the heart of operational excellence. An alert without documented remediation is just noise.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Golden Rule: Link Every Alert to Its Runbook
&lt;/h2&gt;

&lt;p&gt;This single practice transforms your incident response. When an alert fires, the engineer receives a link to the relevant runbook alongside the notification. They click through, immediately understand the context, and have clear remediation steps at their fingertips.&lt;/p&gt;

&lt;p&gt;No searching. No guessing. No waking up the person who happened to build this thing three years ago.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Practices for Runbook Success
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Keep Runbooks Alive
&lt;/h3&gt;

&lt;p&gt;A runbook is not a one-time document. Review and update it after every incident. If an engineer discovered something missing during their response, add it immediately.&lt;/p&gt;

&lt;h3&gt;
  
  
  Make Them Discoverable
&lt;/h3&gt;

&lt;p&gt;The best runbook is useless if no one can find it. Standardize your naming conventions and storage location. Integrate links directly into your alerting system.&lt;/p&gt;

&lt;h3&gt;
  
  
  Test Your Runbooks
&lt;/h3&gt;

&lt;p&gt;Periodically walk through runbook procedures during game days or chaos engineering exercises. Does the documentation actually work? Are the links still valid?&lt;/p&gt;

&lt;h3&gt;
  
  
  Write for the Tired Engineer
&lt;/h3&gt;

&lt;p&gt;Remember: runbooks get read at 3 AM by someone who was asleep ten minutes ago. Use clear headings, bullet points, and direct language. Avoid jargon where possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Include the "Why," Not Just the "What"
&lt;/h3&gt;

&lt;p&gt;Engineers troubleshoot better when they understand the reasoning behind procedures. Don't just say "restart the service"—explain why restarting helps and what symptoms suggest this is the right action.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Simple Template to Get Started
&lt;/h2&gt;

&lt;p&gt;Use this structure for every service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;## Service Name
[Name]

## Overview
Two to three sentences describing what this service does and why it matters.

## Ownership
- Team: [Team name]
- Slack Channel: [#channel]
- On-Call Rotation: [Link]
- Escalation Contacts: [Names/handles]

## Dependencies
- Upstream: [Services this calls]
- Downstream: [Services that call this]
- External: [Third-party APIs]
- Data Stores: [Databases, caches]

## Infrastructure
- Deployment: [Location/platform]
- Scaling: [Configuration]
- Architecture: [Diagram link]

## Key Metrics
| Metric | Normal Range | Dashboard |
|--------|--------------|-----------|
| [Name] | [Range]      | [Link]    |

## Alerts
### [Alert Name]
- **Trigger:** [Condition]
- **Meaning:** [What this indicates]
- **Remediation:** [Step-by-step actions]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Payoff
&lt;/h2&gt;

&lt;p&gt;Teams with well-maintained runbooks consistently demonstrate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;⚡ &lt;strong&gt;Faster mean time to resolution&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;📉 &lt;strong&gt;Reduced escalations&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;😌 &lt;strong&gt;Lower stress levels during incidents&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;🚀 &lt;strong&gt;Better onboarding for new team members&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Runbooks aren't just documentation—they're operational excellence encoded into your organization's DNA.&lt;/p&gt;

&lt;p&gt;Start with your most critical services. One runbook at a time, you'll build a culture where incidents are handled with confidence, not chaos.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>runbook</category>
      <category>incident</category>
      <category>sitereliabilityengineering</category>
    </item>
    <item>
      <title>This Week’s Cloud Native Pulse: Dec 13-19 – OTel Memory Leak Fix, K8s 1.35 GA Blitz, ArgoCD Shields Up</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Sat, 20 Dec 2025 04:59:55 +0000</pubDate>
      <link>https://dev.to/aws-builders/this-weeks-cloud-native-pulse-dec-13-19-otel-memory-leak-fix-k8s-135-ga-blitz-argocd-shields-3dfj</link>
      <guid>https://dev.to/aws-builders/this-weeks-cloud-native-pulse-dec-13-19-otel-memory-leak-fix-k8s-135-ga-blitz-argocd-shields-3dfj</guid>
      <description>&lt;p&gt;Last week was packed with important releases across the tools many of us rely on daily: OpenTelemetry, Kubernetes, ArgoCD, ArgoCD Image Updater, Prometheus, and Grafana. This post highlights the changes that are most likely to impact your clusters, dashboards, and pipelines, with direct links to deeper release notes on &lt;a href="https://www.relnx.io/" rel="noopener noreferrer"&gt;https://www.relnx.io/&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenTelemetry Collector Contrib v0.142.0
&lt;/h2&gt;

&lt;p&gt;OpenTelemetry Collector Contrib v0.142.0 was released on December 17, 2025, and it comes with a mix of critical fixes and useful quality‑of‑life improvements for production pipelines. This is a release worth prioritizing if you use tail sampling, Prometheus Remote Write, GCP networking, or Datadog integrations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key highlights:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Tail sampling memory leak fix&lt;br&gt;
A critical memory leak introduced in 0.141.0 for the tail sampling processor (when not blocking on overflow) has been fixed, which is essential if you rely on tail sampling for high-volume traces.&lt;br&gt;
Details: fix tail sampling memory leak. ​&lt;a href="https://www.relnx.io/features/fix-a-memory-leak-introduced-in-01410-of-the-tail-sampling-processor-when-not-blocking-on-overflow-1450" rel="noopener noreferrer"&gt;https://www.relnx.io/features/fix-a-memory-leak-introduced-in-01410-of-the-tail-sampling-processor-when-not-blocking-on-overflow-1450&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Remote Write 2.0 rc.4 (breaking change)&lt;br&gt;
The collector now targets Remote Write 2.0 spec rc.4, which requires Prometheus 3.8.0 or later, so environments using Prometheus Remote Write must ensure compatibility before upgrading.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: Remote Write 2.0 spec rc.4 change &lt;a href="https://www.relnx.io/features/updated-to-remote-write-20-spec-rc4-requiring-prometheus-380-or-later-the-upstream-prometheus-library-updated-the-remote-write-20-protocol-from-rc3-to-rc4-in-prometheusprometheus17411-1475" rel="noopener noreferrer"&gt;https://www.relnx.io/features/updated-to-remote-write-20-spec-rc4-requiring-prometheus-380-or-later-the-upstream-prometheus-library-updated-the-remote-write-20-protocol-from-rc3-to-rc4-in-prometheusprometheus17411-1475&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;filelog.decompressFingerprint is now stable
The filelog.decompressFingerprint feature for identifying and decompressing log files has graduated to stable, improving confidence in processing compressed logs at scale for better storage and transfer efficiency&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details:&lt;a href="https://www.relnx.io/features/move-filelogdecompressfingerprint-to-stable-stage-1472" rel="noopener noreferrer"&gt;https://www.relnx.io/features/move-filelogdecompressfingerprint-to-stable-stage-1472&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Better GCP External HTTP(S) LB logs&lt;br&gt;
External Application Load Balancer logs can now be parsed into log record attributes instead of being left as raw body payloads, increasing readability and query power for GCP users.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Simplified cache lifecycle management&lt;br&gt;
Cache lifecycle handling has been simplified by removing unnecessary WaitGroup complexity, which reduces internal complexity and the chances of subtle lifecycle bugs.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/simplified-cache-lifecycle-management-by-removing-unnecessary-waitgroup-complexity-1457" rel="noopener noreferrer"&gt;https://www.relnx.io/features/simplified-cache-lifecycle-management-by-removing-unnecessary-waitgroup-complexity-1457&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Datadog receiver: multi-tag parsing flag
A new receiver.datadogreceiver.EnableMultiTagParsing feature gate controls how Datadog tags are converted into OpenTelemetry attributes, giving more precise control over tag-to-attribute mapping.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/add-receiverdatadogreceiverenablemultitagparsing-feature-gate-the-feature-flag-changes-the-logic-that-converts-datadog-tags-to-opentelemetry-attributes-1438" rel="noopener noreferrer"&gt;https://www.relnx.io/features/add-receiverdatadogreceiverenablemultitagparsing-feature-gate-the-feature-flag-changes-the-logic-that-converts-datadog-tags-to-opentelemetry-attributes-1438&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Datadog receiver: AWS SDK semantic conventions
The Datadog receiver improves compliance with OpenTelemetry Semantic Conventions for AWS SDK spans, bringing more consistent, interoperable tracing data across services using the AWS SDK&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/improve-the-compliance-with-otel-semantic-conventions-for-aws-sdk-spans-in-the-datadog-receiver-compliance-improvements-on-spans-received-via-the-datadog-receiver-when-applicable-1436" rel="noopener noreferrer"&gt;https://www.relnx.io/features/improve-the-compliance-with-otel-semantic-conventions-for-aws-sdk-spans-in-the-datadog-receiver-compliance-improvements-on-spans-received-via-the-datadog-receiver-when-applicable-1436&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Datadog tag runtime remapped
The Datadog runtime tag now maps to container.runtime.name instead of container.runtime, aligning better with OpenTelemetry attribute naming and improving trace and metric consistency.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/the-datadog-tag-runtime-is-now-mapped-to-the-otel-attribute-containerruntimename-instead-of-containerruntime-1435" rel="noopener noreferrer"&gt;https://www.relnx.io/features/the-datadog-tag-runtime-is-now-mapped-to-the-otel-attribute-containerruntimename-instead-of-containerruntime-1435&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;New transform: set_semconv_span_name()&lt;br&gt;
A new transform processor function, set_semconv_span_name(), can rewrite span names according to semantic conventions for HTTP, RPC, messaging, and database spans, helping tackle high-cardinality span names&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GCP VPC Flow Logs: MIG &amp;amp; Google Service fields&lt;br&gt;
Support was added for GCP VPC Flow Log fields for Managed Instance Groups and Google Service logs, enabling more granular visibility and troubleshooting for GCP network traffic.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything else in this release: &lt;a href="https://www.relnx.io/releases/opentelemetry-collector-contrib-v0-142-0" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/opentelemetry-collector-contrib-v0-142-0&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Kubernetes v1.35.0
&lt;/h2&gt;

&lt;p&gt;Kubernetes v1.35.0 contains several observability, metrics, and UX changes, along with some deprecations and GA features that may affect day‑to‑day operations. This is a good release to review from both SRE and platform governance perspectives.&lt;/p&gt;

&lt;h3&gt;
  
  
  Highlights:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Improved kube-proxy /statusz
The /statusz page for kube-proxy now includes a list of exposed endpoints, making debugging and introspection of network behavior easier.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/updated-the-statusz-page-for-kube-proxy-to-include-a-list-of-exposed-endpoints-making-debugging-and-introspection-easier-1699" rel="noopener noreferrer"&gt;https://www.relnx.io/features/updated-the-statusz-page-for-kube-proxy-to-include-a-list-of-exposed-endpoints-making-debugging-and-introspection-easier-1699&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Deprecated metrics hidden by policy
Deprecated metrics are now hidden according to the metrics deprecation policy, helping teams avoid relying on outdated signals while keeping their metric surface area clean.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/deprecated-metrics-will-be-hidden-as-per-the-metrics-deprecation-policy-httpskubernetesiodocsreferenceusing-apideprecation-policydeprecating-a-metric-1597" rel="noopener noreferrer"&gt;https://www.relnx.io/features/deprecated-metrics-will-be-hidden-as-per-the-metrics-deprecation-policy-httpskubernetesiodocsreferenceusing-apideprecation-policydeprecating-a-metric-1597&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Excluded dry-run requests from apiserver_request_sli_duration_seconds
Dry‑run requests are excluded from this SLI metric, ensuring latency measurements better reflect real user-impacting operations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/metrics-excluded-dryrun-requests-from-apiserver-request-sli-duration-seconds-1570" rel="noopener noreferrer"&gt;https://www.relnx.io/features/metrics-excluded-dryrun-requests-from-apiserver-request-sli-duration-seconds-1570&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;New kubelet metrics for secret-pulled images&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;New kubelet metrics for the “Ensure Secret Pulled Images” KEP provide visibility into pulling images from private registries with secrets, improving troubleshooting of image pull performance.&lt;/p&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/introduced-new-kubelet-metrics-for-the-ensure-secret-pulled-images-kep-including-1557" rel="noopener noreferrer"&gt;https://www.relnx.io/features/introduced-new-kubelet-metrics-for-the-ensure-secret-pulled-images-kep-including-1557&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Metrics for StatefulSet MaxUnavailable
New metrics expose how many pods can be unavailable during a StatefulSet update, which helps control and reason about downtime during rolling updates&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/added-metrics-for-the-maxunavailable-feature-in-statefulset-1535" rel="noopener noreferrer"&gt;https://www.relnx.io/features/added-metrics-for-the-maxunavailable-feature-in-statefulset-1535&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;More events during Pod resizing
Additional events are emitted during pod resizing, providing clearer visibility into resize status changes and helping debug vertical scaling operations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/added-additional-event-emissions-during-pod-resizing-to-provide-clearer-visibility-when-a-pods-resize-status-changes-1533" rel="noopener noreferrer"&gt;https://www.relnx.io/features/added-additional-event-emissions-during-pod-resizing-to-provide-clearer-visibility-when-a-pods-resize-status-changes-1533&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;New kubelet image manager metric&lt;br&gt;
The kubelet_image_manager_ensure_image_requests_total{present_locally, pull_policy, pull_required} counter exposes detailed information on how often kubelet must ensure images are present, which can inform image placement strategies.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In‑place Pod resource updates are GA&lt;br&gt;
In‑place updates of Pod CPU and memory resources have graduated to GA, enabling nondisruptive vertical scaling for many workloads that previously required recreating pods.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;HPA performance improvement for container metrics&lt;br&gt;
Container-specific HPA metrics now use an optimized lookup that exits early when the target container is found, reducing overhead in pods with multiple containers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dropped certificates/v1beta1 CSR support in kubectl&lt;br&gt;
kubectl no longer supports certificates/v1beta1 CertificateSigningRequest, nudging users to use stable API versions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Stricter kubectl exec syntax&lt;br&gt;
kubectl exec [POD] [COMMAND] is no longer supported; kubectl exec [POD] -- [COMMAND] is now required, which aligns with long‑established best practices and avoids parsing ambiguities&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/changed-kubectl-exec-syntax-to-require-before-the-command-the-form-kubectl-exec-pod-command-is-no-longer-supported-use-kubectl-exec-pod-command-instead-1594" rel="noopener noreferrer"&gt;https://www.relnx.io/features/changed-kubectl-exec-syntax-to-require-before-the-command-the-form-kubectl-exec-pod-command-is-no-longer-supported-use-kubectl-exec-pod-command-instead-1594&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;UserNamespacesPodSecurityStandards gate removed
The UserNamespacesPodSecurityStandards feature gate has been removed now that the minimum supported kubelet version is v1.31, making the enhanced pod security behavior default and reducing configuration complexity.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details: &lt;a href="https://www.relnx.io/features/removed-the-usernamespacespodsecuritystandards-feature-gate-the-minimum-supported-kubernetes-version-for-kubelet-is-now-v131-so-the-gate-is-no-longer-needed-1687" rel="noopener noreferrer"&gt;https://www.relnx.io/features/removed-the-usernamespacespodsecuritystandards-feature-gate-the-minimum-supported-kubernetes-version-for-kubelet-is-now-v131-so-the-gate-is-no-longer-needed-1687&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Full Kubernetes v1.35.0 release highlights are available on: &lt;a href="https://www.relnx.io/releases/kubernetes-v1-35-0" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/kubernetes-v1-35-0&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  ArgoCD v3.2.2
&lt;/h2&gt;

&lt;p&gt;ArgoCD v3.2.2, released on December 18, 2025, is a smaller but meaningful bug‑fix release targeting authentication, secret management, and ApplicationSet behavior.​&lt;/p&gt;

&lt;h3&gt;
  
  
  Key fixes:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;AuthMiddleware: check userinfo endpoint&lt;br&gt;
The AuthMiddleware now checks the userinfo endpoint, improving validation of authenticated users and strengthening the security model around who can access ArgoCD&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Read and write secrets for the same URL&lt;br&gt;
Support for separate read and write secrets on the same URL provides more granular access control, which is useful for tightening permissions around sensitive resources&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AppSet preserves annotations during hydration&lt;br&gt;
ApplicationSet now preserves annotations when hydration is requested, ensuring that attached metadata remains intact and usable by downstream tools and automation.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Read the full ArgoCD 3.2.2 breakdown on: &lt;a href="https://www.relnx.io/releases/argocd-v3-2-2" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/argocd-v3-2-2&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Argocd-image-updater V1.0.2
&lt;/h2&gt;

&lt;p&gt;ArgoCD Image Updater v1.0.2, released on December 16, 2025, focuses on making deployments more predictable and reducing surprise behavior around tags and annotations.​&lt;/p&gt;

&lt;h3&gt;
  
  
  Highlights:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Installed into argocd namespace by default&lt;br&gt;
Installing the Image Updater into the argocd namespace by default simplifies setup and improves integration between the controller and ArgoCD itself.​&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Preserve existing Helm tag parameter when image has no tag&lt;br&gt;
When an image has no explicit tag, the existing Helm tag parameter is preserved, reducing the risk of unintentionally changing image versions during updates.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Fix infinite commit loop with digest strategy&lt;br&gt;
A bug where digest strategy inconsistently wrote tag names and caused infinite commit loops has been fixed, eliminating noisy commits and wasted CI/CD cycles.​&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Default argocd-image-updater-controller annotation&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Using argocd-image-updater-controller as a default container annotation makes automatic image management simpler and helps keep workloads on up‑to‑date images with less manual effort.&lt;/p&gt;

&lt;p&gt;More details are available in the full ArgoCD Image Updater v1.0.2 notes on &lt;a href="https://www.relnx.io/releases/argocd-image-updater-v1-0-2" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/argocd-image-updater-v1-0-2&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Prometheus v3.8.1
&lt;/h2&gt;

&lt;p&gt;Prometheus v3.8.1, released on December 16, 2025, is a focused bug‑fix release that is especially relevant if you rely on Remote Write.&lt;/p&gt;

&lt;h3&gt;
  
  
  Highlights:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Remote Write receiver bug fix
The Remote Write receiver now avoids sending incorrect response headers for the v1 flow, which previously caused senders to emit false partial error logs and metrics, improving the accuracy and trustworthiness of your monitoring data.​&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Full Prometheus 3.8.1 release summary is available on &lt;a href="https://www.relnx.io/releases/prometheus-v3-8-1" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/prometheus-v3-8-1&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Grafana v12.3.1
&lt;/h2&gt;

&lt;p&gt;Grafana v12.3.1, released on December 17, 2025, is a UI and UX‑focused update that cleans up dashboard behavior and improves Azure log exploration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Highlights:
&lt;/h3&gt;

&lt;p&gt;1.Fixed empty space under time controls&lt;br&gt;
Dashboards with many variables no longer show a large empty space under the time controls, giving back valuable screen real estate for panels and visualizations.​&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Clearing hideSeriesFrom on query edit&lt;br&gt;
The QueryEditorRows behavior now clears hideSeriesFrom overrides when a query is edited, helping prevent accidental hiding of relevant series after query changes.​&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Azure logs: aggregate columns in logs builder&lt;br&gt;
Azure users can now include aggregate columns directly in the logs builder, making it easier to derive and visualize higher-level metrics from log data.​&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;More Grafana 12.3.1 details can be found on &lt;a href="https://www.relnx.io/releases/grafana-v12-3-1" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/grafana-v12-3-1&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;That wraps up a busy week across OpenTelemetry, Kubernetes, ArgoCD, Prometheus, and Grafana. If you want to keep up with these changes and benefit from automated upgrade guidance, join the community at relnx.io, where you can track releases for your favorite tools and explore auto‑upgrade workflows tailored to your stack.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>argocd</category>
      <category>opentelemtry</category>
    </item>
    <item>
      <title>Behind the War Room Doors: How Great Incident Management Drives Fast Resolution</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Mon, 17 Nov 2025 10:27:10 +0000</pubDate>
      <link>https://dev.to/aws-builders/behind-the-war-room-doors-how-great-incident-management-drives-fast-resolution-3908</link>
      <guid>https://dev.to/aws-builders/behind-the-war-room-doors-how-great-incident-management-drives-fast-resolution-3908</guid>
      <description>&lt;p&gt;Incident management is a critical part of any observability stack. When things break, stress levels rise, time feels compressed, and communication can easily spiral out of control. Without proper coordination and clearly assigned roles, even small incidents can snowball.&lt;/p&gt;

&lt;p&gt;To make this process smoother, efficient, and blameless, every engineering organization should implement a structured approach. Over time, this will reduce your Mean Time to Resolution (MTTR) and build a culture where everyone focuses on resolution—not blame.&lt;/p&gt;

&lt;p&gt;This framework breaks incident management into four key stages.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zivv95of3nhely5rs4u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zivv95of3nhely5rs4u.png" alt="Screenshot 2025-11-17 at 6.20.31 PM.png" width="800" height="325"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Notifications
&lt;/h2&gt;

&lt;p&gt;When an incident is triggered, communication speed and accuracy determine how fast you can respond. The goal is to alert the right people, in the right channels, at the right time.&lt;/p&gt;

&lt;p&gt;Here’s how to set it up strategically:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;General Incident Channel: A shared space where everyone across the company can stay informed. Transparency builds trust and awareness.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dedicated Incident Channel: A focused chat for real-time communication, troubleshooting, and decision-making between responders.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Stakeholder Alerts (Optional): For high-severity incidents, specific leaders or stakeholders should be notified directly to ensure alignment on business impact and response strategy.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This tiered notification setup ensures that communication stays clear and organized throughout the incident lifecycle.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. During the Incident
&lt;/h2&gt;

&lt;p&gt;Once the response begins, chaos can sneak in unless clear roles and responsibilities are defined upfront. Each person should know their mission to maintain focus and avoid duplication of effort.&lt;/p&gt;

&lt;p&gt;Key roles include:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Incident Commander (IC): The decision-maker. The IC oversees the entire operation, makes judgment calls, and ensures progress continues—without diving into technical work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Scribe: The recorder. This person logs events, decisions, timelines, and next steps. Accurate documentation is essential for the postmortem.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Communication Liaison: The bridge between responders and others. They send concise updates to stakeholders and prevent unnecessary distractions for the technical team.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Responders / Subject Matter Experts (SMEs): The technical experts investigating and resolving the incident. They work closely together to identify root causes and execute remediation steps.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Well-defined roles lead to calm, coordinated action rather than reactive chaos.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Follow-Up (Stabilization Phase)
&lt;/h2&gt;

&lt;p&gt;Once production is stable again, the work isn’t over. The stabilization phase focuses on ensuring the underlying problem is fully understood and properly fixed.&lt;/p&gt;

&lt;p&gt;This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Creating follow-up tickets for permanent fixes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Validating the production environment after recovery.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Running a quick internal review to confirm that monitoring, alerts, and runbooks worked as expected.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This phase transitions the team from firefighting to prevention.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Resolution &amp;amp; Learning
&lt;/h2&gt;

&lt;p&gt;After the system is stable and follow-up actions are completed, take time to learn. Every incident is an opportunity to strengthen the system and team.&lt;/p&gt;

&lt;p&gt;Two critical outputs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Postmortem: A timeline-based narrative of the incident. What happened, why it happened, what went well, and what didn’t. Keep it factual and blameless.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Documentation &amp;amp; Knowledge Sharing: Store all findings in an accessible place so others can learn from the experience and avoid repeating mistakes.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With consistent practice, teams become more confident, incidents resolve faster, and the overall reliability culture improves.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Incident management is not just about technical recovery—it’s about coordination, communication, and continuous learning. By mastering these four parts—Notifications, During the Incident, Follow-Up, and Resolution &amp;amp; Learning—you will transform stressful incidents into structured, teachable moments that strengthen your engineering culture and reduce MTTR over time.&lt;/p&gt;

</description>
      <category>sitereliabilityengineering</category>
      <category>devops</category>
      <category>observability</category>
    </item>
    <item>
      <title>This Week’s Cloud Native Pulse: Top Releases &amp; Urgent Ingress NGINX News (Nov 16, 2025)</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Sun, 16 Nov 2025 14:56:20 +0000</pubDate>
      <link>https://dev.to/aws-builders/this-weeks-cloud-native-pulse-top-releases-urgent-ingress-nginx-news-nov-16-2025-1f8b</link>
      <guid>https://dev.to/aws-builders/this-weeks-cloud-native-pulse-top-releases-urgent-ingress-nginx-news-nov-16-2025-1f8b</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Eight major releases: &lt;code&gt;Skaffold&lt;/code&gt;, &lt;code&gt;Traefik&lt;/code&gt;, &lt;code&gt;Operator Framework&lt;/code&gt;, &lt;code&gt;Argo Workflows&lt;/code&gt;, &lt;code&gt;Cilium&lt;/code&gt;, &lt;code&gt;Helm&lt;/code&gt;, &lt;code&gt;Kubernetes&lt;/code&gt;, &lt;code&gt;Kustomize&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;NGINX Ingress officially retiring—organizations must migrate within 6 months.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Gateway API recommended as the new standard.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Full release details at &lt;a href="https://www.relnx.io/releases" rel="noopener noreferrer"&gt;https://www.relnx.io/releases&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Featured Releases
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Skaffold v2.17.0&lt;/code&gt;: Configuration improvements, and bug fixes [&lt;a href="https://www.relnx.io/releases/skaffold-v2-17-0" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/skaffold-v2-17-0&lt;/a&gt;].&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Traefik v3.6.1&lt;/code&gt;: Docker API negotiation, multi-layer routing, Gateway API support, OpenTelemetry enhancements [&lt;a href="https://www.relnx.io/releases/traefik-v3-6-1" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/traefik-v3-6-1&lt;/a&gt;].&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Operator Framework v1.42.0&lt;/code&gt;: Upgraded Kubernetes support, enhanced testing, network policy protection [&lt;a href="https://www.relnx.io/releases/operator%20framework-v1-42-0" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/operator%20framework-v1-42-0&lt;/a&gt;].&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Argo Workflows v3.7.4&lt;/code&gt;: Smarter caching, controller improvements, exclusive image publishing [&lt;a href="https://www.relnx.io/releases/argo-workflows-v3-7-4" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/argo-workflows-v3-7-4&lt;/a&gt;].&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Cilium v1.16.17&lt;/code&gt;: Security fixes, eBPF networking improvements, Envoy proxy update [&lt;a href="https://www.relnx.io/releases/cilium-v1-16-17" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/cilium-v1-16-17&lt;/a&gt;].&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Helm v4.0.0&lt;/code&gt;: Major milestone release with backend refactor, enhanced security defaults, improved templating capabilities, and seamless Kubernetes integration. This release sets a new standard for package management in Kubernetes.​&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Full release: &lt;a href="https://www.relnx.io/releases/helm-v4-0-0" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/helm-v4-0-0&lt;/a&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;Kubernetes v1.34.2&lt;/code&gt;: Critical security patches, bug fixes, performance enhancements, improved scheduler and API stability, recommended for all production clusters.​&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Full release: &lt;a href="https://www.relnx.io/releases/kubernetes-v1-34-2" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/kubernetes-v1-34-2&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;Kustomize v5.8.0&lt;/code&gt;: Enhanced patch strategies, support for new resource types, streamlined YAML customization, and better CLI UX.​&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Full release: &lt;a href="https://www.relnx.io/releases/kustomize-vkustomize-v5-8-0" rel="noopener noreferrer"&gt;https://www.relnx.io/releases/kustomize-vkustomize-v5-8-0&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Urgent: Ingress NGINX Retirement
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Kubernetes is retiring the Ingress NGINX controller. Users must migrate within 6 months to avoid security risks and lack of maintenance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Gateway API is the recommended replacement as the new Kubernetes ingress standard. Migration guides available at &lt;a href="https://gateway-api.sigs.k8s.io/guides/" rel="noopener noreferrer"&gt;https://gateway-api.sigs.k8s.io/guides/&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Details: Kubernetes Ingress NGINX Retirement [&lt;a href="https://www.kubernetes.dev/blog/2025/11/12/ingress-nginx-retirement/" rel="noopener noreferrer"&gt;https://www.kubernetes.dev/blog/2025/11/12/ingress-nginx-retirement/&lt;/a&gt;]&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Full Release List
&lt;/h2&gt;

&lt;p&gt;See full changelogs and updated projects at &lt;a href="https://www.relnx.io/releases" rel="noopener noreferrer"&gt;https://www.relnx.io/releases&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Community Call-to-Action
&lt;/h2&gt;

&lt;p&gt;Share your thoughts on the NGINX migration, discuss favorite new release features, and follow for next week’s updates.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>nginx</category>
      <category>devops</category>
    </item>
    <item>
      <title>Understanding the Operator Capability Model: Defining Operator Functions</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Thu, 30 Jan 2025 10:35:25 +0000</pubDate>
      <link>https://dev.to/aws-builders/understanding-the-operator-capability-model-defining-operator-functions-55id</link>
      <guid>https://dev.to/aws-builders/understanding-the-operator-capability-model-defining-operator-functions-55id</guid>
      <description>&lt;p&gt;The &lt;strong&gt;Operator Capability Model&lt;/strong&gt;, established by the &lt;strong&gt;&lt;a href="https://operatorframework.io/operator-capabilities/" rel="noopener noreferrer"&gt;Operator Framework&lt;/a&gt;&lt;/strong&gt;, categorizes Kubernetes Operators based on their functionality and maturity. This model serves as a guideline for developers to enhance their Operators while providing users with a clear understanding of what to expect from different Operators.&lt;/p&gt;

&lt;p&gt;This blog will break down the &lt;strong&gt;five capability levels&lt;/strong&gt;, provide &lt;strong&gt;real-world examples from OperatorHub.io&lt;/strong&gt;, and outline the necessary steps to achieve each level.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Level I—Basic Install&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Definition&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Operators at this level handle only the most fundamental tasks—installing the application (Operand) and ensuring it is running. The Operator deploys workloads and conveys their status to administrators but does not handle failures or provide advanced automation.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Example Operator&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://operatorhub.io/operator/ack-prometheusservice-controller" rel="noopener noreferrer"&gt;AWS Controllers for Kubernetes - Amazon Prometheus&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Steps to Reach Level I&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Package the application using &lt;strong&gt;Deployment, StatefulSet, or DaemonSet&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Create a &lt;strong&gt;Custom Resource Definition (CRD)&lt;/strong&gt; to represent the application.&lt;/li&gt;
&lt;li&gt;Develop an Operator that reconciles the CRD and ensures the application is deployed.&lt;/li&gt;
&lt;li&gt;Publish the Operator on &lt;strong&gt;OperatorHub.io&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Level II—Seamless Upgrades&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Definition&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Level II Operators build upon Level I by adding upgrade mechanisms. This means the Operator can update both itself and its Operand smoothly while maintaining backward compatibility and rollback options.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Example Operator&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://operatorhub.io/operator/mongodb-atlas-kubernetes" rel="noopener noreferrer"&gt;MongoDB Atlas Operator&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Steps to Reach Level II&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Implement &lt;strong&gt;rolling updates&lt;/strong&gt; and &lt;strong&gt;version management&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Enable &lt;strong&gt;automatic updates&lt;/strong&gt; for both the Operator and its Operand.&lt;/li&gt;
&lt;li&gt;Ensure compatibility with older Operand versions.&lt;/li&gt;
&lt;li&gt;Provide rollback functionality in case of failures.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Level III—Full Lifecycle Management&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Definition&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Operators at this level actively manage the Operand's lifecycle, providing advanced features such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Backup and restore&lt;/li&gt;
&lt;li&gt;Complex configuration workflows&lt;/li&gt;
&lt;li&gt;Failover and failback mechanisms&lt;/li&gt;
&lt;li&gt;Scaling capabilities (e.g., adding or removing instances)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Example Operator&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://operatorhub.io/operator/postgresql-operator-dev4devs-com" rel="noopener noreferrer"&gt;PostgreSQL Operator by Dev4Ddevs.com&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Steps to Reach Level III&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Implement &lt;strong&gt;automatic backup and restore&lt;/strong&gt; capabilities.&lt;/li&gt;
&lt;li&gt;Provide support for &lt;strong&gt;scaling&lt;/strong&gt;, both manual and automatic.&lt;/li&gt;
&lt;li&gt;Include &lt;strong&gt;failover and failback mechanisms&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Support complex configuration management and dynamic changes.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Level IV—Deep Insights&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Definition&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;At this level, Operators provide detailed insights into both their own performance and that of their Operand. This includes metrics, alerts, and logging.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Example Operator&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://operatorhub.io/operator/prometheus" rel="noopener noreferrer"&gt;Prometheus Operator&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Steps to Reach Level IV&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Integrate &lt;strong&gt;Prometheus metrics&lt;/strong&gt; and expose them via a &lt;strong&gt;ServiceMonitor&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Provide &lt;strong&gt;Grafana dashboards&lt;/strong&gt; for real-time monitoring.&lt;/li&gt;
&lt;li&gt;Implement &lt;strong&gt;logging integrations&lt;/strong&gt; (e.g., Fluentd, Loki).&lt;/li&gt;
&lt;li&gt;Define &lt;strong&gt;alerts and Kubernetes Events&lt;/strong&gt; to notify administrators of issues.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Level V—Auto Pilot (Self-Healing and Scaling)&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Definition&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Level V Operators achieve full automation, handling day-2 operations autonomously. These include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auto-scaling&lt;/strong&gt; based on demand&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-healing&lt;/strong&gt; to recover from failures&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-tuning&lt;/strong&gt; for peak performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Abnormality detection&lt;/strong&gt; to identify unexpected behaviors&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Example Operator&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://operatorhub.io/operator/lbconfig-operator" rel="noopener noreferrer"&gt;External Load-Balancer Configuration Operator&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Steps to Reach Level V&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Implement &lt;strong&gt;predictive auto-scaling&lt;/strong&gt; based on load and historical data.&lt;/li&gt;
&lt;li&gt;Develop &lt;strong&gt;auto-healing mechanisms&lt;/strong&gt; to detect and correct failures.&lt;/li&gt;
&lt;li&gt;Enable &lt;strong&gt;dynamic tuning&lt;/strong&gt; to optimize performance in real time.&lt;/li&gt;
&lt;li&gt;Integrate &lt;strong&gt;machine learning-driven anomaly detection&lt;/strong&gt; for proactive issue mitigation.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;How to Level Up Your Operator&lt;/strong&gt;
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start with the Basics:&lt;/strong&gt; Ensure your Operator can deploy and manage a Kubernetes application.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable Upgrades:&lt;/strong&gt; Implement rolling updates, backward compatibility, and rollback mechanisms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automate Lifecycle Management:&lt;/strong&gt; Provide backup, scaling, and failover support.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Improve Observability:&lt;/strong&gt; Expose metrics, logs, and alerts to enhance monitoring.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enable Full Automation:&lt;/strong&gt; Implement self-healing, auto-scaling, and auto-tuning mechanisms.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Operator Capability Model&lt;/strong&gt; serves as a roadmap for improving an Operator’s maturity. Whether you are just starting or aiming for full automation, following this structured approach ensures a more resilient and feature-rich Operator.&lt;/p&gt;

&lt;p&gt;Start by evaluating your current &lt;strong&gt;capability level&lt;/strong&gt;, and follow these steps to level up! 🚀&lt;/p&gt;




&lt;p&gt;For further insights or any questions, connect with me on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/in/a7medzidan/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="https://twitter.com/27medzidann" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>k8s</category>
      <category>dailytask</category>
      <category>operator</category>
    </item>
    <item>
      <title>Integrating Kube-Prometheus with Your Operator Using Jsonnet Bundler (jb)</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Thu, 30 Jan 2025 08:49:17 +0000</pubDate>
      <link>https://dev.to/aws-builders/integrating-kube-prometheus-with-your-operator-using-jsonnet-bundler-jb-5dop</link>
      <guid>https://dev.to/aws-builders/integrating-kube-prometheus-with-your-operator-using-jsonnet-bundler-jb-5dop</guid>
      <description>&lt;p&gt;Observability is a crucial aspect of managing Kubernetes operators effectively. By integrating &lt;strong&gt;Kube-Prometheus&lt;/strong&gt;, you can gain valuable insights into your operator’s health, monitor resource usage, and set up alerting rules to improve reliability. In this guide, we’ll explore how to use &lt;strong&gt;Jsonnet Bundler (jb)&lt;/strong&gt; to integrate &lt;strong&gt;Kube-Prometheus&lt;/strong&gt; into your Kubernetes operator in an efficient and scalable manner.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;What is jb (Jsonnet Bundler)?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Jsonnet Bundler (jb)&lt;/strong&gt; is a package manager for Jsonnet, a powerful templating language used to manage Kubernetes configurations. With jb, you can easily install and manage &lt;strong&gt;Kube-Prometheus&lt;/strong&gt;, a comprehensive monitoring stack that includes &lt;strong&gt;Prometheus Operator, Alertmanager, Grafana, and ServiceMonitors&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Why Use jb for Kube-Prometheus?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Simplifies &lt;strong&gt;Kube-Prometheus&lt;/strong&gt; installation and management.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Automates Kubernetes manifest generation from Jsonnet.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Allows easy customization of monitoring configurations.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Prerequisites&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Before proceeding, ensure you have the following installed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;A &lt;strong&gt;Kubernetes cluster&lt;/strong&gt; (Minikube, Kind, or a cloud-based cluster)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;kubectl (CLI tool for interacting with Kubernetes)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;go (Required for operator development)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;jsonnet and jsonnet-bundler (jb)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Installing jb and jsonnet&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;If you haven’t installed Jsonnet Bundler, install it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
go &lt;span class="nb"&gt;install &lt;/span&gt;github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb@latest

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Verify installation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
jb &lt;span class="nt"&gt;--version&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If jsonnet is not installed, install it using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;jsonnet  &lt;span class="c"&gt;# MacOS&lt;/span&gt;

&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;jsonnet  &lt;span class="c"&gt;# Ubuntu/Debian&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Step 1: Initialize jb in Your Operator Project&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Navigate to your operator project directory and initialize jb:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;my-operator  &lt;span class="c"&gt;# Navigate to your operator project root&lt;/span&gt;

jb init  &lt;span class="c"&gt;# Initialize Jsonnet Bundler&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a jsonnetfile.json file, which tracks dependencies.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Step 2: Add Kube-Prometheus as a Dependency&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Install &lt;strong&gt;Kube-Prometheus&lt;/strong&gt; as a Jsonnet dependency using jb:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
jb &lt;span class="nb"&gt;install &lt;/span&gt;github.com/prometheus-operator/kube-prometheus/jsonnet/kube-prometheus@main

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;This command will:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Fetch the &lt;strong&gt;Kube-Prometheus&lt;/strong&gt; package from GitHub.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Store it in the vendor/ directory.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Update jsonnetfile.lock.json with the package version.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Verify the dependency installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
&lt;span class="nb"&gt;ls &lt;/span&gt;vendor/kube-prometheus

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see Jsonnet files for dashboards, alerting rules, and ServiceMonitors.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Step 3: Download and Update Example Jsonnet File&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Instead of manually generating manifests, we can download an example Jsonnet configuration file and a build script for easier customization.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Download the Example Jsonnet File&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
curl &lt;span class="nt"&gt;-o&lt;/span&gt; example.jsonnet https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/example.jsonnet

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Download the Build Script&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
curl &lt;span class="nt"&gt;-o&lt;/span&gt; build.sh https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/build.sh

&lt;span class="nb"&gt;chmod&lt;/span&gt; +x build.sh

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Customize example.jsonnet&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Open example.jsonnet and update the namespace where Prometheus and Alertmanager will be deployed, and define which namespaces Prometheus should watch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight jsonnet"&gt;&lt;code&gt;
&lt;span class="k"&gt;local&lt;/span&gt; &lt;span class="nx"&gt;kp&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt;

  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;'kube-prometheus/main.libsonnet'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;+&lt;/span&gt;

  &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="nx"&gt;values&lt;/span&gt;&lt;span class="p"&gt;+::&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

      &lt;span class="nx"&gt;common&lt;/span&gt;&lt;span class="p"&gt;+:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

        &lt;span class="nx"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;'monitoring'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

      &lt;span class="p"&gt;},&lt;/span&gt;

      &lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="p"&gt;+:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

        &lt;span class="nx"&gt;namespaces&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'monitoring'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;

      &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="p"&gt;},&lt;/span&gt;

  &lt;span class="p"&gt;};&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Step 4: Build and Apply the Manifests&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Generate Kubernetes YAML manifests using the build.sh script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
./build.sh example.jsonnet

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If you see the error --: gojsontoyaml: command not found, install the required tool:&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
go &lt;span class="nb"&gt;install &lt;/span&gt;github.com/brancz/gojsontoyaml@latest

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Apply the Kube-Prometheus Stack to Kubernetes:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
kubectl apply &lt;span class="nt"&gt;--server-side&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; manifests/setup/

kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; manifests/

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;This will deploy:&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prometheus Operator&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Prometheus instance&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Alertmanager&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Grafana dashboards&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ServiceMonitors&lt;/strong&gt; for monitoring Kubernetes components&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Step 5: Verify Monitoring Setup&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;1. Port-forward Prometheus to Access the UI&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
kubectl port-forward svc/prometheus-k8s 9090:9090 &lt;span class="nt"&gt;-n&lt;/span&gt; monitoring

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;2. Query Metrics in Prometheus UI&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Open &lt;strong&gt;&lt;a href="http://localhost:9090" rel="noopener noreferrer"&gt;http://localhost:9090&lt;/a&gt;&lt;/strong&gt; in a browser.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Search for your custom metrics (e.g., my_operator_reconcile_count).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;3. View Logs in Prometheus Pod&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
kubectl logs &lt;span class="nt"&gt;-l&lt;/span&gt; &lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;prometheus &lt;span class="nt"&gt;-n&lt;/span&gt; monitoring

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;By following this guide, we have successfully:&lt;/p&gt;

&lt;p&gt;✅ Integrated &lt;strong&gt;Kube-Prometheus&lt;/strong&gt; into our Kubernetes operator project.&lt;/p&gt;

&lt;p&gt;✅ Downloaded and customized an &lt;strong&gt;example Jsonnet configuration&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;✅ Used the build.sh script to generate and apply Kubernetes manifests.&lt;/p&gt;

&lt;p&gt;✅ Configured &lt;strong&gt;ServiceMonitor&lt;/strong&gt; to track our operator’s metrics.&lt;/p&gt;

&lt;p&gt;With this setup, you now have a fully functioning &lt;strong&gt;Prometheus monitoring stack&lt;/strong&gt; that provides deep insights into your operator’s performance and health. 🚀&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Have questions or need help? Drop a comment below!&lt;/strong&gt; 👇&lt;/p&gt;

</description>
      <category>k8s</category>
      <category>operator</category>
      <category>dailytask</category>
    </item>
    <item>
      <title>My Certified Kubernetes Administrator (CKA) Exam Experience</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Thu, 19 Sep 2024 12:05:52 +0000</pubDate>
      <link>https://dev.to/aws-builders/my-certified-kubernetes-administrator-cka-exam-experience-p45</link>
      <guid>https://dev.to/aws-builders/my-certified-kubernetes-administrator-cka-exam-experience-p45</guid>
      <description>&lt;p&gt;Recently, I passed the Certified Kubernetes Administrator (CKA) exam, and I’m excited to share my experience to help others prepare. The exam is practical and task-oriented, and you'll have access to official Kubernetes documentation in case you need to quickly verify anything. &lt;/p&gt;

&lt;p&gt;In this blog, I’ll break down what you need to know and share some useful tips that will make passing the CKA exam feel more approachable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Exam: What to Expect
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff2wjded9n2y7h60s7qu1.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff2wjded9n2y7h60s7qu1.webp" alt="CKA-Exam" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The CKA exam covers 10 core domains of Kubernetes knowledge. You'll be asked to perform real-world administrative tasks in a Kubernetes environment. &lt;/p&gt;

&lt;p&gt;Here's a quick breakdown of the key domains you'll encounter and some example questions to help you prepare.&lt;/p&gt;

&lt;h2&gt;
  
  
  1- Application Lifecycle Management
&lt;/h2&gt;

&lt;p&gt;This domain focuses on your ability to manage applications deployed in Kubernetes. You need to understand how to scale, update, and troubleshoot applications.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Question:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create a deployment named myapp with 3 replicas using the nginx image. Scale the deployment to 5 replicas.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create deployment myapp --image=nginx --replicas=3 kubectl scale deployment myapp --replicas=5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;You should also be familiar with rolling updates and rollbacks:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl rollout status deployment myapp 
kubectl rollout undo deployment myapp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2- Storage:
&lt;/h2&gt;

&lt;p&gt;This domain tests your knowledge of Kubernetes storage, such as Persistent Volumes (PV) and Persistent Volume Claims (PVC), storage classes, access modes, and reclaim policies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Question:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a PersistentVolumeClaim named xyz, with a storage class X, 20Gi capacity, and a host path &lt;code&gt;/data&lt;/code&gt; with ReadWriteOnce access mode. &lt;/p&gt;

&lt;p&gt;Then, create a pod named mypod using the nginx image, which mounts the PVC at &lt;code&gt;/data&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PersistentVolumeClaim YAML:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: xyz
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: x
  hostPath:
    path: /data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Pod YAML:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersiong: v1
kind: Pod
metadata:
  name: mypod
spec:
  Volumes:
    - name: myvol
      persistentVolumeClaim:
        claimName: xyz
  containers:
    - name: mypod-container
      image: nginx
      VolumeMounts:
        - mountPath: /data
          name: myvol  
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3- Cluster Maintenance
&lt;/h2&gt;

&lt;p&gt;You'll be asked to upgrade nodes or manage cluster versions. This domain tests your knowledge of Kubernetes node maintenance and version management.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example question:&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;Upgrade a node to the latest version, matching the control-plane node &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First, compare the versions of the nodes:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get nodes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Drain the node to be upgraded:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl drain node1 --diable-evication --ignore-daemonsets --delete-emptydir-data=false
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Upgrade the Kubernetes components:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt upgrade -y kubelet=1.30.1-1.1 kubectl=1.30.1-1.1 kubeadm=1.30.1-1 --allow-change-held-packages
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4- Installation Configuration
&lt;/h2&gt;

&lt;p&gt;This domain includes tasks like setting up a Kubernetes cluster or adding new nodes to the existing cluster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Question:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Add a new node (new-node) to the cluster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;On the control-plane node, generate the join command:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubeadm token create --print-join-command
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;SSH into the new node and run the join command:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubeadm join &amp;lt;control-plane-ip&amp;gt;:6443 --token &amp;lt;token&amp;gt; --discovery-token-ca-cert-hash sha256:&amp;lt;hash&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5- Logging and Monitoring
&lt;/h2&gt;

&lt;p&gt;Understanding how to retrieve and analyze logs and monitor pod performance is essential. You should know how to use kubectl logs, kubectl top&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Questions&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Get the logs for a pod and save them to &lt;code&gt;/tmp/pod.log&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Find the pod with the highest CPU utilization:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Solution :&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1.
kubectl logs pod-name &amp;gt; /tmp/pod.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2.
kubectl top pods -A --sort-by=cpu --no-headers | head -n 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  6- Networking
&lt;/h2&gt;

&lt;p&gt;Networking is one of the crucial areas in Kubernetes. You need to understand how Kubernetes services (ClusterIP, NodePort, LoadBalancer) work, as well as how to configure and use Ingress controllers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Question:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Configure an Ingress resource that directs traffic to the nginx-service on path /nginx.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Ingress YAML:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: nginx-ingress
spec:
  rules:
    - host: myapp.example.com
      http:
        paths:
          - path: /nginx
            pathType: Prefix
            backend:
              service:
                name: nginx-service
                port:
                  number: 80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  7- Scheduling
&lt;/h2&gt;

&lt;p&gt;You need to demonstrate an understanding of how to schedule pods on specific nodes, use node affinity, taints, and tolerations.&lt;/p&gt;

&lt;p&gt;Also you need to understand static pod and how to create one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Question:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Schedule a pod on a node labeled with env=prod.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pod YAML with nodeSelector:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: v1
kind: Pod
metadata:
  name: prod-pod
spec:
  nodeSelector:
    env: prod
  containers:
    - name: nginx-container
      image: nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  8- Security
&lt;/h2&gt;

&lt;p&gt;Security covers RBAC, Network Policies, Secrets, and ServiceAccounts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Question:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Create a Network Policy that allows incoming traffic only from pods in the frontend namespace to a pod labeled app=backend in the default namespace on port 80.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Network Policy YAML:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend
  namespace: default
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: frontend
    ports:
    - protocol: TCP
      port: 80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  9- Troubleshooting
&lt;/h2&gt;

&lt;p&gt;You’ll need to troubleshoot various issues such as application failure, cluster component failure, and networking issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of the nodes in the cluster isn’t in the Ready status. Investigate and resolve the issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Answer:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check which node isn’t ready:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get nodes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;SSH into the node and check the kubelet status and logs:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;systemctl status kubelet # to see the status of the kubelet
journalctl -u kubelet # to see the logs from kubelet and undertand how to fis the problem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Fix the issue and start kubelet again:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;systemctl start kubelet
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  10- Validation
&lt;/h2&gt;

&lt;p&gt;Validation involves ensuring the health and status of your Kubernetes resources, ensuring they are running as expected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Question:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ensure that the pod mypod is in a Running state. If not, investigate and resolve the issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Check the pod’s status:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pod mypod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;If the pod is not in the Running state, describe the pod to investigate further:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl describe pod mypod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Investigate logs or resource configurations to resolve the issue.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Finally Tips
&lt;/h2&gt;

&lt;p&gt;Here are some important commands that you'll frequently use during the CKA exam:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;create a deployment
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create deployment myapp --image=nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Expose deployment using a service
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl expose deployment myapp --port=80 --target-port=8080 --type=ClusterIP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a service account
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create serviceaccount my-sa
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a Role or ClusterRole:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create role|clusterrole myrole --verb=get,list,watch --resource=pods
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create a RoleBinding or ClusterRoleBinding:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create rolebinding|clusterrolebinding mybinding --role=myrole --serviceaccount=default:my-sa --namespace=default
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Create an Ingress resource:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create ingress mying --rule="myapp.example.com/nginx*=nginx-service:80"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Remember to memorize the Pod YAML configuration — this will save you a lot of time when dealing with Pod-related tasks.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Final Exam Tips
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Copy and Paste: You can copy and paste text from the exam environment to save time. Use the following shortcuts:&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Copy: Ctrl+Shift+C
- Paste: Ctrl+Shift+V
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;You will be able to use the k8s documentation but you will not have time to look into it so make sure you practice using it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Also you can keep the kubctl cheat sheet  command open during the exam just in case if you want to confirm something.&lt;/p&gt;

&lt;p&gt;For further insights or any questions, connect with me on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/in/a7medzidan/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="https://twitter.com/27medzidann" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Elastic Goes Open Source Again: A Cautionary Tale for Terraform and Others</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Sat, 31 Aug 2024 14:59:00 +0000</pubDate>
      <link>https://dev.to/aws-builders/elastic-goes-open-source-again-a-cautionary-tale-for-terraform-and-others-33g5</link>
      <guid>https://dev.to/aws-builders/elastic-goes-open-source-again-a-cautionary-tale-for-terraform-and-others-33g5</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0kuo40n6bhrdxlo2j091.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0kuo40n6bhrdxlo2j091.png" alt="Elatic-opensource-again" width="800" height="336"&gt;&lt;/a&gt;&lt;br&gt;
In a surprising turn of events, &lt;a href="https://www.elastic.co/blog/elasticsearch-is-open-source-again" rel="noopener noreferrer"&gt;Elastic has decided to embrace open source once more&lt;/a&gt;. If you recall, about three years ago, Elasticsearch made headlines by changing its licensing model. The reason? A conflict with AWS that led to a significant shift in the open-source community. Frustrated by Elastic’s decision, the community forked the last truly open-source version of Elasticsearch and birthed a new project called OpenSearch.&lt;/p&gt;

&lt;p&gt;Fast forward to today, OpenSearch has not only survived but thrived. With a roadmap that's increasingly divergent from Elasticsearch and a community that's fiercely supportive, OpenSearch has carved out its own identity, adding numerous features and innovations that distinguish it from its predecessor. This success story is a testament to the power of community-driven development.&lt;/p&gt;

&lt;p&gt;Now, Elastic has announced a return to its open-source roots, introducing two new licenses in a bid to regain the trust of the community. But the question remains: Is it too late? For over three years, developers, enterprises, and enthusiasts have been investing their time and resources into OpenSearch. Elastic's pivot back to open source may be seen as an attempt to reclaim lost ground, but whether they can successfully bring the community back remains to be seen.&lt;/p&gt;

&lt;p&gt;This scenario should serve as a cautionary tale for others in the tech world—particularly for HashiCorp, the creators of Terraform. Recently, HashiCorp’s decision to shift its licensing has sparked controversy, leading to the emergence of OpenTofu, a community-driven fork of Terraform. Just as OpenSearch grew and thrived after the Elasticsearch fork, OpenTofu has the potential to do the same.&lt;/p&gt;

&lt;p&gt;The lesson here is clear: When a project decides to move away from its open-source foundations, it risks alienating its most dedicated users and contributors. The community doesn’t wait around; it adapts, forks, and moves forward. If Terraform maintains its current course, the future may hold a similar story to that of Elasticsearch—where the fork, OpenTofu, evolves with its own unique features and gains the trust of the open-source community.&lt;/p&gt;

&lt;p&gt;In the ever-evolving landscape of software development, the true strength lies in the hands of the community. Companies that underestimate this might find themselves playing catch-up, trying to win back the very people they once took for granted.&lt;/p&gt;

&lt;p&gt;So, what does this mean for developers and companies today? It’s a reminder that open-source software is more than just code—it’s about trust, collaboration, and shared goals. And when that trust is broken, the community will find a way to keep moving forward, with or without the original creators.&lt;/p&gt;

&lt;p&gt;For further insights or any questions, connect with me on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.linkedin.com/in/a7medzidan/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="https://twitter.com/27medzidann" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>elastic</category>
      <category>aws</category>
      <category>community</category>
    </item>
    <item>
      <title>Optimizing EKS Fargate: Exposing K8s Service as a LoadBalancer</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Wed, 06 Mar 2024 09:24:21 +0000</pubDate>
      <link>https://dev.to/aws-builders/optimizing-eks-fargate-exposing-k8s-service-as-a-loadbalancer-3pce</link>
      <guid>https://dev.to/aws-builders/optimizing-eks-fargate-exposing-k8s-service-as-a-loadbalancer-3pce</guid>
      <description>&lt;p&gt;Fargate, a groundbreaking technology, streamlines container orchestration by providing on-demand, perfectly-sized compute capacity. You escape the complexities of manual provisioning, configuring, and scaling of virtual machines. It's the go-to choice when the nature of workloads is uncertain, and rapid deployment is paramount, saving valuable time in capacity planning.&lt;/p&gt;

&lt;p&gt;However, leveraging EKS Fargate poses challenges, especially when exposing K8s services as LoadBalancers. Is it worth it? Today, we unravel this mystery and provide a smooth path for its implementation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges with LoadBalancer Type in EKS Fargate:
&lt;/h2&gt;

&lt;p&gt;In standard EKS, exposing services as LoadBalancers is straightforward. You define your service manifest with type: LoadBalancer, and the magic happens. But in EKS Fargate, you might notice your LoadBalancer stuck in a "pending" status.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Service&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nlb-sample-service&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test-1&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;service.beta.kubernetes.io/aws-load-balancer-type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;external&lt;/span&gt;
    &lt;span class="na"&gt;service.beta.kubernetes.io/aws-load-balancer-nlb-target-type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ip&lt;/span&gt;
    &lt;span class="na"&gt;service.beta.kubernetes.io/aws-load-balancer-scheme&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;internet-facing&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
      &lt;span class="na"&gt;targetPort&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
      &lt;span class="na"&gt;protocol&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;TCP&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;LoadBalancer&lt;/span&gt;
  &lt;span class="na"&gt;selector&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nginx&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Upon applying this in Fargate, you'll witness a perpetually pending external LoadBalancer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~ kubectl get svc

NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP   PORT&lt;span class="o"&gt;(&lt;/span&gt;S&lt;span class="o"&gt;)&lt;/span&gt;        AGE
nlb-sample-service   LoadBalancer   172.20.39.142   &amp;lt;pending&amp;gt;     80:30843/TCP   3m14s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Checking the service description reveals an "Ensuring LoadBalancer" event.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Events:
  Type    Reason                Age    From                Message
  &lt;span class="nt"&gt;----&lt;/span&gt;    &lt;span class="nt"&gt;------&lt;/span&gt;                &lt;span class="nt"&gt;----&lt;/span&gt;   &lt;span class="nt"&gt;----&lt;/span&gt;                &lt;span class="nt"&gt;-------&lt;/span&gt;
  Normal  EnsuringLoadBalancer  3m55s  service-controller  Ensuring load balancer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;How to Make It Run Smoothly?&lt;/p&gt;

&lt;h2&gt;
  
  
  Steps to Deploy K8s Service as LoadBalancer Type in EKS Fargate:
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;deploy the AWS Load Balancer Controller to an Amazon EKS cluster&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;before you start using any service of type &lt;code&gt;LoadBalancer&lt;/code&gt; you will need to deploy AWS LoadBalancer Controller to your Fargate cluster.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Download an IAM policy that allows the AWS Load Balancer Controller to make calls to AWS APIs on your behalf, using the following command.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A. For AWS GovCloud (US-East) or AWS GovCloud (US-West) AWS Regions&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  curl &lt;span class="nt"&gt;-O&lt;/span&gt; https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.5.4/docs/install/iam_policy_us-gov.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;B. All other AWS Regions&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  curl &lt;span class="nt"&gt;-O&lt;/span&gt; https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.5.4/docs/install/iam_policy.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Create an IAM policy using the policy downloaded in the previous step. If you downloaded iam_policy_us-gov.json, change iam_policy.json to iam_policy_us-gov.json before running the command.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws iam create-policy &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--policy-name&lt;/span&gt; AWSLoadBalancerControllerIAMPolicy &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--policy-document&lt;/span&gt; file://iam_policy.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Create a service account named aws-load-balancer-controller in the kube-system namespace for the AWS Load Balancer Controller. Use the following command:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;eksctl create iamserviceaccount &lt;span class="se"&gt;\ &lt;/span&gt;   
&lt;span class="nt"&gt;--cluster&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;YOUR_CLUSTER_NAME &lt;span class="se"&gt;\ &lt;/span&gt; 
&lt;span class="nt"&gt;--namespace&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;kube-system &lt;span class="se"&gt;\ &lt;/span&gt; 
&lt;span class="nt"&gt;--name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;aws-load-balancer-controller &lt;span class="se"&gt;\ &lt;/span&gt; 
&lt;span class="nt"&gt;--attach-policy-arn&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;arn:aws:iam::&amp;lt;AWS_ACCOUNT_ID&amp;gt;:policy/AWSLoadBalancerControllerIAMPolicy &lt;span class="se"&gt;\ &lt;/span&gt; 
&lt;span class="nt"&gt;--override-existing-serviceaccounts&lt;/span&gt; &lt;span class="se"&gt;\ &lt;/span&gt; 
&lt;span class="nt"&gt;--approve&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output should be something like the following.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;2024-03-06 16:27:17 &lt;span class="o"&gt;[&lt;/span&gt;ℹ]  1 iamserviceaccount &lt;span class="o"&gt;(&lt;/span&gt;kube-system/aws-load-balancer-controller&lt;span class="o"&gt;)&lt;/span&gt; was included &lt;span class="o"&gt;(&lt;/span&gt;based on the include/exclude rules&lt;span class="o"&gt;)&lt;/span&gt;
2024-03-06 16:27:17 &lt;span class="o"&gt;[!]&lt;/span&gt;  metadata of serviceaccounts that exist &lt;span class="k"&gt;in &lt;/span&gt;Kubernetes will be updated, as &lt;span class="nt"&gt;--override-existing-serviceaccounts&lt;/span&gt; was &lt;span class="nb"&gt;set
&lt;/span&gt;2024-03-06 16:27:17 &lt;span class="o"&gt;[&lt;/span&gt;ℹ]  1 task: &lt;span class="o"&gt;{&lt;/span&gt; 
    2 sequential sub-tasks: &lt;span class="o"&gt;{&lt;/span&gt; 
        .......
2024-03-06 16:27:50 &lt;span class="o"&gt;[&lt;/span&gt;ℹ]  created serviceaccount &lt;span class="s2"&gt;"kube-system/aws-load-balancer-controller"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Install the AWS Load Balancer Controller with Helm using the following command.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;helm repo add eks https://aws.github.io/eks-charts

helm upgrade &lt;span class="nt"&gt;--install&lt;/span&gt; aws-load-balancer-controller eks/aws-load-balancer-controller &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;clusterName&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Your-luster-Name &lt;span class="nt"&gt;-n&lt;/span&gt; kube-system &lt;span class="nt"&gt;--set&lt;/span&gt; serviceAccount.create&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--set&lt;/span&gt; serviceAccount.name&lt;span class="o"&gt;=&lt;/span&gt;aws-load-balancer-controller &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Your-region &lt;span class="se"&gt;\&lt;/span&gt;
&lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;vpcId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Your-VPC
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note here we have to set &lt;code&gt;region&lt;/code&gt; and &lt;code&gt;vpcId&lt;/code&gt; why?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The Amazon EC2 instance metadata service (IMDS) isn't available to Pods that are deployed to Fargate nodes. &lt;/p&gt;

&lt;p&gt;So if you didn't specify the region and your VPCId the pod will not able to get them from the metadata. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;Verify that the controller is installed.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl get deployment &lt;span class="nt"&gt;-n&lt;/span&gt; kube-system aws-load-balancer-controller

NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
aws-load-balancer-controller   2/2     2            2           84s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  2. Ready to Deploy Your Service:
&lt;/h4&gt;

&lt;p&gt;Now that the controller is in place, reapply your LoadBalancer service.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~ kubectl get svc  

NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP                                                                  PORT&lt;span class="o"&gt;(&lt;/span&gt;S&lt;span class="o"&gt;)&lt;/span&gt;        AGE
nlb-sample-service   LoadBalancer   172.20.176.78   k8s-test1-nlbsampl-xxxx.onaws.com   80:31406/TCP   95m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can seamlessly use the service of type LoadBalancer in EKS Fargate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things to Consider When Exposing Your Service as a LoadBalancer:
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Network Load Balancers and Application Load Balancers (ALBs) can be used with Fargate with IP targets only.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Once you deploy the AWS Load Balancer controller in your cluster, it becomes the default class for all your services with type LoadBalancer.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion:
&lt;/h2&gt;

&lt;p&gt;EKS Fargate offers incredible simplicity and flexibility. With the AWS LoadBalancer Controller, hurdles in exposing K8s services as LoadBalancers are conquered. Seamless integration of this essential feature enriches your container orchestration experience.&lt;/p&gt;

&lt;p&gt;For further insights or any questions, connect with me on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; &lt;a href="https://www.linkedin.com/in/a7medzidan/" rel="noopener noreferrer"&gt;Linkedin&lt;/a&gt; &lt;/li&gt;
&lt;li&gt; &lt;a href="https://twitter.com/27medzidann" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>k8s</category>
      <category>eks</category>
    </item>
    <item>
      <title>Efficiently Scaling Disk Size in StatefulSet on EKS</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Thu, 25 Jan 2024 09:17:40 +0000</pubDate>
      <link>https://dev.to/aws-builders/efficiently-scaling-disk-size-in-statefulset-on-eks-3l27</link>
      <guid>https://dev.to/aws-builders/efficiently-scaling-disk-size-in-statefulset-on-eks-3l27</guid>
      <description>&lt;p&gt;Dealing with &lt;code&gt;stateful set&lt;/code&gt; in k8s is one of the most challenges specially Dealing with stateful sets in Kubernetes, particularly when scaling persistence volume, presents several challenges. It's crucial to consider factors such as data maintenance, cost management, minimizing downtime, and establishing effective monitoring for the future.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scaling Disk Size in a StatefulSet
&lt;/h2&gt;

&lt;p&gt;Let's streamline the process of increasing the size of a Neo4j StatefulSet in an EKS cluster:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: The following steps will result in downtime, so ensure your business can accommodate this.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ol&gt;
&lt;li&gt;Set/Ensure allowVolumeExpansion: true:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Edit your Storage class using the command:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl edit storageClass gp2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Add or confirm the presence of allowVolumeExpansion: true:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;provisioner: kubernetes.io/aws-ebs
......
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Delete your StatefulSet (STS):
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl delete sts --cascade=orphan neo4j-cluster
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Edit your Persistent Volume Claim (PVC):
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; kubectl edit pvc data-neo4j-cluster-0 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Modify the &lt;code&gt;spec&lt;/code&gt; to the desired size.&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Update your Helm Chart:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Adjust your Helm chart values.yaml to reflect the changes:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;volumes:
  data:
    mode: defaultStorageClass
    defaultStorageClass:
      accessModes:
        - ReadWriteOnce
      requests:
        storage: 50Gi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Upgrade your Helm Chart:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install neo4j-cluster neo4j/neo4j --namespace neo4j --values values.yaml --version v5.15.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Automation
&lt;/h2&gt;

&lt;p&gt;I always look for the possibilities to automate the code so you can take all these steps and write a simple bash script which can do the job for you.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#!/usr/bin/env bash

kubectl delete sts STATEFULSET_NAME
kubectl patch pvc PVC_NAME -p '{"spec": {"resources": {"requests": {"storage": "50Gi"}}}}'
helm upgrade --install neo4j-cluster neo4j/neo4j --namespace neo4j --values values.yaml --version v5.15.0
kubectl get pvc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For further insights or any questions, connect with me on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; &lt;a href="https://www.linkedin.com/in/a7medzidan/" rel="noopener noreferrer"&gt;Linkedin&lt;/a&gt; &lt;/li&gt;
&lt;li&gt; &lt;a href="https://twitter.com/27medzidann" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>eks</category>
      <category>k8s</category>
      <category>aws</category>
      <category>dailytask</category>
    </item>
    <item>
      <title>Mastering Google Cloud Developer Exam</title>
      <dc:creator>Ahmed Zidan</dc:creator>
      <pubDate>Tue, 23 Jan 2024 02:24:25 +0000</pubDate>
      <link>https://dev.to/ahmedzidan/mastering-google-cloud-developer-exam-19bl</link>
      <guid>https://dev.to/ahmedzidan/mastering-google-cloud-developer-exam-19bl</guid>
      <description>&lt;p&gt;Embarking on the journey to become a Professional Cloud Developer? Here's a comprehensive guide to help you ace the exam by mastering key tools, practices, and services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyp18o97ea7s33k1tzuyg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyp18o97ea7s33k1tzuyg.png" alt="Professional Cloud Developer" width="800" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Building &amp;amp; Testing Applications 🛠️
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Develop Locally:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://minikube.sigs.k8s.io/docs/start/" rel="noopener noreferrer"&gt;minikube&lt;/a&gt;: Local Kubernetes for easy learning and development.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://skaffold.dev" rel="noopener noreferrer"&gt;Skaffold&lt;/a&gt; : Fast, repeatable, simple container &amp;amp; Kubernetes development.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://www.packer.io/" rel="noopener noreferrer"&gt;Packer&lt;/a&gt;: Automate image builds for efficiency.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://buildpacks.io/" rel="noopener noreferrer"&gt;buildpacks&lt;/a&gt; : transform your application source code into images that can run on any cloud &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://code.google.com/" rel="noopener noreferrer"&gt;Google code&lt;/a&gt; : use Google code or you can integrate it with &lt;code&gt;vs code&lt;/code&gt; and use it as extention.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="https://cloud.google.com/pubsub/docs/emulator" rel="noopener noreferrer"&gt;Google Emulator&lt;/a&gt;: to emulate GCP Services for local Application development currently you can use it for &lt;code&gt;bigtable&lt;/code&gt;, &lt;code&gt;datastore&lt;/code&gt;,  &lt;code&gt;firestore&lt;/code&gt; and &lt;code&gt;pub/sub&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Build Your App:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Artifact&lt;/code&gt;  collection of source code, dependencies, configuration files, binaries, etc, which can be build using different processes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Google Container Registry&lt;/code&gt;:  Store your container image.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Artifact Registry&lt;/code&gt; : The next generation of Container Registry. Store, manage, and secure your build artifacts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Cloud Source Repositories&lt;/code&gt;:  Offers a private git repository &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Cloud build&lt;/code&gt;: where you can build you CI/CD and nicely intgreted with other Google services &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Testing Tools:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;Load test&lt;/code&gt; where you stress your application with a heavy load.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run before putting the application into production&lt;/li&gt;
&lt;li&gt;Design to simulate the real-world traffic as closely as possible &lt;/li&gt;
&lt;li&gt;Test the maximum load you expect to encounter &lt;/li&gt;
&lt;li&gt;Test how your Google Cloud costs increase as the number of users increase &lt;/li&gt;
&lt;li&gt;Test the application when traffic suddenly increases&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;Resilence tests&lt;/code&gt; where you see what happens when various infrastructure components fails.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test infrastructure failures &lt;/li&gt;
&lt;li&gt;App should keep running&lt;/li&gt;
&lt;li&gt;Example terminate a random instance within an auto scaling group&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;Vulnerability tests&lt;/code&gt;  where you see if your application can withstand hacker attacks.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Peer review: Developer check each other's code&lt;/li&gt;
&lt;li&gt;Integrate static code analysis tools such as &lt;code&gt;the vulnerability scanning featur of Google's container analysis service&lt;/code&gt; into CICD pipeline&lt;/li&gt;
&lt;li&gt;At least once a year you should run &lt;code&gt;penetration tests&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Deploying Applications 🚀
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Explore:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;GKE (Standard &amp;amp; Autopilot):&lt;/code&gt; Google Kubernetes Engine for microservices.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;-&lt;code&gt;Cloud Run:&lt;/code&gt; Fully managed service for deploying containers.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Pub/Sub:&lt;/code&gt; Scalable messaging service for decoupled services.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;App Engine:&lt;/code&gt; Build server-side rendered websites.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;A/B Testing:&lt;/code&gt; Test a feature on a small set of users.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Feature Flags:&lt;/code&gt; Control to toggle features on and off.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Backward Compatibility:&lt;/code&gt; Ensure app works with older versions.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Managing Deployed Applications 📊
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Leverage Google Services:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Cloud Logging:&lt;/code&gt; Store logs securely at an exabyte scale.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Cloud Monitoring:&lt;/code&gt; Store application metrics.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;SLO/SLI-based Alerting:&lt;/code&gt; Create SLI-based alerts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Cloud Profiler:&lt;/code&gt; Continuous CPU and memory profiling.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Cloud Trace:&lt;/code&gt; Distributed tracing system for latency data.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Open Telemetry:&lt;/code&gt; Portable telemetry for effective observability.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Error Reporting:&lt;/code&gt; Real-time exception monitoring and alerting.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Designing Cloud-Native Applications 🌐
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Skills to Showcase:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Deploying Code:&lt;/code&gt; GKE, Cloud Run, App Engine, Cloud Function, VM.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Caching Solutions:&lt;/code&gt; Memcache, Redis.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Asynchronous Apps:&lt;/code&gt; Apache Kafka, Pub/Sub, Eventarc.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;code&gt;Databases:&lt;/code&gt; Cloud SQL, Cloud Spanner, BigQuery, Firestore, Datastore.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ready to ace the exam? Dive deeper into each category and own your Google Cloud Developer journey! 💡🚀&lt;/p&gt;

&lt;p&gt;Connect with me on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; &lt;a href="https://www.linkedin.com/in/a7medzidan/" rel="noopener noreferrer"&gt;Linkedin&lt;/a&gt; &lt;/li&gt;
&lt;li&gt; &lt;a href="https://twitter.com/27medzidann" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>gcp</category>
      <category>dailytask</category>
    </item>
  </channel>
</rss>
