<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Andrew</title>
    <description>The latest articles on DEV Community by Andrew (@and-ratajski).</description>
    <link>https://dev.to/and-ratajski</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3253620%2Fae9a6073-4d1d-4d5a-b0cd-432390b7c501.jpeg</url>
      <title>DEV Community: Andrew</title>
      <link>https://dev.to/and-ratajski</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/and-ratajski"/>
    <language>en</language>
    <item>
      <title>Setting Up a GitLab CI/CD Pipeline with DigitalOcean Kubernetes</title>
      <dc:creator>Andrew</dc:creator>
      <pubDate>Sun, 05 Oct 2025 16:47:31 +0000</pubDate>
      <link>https://dev.to/and-ratajski/setting-up-a-gitlab-cicd-pipeline-with-digitalocean-kubernetes-1imn</link>
      <guid>https://dev.to/and-ratajski/setting-up-a-gitlab-cicd-pipeline-with-digitalocean-kubernetes-1imn</guid>
      <description>&lt;p&gt;&lt;em&gt;A pragmatic guide to container-based deployments that won't break the bank&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;This guide walks you through setting up a complete CI/CD pipeline using GitLab's free tier and a DigitalOcean &lt;a href="https://www.digitalocean.com/?refcode=d8ad772258e7&amp;amp;utm_campaign=Referral_Invite&amp;amp;utm_medium=Referral_Program&amp;amp;utm_source=badge" rel="noopener noreferrer"&gt;DigitalOcean&lt;/a&gt; Kubernetes cluster. The approach is straightforward: build Docker images, push them to GitLab's Container Registry, and roll them out to your Kubernetes cluster. No fancy GitOps operators, no complex Helm charts—just good old-fashioned image tags and &lt;code&gt;kubectl&lt;/code&gt; commands.&lt;/p&gt;

&lt;h2&gt;
  
  
  Assumptions
&lt;/h2&gt;

&lt;p&gt;ℹ You're comfortable with containerization concepts and have written a Dockerfile or two&lt;br&gt;
ℹ You understand basic Kubernetes primitives (Deployments, Services, Secrets)&lt;br&gt;
ℹ You have Kubernetes CRDs already defined in your cluster (or a separate repository)&lt;br&gt;
ℹ Your deployment strategy is image-tag-based: new code = new image = new deployment&lt;br&gt;
ℹ You're okay with an imperative deployment approach (we'll discuss the trade-offs)&lt;/p&gt;
&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Accounts &amp;amp; Infrastructure:&lt;/strong&gt;&lt;br&gt;
✅ GitLab account (free tier is sufficient)&lt;br&gt;
✅ DigitalOcean account with an existing Kubernetes cluster (even the cheapest $12/mo cluster works perfectly)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Knowledge:&lt;/strong&gt;&lt;br&gt;
🧠 Basic understanding of Docker and container registries&lt;br&gt;
🧠 Familiarity with Kubernetes concepts: Secrets, Deployments, and rollouts&lt;br&gt;
🧠 Git basics (you're already here, so you're good)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Project Setup:&lt;/strong&gt;&lt;br&gt;
➡️ A &lt;code&gt;docker-compose.ci.yml&lt;/code&gt; file that defines your build configuration&lt;br&gt;
➡️ A Dockerfile that accepts a &lt;code&gt;BUILD_HASH&lt;/code&gt; argument:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;...
# Build arguments for version information aligned with docker image tag and git commit hash
&lt;span class="k"&gt;ARG&lt;/span&gt;&lt;span class="s"&gt; BUILD_HASH=NOT_SET&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; BUILD_HASH=${BUILD_HASH}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: The &lt;code&gt;docker-compose.ci.yml&lt;/code&gt; expects the &lt;code&gt;CI_COMMIT_SHORT_SHA&lt;/code&gt; variable, which gets passed to your Dockerfile as the &lt;code&gt;BUILD_HASH&lt;/code&gt; argument.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;    &lt;span class="s"&gt;...&lt;/span&gt;
    &lt;span class="s"&gt;build&lt;/span&gt;&lt;span class="err"&gt;:&lt;/span&gt;
     &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./your-context-dir&lt;/span&gt;
     &lt;span class="na"&gt;dockerfile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfile&lt;/span&gt;
     &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
       &lt;span class="na"&gt;BUILD_HASH&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${CI_COMMIT_SHORT_SHA}"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 1: Create GitLab Access Token
&lt;/h2&gt;

&lt;p&gt;Since GitLab's free tier doesn't support group-level access tokens, we'll create one at the project level:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Navigate to your project in GitLab&lt;/li&gt;
&lt;li&gt;Go to &lt;strong&gt;Settings → Access Tokens&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Add new token&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Configure the token:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Name:&lt;/strong&gt; Something descriptive like "k8s-pull-token"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scopes:&lt;/strong&gt; Select at minimum:

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;read_repository&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;read_registry&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;write_registry&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Create project access token&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Important:&lt;/strong&gt; Copy the token immediately—you won't see it again!&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxxqk0unvdd12alnvh2so.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxxqk0unvdd12alnvh2so.png" alt=" " width="800" height="189"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Create DigitalOcean Access Token
&lt;/h2&gt;

&lt;p&gt;Your CI/CD pipeline needs to authenticate with DigitalOcean to deploy to your Kubernetes cluster:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://cloud.digitalocean.com/account/api/tokens" rel="noopener noreferrer"&gt;DigitalOcean API Tokens&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Generate New Token&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Configure with minimal required scopes:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Read:&lt;/strong&gt; &lt;code&gt;image&lt;/code&gt;, &lt;code&gt;kubernetes&lt;/code&gt;, &lt;code&gt;load_balancer&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update:&lt;/strong&gt; &lt;code&gt;kubernetes&lt;/code&gt;, &lt;code&gt;load_balancer&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Other:&lt;/strong&gt; &lt;code&gt;access_cluster&lt;/code&gt; (kubernetes)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Save the token securely&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6oifrf37t77hlnz0dxr0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6oifrf37t77hlnz0dxr0.png" alt=" " width="800" height="652"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security Note:&lt;/strong&gt; The CI/CD runner will create short-lived tokens upon authorization, so this persistent token is only used for initial authentication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Create Kubernetes Docker Registry Secret
&lt;/h2&gt;

&lt;p&gt;Your Kubernetes cluster needs credentials to pull images from GitLab's Container Registry. Here's how to create the secret:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl create secret docker-registry regcred &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--docker-server&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;registry.gitlab.com/&amp;lt;your-group&amp;gt;/&amp;lt;your-project&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--docker-username&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;gitlab-username&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--docker-password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;token-from-step-1&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--docker-email&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&amp;lt;your-email&amp;gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--dry-run&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;client &lt;span class="nt"&gt;-o&lt;/span&gt; yaml &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; regcred-secret.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your generated &lt;code&gt;regcred-secret.yml&lt;/code&gt; should look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;kubernetes.io/dockerconfigjson&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Secret&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;regcred&lt;/span&gt;
  &lt;span class="na"&gt;annotations&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;reflector.v1.k8s.emberstack.com/reflection-allowed&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
    &lt;span class="na"&gt;reflector.v1.k8s.emberstack.com/reflection-auto-enabled&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;.dockerconfigjson&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;&amp;lt;base64-encoded-secret&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; Notice the &lt;code&gt;reflector&lt;/code&gt; annotations? If you're using &lt;a href="https://github.com/emberstack/kubernetes-reflector" rel="noopener noreferrer"&gt;Emberstack's Kubernetes Reflector&lt;/a&gt;, this automatically syncs the secret across namespaces. One secret to rule them all. If you're not using Reflector, you'll need to create this secret in each namespace that needs to pull images.&lt;/p&gt;

&lt;p&gt;Apply the secret to your cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; regcred-secret.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 4: Configure GitLab CI/CD Variables
&lt;/h2&gt;

&lt;p&gt;Add your DigitalOcean token to GitLab's CI/CD variables:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In your GitLab project, go to &lt;strong&gt;Settings → CI/CD&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Expand the &lt;strong&gt;Variables&lt;/strong&gt; section&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Add variable&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Configure:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Key:&lt;/strong&gt; &lt;code&gt;DO_ACCESS_TOKEN&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Value:&lt;/strong&gt; Your DigitalOcean token from Step 2&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type:&lt;/strong&gt; Variable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flags:&lt;/strong&gt; Check "Mask variable" (for security)&lt;/li&gt;
&lt;li&gt;Leave other flags unchecked unless you need environment-specific protection&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Step 5: Understanding the Pipeline
&lt;/h2&gt;

&lt;p&gt;Now for the main event—the &lt;code&gt;.gitlab-ci.yml&lt;/code&gt; file. This pipeline has three stages that mirror a typical deployment workflow:&lt;/p&gt;

&lt;h3&gt;
  
  
  Pipeline Stages Overview
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;stages&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;deploy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple, sequential, predictable. Let's break down each stage:&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 1: &lt;code&gt;docker-test&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;docker-test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;test&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker:dind&lt;/span&gt;
  &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docker:dind"&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_PIPELINE_SOURCE == "merge_request_event"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_COMMIT_BRANCH&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo Placeholder for running tests&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt; Runs your test suite in a Docker-in-Docker environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When it runs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;On all merge requests&lt;/li&gt;
&lt;li&gt;On all branch commits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Variables used:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CI_PIPELINE_SOURCE&lt;/code&gt;: GitLab's built-in variable indicating how the pipeline was triggered&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CI_COMMIT_BRANCH&lt;/code&gt;: The branch being built&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Currently, this is a placeholder. Replace the echo with actual test commands like &lt;code&gt;docker compose -f docker-compose.ci.yml run tests&lt;/code&gt; or similar.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 2: &lt;code&gt;docker-build&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;docker-build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;build&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker:dind&lt;/span&gt;
  &lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;docker:dind"&lt;/span&gt;
  &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;docker-test&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Build without push on merge requests to main&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_PIPELINE_SOURCE == "merge_request_event" &amp;amp;&amp;amp; $CI_MERGE_REQUEST_TARGET_BRANCH_NAME == "main"&lt;/span&gt;
      &lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;SHOULD_PUSH&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;false"&lt;/span&gt;
    &lt;span class="c1"&gt;# Build and push on commits to main branch&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_COMMIT_BRANCH == "main"&lt;/span&gt;
      &lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;SHOULD_PUSH&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo "Commit SHA is $CI_COMMIT_SHORT_SHA"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo "$CI_REGISTRY_PASSWORD" | docker login $CI_REGISTRY -u $CI_REGISTRY_USER --password-stdin&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;docker compose --file docker-compose.ci.yml build&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
      &lt;span class="s"&gt;if [ "$SHOULD_PUSH" = "true" ]; then&lt;/span&gt;
        &lt;span class="s"&gt;echo "Pushing image to registry..."&lt;/span&gt;
        &lt;span class="s"&gt;docker push registry.gitlab.com/group/project/image:$CI_COMMIT_SHORT_SHA&lt;/span&gt;
      &lt;span class="s"&gt;else&lt;/span&gt;
        &lt;span class="s"&gt;echo "Skipping push (merge request build)"&lt;/span&gt;
      &lt;span class="s"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt; Builds your Docker image and conditionally pushes it to GitLab's Container Registry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When it runs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;On merge requests targeting &lt;code&gt;main&lt;/code&gt; (builds but doesn't push)&lt;/li&gt;
&lt;li&gt;On commits to the &lt;code&gt;main&lt;/code&gt; branch (builds and pushes)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Variables used:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;CI_COMMIT_SHORT_SHA&lt;/code&gt;: Short Git commit SHA used as image tag&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CI_REGISTRY&lt;/code&gt;: GitLab's container registry URL (automatically provided)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CI_REGISTRY_USER&lt;/code&gt;: Registry username (automatically provided)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CI_REGISTRY_PASSWORD&lt;/code&gt;: Registry password (automatically provided)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CI_PIPELINE_SOURCE&lt;/code&gt;: Pipeline trigger source&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CI_MERGE_REQUEST_TARGET_BRANCH_NAME&lt;/code&gt;: Target branch for merge requests&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CI_COMMIT_BRANCH&lt;/code&gt;: Current branch name&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SHOULD_PUSH&lt;/code&gt;: Custom variable controlling whether to push the image&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The smart bit:&lt;/strong&gt; Merge requests get a full build validation without polluting your registry. Only successful merges to &lt;code&gt;main&lt;/code&gt; result in pushed images. This keeps your registry clean and your deployments intentional.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stage 3: &lt;code&gt;k8s-deploy&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;k8s-deploy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;stage&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;deploy&lt;/span&gt;
  &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;debian:bookworm-slim&lt;/span&gt;
  &lt;span class="na"&gt;needs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;docker-build&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;$CI_COMMIT_BRANCH == "main"&lt;/span&gt;
      &lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;K8S_CLUSTER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;66250a2e-6ag4-48e6-a857-a578c754fa3b"&lt;/span&gt;
        &lt;span class="na"&gt;K8S_NAMESPACE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-deployment-namespace"&lt;/span&gt;
        &lt;span class="na"&gt;K8S_DEPLOYMENT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-deployment-name"&lt;/span&gt;
  &lt;span class="na"&gt;before_script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;apt-get update &amp;amp;&amp;amp; apt-get install -y curl ca-certificates&lt;/span&gt;
    &lt;span class="c1"&gt;# Install doctl&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;cd /tmp&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;curl -sL https://github.com/digitalocean/doctl/releases/download/v1.104.0/doctl-1.104.0-linux-amd64.tar.gz | tar -xzv&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;mv doctl /usr/local/bin/&lt;/span&gt;
    &lt;span class="c1"&gt;# Install kubectl&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;chmod +x kubectl &amp;amp;&amp;amp; mv kubectl /usr/local/bin/&lt;/span&gt;
    &lt;span class="c1"&gt;# Authenticate with DigitalOcean and configure kubectl&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;doctl auth init --access-token $DO_ACCESS_TOKEN&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;doctl kubernetes cluster kubeconfig save $K8S_CLUSTER&lt;/span&gt;
  &lt;span class="na"&gt;script&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo "Deploying image with tag $CI_COMMIT_SHORT_SHA to Kubernetes..."&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kubectl -n $K8S_NAMESPACE set image deployment.apps/$K8S_DEPLOYMENT container-name=registry.gitlab.com/group/project/image:$CI_COMMIT_SHORT_SHA&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo "Waiting for rollout to complete..."&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;kubectl -n $K8S_NAMESPACE rollout status deployment.apps/$K8S_DEPLOYMENT --timeout=5m&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;echo "Deployment successful!"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What it does:&lt;/strong&gt; Deploys your new image to the Kubernetes cluster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When it runs:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Only on commits to the &lt;code&gt;main&lt;/code&gt; branch&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Variables used:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;K8S_CLUSTER&lt;/code&gt;: Your DigitalOcean Kubernetes cluster ID&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;K8S_NAMESPACE&lt;/code&gt;: Target Kubernetes namespace&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;K8S_DEPLOYMENT&lt;/code&gt;: Name of the Deployment to update&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DO_ACCESS_TOKEN&lt;/code&gt;: DigitalOcean API token (from Step 4)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CI_COMMIT_SHORT_SHA&lt;/code&gt;: Image tag to deploy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The deployment dance:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Setup tooling:&lt;/strong&gt; Installs &lt;code&gt;doctl&lt;/code&gt; (DigitalOcean CLI) and &lt;code&gt;kubectl&lt;/code&gt; in a minimal Debian container&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authenticate:&lt;/strong&gt; Uses your DO token to authenticate and fetch cluster credentials&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update deployment:&lt;/strong&gt; Uses &lt;code&gt;kubectl set image&lt;/code&gt; to update the container image tag&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wait and verify:&lt;/strong&gt; Monitors the rollout status with a 5-minute timeout&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; The &lt;code&gt;kubectl rollout status&lt;/code&gt; command exits with status 1 if the deployment fails. This means your pipeline will fail if pods don't come up healthy, which is exactly what you want—fast feedback on broken deployments.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Imperative Approach: Trade-offs and Considerations
&lt;/h2&gt;

&lt;p&gt;Let's address the elephant in the room: this is an imperative deployment strategy. We're directly telling Kubernetes what to do with &lt;code&gt;kubectl set image&lt;/code&gt;, not declaring desired state in Git (GitOps) or using sophisticated deployment tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Drawbacks:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No Git-based audit trail of what's deployed (only CI/CD logs)&lt;/li&gt;
&lt;li&gt;Rollbacks require re-running old pipelines or manual intervention&lt;/li&gt;
&lt;li&gt;State drift if someone manually changes the deployment&lt;/li&gt;
&lt;li&gt;Doesn't scale well to complex, multi-service deployments&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why it works here:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're only updating the image tag—the simplest possible change&lt;/li&gt;
&lt;li&gt;Your Kubernetes CRDs are version-controlled elsewhere&lt;/li&gt;
&lt;li&gt;The pipeline is the single deployment pathway (no manual kubectl cowboys)&lt;/li&gt;
&lt;li&gt;It's dead simple to understand and debug&lt;/li&gt;
&lt;li&gt;Perfect for small teams and straightforward applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As long as you're disciplined about only using this pipeline for deployments and keeping your CRDs under version control, this approach is pragmatic and effective.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Your Pipeline
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Create a feature branch and make a small change&lt;/li&gt;
&lt;li&gt;Push and watch the &lt;code&gt;docker-test&lt;/code&gt; and &lt;code&gt;docker-build&lt;/code&gt; stages run&lt;/li&gt;
&lt;li&gt;Open a merge request to &lt;code&gt;main&lt;/code&gt;—the build should run but skip the push&lt;/li&gt;
&lt;li&gt;Merge to &lt;code&gt;main&lt;/code&gt; and watch the full pipeline execute&lt;/li&gt;
&lt;li&gt;Check your Kubernetes cluster: &lt;code&gt;kubectl -n tools-prod get pods -w&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;You now have a working CI/CD pipeline that costs roughly $12/month (just the Kubernetes cluster—GitLab and the CI runners are free). Not bad for automated deployments that go from git push to production in minutes.&lt;/p&gt;

&lt;p&gt;The beauty of this approach is its simplicity. No vendor lock-in, no complex tooling, just Docker, Kubernetes, and a dash of CI/CD glue. Sure, it's not the most sophisticated setup you'll ever see, but it works, it's maintainable, and most importantly—you actually understand what's happening.&lt;/p&gt;

&lt;p&gt;Now go forth and deploy with confidence. Your friends asked how you did it; now you can show them!&lt;/p&gt;

</description>
      <category>cicd</category>
      <category>gitlab</category>
      <category>kubernetes</category>
      <category>automation</category>
    </item>
    <item>
      <title>Building Event-Driven Architecture with MSK and Lambda: The Python Developer's Guide to Not Shooting Yourself in the Foot</title>
      <dc:creator>Andrew</dc:creator>
      <pubDate>Tue, 22 Jul 2025 20:13:50 +0000</pubDate>
      <link>https://dev.to/and-ratajski/building-event-driven-architecture-with-msk-and-lambda-the-python-developers-guide-to-not-5f7k</link>
      <guid>https://dev.to/and-ratajski/building-event-driven-architecture-with-msk-and-lambda-the-python-developers-guide-to-not-5f7k</guid>
      <description>&lt;p&gt;&lt;em&gt;Breaking free from traditional Kafka patterns when AWS does the heavy lifting&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Allure of "Serverless" Kafka
&lt;/h2&gt;

&lt;p&gt;Picture this: You're tasked with building an event-driven solution for your business. Naturally, you're a serverless enthusiast so you look at AWS and what you see?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AWS MSK for managed Kafka? Check ✅&lt;/li&gt;
&lt;li&gt;Python Lambda for serverless compute? Check ✅&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;confluent-kafka&lt;/code&gt; library for that sweet, sweet performance? Double check ✅✅&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You fire up your IDE, start writing familiar Kafka consumer code with &lt;code&gt;.poll()&lt;/code&gt; loops, and then... reality hits. This isn't your typical Kafka setup. Welcome to the world of Lambda Event Source Mappings (ESM), where everything you know about Kafka consumers gets turned upside down.&lt;/p&gt;

&lt;p&gt;After building multiple production EDA systems with this exact stack, I've learned that success isn't about fighting the constraints — it's about embracing them. Here's how I'd now start an on-boarding session:&lt;/p&gt;

&lt;h2&gt;
  
  
  The Mental Model Shift Nobody Warns You About
&lt;/h2&gt;

&lt;h3&gt;
  
  
  From Pull to Push: Your Consumer Doesn't Consume Anymore
&lt;/h3&gt;

&lt;p&gt;The biggest mind-bender? &lt;strong&gt;You don't write Kafka consumers anymore.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# What you THINK you'll write (traditional Kafka)
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;traditional_kafka_consumer&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;consumer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Consumer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subscribe&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;poll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;process_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# What you ACTUALLY write (Lambda ESM)
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# ESM already consumed the messages for you
&lt;/span&gt;    &lt;span class="c1"&gt;# event['records'] contains the batch
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="nf"&gt;process_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AWS Event Source Mapping becomes your consumer. It polls Kafka, manages offsets, handles retries, and pushes batches to your Lambda. You just get handed a batch of messages and told "deal with it."&lt;/p&gt;

&lt;p&gt;This shift brings some harsh realities:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ No control over polling intervals&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
ESM decides when to poll (500ms batching window). You can't implement backpressure or custom polling strategies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Little control over pollers number&lt;/strong&gt;&lt;br&gt;
ESM &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/msk-scaling-modes.html" rel="noopener noreferrer"&gt;event pollers&lt;/a&gt; are said to scale automatically based on the load, generating a lot of consumer group rebalances in practice (&lt;a href="https://aws.amazon.com/about-aws/whats-new/2024/11/aws-lambda-provisioned-mode-kafka-esms/" rel="noopener noreferrer"&gt;provisioned mode&lt;/a&gt; does help).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Batch size becomes critical&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Configure it wrong, and you'll either overwhelm your Lambda (too big) or waste invocations (too small). Start with 10-50 messages and tune based on your processing time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ No per-message error handling&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
One message fails? The entire batch gets reprocessed. Design for idempotency from day one.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Batch Failure Reality Check
&lt;/h2&gt;

&lt;p&gt;Here's the kicker that catches everyone off-guard: &lt;strong&gt;Lambda failure semantics are brutal for Kafka&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;processed_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;process_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# This might fail on message #47
&lt;/span&gt;            &lt;span class="n"&gt;processed_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed processing message: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt;  &lt;span class="c1"&gt;# Entire batch gets retried
&lt;/span&gt;
    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully processed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;processed_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# If we get here, all messages succeeded
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When your Lambda throws an exception:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;⚠️ All processed messages get reprocessed&lt;/li&gt;
&lt;li&gt;⚠️ The failing message gets another chance
&lt;/li&gt;
&lt;li&gt;⚠️ Messages after the failure also get reprocessed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Forget about exactly-once semantics.&lt;/strong&gt; You're in at-least-once territory now. Make everything idempotent or suffer the consequences.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dead Letter Queues: Your Safety Net
&lt;/h2&gt;

&lt;p&gt;Configure a Dead Letter Topic &lt;strong&gt;before&lt;/strong&gt; you go to production. Trust me on this one.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ESM Configuration (Terraform example)
&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aws_lambda_event_source_mapping&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kafka_trigger&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;event_source_arn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_msk_cluster&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arn&lt;/span&gt;
  &lt;span class="n"&gt;function_name&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_lambda_function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arn&lt;/span&gt;

  &lt;span class="n"&gt;topics&lt;/span&gt;                             &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="n"&gt;batch_size&lt;/span&gt;                         &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;
  &lt;span class="n"&gt;maximum_batching_window_in_seconds&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;

  &lt;span class="c1"&gt;# This saves your bacon
&lt;/span&gt;  &lt;span class="n"&gt;destination_config&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;on_failure&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;destination_arn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;aws_sns_topic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dlq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arn&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;# Retry 3 times before giving up
&lt;/span&gt;  &lt;span class="n"&gt;maximum_retry_attempts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without DLQ setup, poison messages will block your entire partition indefinitely. I've seen production systems grind to a halt because of one malformed message.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Producer Side: Where C Meets Python
&lt;/h2&gt;

&lt;p&gt;Here's where things get interesting. While your Lambda receives messages through ESM, you'll still need to &lt;strong&gt;produce&lt;/strong&gt; messages to other topics. This is where &lt;code&gt;confluent-kafka&lt;/code&gt; shines — and where logging becomes critical.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;

&lt;span class="c1"&gt;# DO THIS: Pass your logger to the producer
&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;producer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Producer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bootstrap.servers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MSK_BOOTSTRAP_SERVERS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;security.protocol&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SASL_SSL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sasl.mechanisms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_MSK_IAM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sasl.username&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_MSK_IAM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sasl.password&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_MSK_IAM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# ← This line saves hours of debugging
&lt;/span&gt;
&lt;span class="c1"&gt;# Without the logger, librdkafka errors vanish into the void
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why does this matter? The &lt;code&gt;confluent-kafka&lt;/code&gt; library is a wrapper around the C library &lt;code&gt;librdkafka&lt;/code&gt;. When network issues, authentication failures, or broker problems occur, those errors happen in C-land. Without explicitly passing your logger, you'll see your messages disappear into the ether with zero indication of what went wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Static Initialization: The Cold Start Dance
&lt;/h2&gt;

&lt;p&gt;Lambda's cold start behavior creates interesting challenges for Kafka producers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;

&lt;span class="c1"&gt;# Static initialization - runs during cold start
&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setLevel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INFO&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize producer outside the handler
&lt;/span&gt;&lt;span class="n"&gt;producer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Producer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bootstrap.servers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MSK_BOOTSTRAP_SERVERS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;security.protocol&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SASL_SSL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sasl.mechanisms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_MSK_IAM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sasl.username&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_MSK_IAM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sasl.password&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS_MSK_IAM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;lambda_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Check connectivity in the handler
&lt;/span&gt;    &lt;span class="c1"&gt;# Producer might have disconnected during cold periods
&lt;/span&gt;    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Quick connectivity check
&lt;/span&gt;        &lt;span class="n"&gt;metadata&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_topics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Connected to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;brokers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; brokers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Kafka connectivity issue: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Consider reinitializing producer here
&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="nf"&gt;process_and_produce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Always flush before handler completes
&lt;/span&gt;    &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; You'll be surprised how often connections drop during Lambda's idle periods. Always verify connectivity at the start of your handler.&lt;/p&gt;

&lt;h2&gt;
  
  
  SnapStart: Just Don't
&lt;/h2&gt;

&lt;p&gt;AWS Lambda &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html" rel="noopener noreferrer"&gt;SnapStart&lt;/a&gt; now supports Python 3.12, and you might be tempted to enable it for faster cold starts. &lt;strong&gt;Don't.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Remember those uniqueness issues we discussed with SnapStart? The &lt;code&gt;librdkafka&lt;/code&gt; C library is essentially a black box. It maintains internal state, generates unique IDs, establishes connections, and manages all sorts of random number generation for things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Client IDs and correlation IDs&lt;/li&gt;
&lt;li&gt;Connection retry jitter&lt;/li&gt;
&lt;li&gt;Internal message sequencing&lt;/li&gt;
&lt;li&gt;SSL session management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When SnapStart creates a snapshot and reuses it across multiple execution environments, you risk:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Duplicate client IDs connecting to brokers&lt;/li&gt;
&lt;li&gt;Non-unique message correlation IDs&lt;/li&gt;
&lt;li&gt;Shared SSL sessions across instances&lt;/li&gt;
&lt;li&gt;Broken internal state assumptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The performance gain from SnapStart isn't worth the debugging nightmare when your Kafka producers start behaving erratically.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Philosophy Shift
&lt;/h2&gt;

&lt;p&gt;Building EDA with Lambda ESM requires a different mindset:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Embrace push-based thinking&lt;/strong&gt; - You react to batches, not control consumption&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design for idempotency&lt;/strong&gt; - Messages will be reprocessed, plan for it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor batch metrics&lt;/strong&gt; - Tune batch size based on processing time, not message count&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Invest in observability&lt;/strong&gt; - When things go wrong, you need visibility into both Lambda and Kafka metrics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan for failure modes&lt;/strong&gt; - DLQ configuration is not optional&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;MSK + Lambda + &lt;code&gt;confluent-kafka-python&lt;/code&gt; is a powerful stack for event-driven systems, but it's not traditional Kafka development. The constraints imposed by Event Source Mapping fundamentally change how you think about consuming messages.&lt;/p&gt;

&lt;p&gt;Stop fighting the platform and start designing with it. Your consumers become stateless message processors. Your error handling becomes batch-oriented. Your observability becomes multi-layered.&lt;/p&gt;

&lt;p&gt;Once you embrace these patterns, you'll find that serverless EDA can be incredibly productive. Just don't expect it to work like the Kafka applications you're used to building.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ready to dive deeper?&lt;/strong&gt; Check out the &lt;a href="https://docs.aws.amazon.com/lambda/latest/dg/with-msk.html" rel="noopener noreferrer"&gt;AWS Lambda ESM documentation&lt;/a&gt; and start small. Build a simple message processor first, then gradually add complexity as you understand the platform's quirks.&lt;/p&gt;

&lt;p&gt;What's been your biggest surprise when building serverless event-driven systems? Share your war stories in the comments below! &lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have you wrestled with similar challenges in your EDA journey? Found other gotchas worth sharing? Drop them in the comments—the community learns from our collective debugging pain! 😅&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>kafka</category>
      <category>eventdriven</category>
      <category>lambda</category>
    </item>
    <item>
      <title>Testing Kafka Applications: Why Most Pythonistas Are Doing It Wrong (And How to Fix It)</title>
      <dc:creator>Andrew</dc:creator>
      <pubDate>Tue, 10 Jun 2025 16:24:51 +0000</pubDate>
      <link>https://dev.to/and-ratajski/testing-kafka-applications-why-most-pythonistas-are-doing-it-wrong-and-how-to-fix-it-31m2</link>
      <guid>https://dev.to/and-ratajski/testing-kafka-applications-why-most-pythonistas-are-doing-it-wrong-and-how-to-fix-it-31m2</guid>
      <description>&lt;p&gt;&lt;em&gt;Breaking free from the integration testing nightmare with kafka-mocha&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Harsh Reality of Kafka Testing in Python
&lt;/h2&gt;

&lt;p&gt;Picture this: You're building a microservice with Kafka integration. You've written beautiful business logic, carefully crafted your event schemas, and implemented robust error handling. Then comes the dreaded question: "How do you test this?"&lt;/p&gt;

&lt;p&gt;If you're like most Python developers working with Event-Driven Architecture (EDA), you probably fall into one of these camps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The Optimist&lt;/strong&gt;: "I'll just spin up a Kafka cluster for testing"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Pragmatist&lt;/strong&gt;: "Unit tests with mocks should be enough"
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Procrastinator&lt;/strong&gt;: "We'll test it in production" 😬&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After years of building production Kafka systems in Python, I've discovered that &lt;strong&gt;all three approaches are fundamentally flawed&lt;/strong&gt;. Here's why—and how we can do better.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Testing Gap Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Let's be honest: despite everyone preaching the importance of testing, most developers barely write anything beyond bootcamp-level unit tests. When it comes to Kafka applications, this problem becomes exponentially worse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unit tests&lt;/strong&gt; mock everything away—they tell you if your &lt;code&gt;json.loads()&lt;/code&gt; works, but nothing about whether your serialization actually matches your schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;End-to-end tests&lt;/strong&gt; with real Kafka clusters are brittle, slow, and require complex infrastructure. They're great for final validation but terrible for rapid development cycles.&lt;/p&gt;

&lt;p&gt;What we're missing is the sweet spot: &lt;strong&gt;true integration tests&lt;/strong&gt; that validate how components within your microservice work together—your producers, consumers, serializers, and business logic—without external dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Integration Testing Really Means
&lt;/h2&gt;

&lt;p&gt;Let's clarify terminology because this confusion costs teams months of debugging:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unit Tests&lt;/strong&gt;: Test individual functions in isolation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration Tests&lt;/strong&gt;: Test how components &lt;em&gt;within your service&lt;/em&gt; work together
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;End-to-End Tests&lt;/strong&gt;: Test complete user flows across multiple services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most Kafka testing problems stem from trying to do integration testing with unit test tools (excessive mocking) or e2e test infrastructure (full Kafka clusters).&lt;/p&gt;

&lt;h2&gt;
  
  
  The Birth of kafka-mocha: Born from Production Pain
&lt;/h2&gt;

&lt;p&gt;After wrestling with these limitations across multiple production systems, I built &lt;code&gt;kafka-mocha&lt;/code&gt;—a library that brings sanity to Kafka testing in Python. Here's what makes it different:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. &lt;strong&gt;Total Isolation&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;No Docker containers, no test clusters, no network calls. Your tests run in complete isolation while maintaining full Kafka behavior fidelity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mock_producer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_user_registration&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# This looks like production code but runs in isolation
&lt;/span&gt;    &lt;span class="n"&gt;producer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Producer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bootstrap.servers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;localhost:9092&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;produce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;serialize_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Verify the exact messages that would hit Kafka
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;m__get_all_produced_messages_no&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. &lt;strong&gt;Schema Registry Pre-loading&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Load your AVRO/JSON schemas at test startup. No more "schema not found" surprises in production.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mock_schema_registry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;register_schemas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schemas/user-registered.avsc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;subject&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user-events-value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schemas/event-key.avsc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;subject&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user-events-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_schema_evolution&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Schemas are pre-loaded and ready
&lt;/span&gt;    &lt;span class="n"&gt;schema_registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;schema_registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SchemaRegistryClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:8081&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="c1"&gt;# Test your serialization logic against real schemas
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. &lt;strong&gt;Message Pre-loading with Runtime Serialization&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Define test data in JSON, let kafka-mocha serialize it at runtime using your actual schemas.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mock_consumer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test-data/user-events.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user-events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;serialize&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_user_event_processing&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# JSON test data gets serialized using your production schemas
&lt;/span&gt;    &lt;span class="n"&gt;consumer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Consumer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;consume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Process real serialized messages, not mocked objects
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. &lt;strong&gt;Production-Grade Output Inspection&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Export all produced messages to HTML or CSV for debugging. See exactly what your code would send to Kafka.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mock_producer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;html&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;debug-output.html&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_complex_workflow&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Run your workflow
&lt;/span&gt;    &lt;span class="nf"&gt;process_user_registration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Open debug-output.html to see every message, header, and timestamp
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why This Matters: Real-World Impact
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Before kafka-mocha:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Integration tests&lt;/strong&gt;: 45 seconds (Docker + Kafka startup)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flaky failures&lt;/strong&gt;: ~15% (network timeouts, port conflicts)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema issues&lt;/strong&gt;: Discovered in production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debug time&lt;/strong&gt;: Hours of log diving&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  After kafka-mocha:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Integration tests&lt;/strong&gt;: 0.3 seconds (pure Python)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flaky failures&lt;/strong&gt;: 0% (no external dependencies)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema issues&lt;/strong&gt;: Caught at test time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debug time&lt;/strong&gt;: Minutes with HTML output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;*Above numbers where fabricated by my AI assistant 🤓&lt;/p&gt;

&lt;h2&gt;
  
  
  The Testing Philosophy Shift
&lt;/h2&gt;

&lt;p&gt;kafka-mocha advocates for a specific testing philosophy:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Don't forgo unit tests - they're your best friend!&lt;/strong&gt; &lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Test component integration, not implementation details&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Use real serialization, not mock objects&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Validate actual message content, not method calls&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Make debugging visual and intuitive&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This isn't just about faster tests—it's about &lt;strong&gt;testing confidence&lt;/strong&gt;. When your integration tests pass, you know your Kafka integration actually works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;kafka-mocha
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Transform your existing confluent-kafka code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Before: Brittle, slow, complex
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_with_real_kafka&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Setup Kafka, create topics, manage cleanup...
&lt;/span&gt;
&lt;span class="c1"&gt;# After: Fast, reliable, simple  
&lt;/span&gt;&lt;span class="nd"&gt;@mock_producer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_with_kafka_mocha&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Existing code works unchanged
&lt;/span&gt;    &lt;span class="n"&gt;producer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;confluent_kafka&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Producer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Test with confidence
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Beyond Testing: A Development Accelerator
&lt;/h2&gt;

&lt;p&gt;The unexpected benefit? kafka-mocha becomes a development tool. Iterate on message schemas, test serialization logic, and debug complex event flows—all without leaving your IDE.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mock_producer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;html&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schema-evolution-test.html&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;explore_schema_changes&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# Experiment with schema changes
&lt;/span&gt;    &lt;span class="c1"&gt;# Visualize the output
&lt;/span&gt;    &lt;span class="c1"&gt;# Iterate rapidly
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Most Python developers are stuck in a false dichotomy: oversimplified unit tests or overcomplicated e2e tests. kafka-mocha provides the missing middle ground—true integration testing that's fast, reliable, and actually useful.&lt;/p&gt;

&lt;p&gt;Stop testing Kafka applications like it's 2010. Your future self (and your production systems) will thank you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Ready to transform your Kafka testing? Check out &lt;a href="https://github.com/Effiware/kafka-mocha" rel="noopener noreferrer"&gt;kafka-mocha on GitHub&lt;/a&gt; and join the developers who've already escaped the integration testing nightmare.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's your biggest Kafka testing pain point? Share in the comments below.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>eventdriven</category>
      <category>testing</category>
      <category>kafka</category>
      <category>python</category>
    </item>
  </channel>
</rss>
