DEV Community: Andrew

Setting Up a GitLab CI/CD Pipeline with DigitalOcean Kubernetes

Andrew — Sun, 05 Oct 2025 16:47:31 +0000

A pragmatic guide to container-based deployments that won't break the bank

Overview

This guide walks you through setting up a complete CI/CD pipeline using GitLab's free tier and a DigitalOcean DigitalOcean Kubernetes cluster. The approach is straightforward: build Docker images, push them to GitLab's Container Registry, and roll them out to your Kubernetes cluster. No fancy GitOps operators, no complex Helm charts—just good old-fashioned image tags and kubectl commands.

Assumptions

ℹ You're comfortable with containerization concepts and have written a Dockerfile or two
ℹ You understand basic Kubernetes primitives (Deployments, Services, Secrets)
ℹ You have Kubernetes CRDs already defined in your cluster (or a separate repository)
ℹ Your deployment strategy is image-tag-based: new code = new image = new deployment
ℹ You're okay with an imperative deployment approach (we'll discuss the trade-offs)

Prerequisites

Accounts & Infrastructure:
✅ GitLab account (free tier is sufficient)
✅ DigitalOcean account with an existing Kubernetes cluster (even the cheapest $12/mo cluster works perfectly)

Knowledge:
🧠 Basic understanding of Docker and container registries
🧠 Familiarity with Kubernetes concepts: Secrets, Deployments, and rollouts
🧠 Git basics (you're already here, so you're good)

Project Setup:
➡️ A docker-compose.ci.yml file that defines your build configuration
➡️ A Dockerfile that accepts a BUILD_HASH argument:

...
# Build arguments for version information aligned with docker image tag and git commit hash
ARG BUILD_HASH=NOT_SET
ENV BUILD_HASH=${BUILD_HASH}

Note: The docker-compose.ci.yml expects the CI_COMMIT_SHORT_SHA variable, which gets passed to your Dockerfile as the BUILD_HASH argument.

    ...
    build:
     context: ./your-context-dir
     dockerfile: Dockerfile
     args:
       BUILD_HASH: "${CI_COMMIT_SHORT_SHA}"

Step 1: Create GitLab Access Token

Since GitLab's free tier doesn't support group-level access tokens, we'll create one at the project level:

Navigate to your project in GitLab
Go to Settings → Access Tokens
Click Add new token
Configure the token:
- Name: Something descriptive like "k8s-pull-token"
- Scopes: Select at minimum:
  - read_repository
  - read_registry
  - write_registry
Click Create project access token
Important: Copy the token immediately—you won't see it again!

Step 2: Create DigitalOcean Access Token

Your CI/CD pipeline needs to authenticate with DigitalOcean to deploy to your Kubernetes cluster:

Go to DigitalOcean API Tokens
Click Generate New Token
Configure with minimal required scopes:
- Read: image, kubernetes, load_balancer
- Update: kubernetes, load_balancer
- Other: access_cluster (kubernetes)
Save the token securely

Security Note: The CI/CD runner will create short-lived tokens upon authorization, so this persistent token is only used for initial authentication.

Step 3: Create Kubernetes Docker Registry Secret

Your Kubernetes cluster needs credentials to pull images from GitLab's Container Registry. Here's how to create the secret:

kubectl create secret docker-registry regcred \
  --docker-server=registry.gitlab.com/<your-group>/<your-project> \
  --docker-username=<gitlab-username> \
  --docker-password=<token-from-step-1> \
  --docker-email=<your-email> \
  --dry-run=client -o yaml > regcred-secret.yml

Your generated regcred-secret.yml should look like this:

apiVersion: v1
type: kubernetes.io/dockerconfigjson
kind: Secret
metadata:
  name: regcred
  annotations:
    reflector.v1.k8s.emberstack.com/reflection-allowed: "true"
    reflector.v1.k8s.emberstack.com/reflection-auto-enabled: "true"
data:
  .dockerconfigjson: <base64-encoded-secret>

Pro tip: Notice the reflector annotations? If you're using Emberstack's Kubernetes Reflector, this automatically syncs the secret across namespaces. One secret to rule them all. If you're not using Reflector, you'll need to create this secret in each namespace that needs to pull images.

Apply the secret to your cluster:

kubectl apply -f regcred-secret.yml

Step 4: Configure GitLab CI/CD Variables

Add your DigitalOcean token to GitLab's CI/CD variables:

In your GitLab project, go to Settings → CI/CD
Expand the Variables section
Click Add variable
Configure:
- Key: DO_ACCESS_TOKEN
- Value: Your DigitalOcean token from Step 2
- Type: Variable
- Flags: Check "Mask variable" (for security)
- Leave other flags unchecked unless you need environment-specific protection

Step 5: Understanding the Pipeline

Now for the main event—the .gitlab-ci.yml file. This pipeline has three stages that mirror a typical deployment workflow:

Pipeline Stages Overview

stages:
  - test
  - build
  - deploy

Simple, sequential, predictable. Let's break down each stage:

Stage 1: `docker-test`

docker-test:
  stage: test
  image: docker:dind
  services:
    - "docker:dind"
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH
  script:
    - echo Placeholder for running tests

What it does: Runs your test suite in a Docker-in-Docker environment.

When it runs:

On all merge requests
On all branch commits

Variables used:

CI_PIPELINE_SOURCE: GitLab's built-in variable indicating how the pipeline was triggered
CI_COMMIT_BRANCH: The branch being built

Currently, this is a placeholder. Replace the echo with actual test commands like docker compose -f docker-compose.ci.yml run tests or similar.

Stage 2: `docker-build`

docker-build:
  stage: build
  image: docker:dind
  services:
    - "docker:dind"
  needs: [docker-test]
  rules:
    # Build without push on merge requests to main
    - if: $CI_PIPELINE_SOURCE == "merge_request_event" && $CI_MERGE_REQUEST_TARGET_BRANCH_NAME == "main"
      variables:
        SHOULD_PUSH: "false"
    # Build and push on commits to main branch
    - if: $CI_COMMIT_BRANCH == "main"
      variables:
        SHOULD_PUSH: "true"
  script:
    - echo "Commit SHA is $CI_COMMIT_SHORT_SHA"
    - echo "$CI_REGISTRY_PASSWORD" | docker login $CI_REGISTRY -u $CI_REGISTRY_USER --password-stdin
    - docker compose --file docker-compose.ci.yml build
    - |
      if [ "$SHOULD_PUSH" = "true" ]; then
        echo "Pushing image to registry..."
        docker push registry.gitlab.com/group/project/image:$CI_COMMIT_SHORT_SHA
      else
        echo "Skipping push (merge request build)"
      fi

What it does: Builds your Docker image and conditionally pushes it to GitLab's Container Registry.

When it runs:

On merge requests targeting main (builds but doesn't push)
On commits to the main branch (builds and pushes)

Variables used:

CI_COMMIT_SHORT_SHA: Short Git commit SHA used as image tag
CI_REGISTRY: GitLab's container registry URL (automatically provided)
CI_REGISTRY_USER: Registry username (automatically provided)
CI_REGISTRY_PASSWORD: Registry password (automatically provided)
CI_PIPELINE_SOURCE: Pipeline trigger source
CI_MERGE_REQUEST_TARGET_BRANCH_NAME: Target branch for merge requests
CI_COMMIT_BRANCH: Current branch name
SHOULD_PUSH: Custom variable controlling whether to push the image

The smart bit: Merge requests get a full build validation without polluting your registry. Only successful merges to main result in pushed images. This keeps your registry clean and your deployments intentional.

Stage 3: `k8s-deploy`

k8s-deploy:
  stage: deploy
  image: debian:bookworm-slim
  needs: [docker-build]
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      variables:
        K8S_CLUSTER: "66250a2e-6ag4-48e6-a857-a578c754fa3b"
        K8S_NAMESPACE: "your-deployment-namespace"
        K8S_DEPLOYMENT: "your-deployment-name"
  before_script:
    - apt-get update && apt-get install -y curl ca-certificates
    # Install doctl
    - cd /tmp
    - curl -sL https://github.com/digitalocean/doctl/releases/download/v1.104.0/doctl-1.104.0-linux-amd64.tar.gz | tar -xzv
    - mv doctl /usr/local/bin/
    # Install kubectl
    - curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
    - chmod +x kubectl && mv kubectl /usr/local/bin/
    # Authenticate with DigitalOcean and configure kubectl
    - doctl auth init --access-token $DO_ACCESS_TOKEN
    - doctl kubernetes cluster kubeconfig save $K8S_CLUSTER
  script:
    - echo "Deploying image with tag $CI_COMMIT_SHORT_SHA to Kubernetes..."
    - kubectl -n $K8S_NAMESPACE set image deployment.apps/$K8S_DEPLOYMENT container-name=registry.gitlab.com/group/project/image:$CI_COMMIT_SHORT_SHA
    - echo "Waiting for rollout to complete..."
    - kubectl -n $K8S_NAMESPACE rollout status deployment.apps/$K8S_DEPLOYMENT --timeout=5m
    - echo "Deployment successful!"

What it does: Deploys your new image to the Kubernetes cluster.

When it runs:

Only on commits to the main branch

Variables used:

K8S_CLUSTER: Your DigitalOcean Kubernetes cluster ID
K8S_NAMESPACE: Target Kubernetes namespace
K8S_DEPLOYMENT: Name of the Deployment to update
DO_ACCESS_TOKEN: DigitalOcean API token (from Step 4)
CI_COMMIT_SHORT_SHA: Image tag to deploy

The deployment dance:

Setup tooling: Installs doctl (DigitalOcean CLI) and kubectl in a minimal Debian container
Authenticate: Uses your DO token to authenticate and fetch cluster credentials
Update deployment: Uses kubectl set image to update the container image tag
Wait and verify: Monitors the rollout status with a 5-minute timeout

Important: The kubectl rollout status command exits with status 1 if the deployment fails. This means your pipeline will fail if pods don't come up healthy, which is exactly what you want—fast feedback on broken deployments.

The Imperative Approach: Trade-offs and Considerations

Let's address the elephant in the room: this is an imperative deployment strategy. We're directly telling Kubernetes what to do with kubectl set image, not declaring desired state in Git (GitOps) or using sophisticated deployment tools.

Drawbacks:

No Git-based audit trail of what's deployed (only CI/CD logs)
Rollbacks require re-running old pipelines or manual intervention
State drift if someone manually changes the deployment
Doesn't scale well to complex, multi-service deployments

Why it works here:

You're only updating the image tag—the simplest possible change
Your Kubernetes CRDs are version-controlled elsewhere
The pipeline is the single deployment pathway (no manual kubectl cowboys)
It's dead simple to understand and debug
Perfect for small teams and straightforward applications

As long as you're disciplined about only using this pipeline for deployments and keeping your CRDs under version control, this approach is pragmatic and effective.

Testing Your Pipeline

Create a feature branch and make a small change
Push and watch the docker-test and docker-build stages run
Open a merge request to main—the build should run but skip the push
Merge to main and watch the full pipeline execute
Check your Kubernetes cluster: kubectl -n tools-prod get pods -w

Conclusion

You now have a working CI/CD pipeline that costs roughly $12/month (just the Kubernetes cluster—GitLab and the CI runners are free). Not bad for automated deployments that go from git push to production in minutes.

The beauty of this approach is its simplicity. No vendor lock-in, no complex tooling, just Docker, Kubernetes, and a dash of CI/CD glue. Sure, it's not the most sophisticated setup you'll ever see, but it works, it's maintainable, and most importantly—you actually understand what's happening.

Now go forth and deploy with confidence. Your friends asked how you did it; now you can show them!

Building Event-Driven Architecture with MSK and Lambda: The Python Developer's Guide to Not Shooting Yourself in the Foot

Andrew — Tue, 22 Jul 2025 20:13:50 +0000

Breaking free from traditional Kafka patterns when AWS does the heavy lifting

The Allure of "Serverless" Kafka

Picture this: You're tasked with building an event-driven solution for your business. Naturally, you're a serverless enthusiast so you look at AWS and what you see?

AWS MSK for managed Kafka? Check ✅
Python Lambda for serverless compute? Check ✅
The confluent-kafka library for that sweet, sweet performance? Double check ✅✅

You fire up your IDE, start writing familiar Kafka consumer code with .poll() loops, and then... reality hits. This isn't your typical Kafka setup. Welcome to the world of Lambda Event Source Mappings (ESM), where everything you know about Kafka consumers gets turned upside down.

After building multiple production EDA systems with this exact stack, I've learned that success isn't about fighting the constraints — it's about embracing them. Here's how I'd now start an on-boarding session:

The Mental Model Shift Nobody Warns You About

From Pull to Push: Your Consumer Doesn't Consume Anymore

The biggest mind-bender? You don't write Kafka consumers anymore.

# What you THINK you'll write (traditional Kafka)
def traditional_kafka_consumer():
    consumer = confluent_kafka.Consumer(config)
    consumer.subscribe(['user-events'])

    while True:
        messages = consumer.poll(timeout=1.0)
        for message in messages:
            process_message(message)

# What you ACTUALLY write (Lambda ESM)
def lambda_handler(event, context):
    # ESM already consumed the messages for you
    # event['records'] contains the batch
    for record in event['records']:
        process_message(record)

AWS Event Source Mapping becomes your consumer. It polls Kafka, manages offsets, handles retries, and pushes batches to your Lambda. You just get handed a batch of messages and told "deal with it."

This shift brings some harsh realities:

❌ No control over polling intervals

ESM decides when to poll (500ms batching window). You can't implement backpressure or custom polling strategies.

❌ Little control over pollers number
ESM event pollers are said to scale automatically based on the load, generating a lot of consumer group rebalances in practice (provisioned mode does help).

❌ Batch size becomes critical

Configure it wrong, and you'll either overwhelm your Lambda (too big) or waste invocations (too small). Start with 10-50 messages and tune based on your processing time.

❌ No per-message error handling

One message fails? The entire batch gets reprocessed. Design for idempotency from day one.

The Batch Failure Reality Check

Here's the kicker that catches everyone off-guard: Lambda failure semantics are brutal for Kafka.

def lambda_handler(event, context):
    processed_count = 0

    for record in event['records']:
        try:
            result = process_message(record)  # This might fail on message #47
            processed_count += 1
        except Exception as e:
            logger.error(f"Failed processing message: {e}")
            raise  # Entire batch gets retried

    logger.info(f"Successfully processed {processed_count} messages")
    # If we get here, all messages succeeded

When your Lambda throws an exception:

⚠️ All processed messages get reprocessed
⚠️ The failing message gets another chance
⚠️ Messages after the failure also get reprocessed

Forget about exactly-once semantics. You're in at-least-once territory now. Make everything idempotent or suffer the consequences.

Dead Letter Queues: Your Safety Net

Configure a Dead Letter Topic before you go to production. Trust me on this one.

# ESM Configuration (Terraform example)
resource "aws_lambda_event_source_mapping" "kafka_trigger" {
  event_source_arn = aws_msk_cluster.cluster.arn
  function_name    = aws_lambda_function.processor.arn

  topics                             = ["user-events"]
  batch_size                         = 25
  maximum_batching_window_in_seconds = 5

  # This saves your bacon
  destination_config {
    on_failure {
      destination_arn = aws_sns_topic.dlq.arn
    }
  }

  # Retry 3 times before giving up
  maximum_retry_attempts = 3
}

Without DLQ setup, poison messages will block your entire partition indefinitely. I've seen production systems grind to a halt because of one malformed message.

The Producer Side: Where C Meets Python

Here's where things get interesting. While your Lambda receives messages through ESM, you'll still need to produce messages to other topics. This is where confluent-kafka shines — and where logging becomes critical.

import logging
import confluent_kafka

# DO THIS: Pass your logger to the producer
logger = logging.getLogger(__name__)

producer = confluent_kafka.Producer({
    "bootstrap.servers": os.environ["MSK_BOOTSTRAP_SERVERS"],
    "security.protocol": "SASL_SSL",
    "sasl.mechanisms": "AWS_MSK_IAM",
    "sasl.username": "AWS_MSK_IAM",
    "sasl.password": "AWS_MSK_IAM",
}, logger=logger)  # ← This line saves hours of debugging

# Without the logger, librdkafka errors vanish into the void

Why does this matter? The confluent-kafka library is a wrapper around the C library librdkafka. When network issues, authentication failures, or broker problems occur, those errors happen in C-land. Without explicitly passing your logger, you'll see your messages disappear into the ether with zero indication of what went wrong.

Static Initialization: The Cold Start Dance

Lambda's cold start behavior creates interesting challenges for Kafka producers:

import confluent_kafka
import logging

# Static initialization - runs during cold start
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)

# Initialize producer outside the handler
producer = confluent_kafka.Producer({
    "bootstrap.servers": os.environ["MSK_BOOTSTRAP_SERVERS"],
    "security.protocol": "SASL_SSL",
    "sasl.mechanisms": "AWS_MSK_IAM",
    "sasl.username": "AWS_MSK_IAM", 
    "sasl.password": "AWS_MSK_IAM",
}, logger=logger)

def lambda_handler(event, context):
    # Check connectivity in the handler
    # Producer might have disconnected during cold periods
    try:
        # Quick connectivity check
        metadata = producer.list_topics(timeout=5)
        logger.info(f"Connected to {len(metadata.brokers)} brokers")
    except Exception as e:
        logger.error(f"Kafka connectivity issue: {e}")
        # Consider reinitializing producer here

    for record in event['records']:
        process_and_produce(record)

    # Always flush before handler completes
    producer.flush(timeout=10)

Pro tip: You'll be surprised how often connections drop during Lambda's idle periods. Always verify connectivity at the start of your handler.

SnapStart: Just Don't

AWS Lambda SnapStart now supports Python 3.12, and you might be tempted to enable it for faster cold starts. Don't.

Remember those uniqueness issues we discussed with SnapStart? The librdkafka C library is essentially a black box. It maintains internal state, generates unique IDs, establishes connections, and manages all sorts of random number generation for things like:

Client IDs and correlation IDs
Connection retry jitter
Internal message sequencing
SSL session management

When SnapStart creates a snapshot and reuses it across multiple execution environments, you risk:

Duplicate client IDs connecting to brokers
Non-unique message correlation IDs
Shared SSL sessions across instances
Broken internal state assumptions

The performance gain from SnapStart isn't worth the debugging nightmare when your Kafka producers start behaving erratically.

The Philosophy Shift

Building EDA with Lambda ESM requires a different mindset:

Embrace push-based thinking - You react to batches, not control consumption
Design for idempotency - Messages will be reprocessed, plan for it
Monitor batch metrics - Tune batch size based on processing time, not message count
Invest in observability - When things go wrong, you need visibility into both Lambda and Kafka metrics
Plan for failure modes - DLQ configuration is not optional

The Bottom Line

MSK + Lambda + confluent-kafka-python is a powerful stack for event-driven systems, but it's not traditional Kafka development. The constraints imposed by Event Source Mapping fundamentally change how you think about consuming messages.

Stop fighting the platform and start designing with it. Your consumers become stateless message processors. Your error handling becomes batch-oriented. Your observability becomes multi-layered.

Once you embrace these patterns, you'll find that serverless EDA can be incredibly productive. Just don't expect it to work like the Kafka applications you're used to building.

Ready to dive deeper? Check out the AWS Lambda ESM documentation and start small. Build a simple message processor first, then gradually add complexity as you understand the platform's quirks.

What's been your biggest surprise when building serverless event-driven systems? Share your war stories in the comments below!

Have you wrestled with similar challenges in your EDA journey? Found other gotchas worth sharing? Drop them in the comments—the community learns from our collective debugging pain! 😅

Testing Kafka Applications: Why Most Pythonistas Are Doing It Wrong (And How to Fix It)

Andrew — Tue, 10 Jun 2025 16:24:51 +0000

Breaking free from the integration testing nightmare with kafka-mocha

The Harsh Reality of Kafka Testing in Python

Picture this: You're building a microservice with Kafka integration. You've written beautiful business logic, carefully crafted your event schemas, and implemented robust error handling. Then comes the dreaded question: "How do you test this?"

If you're like most Python developers working with Event-Driven Architecture (EDA), you probably fall into one of these camps:

The Optimist: "I'll just spin up a Kafka cluster for testing"
The Pragmatist: "Unit tests with mocks should be enough"
The Procrastinator: "We'll test it in production" 😬

After years of building production Kafka systems in Python, I've discovered that all three approaches are fundamentally flawed. Here's why—and how we can do better.

The Testing Gap Nobody Talks About

Let's be honest: despite everyone preaching the importance of testing, most developers barely write anything beyond bootcamp-level unit tests. When it comes to Kafka applications, this problem becomes exponentially worse.

Unit tests mock everything away—they tell you if your json.loads() works, but nothing about whether your serialization actually matches your schema.

End-to-end tests with real Kafka clusters are brittle, slow, and require complex infrastructure. They're great for final validation but terrible for rapid development cycles.

What we're missing is the sweet spot: true integration tests that validate how components within your microservice work together—your producers, consumers, serializers, and business logic—without external dependencies.

What Integration Testing Really Means

Let's clarify terminology because this confusion costs teams months of debugging:

Unit Tests: Test individual functions in isolation
Integration Tests: Test how components within your service work together
End-to-End Tests: Test complete user flows across multiple services

Most Kafka testing problems stem from trying to do integration testing with unit test tools (excessive mocking) or e2e test infrastructure (full Kafka clusters).

The Birth of kafka-mocha: Born from Production Pain

After wrestling with these limitations across multiple production systems, I built kafka-mocha—a library that brings sanity to Kafka testing in Python. Here's what makes it different:

1. Total Isolation

No Docker containers, no test clusters, no network calls. Your tests run in complete isolation while maintaining full Kafka behavior fidelity.

@mock_producer()
def test_user_registration():
    # This looks like production code but runs in isolation
    producer = confluent_kafka.Producer({"bootstrap.servers": "localhost:9092"})
    producer.produce("user-events", serialize_user(user_data))
    producer.flush()

    # Verify the exact messages that would hit Kafka
    assert producer.m__get_all_produced_messages_no("user-events") == 1

2. Schema Registry Pre-loading

Load your AVRO/JSON schemas at test startup. No more "schema not found" surprises in production.

@mock_schema_registry(
    register_schemas=[
        {"source": "schemas/user-registered.avsc", "subject": "user-events-value"},
        {"source": "schemas/event-key.avsc", "subject": "user-events-key"},
    ]
)
def test_schema_evolution():
    # Schemas are pre-loaded and ready
    schema_registry = confluent_kafka.schema_registry.SchemaRegistryClient({"url": "http://localhost:8081"})
    # Test your serialization logic against real schemas

3. Message Pre-loading with Runtime Serialization

Define test data in JSON, let kafka-mocha serialize it at runtime using your actual schemas.

@mock_consumer(inputs=[
    {"source": "test-data/user-events.json", "topic": "user-events", "serialize": True}
])
def test_user_event_processing():
    # JSON test data gets serialized using your production schemas
    consumer = confluent_kafka.Consumer(config)
    messages = consumer.consume(10)
    # Process real serialized messages, not mocked objects

4. Production-Grade Output Inspection

Export all produced messages to HTML or CSV for debugging. See exactly what your code would send to Kafka.

@mock_producer(output={"format": "html", "name": "debug-output.html"})
def test_complex_workflow():
    # Run your workflow
    process_user_registration(user_data)

    # Open debug-output.html to see every message, header, and timestamp

Why This Matters: Real-World Impact

Before kafka-mocha:

Integration tests: 45 seconds (Docker + Kafka startup)
Flaky failures: ~15% (network timeouts, port conflicts)
Schema issues: Discovered in production
Debug time: Hours of log diving

After kafka-mocha:

Integration tests: 0.3 seconds (pure Python)
Flaky failures: 0% (no external dependencies)
Schema issues: Caught at test time
Debug time: Minutes with HTML output

*Above numbers where fabricated by my AI assistant 🤓

The Testing Philosophy Shift

kafka-mocha advocates for a specific testing philosophy:

Don't forgo unit tests - they're your best friend!
Test component integration, not implementation details
Use real serialization, not mock objects
Validate actual message content, not method calls
Make debugging visual and intuitive

This isn't just about faster tests—it's about testing confidence. When your integration tests pass, you know your Kafka integration actually works.

Getting Started

pip install kafka-mocha

Transform your existing confluent-kafka code:

# Before: Brittle, slow, complex
def test_with_real_kafka():
    # Setup Kafka, create topics, manage cleanup...

# After: Fast, reliable, simple  
@mock_producer()
def test_with_kafka_mocha():
    # Existing code works unchanged
    producer = confluent_kafka.Producer(config)
    # Test with confidence

Beyond Testing: A Development Accelerator

The unexpected benefit? kafka-mocha becomes a development tool. Iterate on message schemas, test serialization logic, and debug complex event flows—all without leaving your IDE.

@mock_producer(output={"format": "html", "name": "schema-evolution-test.html"})
def explore_schema_changes():
    # Experiment with schema changes
    # Visualize the output
    # Iterate rapidly

The Bottom Line

Most Python developers are stuck in a false dichotomy: oversimplified unit tests or overcomplicated e2e tests. kafka-mocha provides the missing middle ground—true integration testing that's fast, reliable, and actually useful.

Stop testing Kafka applications like it's 2010. Your future self (and your production systems) will thank you.

Ready to transform your Kafka testing? Check out kafka-mocha on GitHub and join the developers who've already escaped the integration testing nightmare.

What's your biggest Kafka testing pain point? Share in the comments below.

DEV Community: Andrew

Setting Up a GitLab CI/CD Pipeline with DigitalOcean Kubernetes

Overview

Assumptions

Prerequisites

Step 1: Create GitLab Access Token

Step 2: Create DigitalOcean Access Token

Step 3: Create Kubernetes Docker Registry Secret

Step 4: Configure GitLab CI/CD Variables

Step 5: Understanding the Pipeline

Pipeline Stages Overview

Stage 1: docker-test

Stage 2: docker-build

Stage 3: k8s-deploy

The Imperative Approach: Trade-offs and Considerations

Testing Your Pipeline

Conclusion

Building Event-Driven Architecture with MSK and Lambda: The Python Developer's Guide to Not Shooting Yourself in the Foot

The Allure of "Serverless" Kafka

The Mental Model Shift Nobody Warns You About

From Pull to Push: Your Consumer Doesn't Consume Anymore

The Batch Failure Reality Check

Dead Letter Queues: Your Safety Net

The Producer Side: Where C Meets Python

Static Initialization: The Cold Start Dance

SnapStart: Just Don't

The Philosophy Shift

The Bottom Line

Testing Kafka Applications: Why Most Pythonistas Are Doing It Wrong (And How to Fix It)

The Harsh Reality of Kafka Testing in Python

The Testing Gap Nobody Talks About

What Integration Testing Really Means

The Birth of kafka-mocha: Born from Production Pain

1. Total Isolation

2. Schema Registry Pre-loading

3. Message Pre-loading with Runtime Serialization

4. Production-Grade Output Inspection

Why This Matters: Real-World Impact

Before kafka-mocha:

After kafka-mocha:

The Testing Philosophy Shift

Getting Started

Beyond Testing: A Development Accelerator

The Bottom Line

Stage 1: `docker-test`

Stage 2: `docker-build`

Stage 3: `k8s-deploy`