DEV Community

Kafka for Data Engineers: Core Concepts, KRaft, and the Patterns That Actually Work

De' Clerke — Tue, 02 Jun 2026 20:36:14 +0000

If your Kafka Docker Compose still has a ZooKeeper service in it, your setup is already legacy. As of Kafka 4.0 (released March 2025), ZooKeeper is gone. The architecture changed, the config changed, and the setup you learned two years ago will not work with any Kafka 4.x image.

This guide covers Kafka from the ground up — what it is, how it works, how to run it locally in 2026 with KRaft, and the Python patterns that hold up in production. It also covers the gotchas that waste days when you're new to it.

What Kafka Is and Why Data Engineers Use It

Kafka is a distributed event streaming platform. The core abstraction is a log: an ordered, append-only sequence of records. Producers write to the log. Consumers read from it. The log is retained for a configurable period (default 7 days), so multiple consumers can independently read the same data at their own pace.

For data engineers, this matters in three scenarios:

Real-time ingestion. When data arrives faster than a batch pipeline can process it — flight events, sensor readings, stock ticks, user clicks — Kafka buffers it reliably. Your pipeline reads at whatever pace it can sustain without losing data.

Decoupling producers from consumers. Your ingestion pipeline doesn't need to know who consumes the data or how fast they process it. Add a new downstream system and it starts reading from whatever offset it needs. Nothing about the producer changes.

Change Data Capture (CDC). Kafka Connect + Debezium reads PostgreSQL's write-ahead log and streams every row-level insert, update, and delete to a Kafka topic. Downstream systems get a live feed of database changes without polling.

The Architecture: What You Actually Need to Know

Brokers, Topics, and Partitions

A Kafka broker is a server that stores messages and serves producers and consumers. A production cluster has multiple brokers for fault tolerance.

A topic is a named stream of messages — like a database table, but append-only. Topics are divided into partitions, which are the actual unit of storage and parallelism. Each partition is an ordered, immutable log on disk.

Topic: flight-events (3 partitions)

Partition 0: [offset 0] [offset 1] [offset 2] [offset 3] ...
Partition 1: [offset 0] [offset 1] [offset 2] ...
Partition 2: [offset 0] [offset 1] ...

Messages within a partition are strictly ordered. Messages across partitions are not. If you need strict global ordering, use one partition. If you need parallelism, use more.

Partition count rule of thumb: partitions = max(target throughput / single-partition throughput, number of consumers). You can increase partition count later, but you can never decrease it. Start conservative.

Offsets

An offset is the unique sequential ID of a message within a partition. Offset 0 is the first message. Offset 47 is the 48th. Consumers track their position by storing ("committing") the last offset they processed. This is how they resume after a restart without reprocessing everything or missing anything.

Consumer Groups

A consumer group is a set of consumers that split a topic's partitions between them. Each partition is assigned to exactly one consumer in the group at a time.

Topic: flight-events (4 partitions)
Consumer group: pipeline-processors (2 consumers)

Consumer A: handles Partition 0 + Partition 1
Consumer B: handles Partition 2 + Partition 3

This is the horizontal scaling model. To process faster, add consumers to the group — up to the number of partitions. Beyond that, extra consumers sit idle.

A rebalance happens when consumers join or leave the group. Partitions get redistributed. In older Kafka, rebalances were disruptive — all consumers stopped processing, all partitions were revoked, and they were reassigned from scratch. With the new rebalance protocol (KIP-848, GA in Kafka 4.0), rebalances are incremental. Only the partitions that need to move actually move. For large consumer groups, this is a significant operational improvement.

Delivery Semantics

Three guarantees, each with a different tradeoff:

Semantic	Description	Risk
At-most-once	Commit before processing	May lose messages on crash
At-least-once	Commit after processing	May process duplicates on retry
Exactly-once	Transactions + idempotent producer	Highest complexity and latency

For most data engineering pipelines, at-least-once is the right default. Make your processing idempotent (upsert into PostgreSQL with ON CONFLICT, for example) and duplicates become harmless.

The Big Change in 2025: KRaft Replaces ZooKeeper

Every Kafka deployment before Kafka 4.0 required ZooKeeper — a separate distributed coordination service that managed cluster metadata (which broker is the controller, partition leaders, topic configs, consumer group state).

ZooKeeper was the most operationally painful part of running Kafka. It required its own cluster, its own monitoring, its own tuning. It had its own failure modes that were separate from Kafka's.

KRaft (Kafka Raft Metadata) replaces ZooKeeper entirely. Kafka manages its own metadata using the Raft consensus algorithm, built directly into the broker. There is no separate service. The cluster self-manages.

Kafka 3.9.x was the last release to support ZooKeeper. If you're on Kafka 3.9 with ZooKeeper and want to upgrade to 4.0+, you must migrate to KRaft first using the migration tooling in 3.9. You cannot skip it.

Kafka 4.0+ is KRaft-only, no exceptions. The current stable release is 4.3.0 (May 2026).

The practical impact for a new project: your Docker Compose is simpler. No ZooKeeper service. No KAFKA_ZOOKEEPER_CONNECT. Just Kafka, configured to run as both broker and controller.

Running Kafka Locally: KRaft Docker Compose

The Confluent Platform Docker images are the easiest way to run Kafka locally. Confluent Platform 8.1 maps to Kafka 4.1.

# docker-compose.yml — Kafka 4.1 (Confluent Platform 8.1.2), KRaft mode
version: '3.8'
services:
  kafka:
    image: confluentinc/cp-kafka:8.1.2
    container_name: kafka
    hostname: kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_NODE_ID: 1
      KAFKA_PROCESS_ROLES: broker,controller
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,CONTROLLER://0.0.0.0:9093
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,CONTROLLER:PLAINTEXT
      KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
      KAFKA_CONTROLLER_QUORUM_VOTERS: 1@kafka:9093
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'true'
      KAFKA_LOG_RETENTION_HOURS: 168
      CLUSTER_ID: "MkU3OEVBNTcwNTJENDM2Qk"
    volumes:
      - kafka_data:/var/lib/kafka/data
    healthcheck:
      test: kafka-topics --bootstrap-server localhost:9092 --list || exit 1
      interval: 10s
      timeout: 10s
      retries: 5

  kafka-ui:
    image: provectuslabs/kafka-ui:latest
    container_name: kafka-ui
    depends_on:
      kafka:
        condition: service_healthy
    ports:
      - "8080:8080"
    environment:
      KAFKA_CLUSTERS_0_NAME: local
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: kafka:9092

volumes:
  kafka_data:

Start it:

docker compose up -d
docker compose logs -f kafka    # watch for "started (kafka.server.KafkaServer)"

The Kafka UI at http://localhost:8080 shows topics, consumer groups, lag, and messages visually. Use it while you learn.

The ADVERTISED_LISTENERS gotcha (the most common Docker error):

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092 works when your producer/consumer runs on the host machine. If your producer also runs inside Docker (a different container), use the service name instead:

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092

The rule: localhost for host-to-container access, kafka (the service name) for container-to-container. Mixing them produces the error kafka: client has run out of available brokers to talk to with no further explanation.

For multi-service Docker setups (producer container + Kafka container + consumer container), add two listeners:

KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092,PLAINTEXT_HOST://0.0.0.0:29092
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT

Internal containers connect on kafka:9092. Your host machine connects on localhost:29092.

CLI: The Commands You'll Use Constantly

Get into the container:

docker exec -it kafka bash

Topics:

# Create a topic
kafka-topics --bootstrap-server localhost:9092 \
  --create --topic flight-events \
  --partitions 3 \
  --replication-factor 1

# List all topics
kafka-topics --bootstrap-server localhost:9092 --list

# Describe a topic (partitions, leaders, replicas, ISR)
kafka-topics --bootstrap-server localhost:9092 --describe --topic flight-events

# Delete a topic
kafka-topics --bootstrap-server localhost:9092 --delete --topic flight-events

# Change retention to 1 day
kafka-configs --bootstrap-server localhost:9092 \
  --alter --entity-type topics --entity-name flight-events \
  --add-config retention.ms=86400000

Quick produce/consume for testing:

# Produce — type messages, Ctrl+C to stop
kafka-console-producer \
  --bootstrap-server localhost:9092 \
  --topic flight-events

# Consume from the beginning
kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic flight-events \
  --from-beginning

# Consume as part of a named group
kafka-console-consumer \
  --bootstrap-server localhost:9092 \
  --topic flight-events \
  --group my-test-group \
  --from-beginning

Consumer group inspection (critical for debugging lag):

# List all consumer groups
kafka-consumer-groups --bootstrap-server localhost:9092 --list

# Describe a group — shows current offset, log end offset, and lag per partition
kafka-consumer-groups --bootstrap-server localhost:9092 \
  --describe --group flight-processor

# Output:
# TOPIC          PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG  CONSUMER-ID
# flight-events  0          1234            1236            2    ...
# flight-events  1          890             890             0    ...
# flight-events  2          445             450             5    ...

Resetting offsets:

# Reset to beginning (reprocess all messages)
kafka-consumer-groups --bootstrap-server localhost:9092 \
  --group flight-processor \
  --topic flight-events \
  --reset-offsets --to-earliest --execute

# Reset to latest (skip existing, process only new messages)
kafka-consumer-groups --bootstrap-server localhost:9092 \
  --group flight-processor \
  --topic flight-events \
  --reset-offsets --to-latest --execute

# Reset to a specific datetime
kafka-consumer-groups --bootstrap-server localhost:9092 \
  --group flight-processor \
  --topic flight-events \
  --reset-offsets --to-datetime 2025-01-01T00:00:00.000 --execute

Python: Producer Patterns with confluent-kafka

Install:

pip install confluent-kafka==2.14.0

Basic producer:

import json
from confluent_kafka import Producer, KafkaException

KAFKA_CONFIG = {
    'bootstrap.servers': 'localhost:9092',
    'acks': 'all',                           # wait for all in-sync replicas
    'enable.idempotence': True,              # exactly-once per partition, retries safe
    'max.in.flight.requests.per.connection': 5,
    'retries': 10,
    'retry.backoff.ms': 1000,
    'compression.type': 'snappy',
    'batch.size': 32768,                     # 32 KB batches
    'linger.ms': 10,                         # wait 10ms to fill a batch before sending
}

def delivery_callback(err, msg):
    if err:
        print(f"Delivery failed: {err}")
    else:
        print(f"Delivered to {msg.topic()} [{msg.partition()}] @ offset {msg.offset()}")

producer = Producer(KAFKA_CONFIG)

def produce_message(topic: str, key: str, value: dict):
    producer.produce(
        topic=topic,
        key=key.encode('utf-8'),
        value=json.dumps(value).encode('utf-8'),
        callback=delivery_callback,
    )
    producer.poll(0)    # trigger callbacks without blocking

def produce_batch(topic: str, records: list[dict], key_field: str):
    for record in records:
        key = str(record.get(key_field, ''))
        producer.produce(
            topic=topic,
            key=key.encode('utf-8'),
            value=json.dumps(record).encode('utf-8'),
            callback=delivery_callback,
        )
    producer.flush()    # block until all messages confirmed delivered
    print(f"Flushed {len(records)} messages to {topic}")

Why enable.idempotence=True: Without it, a retry after a network timeout might deliver the same message twice because the broker received it but the acknowledgment was lost. With idempotence enabled, Kafka assigns each message a sequence number and deduplicates retries at the broker level. Always enable it unless you have a specific reason not to.

Why linger.ms=10: By default, Kafka sends messages as soon as they're produced, one batch at a time. Setting linger.ms tells the producer to wait up to 10ms for more messages before sending, filling batches more efficiently. For high-throughput pipelines this meaningfully reduces network round trips.

Python: Consumer Patterns

from confluent_kafka import Consumer, KafkaError, KafkaException
import json
import signal

CONSUMER_CONFIG = {
    'bootstrap.servers': 'localhost:9092',
    'group.id': 'flight-processor',
    'auto.offset.reset': 'earliest',     # start from the beginning if no committed offset
    'enable.auto.commit': False,         # manual commit — never let Kafka auto-commit
    'max.poll.interval.ms': 300000,      # 5 min max between polls before rebalance triggered
    'session.timeout.ms': 45000,
    'heartbeat.interval.ms': 3000,
}

consumer = Consumer(CONSUMER_CONFIG)
consumer.subscribe(['flight-events'])

running = True

def shutdown(signum, frame):
    global running
    running = False

signal.signal(signal.SIGTERM, shutdown)
signal.signal(signal.SIGINT, shutdown)

try:
    while running:
        msg = consumer.poll(timeout=1.0)
        if msg is None:
            continue
        if msg.error():
            if msg.error().code() == KafkaError._PARTITION_EOF:
                # Reached end of partition — not an error, just informational
                continue
            raise KafkaException(msg.error())

        key = msg.key().decode('utf-8') if msg.key() else None
        value = json.loads(msg.value().decode('utf-8'))

        # Process the message
        process_event(value)

        # Commit after processing — at-least-once delivery
        consumer.commit(message=msg)
finally:
    consumer.close()

The auto.offset.reset gotcha — the mistake everyone makes first:

auto.offset.reset only applies when a consumer group has no committed offsets — meaning it's brand new or its offsets were deleted. It does not override existing committed offsets.

earliest: start from the oldest available message (what you want for a new consumer seeing existing data)
latest: start from the newest message — everything before the consumer started is invisible

The common mistake: you write a new consumer, set auto.offset.reset='latest', start it, and wonder why it receives nothing. It's because there were no new messages after it started. All the existing messages are already "past" the latest offset. Change to earliest and reset the consumer group offsets, or use --to-earliest on the CLI.

Why enable.auto.commit=False: Auto-commit fires on a timer (every 5 seconds by default). If your process crashes after the commit but before processing finishes, messages are lost. If it crashes after processing but before the commit, you reprocess on restart. Manual commit after processing gives you at-least-once: possible duplicates, never lost. For idempotent sinks (PostgreSQL upserts), this is the right tradeoff.

Batch consumer (for loading into a database):

Instead of committing after every single message, accumulate a batch and commit once:

def consume_and_load(topic: str, batch_size: int = 500):
    consumer.subscribe([topic])
    batch = []
    try:
        while True:
            msg = consumer.poll(timeout=5.0)
            if msg and not msg.error():
                batch.append(json.loads(msg.value()))

            if len(batch) >= batch_size:
                load_to_database(batch)    # your upsert logic
                consumer.commit()
                batch = []
    finally:
        if batch:
            load_to_database(batch)
            consumer.commit()
        consumer.close()

This pattern is typical for Kafka-to-PostgreSQL pipelines: poll until you have 500 messages, bulk-insert with ON CONFLICT DO NOTHING, commit, repeat.

The Rebalance Problem and How KIP-848 Fixes It

The most disruptive thing in a Kafka consumer setup is a rebalance. Before Kafka 4.0, every time a consumer joined or left a group, the broker would:

Tell all consumers to stop processing and revoke their partitions
Wait for all consumers to acknowledge
Reassign all partitions from scratch
Tell consumers to resume

For a consumer group with 20 consumers and 60 partitions, a single consumer restart triggered a full stop-the-world for every consumer. At high message volume, this caused visible processing gaps.

The new rebalance protocol (KIP-848, GA in Kafka 4.0) works incrementally. Only the partitions that need to move actually move. Consumers that don't need to change keep processing. The impact of a single consumer restart is isolated to that consumer's partitions.

To use the new protocol, set the group.protocol consumer config:

CONSUMER_CONFIG = {
    'bootstrap.servers': 'localhost:9092',
    'group.id': 'flight-processor',
    'group.protocol': 'consumer',      # new KIP-848 protocol; 'classic' is the old one
    'auto.offset.reset': 'earliest',
    'enable.auto.commit': False,
}

The consumer protocol is the default in Kafka 4.0+. If you're connecting to a 4.0+ broker, it will be used automatically. If you're connecting to a 3.x broker, it falls back to classic.

Monitoring Consumer Lag

Lag is the difference between the latest message in a partition and the last message your consumer committed. Lag = 0 means you're fully caught up. Lag growing means your consumer is slower than the producer.

CLI:

kafka-consumer-groups --bootstrap-server localhost:9092 \
  --describe --group flight-processor

The LAG column is what you watch. If it's consistently growing, your consumer needs more threads, more instances, or a faster processing path.

Python:

from confluent_kafka.admin import AdminClient, ConsumerGroupTopicPartitions, TopicPartition
from confluent_kafka import Consumer

admin = AdminClient({'bootstrap.servers': 'localhost:9092'})

def check_lag(group_id: str, topic: str, num_partitions: int) -> dict:
    partitions = [TopicPartition(topic, p) for p in range(num_partitions)]
    fut = admin.list_consumer_group_offsets(
        [ConsumerGroupTopicPartitions(group_id, partitions)]
    )
    committed = fut[group_id].result().topic_partitions

    temp_consumer = Consumer({'bootstrap.servers': 'localhost:9092', 'group.id': '_lag-check'})
    lag_report = {}
    for tp in committed:
        low, high = temp_consumer.get_watermark_offsets(
            TopicPartition(tp.topic, tp.partition)
        )
        lag = high - (tp.offset if tp.offset >= 0 else low)
        lag_report[f"{tp.topic}[{tp.partition}]"] = lag
        if lag > 10000:
            print(f"WARNING: high lag on {tp.topic}[{tp.partition}]: {lag}")
    temp_consumer.close()
    return lag_report

Common Errors and What They Actually Mean

"LEADER_NOT_AVAILABLE"
The topic was just created and leader election is still in progress. Wait 2–3 seconds and retry. This is not a real error for newly created topics.

Consumer receives no messages despite messages existing in the topic
Two causes: (1) auto.offset.reset='latest' and the consumer started after those messages were written — reset offsets to earliest for the group; (2) another consumer in the same group already committed those offsets — use a different group.id or reset the group.

"Rebalancing too often"
Your consumer is taking too long between poll() calls. If processing one batch takes longer than max.poll.interval.ms (default 5 minutes), Kafka assumes the consumer is dead and triggers a rebalance. Fix: increase max.poll.interval.ms, process fewer records per poll, or move slow processing to a background thread.

"Message too large" (broker rejects)
message.max.bytes on the broker (default 1MB) is smaller than your message. Align message.max.bytes on the broker and max.request.size on the producer:

# In Docker Compose:
KAFKA_MESSAGE_MAX_BYTES: 10485760         # 10 MB broker limit

# In producer config:
'max.request.size': 10485760             # must match

Dead Letter Queue — when you can't fix a bad message:

Some messages will always fail processing — malformed JSON, missing fields, unexpected types. Don't let them block your pipeline. Route them to a DLQ topic:

def process_with_dlq(msg, dlq_producer):
    try:
        data = json.loads(msg.value())
        process_event(data)
        consumer.commit(message=msg)
    except Exception as e:
        dlq_producer.produce(
            'flight-events-dlq',
            key=msg.key(),
            value=msg.value(),
            headers={
                'error': str(e).encode(),
                'original-topic': msg.topic().encode(),
                'original-partition': str(msg.partition()).encode(),
                'original-offset': str(msg.offset()).encode(),
            },
        )
        dlq_producer.flush()
        consumer.commit(message=msg)    # commit anyway — bad message goes to DLQ, not back to queue

The DLQ lets you inspect failed messages later, fix the processing logic, and replay them.

Kafka with Airflow

The apache-airflow-providers-apache-kafka package adds native Kafka operators and sensors to Airflow:

pip install apache-airflow-providers-apache-kafka

Add a Kafka connection in the Airflow UI: Conn Type → Apache Kafka, Bootstrap Servers → kafka:9092.

from airflow.decorators import dag, task
from airflow.providers.apache.kafka.sensors.kafka import AwaitMessageSensor
from airflow.providers.apache.kafka.operators.consume import ConsumeFromTopicOperator
from datetime import datetime

@dag(schedule='@daily', start_date=datetime(2025, 1, 1))
def kafka_pipeline():

    # Wait for a message matching a condition before proceeding
    wait_for_data = AwaitMessageSensor(
        task_id='wait_for_flight_data',
        kafka_config_id='kafka_default',
        topics=['flight-events'],
        apply_function='my_module.check_message',
        poll_timeout=1.0,
        poll_interval=5,
    )

    # Consume and process a batch from the topic
    consume = ConsumeFromTopicOperator(
        task_id='consume_flights',
        kafka_config_id='kafka_default',
        topics=['flight-events'],
        apply_function='my_module.process_message',
        max_messages=1000,
        max_batch_size=100,
        commit_cadence='end_of_batch',
    )

    wait_for_data >> consume

kafka_pipeline()

# my_module.py
import json

def check_message(message, **context):
    data = json.loads(message.value())
    return data.get('status') == 'ready'

def process_message(message, **context):
    data = json.loads(message.value())
    # transform and return
    return {**data, 'processed_at': context['ts']}

For simpler use cases — just producing from a Python task — use the confluent_kafka library directly inside a @task function:

from confluent_kafka import Producer
import json

@task
def produce_events(records: list) -> int:
    p = Producer({'bootstrap.servers': 'kafka:9092', 'acks': 'all'})
    for record in records:
        p.produce('flight-events', value=json.dumps(record).encode())
    p.flush()
    return len(records)

Quick Reference: Key Config Decisions

Config	Recommended value	Why
`acks`	`all`	Safest — waits for all in-sync replicas
`enable.idempotence`	`True`	Deduplicates retries at broker level
`enable.auto.commit`	`False`	Manual commit for reliable at-least-once
`auto.offset.reset`	`earliest` (new consumers)	Don't silently miss existing messages
`compression.type`	`snappy`	Good ratio, fast encode/decode
`linger.ms`	`10`	Fill batches efficiently; minimal latency impact
`group.protocol`	`consumer` (Kafka 4.0+)	New incremental rebalance protocol

Version Reference (June 2026)

	Version	Notes
Apache Kafka	4.3.0	Latest stable (May 2026)
Confluent Platform	8.1.2	Maps to Kafka 4.1; use for Docker images
confluent-kafka (Python)	2.14.0	Preferred Python client
Last ZooKeeper release	3.9.2	Dead end — do not build new projects on it
Java requirement (brokers)	17+	From Kafka 4.0 onward

If you're migrating an existing ZooKeeper-based cluster, the path is: upgrade to Kafka 3.9 first (the bridge version with the best migration tooling), run the ZooKeeper-to-KRaft migration, then upgrade to 4.0+.

Follow me on dev.to for more data engineering content, or browse the project code at github.com/declerke.

Migrando uma Aplicação Vue 2 Legada de Webpack 2 para Vite: Um Guia Prático Baseado em Problemas Reais

Camila Rody — Tue, 02 Jun 2026 20:29:31 +0000

Quando falamos sobre migração para Vite, a maioria dos artigos parte de um cenário ideal: projetos relativamente modernos, dependências atualizadas, versões recentes do Node e uma arquitetura preparada para evoluir. Na prática, porém, muitas empresas ainda mantêm aplicações Vue 2 que surgiram anos atrás, construídas sobre Webpack 2, carregando uma grande quantidade de bibliotecas descontinuadas, loaders obsoletos e configurações que foram sendo adaptadas por diferentes equipes ao longo do tempo.

Foi exatamente esse cenário que encontrei. O objetivo não era migrar para Vue 3, reescrever a aplicação ou modernizar toda a stack de uma vez. O desafio era muito mais delicado: substituir o Webpack 2 por Vite mantendo a aplicação funcionando, preservando compatibilidade com bibliotecas legadas e sem interromper o desenvolvimento do produto.

Ao longo desse processo, ficou evidente que a maior dificuldade não estava na configuração do Vite em si. O verdadeiro desafio era descobrir tudo aquilo que o Webpack estava fazendo silenciosamente há anos e garantir que esse comportamento fosse reproduzido no novo ambiente.

Antes de Instalar o Vite, Entenda o Que o Webpack Faz Pelo Seu Projeto

Um erro bastante comum é iniciar a migração instalando o Vite e tentando fazer a aplicação subir imediatamente. Em aplicações pequenas isso pode funcionar. Em sistemas legados, normalmente gera uma sequência interminável de erros difíceis de rastrear.

Antes de qualquer alteração, é fundamental mapear as responsabilidades atuais do Webpack. Em muitos projetos ele não atua apenas como bundler. Frequentemente também é responsável por resolver aliases, processar imagens, transpilar JavaScript, injetar variáveis de ambiente, tratar arquivos Sass, gerar chunks dinâmicos e fornecer polyfills para APIs que os navegadores não implementam nativamente.

Por isso, o primeiro passo deve ser uma auditoria da configuração existente.

Analise cuidadosamente arquivos como:

webpack.config.js
webpack.dev.js
webpack.prod.js
.babelrc
package.json

Durante essa análise, documente:

Aliases utilizados
Loaders instalados
Plugins do Webpack
Configurações do Babel
Variáveis de ambiente
Polyfills
Estratégias de code splitting
Tratamento de assets

Essa documentação servirá como roteiro para toda a migração.

Estabilize a Versão do Node Antes de Trocar o Bundler

Em projetos antigos, é comum encontrar versões do Node extremamente defasadas. Muitas vezes o sistema foi desenvolvido utilizando Node 8, Node 10 ou Node 12, e diversas dependências foram escritas considerando limitações dessas versões.

Tentar atualizar Node e substituir Webpack ao mesmo tempo costuma gerar um problema clássico: quando algo quebra, torna-se difícil descobrir a causa.

Por isso, recomendo separar completamente essas etapas.

Primeiro, valide até qual versão do Node a aplicação consegue evoluir sem alterações significativas. Durante essa fase você provavelmente encontrará dependências problemáticas como:

node-sass
extract-text-webpack-plugin
uglify-js
babel-preset-es2015

Em muitos casos, a melhor decisão não é atualizar essas bibliotecas, mas congelá-las temporariamente para reduzir riscos.

O objetivo inicial não é modernizar tudo. É criar um ambiente estável sobre o qual a migração poderá acontecer.

Instale o Vite Sem Remover o Webpack

Uma das decisões mais importantes durante minha migração foi evitar uma substituição imediata.

Em vez de remover o Webpack, o Vite foi introduzido paralelamente.

A instalação básica para Vue 2 é relativamente simples:

npm install vite vite-plugin-vue2 --save-dev

Depois disso, crie um arquivo vite.config.js:

import { defineConfig } from 'vite'
import { createVuePlugin } from 'vite-plugin-vue2'

export default defineConfig({
  plugins: [
    createVuePlugin()
  ]
})

Nesse momento o Webpack continua existindo normalmente.

Os scripts passam a coexistir:

{
  "scripts": {
    "dev": "webpack-dev-server",
    "dev:vite": "vite",
    "build": "webpack",
    "build:vite": "vite build"
  }
}

Essa abordagem permite comparar comportamentos, validar funcionalidades e criar um plano de rollback caso algo inesperado aconteça.

Migrando os Aliases

Quase toda aplicação Vue 2 de médio ou grande porte utiliza aliases para evitar imports relativos extensos.

Um exemplo comum no Webpack:

resolve: {
  alias: {
    '@': path.resolve(__dirname, 'src'),
    '@components': path.resolve(__dirname, 'src/components')
  }
}

Esses aliases precisam ser reproduzidos no Vite.

import path from 'path'

export default defineConfig({
  resolve: {
    alias: {
      '@': path.resolve(__dirname, './src'),
      '@components': path.resolve(__dirname, './src/components')
    }
  }
})

Pode parecer um detalhe pequeno, mas em aplicações grandes isso afeta centenas ou milhares de imports.

Por isso, essa costuma ser uma das primeiras configurações que implemento durante a migração.

Corrigindo Imports Dinâmicos Que Funcionavam Apenas no Webpack

Uma das incompatibilidades mais frequentes está relacionada aos famosos require() dinâmicos.

Durante anos o Webpack permitiu padrões como:

const component = require('./pages/' + pageName)

const service = require(path)

O problema é que o Vite utiliza análise estática para resolver módulos e não consegue interpretar caminhos gerados dinamicamente em tempo de execução.

Nesses casos é necessário utilizar import.meta.glob().

Antes:

const component = require(`./pages/${pageName}.vue`)

Depois:

const pages = import.meta.glob('./pages/*.vue')

const component = pages[`./pages/${pageName}.vue`]

Em projetos legados, essa costuma ser uma das etapas mais trabalhosas da migração, pois exige revisão de arquitetura em diversos pontos da aplicação.

Migrando Variáveis de Ambiente

Outro ajuste obrigatório envolve as variáveis de ambiente.

Projetos baseados em Webpack normalmente utilizam:

process.env.API_URL

No Vite o padrão é diferente.

Antes:

const apiUrl = process.env.API_URL

Depois:

const apiUrl = import.meta.env.VITE_API_URL

Além disso, o arquivo .env precisa ser adaptado:

VITE_API_URL=https://api.minhaempresa.com

Um detalhe importante é que o Vite só expõe para o frontend variáveis prefixadas com:

VITE_

Ignorar esse detalhe pode gerar erros difíceis de identificar em ambientes de homologação e produção.

Lidando com Bibliotecas CommonJS e Dependências Abandonadas

Uma das maiores preocupações em sistemas legados é a quantidade de bibliotecas que não acompanham mais a evolução do ecossistema JavaScript.

Muitas delas ainda utilizam:

module.exports = library

exports.default = library

Embora o Vite possua mecanismos de compatibilidade, alguns pacotes exigem otimizações explícitas:

export default defineConfig({
  optimizeDeps: {
    include: [
      'legacy-library',
      'old-plugin'
    ]
  }
})

Quando isso não é suficiente, uma alternativa eficiente é criar wrappers de compatibilidade para isolar o problema e evitar alterações massivas no código.

Essa abordagem reduz significativamente o impacto da migração.

Polyfills: O Problema Que Geralmente Só Aparece Depois

Durante anos o Webpack forneceu diversos polyfills automaticamente.

Quando migramos para Vite, muitos deles desaparecem.

É comum encontrar erros como:

Buffer is not defined

process is not defined

global is not defined

Um exemplo de correção para Buffer:

import { Buffer } from 'buffer'

window.Buffer = Buffer

Esses problemas normalmente surgem apenas durante testes mais profundos, por isso é importante validar cuidadosamente todas as funcionalidades críticas da aplicação.

Revisando Loaders e Assets

Grande parte da complexidade de projetos Webpack antigos está concentrada nos loaders.

É comum encontrar configurações envolvendo:

file-loader
url-loader
raw-loader
svg-loader
font-loader

A boa notícia é que muitos desses recursos são tratados nativamente pelo Vite.

Entretanto, não assuma que todos os comportamentos serão reproduzidos automaticamente.

Sempre valide:

SVGs inline
Fontes customizadas
Arquivos estáticos
Assets importados dinamicamente
Recursos utilizados por bibliotecas de terceiros

Muitos bugs de produção surgem exatamente nessa etapa.

Validando a Build de Produção

Fazer a aplicação funcionar em desenvolvimento não significa que a migração terminou.

Na verdade, a parte mais importante começa depois.

Uma validação completa deve incluir:

Navegação entre rotas
Lazy loading
Autenticação
Upload de arquivos
Integrações externas
Variáveis de ambiente
Source maps
Geração de bundles

Além disso, recomendo comparar o comportamento da aplicação construída pelo Webpack e pelo Vite simultaneamente durante algum tempo.

Essa estratégia facilita a identificação de regressões e reduz significativamente os riscos da implantação.

Conclusão

Depois de concluir essa migração, uma percepção ficou muito clara para mim: trocar Webpack por Vite é a parte fácil.

O trabalho real está em compreender todas as decisões arquiteturais acumuladas ao longo dos anos e identificar quais delas dependem diretamente do comportamento do Webpack.

Aliases, imports dinâmicos, polyfills, bibliotecas CommonJS, loaders antigos e configurações herdadas raramente aparecem na documentação do projeto. Muitas vezes, só descobrimos sua importância quando algo deixa de funcionar.

Por isso, a melhor estratégia não é migrar rápido. É migrar de forma controlada, mantendo Webpack e Vite coexistindo durante o processo, validando cada etapa e reduzindo ao máximo a quantidade de variáveis envolvidas.

Em aplicações Vue 2 legadas, o sucesso da migração não está em instalar uma ferramenta nova. Está em conseguir modernizar a infraestrutura sem alterar o comportamento que o sistema construiu ao longo dos anos.

5 Angular Features Developers Should Actually Pay Attention to in 2026

Niharika Pujari — Tue, 02 Jun 2026 20:25:47 +0000

Introduction

Angular has changed a lot over the past few releases.

Between Signals, standalone components, modern template syntax, and rendering improvements, Angular feels very different from the Angular many of us learned a few years ago.

Some features are incremental. Others genuinely change how we build applications.

Here are 5 Angular features that actually matter in 2026 and are worth paying attention to if you're building modern Angular apps.

1. Signals Are Changing How Angular State Management Works

Signals are probably the biggest shift in Angular’s mental model in years.

Instead of relying heavily on:

BehaviorSubject
manual subscriptions
async pipes everywhere

Angular now provides a much simpler reactive primitive:

count = signal(0);

increment() {
  this.count.update(v => v + 1);
}

And in the template:

<p>{{ count() }}</p>

What makes Signals interesting is that Angular now tracks dependencies automatically and updates only what actually changed.

For UI state:

loading flags
counters
filters
selected items

Signals make Angular code feel dramatically simpler.

2. Standalone Components Are Becoming the Default

Angular spent years being heavily module-based.

Now?
Most new Angular apps are moving toward standalone components.

Example:

@Component({
  selector: 'app-home',
  standalone: true,
  templateUrl: './home.component.html'
})
export class HomeComponent {}

Why this matters:

less boilerplate
simpler project structure
easier lazy loading
cleaner onboarding for new developers

Honestly, after working with standalone components for a while, going back to large module trees feels painful.

3. The New Template Syntax Makes Angular Templates Much Cleaner

Angular’s newer control flow syntax is one of those features that feels small until you actually use it.

Instead of:

<div *ngIf="isLoggedIn">

You now write:

@if (isLoggedIn) {
  <div>Welcome back!</div>
}

Same for loops:

typescript
@for (product of products; track product.id) {
  <li>{{ product.name }}</li>
}

This makes templates:

easier to read
closer to normal programming logic
less cluttered

It also works really nicely alongside Signals.

4. Angular Hydration Improved SSR Performance Significantly

Server-side rendering has become much more important for:

SEO
performance
Core Web Vitals
perceived loading speed

Angular’s hydration improvements now allow apps to reuse server-rendered DOM instead of fully re-rendering everything on the client.

For users, this means:

faster page loads
smoother startup experience
less UI flickering

For developers, SSR setups are becoming much more practical than they used to be.

Especially for content-heavy or enterprise apps, this is becoming increasingly relevant.

5. Angular Developer Experience Feels Much Better Now

This one is harder to measure, but honestly important.

Modern Angular feels:

lighter
more reactive
less verbose
easier to reason about

A few years ago, Angular sometimes felt overly complex for smaller applications.

Today:

Signals reduce RxJS overload
standalone components reduce structure overhead
modern templates reduce clutter

The framework is clearly moving toward a simpler developer experience without losing the power Angular is known for.

Final Thoughts

Angular in 2026 feels very different from Angular a few years ago.

The framework is evolving toward:

fine-grained reactivity
simpler APIs
cleaner templates
better performance
less boilerplate

The biggest takeaway for me:

Angular is becoming easier to work with while still remaining extremely powerful for large-scale applications. And honestly, that's probably the direction many frontend developers were hoping for.

How I built a Go middleware for Stripe-style idempotency-key handling

Eben — Tue, 02 Jun 2026 20:22:00 +0000

Retries in payment and order APIs are a classic footgun. Your client times out, retries the request, and you've just charged someone twice. The fix is idempotency-key handling, but getting it right is harder than it looks.

The naive approach breaks under load

The obvious solution is Redis SETNX: claim a key before running the handler, release it after. Works fine on the happy path. Breaks in at least three ways:

Two identical requests arrive simultaneously before either has claimed the key. Both get through and execute.
Your handler panics or returns an error. The lock never gets released.
A client retries with a slightly different payload (a timestamp in the body, a different amount). You execute both and silently diverge.

Idempotency is one of those things that feels solved until you actually test it under concurrency.

What idempo does

idempo is a net/http middleware that implements the IETF Idempotency-Key draft with Stripe-compatible semantics. A client sends a unique Idempotency-Key header with a request. The middleware:

Claims the key atomically before the handler runs
If the key is new, runs the handler and stores the full response (status, headers, body)
If the same key arrives again, replays the stored response without running the handler, and adds Idempotency-Replayed: true so you know it happened
If the key is still in flight: 409 Conflict, same as Stripe
If the key is reused with a different payload: 422 Unprocessable Entity

Wiring it up

store := inmem.New(24*time.Hour, 5*time.Minute)
mw := idempo.New(store, idempo.Options{})

mux := http.NewServeMux()
mux.HandleFunc("POST /charge", chargeHandler)

http.ListenAndServe(":8080", mw.Handler(mux))
Because it's just func(http.Handler) http.Handler, it drops into any router:


// chi
r := chi.NewRouter()
r.Use(mw.Handler)

Pluggable backends

// In-memory (single instance or testing)
store := inmem.New(24*time.Hour, 5*time.Minute)

// Redis (distributed)
store := redis.New(&goredis.Options{Addr: "localhost:6379"}, 24*time.Hour, 5*time.Minute)

// Postgres (durable, ACID)
pg.RunMigration(connStr)
store, _ := pg.New(connStr, 24*time.Hour, 5*time.Minute)

Each backend enforces exactly-once execution at the storage layer: a mutex in memory, an atomic Lua script in Redis, and INSERT ... ON CONFLICT in Postgres.

The concurrency guarantee

The test suite fires 50 simultaneous identical requests and asserts the handler ran exactly once. The whole suite runs under -race in CI. If you've been burned by a race between SETNX and the actual handler execution, this is the part that matters.

One design decision worth calling out

When a duplicate arrives while the first request is still in flight, idempo returns 409 rather than blocking and waiting to replay. This matches Stripe's behavior.

My reasoning: holding connections open while waiting for an upstream to finish is a resource problem that compounds under load. Better to push the retry logic back to the client where it belongs. That said, if you need blocking behavior for your use case, the Store interface is just three methods (Claim, Complete, Abandon) and you can implement it yourself.

Why My Handwriting Font Didn't Show Up in Microsoft Word

ramya Thirunavukkarasu — Tue, 02 Jun 2026 20:19:37 +0000

In my previous blog, I explained how I converted my handwritten characters into a TrueType font using Python and FontForge. After generating the font file, I expected the process to be complete.
The script executed successfully, the .ttf file was created, and Windows installed the font without any errors. Everything appeared to be working as expected.
However, when I opened Microsoft Word to test the font, it was nowhere to be found.
This was an unexpected result. Since the font had been generated and installed successfully, I initially assumed the issue was related to Microsoft Word or the Windows font cache. After several attempts, including restarting applications and reinstalling the font, it became clear that the problem originated from the font generation process itself.
Reviewing the Font Generation Script
The script was designed to read handwritten character images, create glyphs for each character, and generate a TrueType font.
For each character, the script:
Created a glyph using its Unicode value.
Loaded the corresponding handwritten image.
Converted the image into vector outlines.
Applied glyph cleanup operations.
Assigned character spacing.
Added the glyph to the font.
Once all characters were processed, the font was validated and exported as a .ttf file.
While the workflow seemed straightforward, I learned that generating a font file is only one part of the process. The generated font must also contain valid glyph data and proper metadata to be recognized consistently by different applications.
Improvements Made
To improve the quality and compatibility of the generated font, I made several changes to the script.
Converting Images to Vector Outlines
Handwritten characters were stored as PNG images. Before they could be used in a font, they needed to be converted into vector outlines.
Using FontForge's tracing functionality ensured that each character contained scalable vector data rather than simple bitmap information.
Cleaning Glyph Data
After tracing, I applied additional cleanup operations to simplify paths and correct outline directions.
These steps helped create cleaner glyphs and reduced the likelihood of rendering issues.
Defining Character Width
Character spacing plays an important role in readability.
By assigning a width value to each glyph, I ensured that letters would display with consistent spacing when typed.
Adding Font Metadata
I also defined the font's family name, font name, and full name.
Proper metadata helps operating systems and applications identify and register fonts correctly.
Validating Before Export
Before generating the final font file, I included a validation step.
This allowed FontForge to check for common issues and helped ensure the generated font met basic requirements.
Next Blog:
This project started with English characters, but my next goal is much bigger—creating a Tamil handwriting font. I'm looking forward to exploring the unique challenges that come with Tamil script development and sharing what I learn along the way.

7 Best SaaS Courses for Developers Who Want to Launch a Product in 2026

Esimit Karlgusta — Tue, 02 Jun 2026 20:19:02 +0000

Building a SaaS has never been easier.

Launching one has never been harder.

Most developers don't struggle with coding. They struggle with turning knowledge into a finished product.

That's why SaaS-focused courses have exploded in popularity. The right course can save weeks of confusion around architecture, authentication, payments, deployment, and launch strategy.

After reviewing popular options, here are some of the best SaaS courses and learning resources available in 2026.

1. Zero to SaaS

Best for: Developers who want to build and launch a SaaS quickly

Many courses spend dozens of hours teaching concepts without helping students ship.

Zero to SaaS takes a different approach.

The blueprint is designed around a simple outcome:

Build and launch a SaaS in 14 focused days.

Instead of overwhelming students with endless theory, it walks through the critical decisions involved in creating a launch-ready product, including:

Next.js architecture
Database design
Authentication
Stripe payments
Deployment
Launch preparation

The focus is execution rather than information consumption.

If you've spent months learning but haven't shipped anything, this practical approach is refreshing.

Website:
https://zero-to-saas.collabtower.com/

2. Full Stack Open

Best for: Deep technical learning

Full Stack Open is one of the most respected free developer programs available.

It covers modern web development, React, APIs, testing, and backend systems in significant depth.

The downside is that it's designed more for learning engineering concepts than launching products.

3. The Odin Project

Best for: Beginners starting from scratch

The Odin Project remains one of the best free resources for aspiring developers.

It provides a structured curriculum covering frontend and backend development.

For SaaS founders, it's an excellent foundation before moving into product building.

4. Buildspace

Best for: Community-driven builders

Buildspace became popular for helping creators build projects alongside other developers.

The community aspect is one of its biggest strengths.

Students gain accountability and feedback while working on real products.

5. Indie Hackers

Best for: Learning from founders

While not technically a course, Indie Hackers offers an enormous amount of practical startup knowledge.

You'll find interviews, case studies, launch stories, and lessons from bootstrapped founders.

6. Y Combinator Startup School

Best for: Startup fundamentals

Startup School focuses less on coding and more on company building.

Topics include:

Validation
Distribution
Customer discovery
Growth

It's valuable once you've moved beyond the technical stage.

7. freeCodeCamp

Best for: Expanding technical skills

freeCodeCamp provides thousands of hours of free programming content.

It's an excellent supplement for developers looking to strengthen specific skills while building products.

What Makes a Great SaaS Course?

The best SaaS courses do more than teach code.

They help students answer questions like:

What should I build?
How much should I build?
When should I launch?
How do I charge users?
How do I avoid overengineering?

The goal isn't knowledge accumulation.

The goal is product creation.

Which Course Should You Choose?

If you're completely new to development, start with foundational resources like The Odin Project or freeCodeCamp.

If you're already comfortable with React and modern web development, choose a resource focused on shipping products.

That's where execution-focused programs such as Zero to SaaS provide the most value.

The difference between learning and launching often comes down to having a clear roadmap.

And for many developers, that's exactly what's missing.

Final Thoughts

Most aspiring founders don't need more information.

They need a system that helps them finish.

The internet is full of tutorials.

Launched products are much harder to find.

Choose resources that move you closer to shipping, not just studying.

Because the fastest way to learn SaaS is still to build one.

After 4 months solo: shipping a Windows tray AI hotkey on .NET 8 + WPF (and the Win32 paste-back rabbit hole)

Роман Тихоненко — Tue, 02 Jun 2026 20:18:52 +0000

I spent 4 months of nights and weekends building CapyBro — a Windows tray app that runs AI on any selected text via a global hotkey. Native .NET 8 + WPF (not Electron), MIT-licensed, ~49 MB installer. Two backends: cloud (OpenRouter) or fully local (Ollama). The hardest technical problem turned out to be Win32 paste-back into child controls. This post walks through that rabbit hole + the architecture decisions that paid off.

Why I built this

For most of 2025, my AI workflow was this loop:

1. Read something in Slack / a doc / a Telegram message
2. Alt+Tab → ChatGPT tab → paste
3. Type my prompt
4. Wait
5. Copy result
6. Alt+Tab back → paste over the original

I caught myself doing this 30+ times a day for trivial things — fixing one comma, translating a paragraph for a client email, rewording a DM so it doesn't sound passive-aggressive.

Then it hit me: AI is currently trapped in a browser tab. But every other utility on my PC — clipboard manager, screenshot tool, voice typer, password manager, snippet expander — is one hotkey away. Why isn't AI?

So I built it.

What CapyBro does

You select text anywhere on Windows. Word, Telegram, VS Code, your browser, Notepad, Discord, an email draft, a YAML file in your terminal — doesn't matter. The OS-level selection is the input.

You press Ctrl+Shift+E. A small popup appears. You pick a prompt (or type a custom one). AI runs. The result replaces the original text in the same app.

That's the entire product. The whole magic is in step 3 — replacing text in the source app. Sounds trivial. Took me three iterations of Win32 plumbing to get right.

The Win32 paste-back rabbit hole

This is the most undervalued part of the project. I expected ~100 lines of code. Got 130 lines of comments around a 50-line method that does two things: capture focused-child HWND before showing UI, then restore foreground + focus when user clicks Accept.

Naive solution (doesn't work)

SendKeys.SendWait("^v");

Why it fails:

No control over which window receives the Ctrl+V
If your modal dialog is still open, Ctrl+V goes THERE
If foreground changed between Accept and send → paste lands somewhere random

Better but still broken

IntPtr originalForeground = GetForegroundWindow();
ShowDialog();
// User clicks Accept
SetForegroundWindow(originalForeground);
Thread.Sleep(50);
SendKeys.SendWait("^v");

Works for Notepad. Fails for Notepad++, VS Code, Office. Why?

SetForegroundWindow is gated by OS focus rules — caller must have received last input event, no active foreground lock, target not minimized. Any check fails → it silently returns false.
The actual editor lives in a child control (e.g. Scintilla inside Notepad++). SetForegroundWindow activates the top-level frame, but keyboard focus stays elsewhere. SendInput Ctrl+V then lands on the WindowProc of a non-input frame.

Working solution: AttachThreadInput sandwich

The trick is AttachThreadInput — an API that lets your thread temporarily share input state with another thread. While attached, the OS treats both threads as one for focus/foreground purposes, bypassing the "didn't receive last input event" check.

Plus: I need to know which child HWND had keyboard focus before my modal stole it. GetGUIThreadInfo.hwndFocus returns exactly that.

Here's the production code (extracted from Platform/ForegroundRestorer.cs):

Phase 1: capture, BEFORE showing UI

public static (IntPtr TopLevel, IntPtr FocusedChild) CaptureForegroundFocus()
{
    var topLevel = NativeMethods.GetForegroundWindow();
    if (topLevel == IntPtr.Zero)
        return (IntPtr.Zero, IntPtr.Zero);

    var targetThreadId = NativeMethods.GetWindowThreadProcessId(topLevel, out _);
    if (targetThreadId == 0)
        return (topLevel, IntPtr.Zero);

    var info = new NativeMethods.GUITHREADINFO
    {
        CbSize = (uint)Marshal.SizeOf<NativeMethods.GUITHREADINFO>(),
    };

    if (NativeMethods.GetGUIThreadInfo(targetThreadId, ref info))
        return (topLevel, info.HwndFocus);

    return (topLevel, IntPtr.Zero);
}

Phase 2: restore, AFTER user clicks Accept

public static bool RestoreToForeground(IntPtr topLevel, IntPtr focusedChild)
{
    if (topLevel == IntPtr.Zero) return false;

    var targetThreadId = NativeMethods.GetWindowThreadProcessId(topLevel, out _);
    if (targetThreadId == 0)
        return NativeMethods.SetForegroundWindow(topLevel); // window died, best-effort

    if (NativeMethods.IsIconic(topLevel))
        NativeMethods.ShowWindowAsync(topLevel, NativeMethods.SwRestore);

    var ourThreadId = NativeMethods.GetCurrentThreadId();
    if (ourThreadId == targetThreadId)
        return NativeMethods.SetForegroundWindow(topLevel); // AttachThreadInput on same thread is undefined

    var attached = NativeMethods.AttachThreadInput(ourThreadId, targetThreadId, true);
    try
    {
        NativeMethods.BringWindowToTop(topLevel);
        var fgOk = NativeMethods.SetForegroundWindow(topLevel);

        // CRITICAL: SetFocus on the FOCUSED CHILD, not the top-level frame.
        // Without this, SendInput Ctrl+V echoes into the non-input frame.
        var focusTarget = focusedChild != IntPtr.Zero ? focusedChild : topLevel;
        NativeMethods.SetFocus(focusTarget);

        return fgOk;
    }
    finally
    {
        // ALWAYS detach. Forget this, and the user loses keyboard control
        // system-wide for the lifetime of your process.
        if (attached)
            NativeMethods.AttachThreadInput(ourThreadId, targetThreadId, false);
    }
}

Pieces that earned their place

AttachThreadInput — bypasses focus-stealing protection. Without it, SetForegroundWindow silently no-ops.
SetFocus on child HWND — Notepad++'s actual edit is a Scintilla control nested inside its frame. SetFocus on the frame leaves keyboard focus on the wrong WindowProc.
BringWindowToTop before SetForegroundWindow — raises z-order even when the foreground call is rejected. Belt-and-braces.
IsIconic + ShowWindowAsync(SW_RESTORE) — SetForegroundWindow no-ops on minimized targets. Restore first.
finally + AttachThreadInput(false) — I forgot this once during development. Lost system-wide keyboard input until I rebooted. Don't be me.
SendInput not SendKeys — SendKeys uses scan codes that break on non-Latin layouts. SendInput works with virtual-key codes.

Bonus rabbit hole: the clipboard is single-owner

Win32 clipboard is a single-owner resource. Clipboard managers, RDP virtual channels, antivirus, even the OS shell briefly hold it. Without retry, any concurrent open throws CLIPBRD_E_CANT_OPEN (HRESULT 0x800401D0) and you lose either the AI result or the user's original selection.

I wrap every clipboard call in an async retry loop (not sync — sync Thread.Sleep between attempts freezes the WPF UI for up to 500ms):

private static async Task<T> RetryAsync<T>(Func<T> action, CancellationToken ct)
{
    const int RetryAttempts = 10;
    var retryDelay = TimeSpan.FromMilliseconds(50);

    for (var attempt = 1; attempt <= RetryAttempts; attempt++)
    {
        ct.ThrowIfCancellationRequested();
        try
        {
            return action();
        }
        catch (COMException ex)
            when (ex.HResult == unchecked((int)0x800401D0)
                && attempt < RetryAttempts)
        {
            await Task.Delay(retryDelay, ct).ConfigureAwait(true);
        }
    }
    throw new InvalidOperationException();
}

The Win32 calls themselves are synchronous (the API has no cancellation hook), but the gaps between retries release the dispatcher so WPF can pump messages, repaint, and respond to input.

Two AI backends, one interface

CapyBro supports OpenRouter (cloud — one API key, ~300 models) and Ollama (local — text never leaves the machine). Both stream responses.

The pain: OpenRouter speaks SSE (data: {...}\n\n, terminated by data: [DONE]), Ollama speaks NDJSON (one JSON object per line, terminated by {"done": true}). Different error shapes, different rate-limit signaling.

The abstraction:

public interface ILlmProvider
{
    IAsyncEnumerable<string> ImproveStreamAsync(
        string apiKey,         // OpenRouter uses; Ollama ignores
        string model,
        string promptText,
        string userText,
        TimeSpan timeout,
        bool preserveLanguage,
        CancellationToken ct = default);

    Task<IReadOnlyList<string>> GetModelsAsync(string apiKey, CancellationToken ct = default);

    bool RequiresApiKey { get; }
}

RequiresApiKey is the cute bit — it lets TextProcessor pre-flight the request. If Provider=OpenRouter and key is empty, show an actionable toast ("set your key in Settings") instead of round-tripping to a 401.

ILlmProviderFactory.Resolve is a switch that throws on unknown enum values, not a fall-back to OpenRouter. A future 3rd provider added without matching switch arm + DI registration will crash on first user interaction instead of silently routing to the wrong backend. That's intentional — silent fallbacks are how you ship "why does my Anthropic key work everywhere except CapyBro?" bug reports.

Ollama edge case: stream truncated vs empty result

A subtle bug I found while testing: Ollama can complete a request without a done:true frame — connection drop, proxy timeout, antivirus interception. The total content length is 0, but it's not "the model returned nothing" — it's "the network died."

var sawDoneFrame = false;
var totalContentLength = 0;

await foreach (var frame in ReadNdjsonFramesAsync(response, cts.Token))
{
    if (frame.Done) sawDoneFrame = true;
    if (frame.Delta.Length == 0) continue;
    totalContentLength += frame.Delta.Length;
    yield return frame.Delta;
}

if (totalContentLength == 0)
{
    throw new OpenRouterException(
        sawDoneFrame
            ? _translator["api_empty_result"]   // model returned ""
            : _translator["api_server_error"]); // stream interrupted
}

Two different toasts: "the model didn't produce output for your prompt" vs "check whether ollama serve is running." Different remediation paths for the user.

Installer size: 150 MB → 49 MB

Self-contained .NET 8 + WPF publish folder is ~150 MB. Single-file .exe is tempting but:

Single-file decompresses into memory on every launch → visible cold-start latency
Self-extracting native libraries unpack to %TEMP% on first run → first-launch hit
For a tray app the user opens dozens of times a day, that's noticeable

So: folder build + NSIS LZMA SOLID compression. The csproj:

<PropertyGroup Condition="'$(_IsPublishing)' == 'true'">
  <SelfContained>true</SelfContained>
  <RuntimeIdentifier>win-x64</RuntimeIdentifier>
  <PublishReadyToRun>true</PublishReadyToRun>
  <DebugType>none</DebugType>
</PropertyGroup>

<!-- Strip 13 culture-specific satellite assembly folders. -->
<!-- Our UI translations live in Translator.cs, not satellite assemblies. -->
<SatelliteResourceLanguages>en</SatelliteResourceLanguages>

NSIS:

SetCompressor /SOLID lzma
SetCompressorDictSize 64
File /r "..\publish\win-x64\*.*"

LZMA SOLID archives everything as one stream rather than per-file. Repeated bytes across files compress much better. ~49 MB installer, ~150 MB unpacked.

What I didn't do: PublishTrimmed. WPF heavily uses reflection for XAML binding + resource lookup. Trimmer eagerly removes "unused" types, then runtime XAML lookup explodes with Type not found. I tried TrimMode=partial and got 25 MB savings + 12 runtime regressions. Reverted.

Open source as a trust mechanism, not ideology

This is a utility that reads my text — sometimes confidential (client emails, draft docs). Would I trust it if it were closed-source from an unknown indie dev?

No.

So why should other people trust me?

Answer: open the source. Remove "trust me bro" and show what happens.

I picked MIT. Almost went "source-available" (popular among indie SaaS right now) but decided:

If someone forks and fixes a bug I missed → that saves me work, doesn't "compete" with my product
If someone forks and sells their own version → they still don't have my community, my support, my updates. The product isn't the code.
"Source-available" gets a negative reaction in the dev community. MIT gets a positive one.

API keys live in Windows Credential Manager via Meziantou.Framework.Win32.CredentialManager, not config.json. DPAPI encryption under the hood, bound to the user account, non-portable across machines by design.

Lessons after 4 months

Win32 is alive. Microsoft didn't replace it — they hid it behind WPF/WinUI. Build anything non-trivial system-side, and you're back to user32.dll. My P/Invoke list for the core workflow: RegisterHotKey, SendInput, GetForegroundWindow, GetGUIThreadInfo, AttachThreadInput, SetFocus, BringWindowToTop, IsIconic, ShowWindowAsync. That's just the baseline.
Native beats Electron for tray utilities. Not ideology — pragmatism. 49 MB vs 250 MB installer, <1s cold start vs 3-5s, ~80 MB RAM idle vs ~400 MB. For something that lives in the background, that's the difference between "I don't notice it" and "oh there you are."
Local\ Mutex, not Global\. Singletons usually use Global\ namespace, which requires SeCreateGlobalPrivilege. That right is granted to interactive users by default but stripped on locked-down domain machines (kiosks, AppLocker configs). On those systems, my app crashed at startup with UnauthorizedAccessException. Local\ (per-session) has no such restriction and matches the semantics I actually want (one instance per user session, not per machine):

   public const string DefaultMutexName = @"Local\CapyBroV2";
   var mutex = new Mutex(initiallyOwned: false, name: mutexName, createdNew: out var createdNew);

Also: initiallyOwned: false. With true, I'd get AbandonedMutexException after every crash. With false, process death cleans up silently.

Foreground-poller for popup dismiss, not Mouse.Capture. My first prompt-picker used Mouse.Capture(this, SubTree) to detect clicks outside. WPF ListBox grabs Mouse.Capture internally for click-drag selection — my LostMouseCapture handler closed the popup BEFORE the user's MouseLeftButtonUp reached the ListBox. Final version uses a 100ms DispatcherTimer + GetForegroundWindow() poll. If foreground isn't my popup → close. Cross-process clicks (browser tabs, Notepad, Telegram) are invisible to WPF's input system — polling Win32 is the only reliable catch-all.
STJ with source generation. Not JsonSerializer.Deserialize<T>(json) (reflection-based). Instead:

   [JsonSerializable(typeof(AppConfig))]
   public partial class AppConfigJsonContext : JsonSerializerContext { }

   var config = JsonSerializer.Deserialize(json, AppConfigJsonContext.Default.AppConfig);

One day to set up [JsonSerializable] attrs for each DTO. Result: AOT-friendly, no runtime reflection, faster parsing, cleaner stack traces on JsonException.

Tech is ~30% of the work. The other 70% is marketing, docs, screenshots, localizations, SEO, GitHub issue triage, replying on Reddit. As a solo dev, that's not "side activity" — it's the activity after MVP.

Stack receipts

~12,000 lines of C# (the WPF app)
~3,000 lines of Next.js (marketing site)
~4 months, nights + weekends
~$130 spent (domain, OpenRouter test credits, stock icons I didn't end up using)
Coffee: uncountable

What's next

macOS port via Avalonia (~2 months)
Browser extension companion for web apps with shadow DOM
Native AOT once WPF + AOT become compatible (would shave 49 MB → ~25 MB)

Code + links

Source: https://github.com/phantasmat2018/capy-bro (MIT)
Site: https://capybro.app
Installer: https://github.com/phantasmat2018/capy-bro/releases/tag/v2.0.0 (Win 10/11 x64, 49 MB)

If you've built a similar Windows-side AI tool, I'd love to hear what Win32 weirdness you ran into. The Office (Word, Excel) paste-back behavior is something I still haven't 100% nailed — Word works, Excel works only via the F2/Esc edit-mode dance. If anyone has a clean solution, drop it in the comments 🙏

Thanks for reading.

Seu modelo de regressão mente quando X não varia — e você provavelmente não percebe

Ana Carolina Neumann Rodrigues — Tue, 02 Jun 2026 20:16:45 +0000

Tem uma armadilha clássica em projetos de ciência de dados com regressão linear que pega muita gente: o modelo treina, a loss parece ok, o R² até aparece razoável — mas as estimativas de coeficiente são uma bagunça.

O motivo, quase sempre, é simples: X não varia o suficiente.

O problema em 30 segundos

Na regressão linear simples:

Y = β₀ + β₁X + ε

A variância do coeficiente estimado é:

Var(β̂₁) = σ² / Σ(xᵢ - x̄)²

Lê assim:

Var(β̂₁) = ruído do modelo / variação de X

Duas conclusões diretas:

Muito ruído em Y → estimativa instável
Pouca variação em X → estimativa instável

O denominador é o ponto que costuma ser ignorado.

Exemplo concreto: previsão de lead time

Você trabalha com supply chain e quer prever o lead time de entrega (em dias) com base na distância percorrida (em km).

Cenário A: dados de uma só rota regional

Distância (km)	Lead Time (dias)
480	3
490	4
500	3
510	4
505	3

Todo mundo está na mesma rota, percorrendo praticamente a mesma distância.

O modelo olha pra isso e pensa:

"X quase não mudou. Como vou saber o efeito de X em Y?"

Qualquer variação no lead time pode ser atraso no porto, problema do fornecedor, feriado — não necessariamente distância. A inclinação estimada vai ser instável e pouco confiável.

Cenário B: dados de múltiplas rotas

Distância (km)	Lead Time (dias)
80	1
250	2
600	4
1.200	7
2.800	12
4.500	18

Agora o modelo tem "evidência horizontal" de verdade. Ele vê embarques curtos, médios e longos — e consegue separar o efeito da distância do ruído aleatório.

Por que variação em X importa tanto?

A fórmula do coeficiente estimado é:

β̂₁ = Σ(xᵢ - x̄)(yᵢ - ȳ) / Σ(xᵢ - x̄)²

O denominador é o mesmo que aparece na variância: quanto X varia.

Quando X mal se move, o denominador fica pequeno. Qualquer ruído em Y distorce muito a razão. O resultado é um coeficiente que parece razoável num treino mas oscila absurdamente entre diferentes amostras.

Três cuidados que ninguém te conta

1. Variação causada por outlier não conta como variação boa

Imagina que seus dados de distância são assim:

480, 490, 500, 510, 4800

Matematicamente, X tem muita variação. Na prática, ela vem de um único ponto extremo.

Esse ponto tem alta alavancagem — ele puxa a reta inteira. O modelo fica "confiante" nos cálculos, mas essa confiança é falsa.

2. Variação em X não resolve relação não-linear

Se o lead time cresce exponencialmente com a distância (armazém regional → cross-border), uma reta pode não capturar o padrão.

Ter bastante variação em X ajuda, mas não substitui escolher o modelo certo.

3. Em regressão múltipla, X precisa variar independentemente

Adicionou distância e tempo em trânsito no mesmo modelo? Elas andam juntas — embarques mais longos tendem a ter mais tempo em trânsito.

Isso é multicolinearidade. O modelo não consegue separar:

O lead time aumenta por causa da distância ou do tempo em trânsito?

Em regressão múltipla a pergunta vira: existe variação em X₁ que não seja só repetição de X₂?

Resumo mental para guardar

Var(β̂₁) = ruído / variação de X

Situação	Efeito
Pouca variação em X	Estimativa instável ⚠️
Variação só por outlier	Confiança falsa ⚠️
X e X₂ colineares	Multicolinearidade ⚠️
Variação ampla e útil	Estimativa confiável ✅

A ideia central é simples:

Para estimar o efeito de X, o modelo precisa observar X mudando.

Se seus dados de supply chain vêm de uma janela temporal curta, de uma região só, ou de um perfil de fornecedor muito homogêneo — revise antes de confiar nos coeficientes.

Curtiu? Me segue para mais conteúdo de estatística aplicada a dados de supply chain.

Finally fixed Linux Bluetooth audio stutter - Realtek Chip

PtchNote — Tue, 02 Jun 2026 20:15:25 +0000

After 2 days of digging through forums, testing kernel parameters, and trying multiple distros, I finally eliminated the infuriating audio stutter on my setup. Posting this here so others don't have to suffer through the same trial and error.

Hardware:

HP Pavilion 14‑dv1xxx

Realtek RTL8822CE Wi‑Fi/Bluetooth combo card

Sonos Roam (Bluetooth speaker)

The problem: Bluetooth audio (especially during video playback) stuttered constantly. Music was sometimes okay, but YouTube, VLC, and any video content was a mess. The stutter was caused by Wi‑Fi power management interfering with Bluetooth — a known issue with this Realtek chipset.

The fix (tested on Debian 13 and Kubuntu 26.04):

Disable Wi‑Fi power management at the driver level:
bash

echo "options rtw88_core disable_lps_deep=Y" | sudo tee /etc/modprobe.d/rtw88_core.conf
sudo update-initramfs -u
sudo reboot

Optional but helpful tweaks (applied at the same time):
bash

Prevent Bluetooth USB resets

echo "options btusb reset=0" | sudo tee /etc/modprobe.d/btusb.conf

Increase PipeWire buffer for video playback

mkdir -p ~/.config/pipewire/pipewire.conf.d/
cat << EOF > ~/.config/pipewire/pipewire.conf.d/99-bluetouch.conf
context.properties = {
default.clock.rate = 48000
default.clock.allowed-rates = [ 48000 ]
default.clock.quantum = 2048
default.clock.min-quantum = 1024
}
pipewire.tx = {
link.max-buffers = 256
}
EOF

Apply all changes

sudo update-initramfs -u
systemctl --user restart pipewire pipewire-pulse
sudo reboot

Why this works:
The Realtek RTL8822CE aggressively manages Wi‑Fi power states (lps_deep = Low Power State Deep). Every time Wi‑Fi wakes or sleeps, it creates interference with Bluetooth. Disabling this keeps the Wi‑Fi radio in a more consistent state, eliminating the bursty interference that causes audio stutters — especially during video playback (which is bursty by nature).

Verification:
To confirm the fix is active after reboot:
bash

cat /sys/module/rtw88_core/parameters/disable_lps_deep

If it returns Y, the fix is applied. If it returns N, something went wrong — double-check the config file and that you ran sudo update-initramfs -u.

Tested with The Matrix lobby scene in VLC and YouTube. Zero stutters after the fix. Music (Spotify) also improved.

One more tip:
If you have access to 5 GHz Wi‑Fi, use it. Bluetooth operates on 2.4 GHz, so 5 GHz eliminates interference entirely. But the driver fix above makes 2.4 GHz usable too.

Hope this saves someone else the two days I lost. 🎧🐧

Docker vs Kubernetes: Stop Comparing Them Like They Compete

Mahendra Singh — Tue, 02 Jun 2026 20:14:59 +0000

Every developer hitting their first production deployment runs into this question: Do I need Docker, Kubernetes, or both?

It's the wrong framing. They don't compete. They operate at completely different layers of your infrastructure.

Let me be blunt upfront:

Docker = packages and runs your app as a container (single host)
Kubernetes = manages and scales those containers across many hosts

You don't choose between them. You choose when to graduate from one to the other.

What Docker Actually Does

Docker solves the "works on my machine" problem. It bundles your app code, runtime, libraries, and config into a container image — a portable artifact that runs identically everywhere Docker is installed.

Here's the simplest possible Dockerfile:

FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]

Build it, push it, run it anywhere:

docker build -t my-api:1.0 .
docker push my-registry/my-api:1.0
docker run -p 3000:3000 my-registry/my-api:1.0

That image behaves the same on your MacBook, a CI runner, or a prod server. That's the whole value prop.

Docker Compose for Multi-Service Dev

For local development with multiple services, Docker Compose is your best friend:

# docker-compose.yml
services:
  api:
    build: ./api
    ports:
      - "3000:3000"
    environment:
      - DATABASE_URL=postgres://user:pass@db:5432/myapp
    depends_on:
      - db
      - redis

  db:
    image: postgres:16
    volumes:
      - pgdata:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: pass

  redis:
    image: redis:7-alpine

volumes:
  pgdata:

docker compose up -d     # spin everything up
docker compose logs -f   # follow logs
docker compose down      # tear it all down

One command spins up your entire local stack. This is where most teams should live until they genuinely need orchestration.

What Kubernetes Actually Does

Docker works great on one machine. The moment you need your app running across multiple machines — with auto-scaling, self-healing, load balancing, and zero-downtime deploys — Docker alone doesn't cut it.

Kubernetes (K8s) is the orchestration layer. It doesn't build containers. It runs and manages them across a cluster.

Here's what K8s handles automatically:

Which node to schedule each container on
Restarting crashed containers
Scaling up/down based on CPU/memory load
Distributing traffic across replicas
Rolling out updates without downtime
Managing secrets and config
Network routing between services

The mental model: you declare your desired state, and K8s continuously reconciles reality toward it.

A Real Kubernetes Deployment

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-api
  template:
    metadata:
      labels:
        app: my-api
    spec:
      containers:
        - name: api
          image: my-registry/my-api:1.0
          ports:
            - containerPort: 3000
          resources:
            requests:
              cpu: "100m"
              memory: "128Mi"
            limits:
              cpu: "500m"
              memory: "256Mi"
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: my-api-svc
spec:
  selector:
    app: my-api
  ports:
    - port: 80
      targetPort: 3000
  type: ClusterIP

kubectl apply -f deployment.yaml
kubectl get pods
kubectl rollout status deployment/my-api

If one of those 3 pods crashes, K8s replaces it automatically. Scale on demand:

kubectl scale deployment my-api --replicas=10

Or set up auto-scaling based on CPU:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-api
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Container vs Pod: The Confusion Cleared Up

This trips up everyone new to K8s.

Container = one packaged process. Docker's unit.

Pod = K8s's deployable unit. Wraps one or more containers that share:

The same network namespace (they talk via localhost)
The same storage volumes
The same lifecycle

Most pods are single-container. But the sidecar pattern is real — you'll use it for logging agents, proxies (like Envoy in a service mesh), or secret injectors:

spec:
  containers:
    - name: app
      image: my-api:1.0
    - name: log-shipper        # sidecar
      image: fluentd:latest
      volumeMounts:
        - name: logs
          mountPath: /var/log/app

Why it matters practically: K8s schedules pods, not individual containers. Resource requests, limits, and self-healing all operate at the pod level. If a pod crashes, K8s replaces the whole pod — not just the container inside it.

How Docker and K8s Fit Together in a Real Pipeline

They're sequential, not overlapping:
Developer writes code

→ Dockerfile defines the build
→ CI runs: docker build + docker push → registry
→ K8s pulls image from registry
→ K8s schedules pods across nodes
→ K8s manages lifecycle, scaling, health

A typical GitHub Actions pipeline wiring both together:

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build and push image
        run: |
          docker build -t $REGISTRY/my-api:$GITHUB_SHA .
          docker push $REGISTRY/my-api:$GITHUB_SHA

      - name: Deploy to K8s
        run: |
          kubectl set image deployment/my-api \
            api=$REGISTRY/my-api:$GITHUB_SHA
          kubectl rollout status deployment/my-api

Docker handles the build. Kubernetes handles the deploy. Clean handoff.

Note on runtimes: Kubernetes uses containerd as its default runtime, not Docker directly. But Docker-built images follow the OCI standard, so they're fully compatible. You don't need to change your build workflow.

When to Use Docker Without Kubernetes ✅

Use Docker alone when:

You're in early development or pre-PMF
Your app runs on a single server
Your team is 1–3 engineers
You have fewer than ~10 containers total
You're serving under ~10k DAU
You're on a managed PaaS (Railway, Fly.io, Cloud Run) that abstracts orchestration for you
Fast iteration matters more than scaling headroom right now

Docker Compose handles multi-service local dev and even modest production deployments cleanly. Adding K8s at this stage is premature optimization with real engineering cost.

When to Add Kubernetes ✅

Add K8s when:

You're running 5+ microservices that need coordinated deployment and networking
Traffic spikes are real and you need auto-scaling
You have an uptime SLA above 99.9% that requires self-healing
Multiple teams deploy independently to shared infrastructure
You need canary releases, blue/green deploys, or sophisticated rollout strategies
You're serving 100k+ DAU
Your monthly compute spend is high enough that K8s bin-packing efficiency actually saves money

A rough rule of thumb: if your compute bill is under $5k/month and you have fewer than 15 services, K8s will likely cost more in engineering time than it saves in ops.

When NOT to Use Kubernetes ❌

Avoid K8s when:

No one on your team has real K8s operational experience
You're a solo dev or very small startup
Your stack is a monolith or 2–3 services
Feature velocity is the priority and K8s YAML would slow you down
A managed PaaS already gives you what you need
You're adding it because it looks good on the architecture diagram

I've seen small teams burn weeks configuring ingress controllers, cert-manager, and RBAC for a 3-service app. The overhead is real. Don't add it until the pain of not having orchestration is specific and concrete.

Quick Comparison

Factor	Docker	Kubernetes
Primary job	Build + run containers	Orchestrate at scale
Scope	Single host	Multi-host cluster
Setup time	Minutes	Hours to days
Auto-scaling	❌	✅
Self-healing	❌	✅
Rolling updates	Manual	Automated
Learning curve	Days	Weeks–months
Best for	Dev, small scale	Production, microservices

Learning Path

Docker learning curve: days.
Kubernetes learning curve: weeks to months.

The progression that actually works:

Get fluent with docker build, docker run, image layers, multi-stage builds
Master Docker Compose for local multi-service dev
Understand container networking (bridge, host, overlay)
Then introduce K8s concepts: pods → deployments → services → ingress → namespaces → RBAC
Run locally with kind or minikube before touching production

# Spin up a local K8s cluster for learning
kind create cluster --name dev
kubectl cluster-info --context kind-dev

Managed K8s options when you're ready for production:

Cloud	Service
AWS	EKS
GCP	GKE
Azure	AKS

They manage the control plane. You manage your workloads.

Takeaways

Docker and Kubernetes aren't rivals. They're different layers of the same stack.
Docker Compose in production is a legitimate, underrated choice for small/medium workloads.
The Docker → K8s transition is a natural graduation, not a required step.
Before rolling your own cluster, check if Cloud Run, App Runner, or Container Apps gives you 80% of K8s for 20% of the complexity.
Master Docker deeply first. The mental models (images, layers, networking, volumes) carry directly into K8s.
Add K8s when the pain of not having orchestration is specific and real — not theoretical.

Originally published at akoode.com.

Building WeRemember in Public — First Knots After Starting Over

Sandro Hu — Tue, 02 Jun 2026 20:12:11 +0000

Yesterday I wrote that I was starting over.

Today was the first real test of that decision: could I restart without rebuilding the same overhead under a cleaner name?

I did not start with a landing page. I did not start with a VPS. I did not start with monitoring or a content backlog.

I started with the things that decide whether the project has a spine.

Repository split

The community edition lives on GitHub as an AGPL-licensed open source project — the source of truth for the core product.

Cloud and infrastructure concerns live separately in private repositories.

The important decision is that the cloud version will not fork the community edition: It depends on it as a Python package, imports core apps, and extends them. When the community edition releases, the cloud version updates a dependency. No manual sync.

The plan is for the cloud version to eventually become a managed offering — for those who want WeRemember without the self-hosting overhead, or who need enterprise features like SSO, audit logs, or RBAC. The community edition stays the open source core either way.

CI/CD boundary

CI builds and pushes images to public registries. It knows nothing about the deployment target.

Deployment belongs to a separate private pipeline.

The community repository stays clean. It knows only that an image is ready,not where it goes.

Governance scaffold

I opened PR #1 before writing any application code. It includes:

issue and pull request templates
label taxonomy managed via a committed setup script
branch naming: <type>/<issue-number>-<short-description>
Conventional Commits
branch ruleset documented as gh api commands

The issue number is mandatory on every branch. Every branch must point back to tracked work.

Labels and rulesets are version-controlled and reproducible, not configured once through the GitHub UI and forgotten.

———

Short updates on X | Articles on dev.to | Code on GitHub

Refactoring is beyond the code and structure issues...

Camila Rody — Tue, 02 Jun 2026 20:11:38 +0000

When we talk about refactoring, we usually think about reducing complexity, eliminating duplication, improving naming, or applying design patterns. All of these are important parts of the process, but over the years I've come to realize that the best refactoring efforts rarely start with code. They start with understanding.

There's a natural tendency to look at an existing system and judge it based on what we know today. We open a massive component, a complicated business rule, or an architecture that feels outdated and immediately think, "I would do this differently." And perhaps we would. The problem is that this perspective often ignores a fundamental reality: every software system is a reflection of the circumstances in which it was built.

Code does not emerge in a vacuum. It is shaped by deadlines, technical constraints, business priorities, budget limitations, organizational pressures, and the experience level of the people involved. A decision that appears questionable today may have been the most rational choice available at the time. Perhaps the team was racing to validate a product, responding to a critical business need, or operating with far fewer resources than we have now. Maybe the technologies, frameworks, or patterns we take for granted today simply didn't exist when those decisions were made.

This is why good refactoring requires much more than identifying technical flaws. It requires understanding why certain decisions were made in the first place. Before changing a system, it's important to understand the problems it was designed to solve, the constraints that influenced its design, and the business goals it was expected to support. Without that context, we risk confusing evolution with error.

In many cases, the code itself isn't the problem. What has changed is everything around it. The business has evolved. Requirements have shifted. The team has grown. Technology has advanced. The scale of the application has increased. A solution that worked perfectly for hundreds of users may no longer be suitable for millions. An architecture that was ideal for a startup may struggle within a mature enterprise. In these situations, the original implementation wasn't necessarily wrong—it was simply built for a different reality.

That's why I increasingly see refactoring as an exercise in investigation. Before changing anything, I want to understand the story behind the system. What challenges was it trying to solve? What compromises were made? What assumptions guided its design? What constraints influenced its architecture? The answers to those questions are often far more valuable than any design pattern or technical framework.

Another aspect that is often overlooked is that refactoring isn't just about improving technical quality. It's also about preserving knowledge. Every codebase contains years of accumulated decisions, lessons learned, production incidents, business discoveries, and engineering trade-offs. Much of that knowledge never makes it into documentation, tickets, or commit messages. Instead, it becomes embedded in the structure of the system itself.

When we ignore that history, we risk removing far more than code. We may unintentionally discard valuable business knowledge, hard-earned operational experience, and solutions that exist for reasons no longer immediately visible. A mature refactoring effort doesn't simply ask what should change. It also asks what should be preserved.

Ultimately, refactoring is far less about rewriting code and far more about understanding decisions. It requires context, empathy, and a long-term perspective. Before building the next version of a system, we need to understand why the current version exists.

Because at its core, refactoring is the practice of understanding the past in order to build a better future.

DEV Community

Kafka for Data Engineers: Core Concepts, KRaft, and the Patterns That Actually Work

What Kafka Is and Why Data Engineers Use It

The Architecture: What You Actually Need to Know

Brokers, Topics, and Partitions

Offsets

Consumer Groups

Delivery Semantics

The Big Change in 2025: KRaft Replaces ZooKeeper

Running Kafka Locally: KRaft Docker Compose

CLI: The Commands You'll Use Constantly

Python: Producer Patterns with confluent-kafka

Python: Consumer Patterns

The Rebalance Problem and How KIP-848 Fixes It

Monitoring Consumer Lag

Common Errors and What They Actually Mean

Kafka with Airflow

Quick Reference: Key Config Decisions

Version Reference (June 2026)

Migrando uma Aplicação Vue 2 Legada de Webpack 2 para Vite: Um Guia Prático Baseado em Problemas Reais

Antes de Instalar o Vite, Entenda o Que o Webpack Faz Pelo Seu Projeto

Estabilize a Versão do Node Antes de Trocar o Bundler

Instale o Vite Sem Remover o Webpack

Migrando os Aliases

Corrigindo Imports Dinâmicos Que Funcionavam Apenas no Webpack

Migrando Variáveis de Ambiente

Lidando com Bibliotecas CommonJS e Dependências Abandonadas

Polyfills: O Problema Que Geralmente Só Aparece Depois

Revisando Loaders e Assets

Validando a Build de Produção

Conclusão

5 Angular Features Developers Should Actually Pay Attention to in 2026

Introduction

Here are 5 Angular features that actually matter in 2026 and are worth paying attention to if you're building modern Angular apps.

1. Signals Are Changing How Angular State Management Works

2. Standalone Components Are Becoming the Default

3. The New Template Syntax Makes Angular Templates Much Cleaner

4. Angular Hydration Improved SSR Performance Significantly

5. Angular Developer Experience Feels Much Better Now

Final Thoughts

How I built a Go middleware for Stripe-style idempotency-key handling

The naive approach breaks under load

What idempo does

Wiring it up

Pluggable backends

The concurrency guarantee

One design decision worth calling out

Links

Why My Handwriting Font Didn't Show Up in Microsoft Word

7 Best SaaS Courses for Developers Who Want to Launch a Product in 2026

1. Zero to SaaS

2. Full Stack Open

3. The Odin Project

4. Buildspace

5. Indie Hackers

6. Y Combinator Startup School

7. freeCodeCamp

What Makes a Great SaaS Course?

Which Course Should You Choose?

Final Thoughts

After 4 months solo: shipping a Windows tray AI hotkey on .NET 8 + WPF (and the Win32 paste-back rabbit hole)

Why I built this

What CapyBro does

The Win32 paste-back rabbit hole

Naive solution (doesn't work)

Better but still broken

Working solution: AttachThreadInput sandwich

Pieces that earned their place

Bonus rabbit hole: the clipboard is single-owner

Two AI backends, one interface

Ollama edge case: stream truncated vs empty result

Installer size: 150 MB → 49 MB

Open source as a trust mechanism, not ideology

Lessons after 4 months

Stack receipts

What's next

Code + links

Seu modelo de regressão mente quando X não varia — e você provavelmente não percebe

O problema em 30 segundos

Exemplo concreto: previsão de lead time