Density Tech

Posted on Jan 25

Do You Really Need Kafka? A Practical Alternative with Postgres

#kafka #postgres #eventdriven #programming

Kafka: The Right Tool, Used Too Often

Apache Kafka has become the default answer to almost any asynchronous or event-driven problem. It is powerful, proven at scale, and excellent at handling large volumes of data with strong guarantees. If you are building a real-time data platform, a streaming system, or anything that needs to fan out events to many consumers, Kafka is often the right tool.

But Kafka also comes with real cost. Running it in production means dealing with brokers, partitions, consumer groups, rebalancing, retention policies, and monitoring. Even with managed services, you are still paying in infrastructure and in engineering time.

In practice, many teams end up using Kafka not because they need a streaming platform, but because they just need a reliable queue.

And in those cases, Kafka is usually correct — but often overkill.

The Hidden Cost of “Just Use Kafka”

For early-stage systems, internal tools, or moderate workloads, Kafka tends to introduce more complexity than actual value.

You run a distributed system even when your problem is not distributed.
You operate a streaming platform even when your use case is just background jobs.
You manage offsets and consumer groups when a simple retry would be enough.
You debug through dashboards and logs instead of just looking at the data.

The result is a system that works well, but feels heavy. Heavy to run, heavy to reason about, and heavy to change.

This is exactly where Postgres-based queues start to look very attractive.

Why Postgres Can Win in Economics and Debuggability

Postgres is already there in almost every backend. It is monitored, backed up, and familiar. Using it as a lightweight message queue adds no new service, no new cluster, and no new operational surface.

From a cost perspective, it is hard to beat:
there are no brokers to run, no separate infrastructure, and no extra managed services to pay for.

From a debugging perspective, it is even better:
every message is just a row,
every failure is visible,
retries are tracked,
and stuck messages can be inspected or fixed with plain SQL.

Instead of debugging a distributed system, you debug data.
And for most engineers, that is a much simpler and more productive mental model.

When Postgres Is Actually the Right Choice

Postgres works well as a message queue when async processing is just a part of your system, not the main thing your system exists to do. In these cases, you usually care more about simplicity and reliability than extreme scale or global distribution.

pgmq fits nicely for things like background jobs, webhook handling, retry systems, internal workflows, and small ETL pipelines. These setups usually have a few producers, a few consumers, and traffic that is steady but not massive. What they really need is visibility and control, not a full-blown streaming platform.

This is where Postgres shines. You can wrap business logic and queue operations in the same transaction. You don’t need to run any extra infrastructure. And you can see exactly what’s happening just by querying tables. If something breaks, you can inspect the message, fix it, and retry it directly.

pgmq is not meant for high-throughput streaming, analytics pipelines, or cross-region event systems. Once the queue becomes the core of your architecture, and not just a helper, you are in Kafka territory.

The simple rule is: use Postgres when the queue supports your system. Use Kafka when the queue is your system.

pgmq: How It Works

At a high level, pgmq is not doing anything magical. It is just using Postgres tables, locks, and timestamps to behave like a message queue. There is no separate broker, no background service, and no hidden state. Everything lives inside the database.

When you create a queue in pgmq, it creates two main tables for you:

pgmq.q_events – the live queue

pgmq.a_events – the history of processed messages

The live table is where all active messages sit. Each row is one message. The important columns are:

msg_id – unique ID for the message

enqueued_at – when the message was produced

vt – when the message can be read again

read_ct – how many times it has been delivered

message – your actual JSON payload

headers - your actual headers

This single table gives you most queue features in one place:

Durability → rows stored in Postgres

Visibility timeout → vt

Retry count → read_ct

Ordering → ORDER BY msg_id

Backlog → SELECT count(*)

When a consumer reads messages, pgmq simply locks rows using FOR UPDATE SKIP LOCKED and moves vt into the future. If the consumer crashes, the lock is released and the message becomes visible again. That is your retry mechanism.

When the consumer finishes, calling pgmq.delete() removes the row from q_events and moves it into a_events. That archive table is extremely useful in practice — it gives you a full audit trail of what was processed, when, and how many times.

There is also a small pgmq.meta table which stores queue-level configuration like visibility timeouts and creation metadata. Think of it as the control plane.

The key thing to understand is this: pgmq is just SQL implementing queue semantics. If you can read the tables, you can understand the system. There is no black box. What you see in the database is exactly what the queue is doing.

And that is precisely why pgmq feels so easy to debug compared to traditional brokers.

In the next section, we’ll set up a minimal pgmq environment using Docker and Kubernetes, and walk through a working producer–consumer example.

Setting Up pgmq Locally (A Minimal Working Example)

We’ll start by running Postgres with pgmq using a single Kubernetes deployment.

Prerequisties : Docker, Minikube

# postgres.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: ghcr.io/pgmq/pg18-pgmq:v1.7.0
          imagePullPolicy: IfNotPresent
          env:
            - name: POSTGRES_USER
              value: xxx
            - name: POSTGRES_PASSWORD
              value: xxx
            - name: POSTGRES_DB
              value: queue_db
          ports:
            - containerPort: 5432
---
apiVersion: v1
kind: Service
metadata:
  name: postgres
spec:
  type: NodePort
  selector:
    app: postgres
  ports:
    - port: 5432
      nodePort: 30007

kubectl apply -f postgres.yaml
kubectl get pods

kubectl port-forward svc/postgres 5432:5432

In new Terminal

psql -h localhost -U xxx -d queue_db

To create a queue

CREATE EXTENSION pgmq;
SELECT pgmq.create('events');

Producer Setup (Python)

# producer.py

import psycopg2
import json
import time

conn = psycopg2.connect(
    host="postgres",
    port=5432,
    user="xxx",
    password="xxx",
    dbname="queue_db"
)

cur = conn.cursor()
i = 0

while True:
    payload = {"id": i, "type": "order_created"}
    cur.execute("SELECT pgmq.send('events', %s)", [json.dumps(payload)])
    conn.commit()
    print("Produced:", payload)
    i += 1
    time.sleep(1)

# Dockerfile

FROM python:3.11-slim
WORKDIR /app
RUN pip install psycopg2-binary
COPY producer.py .
CMD ["python", "producer.py"]

# producer.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pg-producer
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pg-producer
  template:
    metadata:
      labels:
        app: pg-producer
    spec:
      containers:
        - name: producer
          image: pg-producer
          imagePullPolicy: IfNotPresent


docker build -t pg-producer .
kubectl apply -f producer.yaml

Consumer Setup (Python)

# consumer.py

import psycopg2
import time

conn = psycopg2.connect(
    host="postgres",
    port=5432,
    user="user",
    password="pass",
    dbname="queue_db"
)

cur = conn.cursor()

while True:
    cur.execute("SELECT * FROM pgmq.read('events', 1, 5)")
    rows = cur.fetchall()

    if not rows:
        time.sleep(1)
        continue

    for row in rows:
        msg_id = row[0]
        body = row[3]
        print("Consumed:", body)

        cur.execute("SELECT pgmq.delete('events', %s)", [msg_id])
        conn.commit()

# Dockerfile

FROM python:3.11-slim
WORKDIR /app
RUN pip install psycopg2-binary
COPY consumer.py .
CMD ["python", "consumer.py"]

# consumer.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pg-consumer
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pg-consumer
  template:
    metadata:
      labels:
        app: pg-consumer
    spec:
      containers:
        - name: consumer
          image: pg-consumer
          imagePullPolicy: IfNotPresent


docker build -t pg-consumer .
kubectl apply -f consumer.yaml

Ensure all pods are running

m-mfkghf4fgk producer % kubectl get pods

NAME                          READY   STATUS    RESTARTS   AGE
pg-consumer-c976b84f8-t84mt   1/1     Running   0          27s
pg-producer-85cd846b4-llj72   1/1     Running   0          92s
postgres-5f9b95c698-pjjnp     1/1     Running   0          20m

Verify Logs

kubectl logs deployment/pg-producer

kubectl logs deployment/pg-consumer

Optional : PG WEB UI

m-mfkghf4fgk producer % kubectl run pgweb --image=sosedoff/pgweb -- \
  --host=postgres \
  --port=5432 \
  --user=xxx \
  --pass=xxx \
  --db=queue_db \
  --ssl=disable

kubectl port-forward pod/pgweb 8081:8081

Visit in browser - http://localhost:8081

Run Sample query to see the events

SELECT * FROM pgmq.q_events LIMIT 20;

Not Everything Needs to Be Kafka

pgmq is not trying to replace Kafka, and it shouldn’t. It solves a different problem. If you need high-throughput streaming, multiple independent consumers, or large-scale event processing, Kafka is still the right tool.

But if all you need is a reliable, observable queue for background work, retries, or internal workflows, Postgres is often more than enough. You already run it, you already trust it, and you can see exactly what is happening inside it.

In many systems, the queue is not the product — it is just plumbing. And for plumbing, simple and boring is usually better than powerful and complex.