Polliog

Posted on Feb 2

Kubernetes Is Overkill for 99% of Apps (We Run 500k Logs/Day on Docker Compose)

#kubernetes #docker #programming #devops

Everyone says you need Kubernetes for production. They're wrong.

We run Logtide, a full observability platform with log management, SIEM, real-time streaming, and analytics - on Docker Compose. We handle 500,000 logs per day. We serve thousands of users. We maintain 99.8% uptime.

Our infrastructure? One server. One docker-compose.yml file. Zero Kubernetes complexity.

Here's why most apps don't need Kubernetes, and how we actually run production workloads at scale with boring tech.

The Kubernetes Tax

Every technology has a complexity tax. You pay it every time you deploy. Every time you debug. Every time you onboard someone new.

For Kubernetes, that tax is massive.

What Kubernetes Actually Costs

Learning curve:

2-3 months for experienced engineers to become proficient
Understanding pods, services, deployments, ingresses, config maps, secrets, persistent volumes
Networking models, service meshes, CNI plugins
RBAC, security contexts, pod security policies

Operational overhead:

etcd cluster management
Control plane updates
Certificate rotation
Network policy debugging
Storage class configuration
Monitoring stack (Prometheus, Grafana, Alertmanager)

Infrastructure costs:

Minimum 3 control plane nodes for HA
Worker nodes (minimum 2-3)
Load balancers
Managed Kubernetes service fees ($70-150/month on top of compute)

A recent case study showed a team spent 60 hours per month on Kubernetes operations for a 5-service application. That's 1.5 engineer months just keeping the lights on.

The Industry Reality

2025 CNCF data shows:

80% of Kubernetes users are at companies with 1000+ employees
79% of production issues come from configuration changes
Most K8s clusters run <20 services

Translation: Kubernetes is enterprise tooling being cargo-culted by startups.

What We Actually Run On

Our entire stack:

version: '3.8'

services:
  postgres:
    image: timescale/timescaledb:latest-pg16
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      POSTGRES_DB: logtide
      POSTGRES_USER: logtide
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    deploy:
      resources:
        limits:
          memory: 4G
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U logtide"]
      interval: 10s
      timeout: 5s
      retries: 5

  backend:
    image: logtide/backend:latest
    depends_on:
      postgres:
        condition: service_healthy
    environment:
      DATABASE_URL: postgresql://logtide:${DB_PASSWORD}@postgres:5432/logtide
      NODE_ENV: production
    deploy:
      replicas: 2
      resources:
        limits:
          memory: 1G
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  frontend:
    image: logtide/frontend:latest
    depends_on:
      - backend
    deploy:
      resources:
        limits:
          memory: 512M

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - ./certs:/etc/nginx/certs:ro
    depends_on:
      - frontend
      - backend

volumes:
  postgres_data:

That's it. 60 lines. Everyone on the team understands it.

Our Production Numbers

Server specs:

Hetzner CX33: 4 vCPU, 8GB RAM
160GB NVMe storage
Cost: €12.50/month (~$14)

Traffic:

500,000 logs/day ingested
~6 logs/second average, 50/sec peak
1000+ concurrent SSE connections for live tailing
15GB compressed storage (TimescaleDB compression)

Performance:

P50 log ingest: 45ms
P95 log ingest: 120ms
Real-time streaming latency: <50ms
Search queries: 50-200ms

Reliability:

99.8% uptime over 3 months
Zero-downtime deploys
Deploy time: 30 seconds
Recovery time: 2 minutes (restart all services)

Cost comparison:

Our setup: $14/month
Equivalent managed K8s: $150-300/month (EKS/GKE/AKS base + nodes)
Savings: $1,632-3,432/year

Production Features We Actually Use

Zero-Downtime Deploys

#!/bin/bash
# deploy.sh

echo "Pulling latest images..."
docker compose pull

echo "Updating backend (rolling)..."
docker compose up -d --no-deps --scale backend=3 backend
sleep 10
docker compose up -d --no-deps --scale backend=2 backend

echo "Updating frontend..."
docker compose up -d --no-deps frontend

echo "Reloading nginx..."
docker compose exec nginx nginx -s reload

echo "Deploy complete!"

No Kubernetes rolling updates. No replica sets. Just Docker Compose and a 10-line bash script.

Health Checks and Auto-Restart

Docker Compose handles it:

healthcheck:
  test: ["CMD-SHELL", "pg_isready"]
  interval: 10s
  retries: 5

restart: unless-stopped

Container dies? Docker restarts it. Health check fails? Docker marks it unhealthy and stops routing traffic.

Resource Limits

deploy:
  resources:
    limits:
      memory: 4G
      cpus: '2'
    reservations:
      memory: 2G

No need for K8s resource quotas or limit ranges.

Secrets Management

# .env file (git-ignored)
DB_PASSWORD=xxx
JWT_SECRET=xxx
SMTP_PASSWORD=xxx

Combined with proper file permissions:

chmod 600 .env

Is it HashiCorp Vault? No. Does it work? Yes.

For production, we use environment variables injected by the CI/CD system. For truly sensitive data (like TLS certs), we use encrypted volumes with LUKS.

Monitoring

We self-host Prometheus + Grafana in the same Docker Compose stack:

prometheus:
  image: prom/prometheus
  volumes:
    - ./prometheus.yml:/etc/prometheus/prometheus.yml
    - prometheus_data:/prometheus

grafana:
  image: grafana/grafana
  volumes:
    - grafana_data:/var/lib/grafana
  depends_on:
    - prometheus

Total addition: 15 lines. No Kubernetes operator. No Helm charts.

Backups

#!/bin/bash
# backup.sh

docker compose exec -T postgres pg_dump -U logtide logtide | \
  gzip > /backups/logtide-$(date +%Y%m%d).sql.gz

# Upload to S3
aws s3 cp /backups/logtide-$(date +%Y%m%d).sql.gz \
  s3://logtide-backups/

# Keep last 30 days locally
find /backups -name "logtide-*.sql.gz" -mtime +30 -delete

Run via cron. Works perfectly. No Velero. No K8s backup operators.

When We'll Actually Need Kubernetes

We're honest about this. We'll move to Kubernetes when:

1. Multi-region becomes critical

Docker Compose is single-host. If we need true multi-region failover with automatic traffic routing, we'll need K8s.

Current workaround: Run separate Docker Compose stacks in each region with DNS failover. Works for now.

2. We need >10 servers

Managing Docker Compose across 10+ hosts gets painful. At that scale, K8s orchestration wins.

Current reality: We're on 1 server. We could probably scale to 3-5 before Docker Swarm or K8s becomes necessary.

3. Complex microservices networking

If we split into 50+ services with complex service mesh requirements, K8s wins.

Current architecture: 4 services. Simple nginx routing. No need for Istio or Linkerd.

4. We hire a dedicated platform team

Kubernetes requires dedicated people. If we grow to the point where we can afford a 2-3 person platform engineering team, K8s makes sense.

Current team: 2 engineers. We can't afford Kubernetes babysitting.

The Real Question

Before adopting Kubernetes, ask:

Do we have the problems Kubernetes solves?

Running 100+ services? (No)
Need cross-datacenter orchestration? (No)
Have 5+ platform engineers? (No)
Need complex traffic splitting? (No)
Spending >$50k/month on infrastructure? (No)

If you answered "no" to most of these, you don't need Kubernetes.

What Actually Matters

Good infrastructure has three properties:

1. It disappears

You deploy code. It works. You don't think about infrastructure daily.

With Kubernetes, infrastructure became our full-time job. With Docker Compose, we git push and move on.

2. Everyone understands it

When something breaks at 2 AM, can anyone on the team fix it?

With Docker Compose: Yes. It's just containers and bash scripts.

With Kubernetes: No. You need the "Kubernetes person."

3. It's proportional to your scale

Your infrastructure complexity should match your actual scale, not your imagined future scale.

We serve thousands of users on one server. Why would we architect for millions?

Our Actual Scaling Path

Today (500k logs/day):

1 server
Docker Compose
$14/month

At 5M logs/day:

3 servers (1 primary, 2 read replicas)
Still Docker Compose
Manual failover
$50/month

At 50M logs/day:

10+ servers
Consider Kubernetes or Docker Swarm
Need proper orchestration
$500+/month

Key insight: We're 3 orders of magnitude away from needing Kubernetes.

The Stack We Actually Built

Backend: Fastify (Node.js) - 2 replicas

Frontend: SvelteKit 5 - static files served by nginx

Database: PostgreSQL with TimescaleDB extension

Real-time: PostgreSQL LISTEN/NOTIFY + Server-Sent Events

Search: PostgreSQL full-text search + GIN indexes

Caching: PostgreSQL UNLOGGED tables (Redis optional)

Queue: PostgreSQL with SKIP LOCKED

Pub/Sub: PostgreSQL LISTEN/NOTIFY

Analytics: TimescaleDB continuous aggregates

Notice a pattern? PostgreSQL does everything.

One database. One connection string. Zero microservice complexity.

Why This Works

1. Postgres is incredibly powerful

With extensions (TimescaleDB, pg_cron, pgvector), Postgres handles:

Time-series data (TimescaleDB)
Full-text search
JSON documents
Job queues
Pub/sub
Geospatial data (PostGIS)

2. Boring tech is reliable

Docker Compose has existed since 2014. It's mature. It's stable. The docs are excellent.

Kubernetes changes every 3 months. Your deployment manifests break. Your Helm charts need updates.

3. Simple systems are debuggable

When logs aren't appearing:

Check Docker Compose logs: docker compose logs -f backend
Check database: psql and run queries
Check nginx: docker compose exec nginx cat /var/log/nginx/error.log

No kubectl context switching. No pod network debugging. No CNI plugin issues.

4. Constraints breed creativity

Having to fit everything on one server forced us to:

Use TimescaleDB compression (90% space savings)
Optimize queries properly
Build efficient real-time features with Postgres
Actually understand our performance characteristics

With unlimited Kubernetes resources, we'd have thrown hardware at problems instead of solving them.

What We Learned

Enterprise best practices are often wrong for you

Google runs Kubernetes because they have 50,000 engineers and millions of containers.

You are not Google.

Simplicity is a feature

The best infrastructure is the one you don't have to think about.

Premature optimization extends to infrastructure

Don't build for 10M users when you have 100.

The boring choice is usually right

Docker Compose is boring. PostgreSQL is boring. nginx is boring.

Boring means: mature, stable, well-documented, widely understood.

Technical debt is real

Every piece of infrastructure you add is debt. You'll pay interest on it forever through:

Maintenance
Upgrades
Monitoring
Documentation
Training new team members

Choose wisely.

Start Simple

If you're building a new application:

Day 1: Docker Compose on one server
100 users: Still Docker Compose
1,000 users: Still Docker Compose, maybe add monitoring
10,000 users: Add read replicas, still Docker Compose
100,000 users: Now consider proper orchestration

Most companies never reach 100,000 users. Don't prematurely optimize for scale you'll never see.

Our Recommendation

Use Docker Compose if:

You have <20 services
You can fit on <5 servers
Team size <10 engineers
You value simplicity over features
You're not managing multi-region deployments

Consider Kubernetes if:

You have >50 services
You need >10 servers
You have dedicated platform engineers
Complex networking requirements
Multi-region is critical
Enterprise compliance requires it

For everyone else: Docker Compose is enough.

Try It Yourself

Logtide runs entirely on Docker Compose. We're open source (AGPLv3).

Get started in 2 minutes:

git clone https://github.com/logtide-dev/logtide
cd logtide
cp .env.example .env
docker compose up -d

No Kubernetes. No complexity. Just working software.

Production-ready log management, SIEM, and observability on boring tech.

500,000 logs/day. One server. Zero Kubernetes.

Sometimes the best solution is the simplest one.

GitHub: https://github.com/logtide-dev/logtide
Website: https://logtide.dev

Top comments (6)

david duymelinck • Feb 2

On your setup, while a single data storage solution like Postgres can do a lot of things, at some point you are going to lose money not using specialized data-storage systems.
For example a search specific database is going to handle the same amount of data better with less resources.

The 5 million logs might be fixed by a database that is more equipped to handle that amount of data. The reason all the specialized solutions exist is because at scale they make sense.

And it might not even be the data-storage solution that is the problem, but the programming language. Scaling has no one fits all solution.

While looking for the most simple solution is something we all should do, the complexity of scaling can make the solution more complicated than the simplest solution.

The main idea of the post is solid, it is the single solution thinking that makes it missing the mark for me.

Polliog • Feb 3

Thank you for your comment, I've already talked in another post etc how i handle that kind of data with timescale. Of course we have already in mind of creating a pluggable storage server, so a user can chose what type of storage server or db use with logtide. In this moment we are already experimenting.

Also tbh, timescale handle the compression and retention without problems and without too much cpu usage, so maybe i'm going to create an article only about it.

david duymelinck • Feb 3

I have no doubt you have thought about the options I mentioned.

It is just the presentation in the post as the only option that made me react.

Art light • Feb 2

This is a great write-up—clear, honest, and refreshingly grounded in real production numbers instead of hype. I really like how you frame Kubernetes as a tool with a cost, not a default badge of maturity, and your transparency around trade-offs makes the argument much stronger. Personally, I agree that matching infrastructure complexity to actual scale is underrated, and your “boring tech” approach shows how far solid fundamentals can go. It also sets a realistic expectation for small teams who want to ship and operate reliably without burning energy on platform overhead. Curious to see how your setup evolves over time, especially if you push into multi-region or higher ingest volumes—this feels like a very sane baseline to learn from.