So you spun up a VPS, deployed your app, told everyone it was live — and then woke up to angry Slack messages because the whole thing went down at 3 AM. Welcome to the club.
Self-hosting production applications is one of those things that sounds straightforward until you actually do it. I've been running self-hosted services for about six years now, and the gap between "it works on my server" and "it works reliably in production" is where most of the pain lives. There's actually a massive free guide floating around (750+ pages) covering this exact territory, which reminded me that a lot of developers keep hitting the same walls.
Let me walk through the most common reasons self-hosted apps fail in production and how to actually fix them.
The Root Cause: You Deployed an App, Not a System
Here's the core issue. When you docker compose up -d and walk away, you've deployed an application. But production needs a system — monitoring, automatic restarts, log rotation, backups, resource limits, and reverse proxy configuration that doesn't fall over.
Most 3 AM crashes come down to one of three things:
- Memory exhaustion — your app (or its database) slowly ate all available RAM
- Disk full — logs or temp files filled the drive
- No automatic recovery — the process crashed and nothing restarted it
Let's fix all three.
Step 1: Set Resource Limits (Stop the OOM Killer)
If you're using Docker Compose, you need memory limits. Without them, a single misbehaving container can take down everything on the host.
# docker-compose.yml
services:
app:
image: your-app:latest
deploy:
resources:
limits:
memory: 512M # hard ceiling — container gets killed past this
cpus: '1.0'
reservations:
memory: 256M # guaranteed minimum
restart: unless-stopped # this alone prevents most 3 AM incidents
postgres:
image: postgres:16
deploy:
resources:
limits:
memory: 1G
# tune shared_buffers to ~25% of memory limit
environment:
POSTGRES_SHARED_BUFFERS: 256MB
volumes:
- pgdata:/var/lib/postgresql/data
restart: unless-stopped
That restart: unless-stopped line is doing heavy lifting. It means Docker will automatically restart crashed containers unless you explicitly stopped them. I'm genuinely surprised how many production setups I've seen without it.
Step 2: Fix the Silent Disk Killer
Docker logs will eat your disk alive if you don't configure rotation. By default, Docker just appends JSON logs forever. I learned this the hard way when a 40GB log file took down a production Postgres instance.
Add this to your Docker daemon config:
// /etc/docker/daemon.json
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
Restart Docker after changing this. Existing containers need to be recreated (not just restarted) to pick up the new logging config.
While you're at it, set up a basic disk monitoring cron job:
#!/bin/bash
# /usr/local/bin/disk-check.sh
# Alert when disk usage crosses 85%
THRESHOLD=85
USAGE=$(df / | tail -1 | awk '{print $5}' | sed 's/%//')
if [ "$USAGE" -gt "$THRESHOLD" ]; then
# swap this for your preferred notification method
curl -X POST "https://your-webhook-url" \
-H "Content-Type: application/json" \
-d "{\"text\": \"Disk usage at ${USAGE}% on $(hostname)\"}"
fi
Schedule it every 15 minutes with cron and you'll never be surprised by a full disk again.
Step 3: Add Health Checks That Actually Work
Docker health checks let you detect when your app is technically running but not actually working — like when your Node.js server is up but stuck in an event loop block.
# docker-compose.yml
services:
app:
image: your-app:latest
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s # grace period for startup
restart: unless-stopped
But here's the thing most guides skip: your /health endpoint needs to actually check dependencies. Don't just return 200.
// Express health check that actually means something
app.get('/health', async (req, res) => {
try {
// check database connection
await db.query('SELECT 1');
// check redis if you use it
await redis.ping();
res.status(200).json({ status: 'ok' });
} catch (err) {
// returning 503 makes Docker mark container as unhealthy
res.status(503).json({ status: 'degraded', error: err.message });
}
});
Step 4: Reverse Proxy Configuration That Doesn't Suck
If you're exposing services to the internet, you need a reverse proxy. Caddy has become my go-to because it handles TLS certificates automatically and the config is minimal.
# Caddyfile
yourapp.example.com {
reverse_proxy app:3000 {
# passive health checks — stop sending traffic to dead upstreams
health_uri /health
health_interval 30s
}
# basic rate limiting to prevent abuse
rate_limit {
zone dynamic {
key {remote_host}
events 100
window 1m
}
}
encode gzip
log {
output file /var/log/caddy/access.log {
roll_size 50mb
roll_keep 5
}
}
}
Caddy handles HTTPS automatically through Let's Encrypt. No certbot cron jobs, no renewal scripts. It just works.
Step 5: Backups (The Thing You'll Wish You Had)
I know. Backups are boring. But future-you will be incredibly grateful. Here's a minimal but functional approach for Postgres:
#!/bin/bash
# /usr/local/bin/backup-db.sh
BACKUP_DIR="/backups/postgres"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=7
# dump the database from the running container
docker exec postgres pg_dump -U appuser -Fc appdb > "${BACKUP_DIR}/app_${TIMESTAMP}.dump"
# clean up old backups
find "$BACKUP_DIR" -name "*.dump" -mtime +$RETENTION_DAYS -delete
# optional: sync to remote storage
# rclone copy "$BACKUP_DIR" remote:backups/postgres --max-age 24h
Run this daily via cron. Uncomment the rclone line when you've set up remote storage — local-only backups on the same server are better than nothing, but not by much.
Prevention: The Checklist
Before you consider any self-hosted deployment "production ready," run through this:
- Resource limits set for every container
-
Restart policies configured (
unless-stoppedat minimum) - Log rotation enabled at the Docker daemon level
- Health checks that verify actual functionality, not just process liveness
- TLS termination via a reverse proxy with automatic cert renewal
- Automated backups with at least one off-server copy
- Disk and memory monitoring with alerts
- Firewall rules — only expose ports 80, 443, and your SSH port
- Unattended security updates enabled on the host OS
You don't need Kubernetes for this. You don't need a managed platform. A single well-configured VPS with Docker Compose can reliably host a surprising amount of production traffic. The key word is well-configured.
The Bigger Picture
Self-hosting is making a comeback for good reasons — cost control, data sovereignty, and honestly just the satisfaction of running your own infrastructure. But the gap between tutorials and production-grade setups is real, and it's where most people get burned.
The pattern is almost always the same: the app itself is fine, but the operational wrapper around it is missing. Add restart policies, resource limits, health checks, log management, and backups, and you've eliminated probably 90% of the 3 AM pages.
Now go set up those log rotation limits before your disk fills up. Ask me how I know this is urgent.
Top comments (0)