Polliog

Posted on Dec 5, 2025 • Edited on Jan 11

Docker Compose for Production: Lessons from Deploying a Log Management Platform

#automation #devops #docker #opensource

When I started building LogTide - an open-source alternative to Datadog - I made a controversial decision: no custom installation scripts. Just plain Docker Compose files that users can read, understand, and modify.

Most self-hosted platforms give you a curl | bash script that does "magic" behind the scenes. That approach might be convenient, but it breaks trust in a privacy-first platform. If you can't see what's being deployed, how can you trust it with your logs?

Here's what I learned deploying a production-grade log management platform with transparent Docker Compose configurations.

The Philosophy: Transparency Over Convenience

# This is what users see - no hidden steps
services:
  postgres:
    image: timescale/timescaledb:latest-pg16
    environment:
      POSTGRES_DB: logtide
      POSTGRES_USER: logtide
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data

Why this matters:

Security teams can audit every line before deployment
Users understand what resources they're committing
No surprises about ports, volumes, or network configurations
Easy to customize for specific infrastructure needs

The trade-off? Users need to understand basic Docker concepts. But my target audience (European SMBs and developers) already runs Docker in production.

Lesson 1: Health Checks Are Non-Negotiable

Early versions of LogTide had a race condition: the backend would start before PostgreSQL was ready, leading to connection errors. The solution? Proper health checks and service dependencies.

postgres:
  image: timescale/timescaledb:latest-pg16
  healthcheck:
    test: ["CMD-SHELL", "pg_isready -U logtide"]
    interval: 10s
    timeout: 5s
    retries: 5

backend:
  image: logtide/backend:latest
  depends_on:
    postgres:
      condition: service_healthy  # 👈 Wait for actual health, not just "started"

The Problem with `depends_on` Alone

Most tutorials show depends_on without condition: service_healthy. This only ensures services start in order - not that they're ready to accept connections.

Real-world impact: Before health checks, ~30% of first-time deployments failed because the backend tried to connect to PostgreSQL before it finished initializing. After implementing health checks: 0% startup failures.

# ❌ BAD: Service might not be ready
depends_on:
  - postgres

# ✅ GOOD: Wait for actual readiness
depends_on:
  postgres:
    condition: service_healthy

Lesson 2: Pre-Built Images vs. Building Locally

Initially, I assumed users would clone the repo and run docker-compose up --build. This caused three problems:

Slow deployments - building from source takes 5-10 minutes
Build failures - different Node/pnpm versions caused inconsistencies
Trust issues - "What if the build process does something malicious?"

The solution: Push pre-built images to Docker Hub and GitHub Container Registry.

backend:
  image: logtide/backend:latest  # 👈 Pre-built, reproducible
  # Build from source is still possible for advanced users
  # build:
  #   context: .
  #   dockerfile: packages/backend/Dockerfile

Results:

Deployment time: 10 minutes → 2 minutes
Zero build-related support tickets
Users can inspect images before pulling: docker inspect logtide/backend:latest

Multi-Platform Builds

LogTide supports both AMD64 and ARM64 (for M1 Macs and ARM servers). Using GitHub Actions:

# .github/workflows/publish-images.yml
- name: Build and push
  uses: docker/build-push-action@v5
  with:
    platforms: linux/amd64,linux/arm64
    push: true
    tags: logtide/backend:${{ steps.version.outputs.version }}

Now users on any architecture just run docker compose up -d and it works.

Lesson 3: Environment Variables - The Right Way

Early versions had secrets in the compose file. Bad idea.

# ❌ NEVER do this
environment:
  DATABASE_URL: postgresql://logtide:supersecret123@postgres:5432/logtide

Instead, use .env files with generated secrets:

# docker-compose.yml
environment:
  DATABASE_URL: postgresql://logtide:${DB_PASSWORD}@postgres:5432/logtide

# .env (not committed to git)
DB_PASSWORD=
REDIS_PASSWORD=
API_KEY_SECRET=

Pro tip: I provide an install.sh script that generates secure random passwords automatically:

generate_password() {
    openssl rand -base64 32 | tr -d "=+/" | cut -c1-32
}

DB_PASSWORD=$(generate_password)
REDIS_PASSWORD=$(generate_password)

But the script is optional - users can still manually create .env if they prefer.

Lesson 4: Volume Management for Data Persistence

Lost data is not an option for a log management platform. Here's how volumes work in LogTide:

services:
  postgres:
    volumes:
      - postgres_data:/var/lib/postgresql/data  # 👈 Named volume

volumes:
  postgres_data:
    driver: local  # 👈 Explicit driver (important for clustering later)

Why Named Volumes vs. Bind Mounts?

Bind mounts (./data:/var/lib/postgresql/data):

❌ Permission issues on different systems
❌ Harder to backup/restore
❌ Not portable across Docker hosts

Named volumes:

✅ Docker manages permissions
✅ Easy to backup: docker run --rm -v postgres_data:/data -v $(pwd):/backup ubuntu tar czf /backup/postgres_backup.tar.gz /data
✅ Can be migrated to network storage later

Backup Strategy

I document this explicitly in the deployment guide:

# Create backup
docker compose exec postgres pg_dump -U logtide logtide > backup_$(date +%Y%m%d).sql

# Restore from backup
docker compose exec -T postgres psql -U logtide logtide < backup_20250115.sql

Users need to know how to backup their data before disaster strikes.

Lesson 5: The Worker Pattern

LogTide has a worker service that shares the same image as the backend but runs background jobs (sending email alerts, processing Sigma rules, aggregating stats).

backend:
  image: logtide/backend:latest
  command: ["server"]  # Default: runs Fastify API

worker:
  image: logtide/backend:latest
  command: ["worker"]  # Same image, different entrypoint
  depends_on:
    postgres:
      condition: service_healthy

Why this architecture?

Separation of concerns - API stays responsive even during heavy background processing
Independent scaling - can run multiple workers without scaling the API
Single image - reduces complexity and storage

The command override is handled in the Dockerfile:

# packages/backend/Dockerfile
ENTRYPOINT ["node"]
CMD ["dist/server.js"]  # Default

# Override with: docker run logtide/backend worker
# This runs: node dist/worker.js

Lesson 6: Restart Policies Matter

Production services crash. It's inevitable. The question is: do they recover?

backend:
  restart: unless-stopped  # 👈 Survives reboots and crashes

Restart policy options:

no - Never restart (bad for production)
always - Always restart (even if manually stopped)
on-failure - Only restart on error codes
unless-stopped - Best for production - restarts unless explicitly stopped

Real-world scenario: A user reported their server rebooted for kernel updates. With restart: unless-stopped, LogTide came back online automatically. Without it, they would have had downtime until they manually restarted containers.

Lesson 7: Network Configuration

LogTide uses a custom bridge network instead of the default:

services:
  backend:
    networks:
      - logtide-network

networks:
  logtide-network:
    driver: bridge

Benefits:

Services can reference each other by name (postgres:5432, redis:6379)
Isolated from other Docker projects
Better security - services aren't accessible from other networks

Pitfall I discovered: The frontend needs to connect to the backend from outside the Docker network (browser → backend). So I expose ports:

backend:
  ports:
    - "8080:8080"  # Host:Container
  networks:
    - logtide-network

But internal services (like the worker) don't need exposed ports - they communicate via the internal network.

Lesson 8: Development vs. Production Compose Files

I maintain two compose files:

docker-compose.dev.yml - For contributors:

services:
  postgres:
    ports:
      - "5432:5432"  # 👈 Exposed for local development tools

  redis:
    ports:
      - "6379:6379"

docker-compose.yml - For production:

services:
  postgres:
    # No ports exposed - only accessible via internal network

Run with: docker compose -f docker-compose.dev.yml up

Why separate files?

Development needs database access from host (for migrations, debugging)
Production should never expose databases to the host network

Lesson 9: Monitoring and Logs

Every service has logging configured:

# View logs for a specific service
docker compose logs -f backend

# View last 100 lines
docker compose logs --tail=100 backend

# View logs for all services
docker compose logs -f

I also added a health endpoint to the backend:

// GET /health
fastify.get('/health', async () => {
  const dbHealthy = await checkDatabase();
  const redisHealthy = await checkRedis();

  return {
    status: dbHealthy && redisHealthy ? 'healthy' : 'unhealthy',
    services: { postgres: dbHealthy, redis: redisHealthy }
  };
});

Users can monitor this with a simple cron job:

# Add to crontab
*/5 * * * * curl -f http://localhost:8080/health || systemctl restart docker-compose-logtide

Lesson 10: What About Kubernetes?

People ask: "Why not Kubernetes? Isn't Docker Compose just for development?"

My take: For 90% of self-hosted use cases (SMBs, startups, personal projects), Docker Compose is perfect:

✅ Runs on a single $5/month VPS
✅ No learning curve beyond basic Docker
✅ Easy to backup and restore
✅ Low operational complexity

Kubernetes makes sense when you need:

Multi-node clustering
Auto-scaling across dozens of instances
Complex orchestration with service meshes

But LogTide's target users don't need that complexity. They need simple, transparent, reliable deployments.

The Results

After 3 months of production deployments:

0 deployment failures due to Docker issues
Average deployment time: 2 minutes (from git clone to running services)
Support tickets related to deployment: ~5% (vs. 40% in early versions)
Self-hosting adoption rate: 35% of users prefer self-hosting over our cloud offering

Key Takeaways

Transparency builds trust - visible configuration files are better than magic scripts
Health checks prevent race conditions - use condition: service_healthy
Pre-built images save time - build once, deploy everywhere
Named volumes for persistence - easier backups and migrations
Restart policies for resilience - unless-stopped is production-ready
Separate dev and production configs - security and convenience aren't the same
Document backup procedures - before users need them
Docker Compose scales - you don't need Kubernetes for everything

Try It Yourself

LogTide is open source (AGPLv3). You can see the complete Docker setup here:

# Try it in 2 minutes
git clone https://github.com/logtide/logtide.git
cd logtide/docker
cp ../.env.example .env
docker compose up -d

What's your experience with Docker Compose in production? Have you encountered different challenges? Let me know in the comments!

Top comments (2)

Aman Saxena • Dec 6 '25

This is interesting project, Can i also contribute to it if its still in development @polliog

Polliog • Dec 7 '25

Hey! Absolutely, that would be amazing!
Yes, the project is very active (we just released v0.2.4). We are building a monorepo with SvelteKit 5 and Fastify, so there is plenty of fun stuff to work on.

The best way to start is to check the "good first issue" label on our GitHub Issues page. Or, if you have a specific feature in mind (or a missing SDK you want to build), feel free to open an issue to discuss it!