What you'll learn:
- The complete traffic flow from user browser to container
- How Traefik handles TLS termination, routing, and zero-downtime updates
- Why services opt-in to exposure via Docker labels
- How CrowdSec adds a WAF and IP reputation layer to every request
- How Tailscale VPN secures admin access without opening SSH to the internet
The Threat Model
A public internet server is subject to continuous automated scanning and attack attempts. Within minutes of a new IP address becoming reachable:
- Automated scanners probe every common port
- Bots attempt SSH brute-force (hundreds of attempts per hour)
- Crawlers look for exposed admin interfaces (wp-admin, /actuator, .env, etc.)
- Malicious requests probe for SQL injection, XSS, and other OWASP vulnerabilities
The security architecture in this stack addresses each of these without requiring a separate security engineer. Here's the full picture:
Internet
│
▼
Cloudflare (DNS proxy)
│ Hides server IP, DDoS mitigation, HTTPS at edge
│
▼
Hetzner Firewall
│ Allows: 80, 443 only. Blocks everything else at network level.
│
▼
fail2ban (host)
│ Bans IPs after 3 failed SSH attempts (1h ban)
│
▼
Traefik (port 80 / 443)
│ TLS termination, HTTP→HTTPS redirect, real IP forwarding
│
▼
CrowdSec bouncer (Traefik plugin)
│ IP reputation check + AppSec WAF rules (SQLi, XSS, etc.)
│ Block decision: 60s default ban
│
▼
Application (bento, etc.)
Admin traffic takes a different path entirely — through Tailscale VPN, bypassing the public internet stack.
Traefik: The Entry Point for All Traffic
Traefik is the reverse proxy that sits in front of all applications. It handles:
- TLS certificate acquisition and renewal (Let's Encrypt, automated)
- HTTP to HTTPS redirection
- Routing requests to the correct backend service
- Running the CrowdSec bouncer plugin
Static Configuration
apps/traefik/traefik_static_conf.yaml defines the entrypoints, providers, and plugins that are loaded once at startup.
Entrypoints:
entryPoints:
web:
address: :80
http:
redirections:
entryPoint:
to: websecure
scheme: https
forwardedHeaders:
trustedIPs: &trustedIps
- 103.21.244.0/22
- 104.16.0.0/13
# ... all Cloudflare IP ranges
websecure:
address: :443
forwardedHeaders:
trustedIPs: *trustedIps
transport:
respondingTimeouts:
readTimeout: 600s
writeTimeout: 600s
metrics:
address: :8899
Port 80 (web) redirects all traffic to 443 and trusts Cloudflare's IP ranges for the X-Forwarded-For header. Without this trustedIPs configuration, Traefik would see Cloudflare's IP as the client IP — meaning CrowdSec would evaluate Cloudflare's infrastructure, not the actual user. By trusting Cloudflare's ranges, Traefik unwraps the X-Forwarded-For header to get the real client IP.
Port 443 (websecure) has 600-second timeouts to support long-running operations like PDF generation in the Bento app.
Port 8899 (metrics) exposes Prometheus metrics for Grafana Alloy to scrape. This port is not in the Hetzner firewall allow-list and is not accessible from the public internet — Alloy scrapes it from inside the overlay network.
Certificate resolvers:
certificatesResolvers:
staging:
acme:
email: <YOUR_EMAIL>
caServer: "https://acme-staging-v02.api.letsencrypt.org/directory"
httpChallenge:
entryPoint: web
production:
acme:
email: <YOUR_EMAIL>
caServer: "https://acme-v02.api.letsencrypt.org/directory"
httpChallenge:
entryPoint: web
Two resolvers exist: staging (for testing — will not exceed Let's Encrypt rate limits) and production (real certificates). Services specify which resolver to use in their labels. During initial setup, use staging to validate the configuration, then switch to production.
Providers:
providers:
swarm:
exposedByDefault: false
docker:
exposedByDefault: false
file:
directory: /etc/traefik
watch: true
exposedByDefault: false means Traefik ignores all containers unless they have traefik.enable=true in their labels. A service added to Swarm without this label will not be exposed publicly. Every exposure is explicit and intentional.
CrowdSec plugin:
experimental:
plugins:
bouncer:
moduleName: "github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin"
version: "v1.5.0"
The plugin is declared here in static config. Its configuration (which requests it applies to, which CrowdSec instance it talks to) is in the dynamic config.
Dynamic Configuration
apps/traefik/traefik_dynamic_conf.yaml defines middlewares and routes that Traefik watches for changes without restarting:
http:
middlewares:
auth:
basicAuth:
users:
- <USERNAME>:<BCRYPT_HASH>
crowdsec:
plugin:
bouncer:
enabled: true
crowdsecMode: live
crowdsecAppsecEnabled: true
crowdsecAppsecHost: crowdsec_crowdsec:7422
crowdsecAppsecFailureBlock: true
crowdsecLapiKeyFile: "/run/secrets/crowdsec_api_key"
crowdsecLapiHost: crowdsec_crowdsec:8080
forwardedHeadersTrustedIPs:
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
clientTrustedIPs:
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
The crowdsec middleware is defined once here and referenced by any service that wants WAF protection. The auth middleware is used for any internal service (like the Traefik dashboard) that should be behind basic auth.
Zero-Downtime Updates
In the Traefik compose file, the update strategy is:
deploy:
update_config:
order: start-first
start-first means Docker Swarm starts the new Traefik container before stopping the old one. During the overlap window, the new container is running and healthy before the old one receives the stop signal. This means Traefik updates happen with no dropped requests.
Combined with SwarmCD's immutable config versioning (Part 3), every configuration change to Traefik is zero-downtime.
Service Exposure via Docker Labels
Here's how a service opts into public access. From apps/bento/bento.yaml:
services:
bento:
image: ghcr.io/alam00000/bentopdf-simple:v2.7.0
networks:
- swarm_network
deploy:
labels:
- "traefik.enable=true"
- "traefik.http.routers.bento-http.rule=Host(`pdf.yourdomain.com`)"
- "traefik.http.routers.bento-http.entrypoints=web"
- "traefik.http.routers.bento-http.middlewares=redirect-to-https@file"
- "traefik.http.routers.bento.rule=Host(`pdf.yourdomain.com`)"
- "traefik.http.routers.bento.entrypoints=websecure"
- "traefik.http.routers.bento.tls.certresolver=production"
- "traefik.http.routers.bento.middlewares=crowdsec@file"
- "traefik.http.services.bento.loadbalancer.server.port=8080"
Breaking this down:
-
traefik.enable=true— opts in to Traefik management - Two routers: one for HTTP (redirect to HTTPS), one for HTTPS
-
tls.certresolver=production— request a production Let's Encrypt certificate for this hostname -
middlewares=crowdsec@file— all requests to this service pass through the CrowdSec bouncer -
server.port=8080— Traefik forwards to this container port
Notice that labels go under deploy: not under services: in Swarm mode. This is a Docker Swarm requirement — service labels (the ones Traefik watches) must be deployment labels, not container labels.
CrowdSec: WAF and IP Reputation
CrowdSec adds two protection layers to every request passing through Traefik:
LAPI (Local API) — IP Reputation:
CrowdSec maintains a local database of banned IP addresses. This database is populated from:
- The CrowdSec community threat intelligence feed (millions of crowdsourced malicious IPs)
- Local detections (if you run CrowdSec agents on the host)
When a request arrives, the bouncer plugin checks the source IP against the LAPI. If it's in the ban list, the request is blocked immediately with a 403.
AppSec — WAF Rules:
CrowdSec's AppSec component applies request inspection rules that block common attack patterns:
- SQL injection (e.g.,
' OR 1=1 --in query parameters) - XSS (e.g.,
<script>alert(1)</script>in form fields) - Path traversal (e.g.,
../../../etc/passwd) - Known CVE exploit patterns for common web frameworks
crowdsec:
plugin:
bouncer:
crowdsecAppsecEnabled: true
crowdsecAppsecHost: crowdsec_crowdsec:7422
crowdsecAppsecFailureBlock: true # Block if AppSec is unreachable
crowdsecAppsecFailureBlock: true means that if the AppSec engine is unavailable (container restart, etc.), requests are blocked rather than allowed through. This is a fail-closed posture — prefer availability loss over security bypass.
Internal traffic bypass:
clientTrustedIPs:
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
RFC1918 private address ranges (Docker's overlay network, Tailscale) bypass CrowdSec checks. Inter-service communication inside the cluster doesn't need to be WAF-inspected — it never crosses the public internet boundary.
Tailscale: Secure Admin Access
SSH is not exposed in the Hetzner firewall. All administrative access is routed through Tailscale VPN.
During cloud-init (Part 2), the server joins your Tailscale network:
tailscale up \
--ssh \
--accept-routes \
--advertise-exit-node \
--advertise-tags=tag:server \
--client-id=<TAILSCALE_CLIENT_ID> \
--client-secret=<TAILSCALE_CLIENT_SECRET>
--ssh enables Tailscale SSH, allowing SSH access to the server using Tailscale credentials. The Tailscale hostname (my-server.your-tailnet.ts.net) is stable even if the server IP changes.
From any device enrolled in the Tailscale network:
ssh admin@my-server.your-tailnet.ts.net
This eliminates the need for public SSH key management, firewall IP exceptions, or a self-managed VPN gateway. Tailscale handles NAT traversal automatically, establishing a peer-to-peer encrypted connection regardless of network topology.
SSH Hardening Recap
Even though Tailscale VPN is the primary admin path, SSH is still hardened as a defense-in-depth measure:
From server/hetzner.tfpl:
PasswordAuthentication no → SSH keys only, passwords rejected
MaxAuthTries 6 → Disconnect after 6 failed attempts
MaxSessions 3 → Limit concurrent sessions
X11Forwarding no → Disable graphical forwarding
ClientAliveInterval 300 → Disconnect idle sessions after 5 min
LoginGraceTime 30 → Disconnect if auth not completed in 30s
And fail2ban:
bantime = 3600 → 1-hour bans
findtime = 600 → 10-minute window
maxretry = 3 → 3 failures triggers ban
mode = aggressive → Also catches scan patterns
Source IPs that fail authentication 3 times within a 10-minute window are banned for 1 hour. Combined with key-only authentication and SSH not being exposed to the public internet, the SSH attack surface is substantially reduced.
Summary: Security in Layers
| Layer | What it protects against |
|---|---|
| Cloudflare DNS proxy | Hides server IP; DDoS mitigation at edge |
| Hetzner firewall | Blocks all non-HTTP/HTTPS traffic at network level |
| fail2ban | SSH brute-force banning |
| SSH key-only auth | Password-based SSH attacks |
| Tailscale VPN | Admin access without exposing SSH to internet |
Traefik exposedByDefault: false
|
Accidental service exposure |
| CrowdSec LAPI | Known malicious IP blocking |
| CrowdSec AppSec | Application-layer attack filtering (SQLi, XSS, CVEs) |
| Docker secrets | Credentials as files, not environment variables |
| SOPS encryption | No plaintext secrets in Git |
Each layer is independent — a failure or bypass of any one layer still leaves others intact. This is defense in depth.
Repository: gitlab.com/sakonn/docker-swarm-gitops
Top comments (0)