FastCGI vs PHP-FPM vs Caddy in 2026: Real Latency at 10K Requests/Second for Modern Reverse Proxy Workloads

#webdev #programming #tools #devtools

The Numbers Don't Lie: FastCGI Still Outperforms at Scale

After pushing 10,000 requests per second through three different PHP server configurations for 30 minutes straight, the results are clear: nginx with FastCGI remains the lowest-latency option for high-throughput reverse proxy workloads in 2026. PHP-FPM behind nginx comes within a few percent, while Caddy sacrifices roughly 30% throughput for operational simplicity—a tradeoff that makes sense below 5K req/sec but becomes costly beyond that.

The Verdict: Performance vs. Simplicity

If your workload is latency-sensitive and steady-state, FastCGI via nginx still delivers the best performance. The protocol’s minimal overhead and per-request efficiency matter more than most developers realize at scale. PHP-FPM behind nginx is functionally equivalent until you exceed 8,000 req/sec, at which point pool tuning becomes critical. Caddy shines for smaller sites, where automatic TLS, simpler configs, and reduced operational overhead justify the throughput penalty. Above 5K req/sec, however, the abstraction layers in Caddy—handler chaining, dynamic reloads, certificate rotation—compound into a measurable 25-35% performance gap.

Test Setup: A Real-World Benchmark

We ran these tests on a Hetzner AX-52 server (AMD Ryzen 7 7700, 64GB DDR5, NVMe RAID-1) running Debian 13 with kernel 6.12 LTS. The application was a Laravel 12 service resolving product queries with a 50ms simulated database call, ensuring realistic performance characteristics. PHP 8.4.3 was configured with OPcache, JIT, and preloaded Composer autoloading.

For the frontends:

nginx + FastCGI: A minimal nginx config pointing to a custom Go FastCGI bridge.
nginx + PHP-FPM: Standard nginx fronting PHP-FPM with both static and dynamic process managers.
Caddy + PHP-FPM: The built-in php_fastcgi handler with automatic HTTPS (pre-staged certs for benchmarking).

Load was generated from sibling Hetzner boxes over the internal 10Gbit network to avoid NIC bottlenecks. Each test ran for 30 minutes after a 60-second warmup, with results aggregated from three runs.

Configuration Highlights

The nginx FastCGI setup was stripped to essentials:

worker_processes auto;
worker_rlimit_nofile 65535;
events {
    worker_connections 16384;
    use epoll;
    multi_accept on;
}
http {
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_requests 10000;
    upstream fcgi_backend {
        server unix:/run/fcgi.sock;
        keepalive 256;
    }
    server {
        listen 443 ssl http2 reuseport;
        location / {
            fastcgi_pass fcgi_backend;
            fastcgi_keep_conn on;
            fastcgi_buffering off;
        }
    }
}

Caddy’s config, by contrast, is famously concise:

bench.example.com {
    root * /var/www/app/public
    php_fastcgi unix//run/php/php8.4-fpm.sock {
        try_files {path} {path}/index.php =404
    }
    encode zstd gzip
    file_server
}

Key Takeaways

FastCGI’s wire format remains efficient at scale, avoiding HTTP framing overhead. PHP-FPM’s performance is nearly identical when tuned properly, but Caddy’s abstraction costs become significant above 5K req/sec. The tradeoff isn’t just about raw speed—it’s about operational complexity versus throughput.

Read the full article at novvista.com for the complete analysis with additional examples and benchmarks.

Originally published at NovVista