DEV Community

Cover image for I built a pure-PHP HTTP server that handles 1,000,000+ requests/second — no C extension
Rodrigo Vieira
Rodrigo Vieira

Posted on

I built a pure-PHP HTTP server that handles 1,000,000+ requests/second — no C extension

Every PHP developer has heard some version of it: "PHP doesn't do concurrency." The mental model is shared-nothing, one request per process, boot the framework, die, repeat. For anything that needs real throughput, the standard advice is Nginx + PHP-FPM — or, if you want true async, reach for Swoole, a C extension you compile into PHP.

I've spent a while building Bootgly, a base PHP framework, and its HTTP server runs as a long-running, event-driven process written in pure PHP — no C extension, no Go sidecar, no third-party runtime in its core.

On the TechEmpower /plaintext route it peaks at 1,076,709 requests per second, ahead of Swoole, and roughly 150× a Laravel + PHP-FPM stack.

This post is the result, how it actually works under the hood, the honest caveats, and a single docker run you can paste to reproduce it.

The result first

These are peak req/s per route, each server measured at its own best worker count. Methodology is at the bottom — and it's the same for every framework, which is the only way a benchmark like this means anything.

Route (req/s) Bootgly Swoole 6.2.0 (base) Hyperf ReactPHP AMPHP Laravel (Octane)
/plaintext 1,076,709 964,908 358,576 267,158 99,093 11,482
/json 1,068,765 979,082 347,233 269,292 99,244 11,413
/db (single query) 88,304 95,718 75,883 43,190 29,008 8,094
/query (20×) 20,341 17,263 15,800 924 1,890 2,326
/fortunes 73,640 98,557 75,650 42,550 14,954 7,695
/updates (20×) 5,974 3,721 3,499 1,086 809 321

Throughput vs server-workers — Bootgly vs Swoole (base mode) across the six TechEmpower routes
req/s as server-workers sweep from 1 → 24. Bootgly in blue, Swoole base mode in orange.

I'm not going to pretend Bootgly wins everything — but that single-number table actually undersells what's going on, so here's the honest version. Those figures are each server at its own best worker count; the curves above tell the real story. Across the full sweep, Bootgly leads Swoole on /plaintext, /json, /query (+126%) and /updates (+60%) at every worker count. The two database-render routes are a scaling story: Bootgly is ahead for most of the range, and Swoole only overtakes near full saturation — the curves cross around 21 workers on /db (Swoole ends just ~8% up at 24) and around 18 on /fortunes, where Swoole's coroutine scheduler pulls clearly ahead at high worker counts. Against every other framework in the table, Bootgly leads on every route, everywhere.

The headline that matters to me isn't "beats Swoole." Swoole is excellent and battle-tested. The headline is that a server in the same throughput class as a C extension is sitting here in plain PHP you can open and read.

What you actually write

Here is a complete HTTP server with a route and a 404 fallback:

use Bootgly\WPI\Nodes\HTTP_Server_CLI\Request;
use Bootgly\WPI\Nodes\HTTP_Server_CLI\Response;
use Bootgly\WPI\Nodes\HTTP_Server_CLI\Router;

return static function (Request $Request, Response $Response, Router $Router): Generator
{
   yield $Router->route('/', function (Request $Request, Response $Response) {
      return $Response(body: 'Hello World!');
   }, GET);

   // Catch-all 404
   yield $Router->route('/*', fn (Request $Request, Response $Response) =>
      $Response(code: 404, body: 'Not Found')
   );
};
Enter fullscreen mode Exit fullscreen mode
bootgly project Demo/HTTP_Server_CLI start
Enter fullscreen mode Exit fullscreen mode

Now, an important clarification, because "a server in a dozen lines" would be a lie: that handful of lines is your application, not the server. The server underneath it is a complete piece of software — HTTP/1.1 framing, a router with compile-time parameter constraints, a middleware pipeline, sessions, authentication and authorization, request validation, streaming multipart uploads, and an async PostgreSQL DBAL + ORM — all built in. What's small is the surface you touch. What runs behind it is not small at all.

That's the whole point of the next section.

How it works under the hood

The trick isn't a trick. It's just architecture that PHP has been able to support for years, assembled deliberately.

It's a long-running process, not FPM

A normal PHP request lives and dies in milliseconds. Bootgly's server is a CLI process that boots once and stays resident. There's no per-request bootstrap, no re-parsing, no re-wiring the container on every hit. Opcache + JIT then compile the hot paths and keep them compiled (that alone is worth ~50% on this workload). This is the same insight Swoole, RoadRunner, and Laravel Octane are built on — Bootgly just does it without leaving PHP.

Multi-process workers, via pcntl and posix

On boot, a master process forks N worker processes with pcntl_fork() and distributes accepted connections across them, so all your cores are actually used. Each worker is a real OS process with its own memory — a crash in one doesn't take down the others. The master handles UNIX signals properly (SIGTERM, SIGHUP, SIGCHLD, SIGUSR1…) for graceful shutdown and reloads, and can drop privileges to www-data after binding a privileged port using posix_setuid() / posix_setgid(). In the benchmark above, the sweet spot was 24 workers on 24 logical CPUs.

An event loop made of stream_select

Inside each worker is a classic non-blocking event loop. The listen socket comes from stream_socket_server(), every connection is flipped to non-blocking with stream_set_blocking($socket, false), and the worker waits on readiness with stream_select() — PHP's built-in select() wrapper. No libuv, no ext-event, no ext-swoole. Sockets get tuned at the syscall level (TCP_NODELAY, SO_KEEPALIVE), and TLS is handled with stream_socket_enable_crypto(). It is, genuinely, the kind of event loop you'd write in C — expressed in PHP's standard stream functions.

HTTP is layered on top of TCP

This is the part I'm most fond of architecturally. Bootgly follows a strict layered design it calls I2P (Interface-to-Platform), and the HTTP server is quite literally a TCP server with HTTP bolted on through inheritance:

class HTTP_Server_CLI extends TCP_Server_CLI implements HTTP, Server
Enter fullscreen mode Exit fullscreen mode

So the request lifecycle is clean and one-directional:

raw bytes
   → TCP_Server_CLI        (accept, non-blocking read/write, backpressure)
   → HTTP decoders         (request-line + header framing, chunked, multipart)
   → Request               (typed, per-connection, no superglobals)
   → Router → middleware → your handler
   → Response → HTTP encoders
   → TCP_Server_CLI        (non-blocking, backpressure-aware write)
   → wire
Enter fullscreen mode Exit fullscreen mode

The TCP layer knows nothing about HTTP. The HTTP layer reuses every bit of the socket machinery below it. The same TCP foundation also powers a UDP server and the framework's HTTP/TCP/UDP clients — write the hard part once, reuse it everywhere.

Fibers for the work that would otherwise block

An event loop only stays fast if nothing blocks it. The classic PHP problem is the database call: in a synchronous worker, one slow query freezes everything that worker is serving. Bootgly uses PHP 8.1 Fibers to make that cooperative. A handler can defer work, and when it hits I/O — say, the native non-blocking PostgreSQL client talking to the connection pool — it yields control back to the loop instead of stalling the worker. The loop serves other connections and resumes the fiber when the socket is ready. That's why /query (20 sequential DB fetches) and /updates come out ahead: the concurrency is real, not faked by a thread pool.

Zero third-party dependencies

Everything above uses only PHP's standard librarystream_*, socket_*, pcntl_*, posix_*. There are no Composer packages in the framework core. The router, the decoders, the session handler, the test framework, the DBAL: all native. Your vendor/ directory for a Bootgly app's core is, effectively, empty.

Why pure PHP at all?

"Just use Swoole" is a fair question, so here's the honest answer.

Deployment. A C extension means a compile step, a PECL install, a matching ABI, and a CI pipeline that has to carry it. Bootgly's server runs anywhere PHP 8.4 runs — git clone, and go. No extension to build, nothing to keep in sync with your PHP minor version.

Supply chain. No third-party runtime in the core means a dramatically smaller attack surface and nothing to audit but the framework itself. After the last few years of dependency-chain incidents across every ecosystem, "you can read the entire stack" is a feature.

Understanding. This is the personal one. My motto, stolen from Feynman, is "What I cannot create, I do not understand." When the HTTP server, the TCP layer, the event loop, and the decoders are all PHP I wrote, there is no black box between my application and the socket. If something is slow or wrong, I can read straight down to the stream_select() call. That's worth a lot when you're debugging production at 2 a.m.

The honest caveats

If you take one thing from a benchmark post, take the caveats — they're how you tell whether to trust the numbers.

  • Bootgly is beta. The 1.0 release is close, but the public API is still being finalized. Don't ship it to production yet; pin a version and expect changes.
  • PHP 8.4+, Linux. It leans on modern PHP (Fibers, property hooks, asymmetric visibility) and is developed/tested on Debian-based Linux. Windows works under Docker.
  • Swoole wins the DB-render routes at saturation. At peak worker counts Swoole takes /db (barely — the curves cross around 21 workers) and /fortunes (clearly, above ~18 workers). Below those, Bootgly leads both. A mature C coroutine scheduler earns the high-saturation crown.
  • /updates at extreme worker counts is not fully characterized. Bootgly wins it at ≤~24–32 workers (the config above); at very high counts, concurrent batched UPDATEs can deadlock in PostgreSQL itself. That high-concurrency behavior still needs validation on real many-core hardware before I'd call it settled.
  • Synthetic routes are synthetic. TechEmpower routes are a fair cross-framework comparison, not your app. Always benchmark your real workload.

Reproduce it yourself

This is the part that makes it real. Every cross-framework opponent ships as a self-contained Docker image that bundles Bootgly + the opponent + PostgreSQL (which boots and seeds itself), so the whole benchmark is one command, zero host setup:

docker run --rm bootgly/bootgly_benchmarks:swoole \
   test benchmark HTTP_Server_CLI --opponents=bootgly,swoole-base --loads=techempower:*
Enter fullscreen mode Exit fullscreen mode

Swap the image tag for workerman, reactphp, amphp, roadrunner, hyperf, or laravel-octane to run any other opponent.

Methodology: 24 logical CPUs (WSL2 / Ryzen 9 3900X), PHP 8.4.22, 514 connections, 10 s per route, DB_POOL_MAX=1 for every framework (identical per-worker DB footprint). Worker count was swept 1→24; figures are each framework's peak. Full comparison — features, performance, and the server's built-in security posture — lives on the Bootgly vs Swoole, Hyperf, ReactPHP, AMPHP & Laravel page, and the raw runs are in bootgly_benchmarks.

Where this is going

Bootgly is heading toward a 1.0 with the API frozen and the docs filled out. The HTTP server, the async DBAL + ORM, sessions, auth, and the middleware stack are already in — the work now is stabilization, not new foundations.

If a million requests per second in plain PHP is the kind of thing that makes you curious enough to read the source, the repo is here: github.com/bootgly/bootgly. A ⭐ helps more people find it, and I'm genuinely keen to hear where the design holds up and where it doesn't — especially from people who've shipped Swoole, RoadRunner, or Workerman in anger.

What would you stress-test first?

Top comments (0)