Benny

Posted on Mar 15

Why We Built HttpArena — A Better Way to Benchmark HTTP Frameworks

#webdev #performance #opensource #http

The Problem

Every framework claims to be fast. Blog posts benchmark X vs Y with a single plaintext endpoint, one concurrency level, one metric. The results are interesting for five minutes, then a new version ships and everything changes.

The real question developers face is harder: which framework performs best for my workload? And nobody can answer that, because most benchmarks don't test real workloads.

What Is HttpArena?

HttpArena is an open-source benchmarking platform that tests HTTP frameworks across 16 different test profiles on dedicated, reproducible hardware. No cloud VMs. No noisy neighbors. Same machine, same load generator, same conditions for every framework.

The source is on GitHub and the results are live at mda2av.github.io/HttpArena.

Why 16 Test Profiles?

This is the core idea. A single "requests per second" number is almost meaningless without context. HttpArena tests frameworks across a range of realistic scenarios:

Connection Behavior

Baseline at 512, 4K, 16K, and 32K concurrent connections — how does performance scale as you push connection counts higher?
Pipelined — HTTP pipelining with 16 requests per connection
Limited connections — connection reuse under constrained pools

Real Workloads

JSON processing — parse a dataset, compute derived fields, serialize the response
Compression — gzip a large payload on the fly
Upload — handle incoming request bodies of varying sizes
Database — SQLite queries under concurrent load

Resilience

Noisy — a mix of valid requests, bad methods, and nonexistent paths. Does the server stay stable?
Mixed — all endpoint types hit concurrently. This is closest to real-world traffic.

Protocols

HTTP/2 and HTTP/3 — for frameworks that support them
Static file serving over H2
gRPC — unary calls with and without TLS
WebSocket — echo server performance

A framework that dominates at plaintext might fall apart under JSON serialization. One that handles 512 connections beautifully might choke at 32K. One that aces every individual test might have contention issues when all endpoints are hit simultaneously. You don't see any of this with single-profile benchmarks.

What Makes It Different

Reproducibility

Every framework runs in a Docker container on the same dedicated hardware. The Dockerfiles, source code, and test configurations are all in the repo. Anyone can clone it and reproduce the results.

Correctness First

Before any performance testing happens, every framework goes through an 18-point validation suite that checks:

Correct arithmetic on query params and request bodies
Anti-cheat with randomized inputs (no hardcoded responses)
Proper HTTP status codes (404 for missing routes, 4xx for bad methods)
Correct Content-Type headers
Valid JSON processing with computed fields
Gzip compression that actually compresses
Resilience under malformed requests

If your framework doesn't pass validation, it doesn't get benchmarked. Performance numbers are useless if the server isn't doing the work correctly.

Apples to Apples

Every framework implements the same endpoints with the same behavior. The JSON endpoint processes the same dataset. The compression endpoint gzips the same payload. The database endpoint runs the same queries. The only variable is the framework itself.

Growing Framework List

We currently test 35+ frameworks across languages including Rust, Go, C, C++, Java, C#, JavaScript (Node, Bun, Deno), Python, Ruby, Lua, and more. New frameworks are being added regularly — recent additions include Crystal, Zig, Nim, Swift, and Gleam.

Add Your Framework

Adding a framework is straightforward:

Create a Dockerfile — multi-stage build, minimal runtime image
Implement the endpoints — /baseline11, /pipeline, /json, /compression, /upload, and optionally /db, /baseline2 (H2), /static/*
Add a meta.json — declare which test profiles your framework subscribes to
Open a PR — validation runs automatically in CI

Look at any existing framework in the frameworks/ directory for a working example. The whole process takes about an hour if you know your framework well.

Who Is This For?

Developers choosing a framework — see how candidates perform across diverse workloads, not just plaintext
Framework authors — get multi-dimensional performance data and a standardized way to compare against the ecosystem
Performance engineers — reproducible, open-source benchmarks you can run on your own hardware
The curious — sometimes you just want to know how fast things can go

Check It Out

Live results: mda2av.github.io/HttpArena
Source & contribute: github.com/MDA2AV/HttpArena

We're building this in the open and actively welcoming contributions. If your favorite framework isn't represented yet, come add it. If you think our methodology could be better, open an issue. The goal is to give the community the most useful, honest benchmark data possible.

DEV Community