Link to benchmarkr homepage
What if cURL let you easily run concurrent requests and benchmark your
endpoints? Where you had an executable that exports tools for your coding
agents (Claude, Cursor) via MCP, streams live performance updates, and
auto-exports benchmark runs to JSON or SQL?
Development speed is increasing rapidly. Testing and benchmarking are
becoming even more crucial aspects of development for testing AI-generated
and AI-assisted code. Benchmarkr lets you easily orchestrate performance
testing on your API endpoints — whether you're catching regressions, sanity-
checking a refactor, or letting an agent validate the code it just wrote
Why another HTTP tool?
There's no shortage of load-test tools — hey, ab, wrk, bombardier,
k6. They're great at what they do, but they live in a pre-agent world:
- No live feedback — you wait 30 seconds, then read a wall of text.
- No structured persistence — you pipe to a file, grep, repeat.
- No agent integration — your coding agent can't call them without shelling out and parsing free-form output.
Benchmarkr is a single Go binary that does three things well with little out of the box configuration required:
- Runs concurrent HTTP benchmarks with live metrics in the terminal.
- Stores runs as JSON files, Postgres, or MySQL — configured once, reused forever.
- Ships an MCP server so Claude Code, Cursor, and any other MCP-compatible agent can benchmark your endpoints by name.
Install
# macOS / Linux
brew tap mack-overflow/tap
brew install benchmarkr
# Debian / Ubuntu
echo "deb [trusted=yes] https://apt.fury.io/mack-overflow/ /" \
| sudo tee /etc/apt/sources.list.d/benchmarkr.list
sudo apt update && sudo apt install benchmarkr
# RHEL / Fedora
sudo tee /etc/yum.repos.d/benchmarkr.repo <<EOF
[benchmarkr]
name=Benchmarkr
baseurl=https://yum.fury.io/mack-overflow/
enabled=1
gpgcheck=0
EOF
sudo yum install benchmarkr
Or grab a binary from the releases page if you'd
rather skip a package manager.
Your first benchmark
The smallest useful command:
benchmarkr run --url https://api.example.com/health
That fires a single worker for 10 seconds at a GET endpoint. You'll see requests, errors, and P50/P95 update live, then a final summary with throughput, latency percentiles, status-code breakdown, response sizes, and cache hit/miss counts.
Add concurrency and duration:
benchmarkr run \
--url https://api.example.com/users \
--concurrency 50 \
--duration 30
POST with headers and a body:
benchmarkr run \
--url https://api.example.com/users \
--method POST \
--header "Authorization: Bearer tok_xxx" \
--header "Content-Type: application/json" \
--body '{"name":"test"}'
Rate-limit so you don't accidentally DDoS staging:
benchmarkr run \
--url https://api.example.com/search \
--concurrency 5 \
--duration 20 \
--rate-limit 100 # max 100 req/s
Bypass the CDN cache to measure origin latency:
benchmarkr run \
--url https://cdn.example.com/asset.js \
--cache-mode bypass \
--duration 10
The MCP server — letting agents benchmark for you
This is the part I'm most excited about. Install the MCP companion binary:
brew install mack-overflow/tap/benchmarkr-mcp
Then wire it into your agent. For Claude Code, drop a .mcp.json in
your project root:
{
"mcpServers": {
"benchmarkr": {
"command": "benchmarkr-mcp"
}
}
}
And a short CLAUDE.md so the agent prefers it over reinventing the wheel:
# Benchmarking
Use the benchmarkr MCP tools (run_benchmark, get_benchmark_status,
stop_benchmark, list_endpoints) for all API benchmarking tasks.
Do not install or use external tools like hey, ab, or bombardier.
Cursor is the same config under ~/.cursor/mcp.json plus a .cursorrules.
Now you can say things like:
"Benchmark the /api/users endpoint at 20 concurrent workers for 30
seconds and tell me if the P95 is above 200ms."
…and the agent calls the run_benchmark tool directly, reads the
structured result, and answers in the context of your actual code. No more
"let me write a shell script for you" dance.
Where this is heading
A few things on the roadmap I'm looking for feedback on:
compare_endpoints — run two URLs side-by-side and diff the metrics.
regression_test — assert against a previous run's P95/P99 and return a pass/fail the agent can reason about.
Scenario files — YAML-defined multi-step flows (login → fetch → post) instead of single-URL runs.


Top comments (0)