Easily benchmark all your app's endpoints at once

#api #performance #devops #go

Most "load tests" in real codebases are a curl pasted into a Slack thread. Someone runs it before a release, eyeballs the latency, and we ship. There's nothing version-controlled, nothing repeatable, and the next person to touch the service has no idea which endpoints are actually fast paths.

benchmarkr is a powerful and easy-to-use CLI and MCP tool that fixes that part of the workflow specifically. The thing I want to talk about in this post is the piece that makes it click: a YAML config that lives in your repo and describes every endpoint you care about, the same way a package.json describes your dependencies.

The config

First, install benchmarkr cli if you haven't:

brew tap mack-overflow/tap
brew install benchmarkr

for Homebrew, or for Debian

echo "deb [trusted=yes] https://apt.fury.io/mack-overflow/ /" \
  | sudo tee /etc/apt/sources.list.d/benchmarkr.list
sudo apt update
sudo apt install benchmarkr

(More installation guides available here)

Next, run benchmarkr endpoints init in your project root and you get a benchmarkr.yaml you can commit:

version: 1

endpoints:
  - name: list-users
    method: GET
    url: ${API_BASE:-http://localhost:8080}/users
    headers:
      Authorization: Bearer ${API_TOKEN}
    defaults:
      concurrency: 10
      duration_seconds: 30

  - name: search-users
    method: GET
    url: ${API_BASE}/users/search
    params:
      q: "test"
      limit: "50"
    defaults:
      concurrency: 5
      duration_seconds: 15

  - name: create-order
    method: POST
    url: ${API_BASE}/orders
    headers:
      Authorization: Bearer ${API_TOKEN}
      Content-Type: application/json
    body:
      sku: "ABC-123"
      quantity: 1
    defaults:
      concurrency: 2
      duration_seconds: 10

A few things to notice:

Env var substitution. ${API_BASE} and ${API_BASE:-default} work the way they do in shell. A sibling .env file is auto-loaded but never overrides what's already in the environment, so the same file works on a laptop, in CI, and in staging.
Defaults travel with the endpoint. create-order runs at concurrency 2 for 10 seconds because that's what makes sense for a write path. list-users runs at concurrency 10. You set this once in the file you already review.
Discovery walks up from CWD. Run the CLI from any subdirectory and it finds the file, like git does.

Running one endpoint

benchmarkr run -e list-users

That's it. Saved defaults apply. Any flag you pass on the command line wins; headers and params are merged. So when you're poking at production specifically, you can do:

benchmarkr run -e list-users \
  --header "X-Trace: debug-2026-04-28" \
  --concurrency 50

…without editing the committed file.

Running all of them

This is where the YAML pays for itself. Because every endpoint is named and self-describing, you can hand the entire file to the CLI in one shot:

benchmarkr run --all

That walks every endpoint in benchmarkr.yaml in succession, applying each endpoint's saved defaults (concurrency, duration, headers, body — the whole config). Between runs you get a [i/N] <name> header so it's obvious where you are; live p50/p95/p99 streams in for the active endpoint and a final summary prints when it finishes. --all is mutually exclusive with --url and --endpoint, and any flags you do pass (e.g. --store, --json, --rate-limit) apply to every run in the sweep.

For CI, this collapses the workflow step to one line:

# .github/workflows/perf.yml
- name: Benchmark every endpoint
  env:
    API_BASE: https://api.staging.example.com
    API_TOKEN: ${{ secrets.STAGING_API_TOKEN }}
    BENCH_CLOUD_TOKEN: ${{ secrets.BENCHMARKR_TOKEN }}
  run: benchmarkr run --all --store --json > perf-results.json

--json with --all emits an array — one entry per endpoint, with the same result shape as a single run — so you can pipe it straight into a regression check or upload it as a CI artifact:

[
  {
    "name": "list-users",
    "stop_reason": "completed",
    "duration": "30.001s",
    "stored": true,
    "result": { "requests": 12483, "p50_ms": 4, "p95_ms": 12, "p99_ms": 23, "errors_total": 0 }
  },
  {
    "name": "search-users",
    "stop_reason": "completed",
    "duration": "15.002s",
    "stored": true,
    "result": { "requests": 4127, "p50_ms": 18, "p95_ms": 47, "p99_ms": 92, "errors_total": 0 }
  },
  {
    "name": "create-order",
    "stop_reason": "completed",
    "duration": "10.001s",
    "stored": true,
    "result": { "requests": 312, "p50_ms": 41, "p95_ms": 88, "p99_ms": 121, "errors_total": 0 }
  }
]

You're not maintaining a separate list of "endpoints to benchmark" in your CI workflow and a list in your config. There's one list. Add a new endpoint to benchmarkr.yaml in the same PR that adds the route, and the next CI run picks it up automatically — no workflow edits, no shell loop to babysit.

Round-tripping with the cloud dashboard

The CLI gives you fast feedback. The dashboard gives you the long view — historical p95 charts, regression detection across versions, the kind of thing that's painful to wire up yourself.

The newest piece is import/export, so the YAML in your repo and the endpoints in the dashboard stay in sync without anyone having to maintain both:

Export from the dashboard. Open any endpoint and click Export for YAML or JSON. Or click Export all in the endpoints nav to dump every endpoint to one file you can drop into a fresh repo.
Import to the dashboard. Click Import, pick a benchmarkr.yaml, and endpoints upsert by (user, name). If the config changed, a new version is recorded — so you get a history of how each endpoint's load shape evolved.

A workflow I've been using:

Define endpoints in benchmarkr.yaml, commit them.
CI runs the loop above on every PR with --store and the cloud token, persisting results to the dashboard.
Open the endpoint in the dashboard to see the trend line for that endpoint across the last N PRs.
If somebody adds an endpoint via the dashboard UI for ad-hoc poking, Export → drop the file into the repo → it's now part of the CI matrix.

A note on the cloud dashboard

The cloud platform is currently in closed beta. We're planning to open it up to the public on a per-token basis in spring 2026 — if you'd like access at launch, you can join the waitlist.

The CLI itself is open source and works without the cloud — benchmarkr run, the YAML config, and even local result persistence don't require an account or a token. The dashboard, history charts, version pinning, and import/export are the parts gated behind beta access for now.

Why this is worth doing

The shift that matters isn't "run benchmarks in CI" — plenty of tools do that. It's having a single, reviewable file that says here are this service's endpoints and how we expect them to behave under load, sitting next to the code in the same PR.

Once that file exists:

New endpoints get a perf budget at the same moment they get a route handler.
Reviewers can see in the diff that a new write path is being benchmarked at concurrency 2, not 100, and push back if that's wrong.
CI gets a free regression signal across every endpoint, not just the one someone remembered to add to a script.
The dashboard gives you the historical view without anyone manually re-entering endpoints.

The repo already describes your API. This is just letting it benchmark itself.

benchmarkr is open source — brew install mack-overflow/tap/benchmarkr or grab it from benchmarkr. Cloud dashboard beta access opens publicly per-token in spring 2026.