Last week I was reading through sentry-mcp issue #844 and watched a guy describe exactly the pain I keep running into. He had Cursor running parallel automation against the Sentry MCP, saturated the 60-request-per-minute bucket in seconds, and got back a 429 with no Retry-After header. His agent just sat there. No backoff hint, no escape path, nothing to do but fail the run and ask a human to babysit it.
That same week awslabs/mcp #2949 popped up where the MCP handshake itself was failing because tools/list was tripping a 429 on the second call. And GLips/Figma-Context-MCP #258 had folks convinced the MCP was broken when really their parallel calls were just blowing through Figma's per-token limit on shared credentials.
Same shape every time. The server says "no" in a way the agent cannot use.
Why 429 is the wrong answer for agents
The 429 status was designed for humans behind browsers. Retry-After: 60 works fine if a person can read a banner that says "try again in a minute." It does not work when you have an autonomous agent that needs to decide right now whether to wait, retry, escalate, or pay.
Most MCP servers do not even send Retry-After. The agent gets a 429 body, maybe some JSON, and zero machine-readable information about what would let it succeed. So it does the dumb thing. It retries immediately. Or worse, it gives up and the whole tool chain breaks.
There is no payment path. There is no proof-of-work path. There is no "I will do something to earn the right to call you" path. Just a closed door.
What 402 looks like on the wire
HTTP 402 Payment Required has been sitting in the spec since 1997 waiting for someone to use it. With agents, it finally has a real job.
A useful 402 response gives the caller a challenge it can solve programmatically. Two flavors:
HTTP/1.1 402 Payment Required
Content-Type: application/json
WWW-Authenticate: PowChallenge realm="api", id="abc123", salt="...", difficulty=14
{
"type": "pow",
"id": "abc123",
"salt": "9f3c...",
"difficulty": 14,
"signature": "..."
}
Or for paid access:
HTTP/1.1 402 Payment Required
WWW-Authenticate: L402 macaroon="...", invoice="lnbc30n1p..."
Both are deterministic. The agent reads the challenge, does the work (CPU cycles or a Lightning payment), submits the answer, and gets a token. No human in the loop. No guessing at backoff intervals. The server told the agent exactly what to do.
Wrapping an existing MCP server
@powforge/captcha-mcp is one way to do this without writing the crypto yourself. It exposes three tools: challenge, verify, and status. The package wraps captcha.powforge.dev as the backend so you do not have to host the puzzle service.
Drop it into your Claude Code or Cursor config:
{
"mcpServers": {
"powforge-captcha": {
"command": "npx",
"args": ["-y", "@powforge/captcha-mcp"]
}
}
}
Then on your own backend, when a caller hits a rate-limited endpoint, return a 402 pointing at the verify path. After the agent solves the puzzle and gets a token, your backend checks it:
curl -X POST https://captcha.powforge.dev/api/token/verify \
-H "Content-Type: application/json" \
-d '{"token":"<token-from-verify-tool>"}'
Token good, request goes through. Token bad or expired, you 402 them again with a fresh challenge.
What the agent sees
From the agent's point of view the loop is short:
- Call the tool. Get back a 402 with a challenge.
- Call the
challengetool to get a fresh puzzle (or use the one from the 402 directly). - Burn 5 to 10 seconds of CPU finding a nonce that produces a SHA-256 hash with 14 leading zero bits.
- Call
verifywith the nonce. Get back a 5-minute HMAC-signed access token. - Retry the original call with the token. Get the real response.
The agent never had to ask a human. It never had to guess at a retry interval. It paid for access in CPU cycles and got through.
PoW or Lightning
Pick based on who is calling. PoW (free tier, SHA-256, around 5 to 10 seconds of CPU at 14 leading zero bits) works great for sporadic agents, exploration runs, and free-tier users. The cost is real but small, and it scales with how much the caller wants the resource.
L402 over Lightning (paid tier, 3 sats per call by default) makes more sense for high-volume callers who would rather pay cash than burn CPU. Most agent operators will happily drop a few sats to skip the puzzle.
You can offer both from the same endpoint. The 402 response tells the agent what is available, and the agent picks based on its own constraints.
Try it
npx -y @powforge/captcha-mcp
Package and docs: https://www.npmjs.com/package/@powforge/captcha-mcp
If you maintain an MCP server that is currently returning 429s, swap the response for a 402 with a real challenge. Your agent callers will thank you by actually completing their runs instead of hanging on a wait header they cannot read.
Top comments (0)