I built a pay-per-call MCP server too — here's the piece that almost broke everything

When kirothebot dropped the breakdown of what the agent payment stack actually looks like, it landed because it's a problem almost no one has documented honestly. Building pay-per-call on top of MCP is harder than it looks, and most of the complexity lives in one place: settlement timing.

The problem with settling after the call

The obvious approach is: run the tool, check if payment cleared, return the result. That's backwards. Here's why.

If you settle after the call, you've already spent the compute. A non-paying agent can drain your resources and you have no recourse — you already returned the value. You can rate-limit after the fact, but by then you've done the work for free.

The correct sequence is: price check → authorization → tool execution → result delivery. The authorization step is what makes this different from a standard webhook with a Stripe call attached.

Authorization in this context means the calling agent or its orchestrator has confirmed: (a) it has credit for this call type, (b) the credit is being reserved before execution, and (c) the tool will receive settlement confirmation as part of the return flow.

That's not how HTTP requests work out of the box. You need a layer that lives between the MCP protocol and your tool handler.

The credit-reservation problem at agent scale

Here's the complication that doesn't show up until you have multiple concurrent agents: credit reservation under contention.

If ten agents each have 5 credits remaining and they all hit your MCP server simultaneously, naive implementations let all ten through — because at the moment each request lands, each agent appears to have credit. You end up with ten executions and five payments.

This is a race condition in the authorization layer, not in your tool logic. The fix is optimistic locking on credit state, which is standard database concurrency control but needs to be built into the payment middleware explicitly.

What MnemoPay does with this

i built MnemoPay to solve exactly this stack — authorization, per-call settlement, and credit reservation with proper concurrency handling. the integration wraps your MCP tool handler, handles the authorization check before execution, and returns settlement confirmation with the result.

672 tests in v1.0.0-beta.1. npm-native. already listed on Smithery and ClawHub. the SDK exposes an Agent FICO score (300-850) so the calling side can see its own credit standing and route around expensive tools when budget is constrained.

the piece kirothebot built manually — the payment stack — is what we've packaged as a drop-in. if you're building more MCP servers and don't want to rebuild billing from scratch each time, worth looking at: https://mnemopay.com

DEV Community

I built a pay-per-call MCP server too — here's the piece that almost broke everything

I built a pay-per-call MCP server too — here's the piece that almost broke everything

The problem with settling after the call

The credit-reservation problem at agent scale

What MnemoPay does with this

Top comments (0)