What Your AI Agent Actually Sends to APIs (And Why You Should Care)

#ai #webdev #debugging #opensource

You built an MCP tool that creates Stripe charges. You tested it. It works. A customer gets double-charged. Now what?

You open your logs. The LLM decided to call your create_charge tool twice. Or maybe once with weird parameters. Or maybe your tool called Stripe's API three times because of retry logic you forgot about. You don't actually know, because you never saw the raw HTTP requests that left your machine.

This is the reality of building AI agent tooling right now: you're shipping code where the most important part — what actually goes over the wire — is invisible to you.

The black box problem

When you write a normal REST integration, you control the inputs. You write the fetch call, you set the headers, you know exactly what's being sent because you typed it.

AI agents flip this. The LLM decides when to call your tool, what arguments to pass, and sometimes how many times to call it. Your MCP tool or function-calling handler takes those arguments and fires off HTTP requests to third-party APIs. The gap between "what the LLM decided" and "what bytes hit Stripe's servers" is where bugs live.

And it's not a small gap.

Consider a typical MCP tool that manages Shopify products:

@mcp.tool()
async def update_product(product_id: str, title: str, price: float):
    async with httpx.AsyncClient() as client:
        resp = await client.put(
            f"https://your-store.myshopify.com/admin/api/2024-01/products/{product_id}.json",
            headers={"X-Shopify-Access-Token": TOKEN},
            json={"product": {"title": title, "variants": [{"price": str(price)}]}}
        )
    return resp.json()

Looks simple. But when an agent calls this, you can't see:

Did the agent pass a valid product_id, or did it hallucinate one?
What did the full request body actually look like after serialization?
Did Shopify return a 200, or a 422 that your tool swallowed?
How long did the request take? Was it retried?

You could add logging. You could add print statements. You could wire up OpenTelemetry. But now you're instrumenting every tool, and you still can't see the actual bytes on the wire — just what your code thinks it sent.

What you actually want

You want curl -v for your AI agent's API calls. Something that shows you the real request and response, in real time, without changing your tool's code.

The simplest approach I've found: swap a base URL and watch everything in your browser.

The idea is dead simple. Instead of your tool hitting https://api.stripe.com, it hits https://abc123.toran.sh. Toran forwards the request to the real API, records everything, and streams it to a live dashboard you can watch in your browser.

No install. No signup. No SDK to integrate. You change a base URL, and suddenly you can see everything.

What this looks like in practice

Go to toran.sh/try, enter the upstream API you want to inspect (e.g. https://your-store.myshopify.com), and you get a unique toran URL like https://abc123.toran.sh. Now swap it in:

# Before
BASE = "https://your-store.myshopify.com"

# After — point at your toran URL instead
BASE = os.environ.get("SHOPIFY_BASE_URL", "https://your-store.myshopify.com")

# Set the env var to your toran URL
export SHOPIFY_BASE_URL=https://abc123.toran.sh

That's it. Now every request your MCP tool makes to Shopify flows through toran, and you see the full request and response live in your browser.

No code changes beyond the base URL. When you're done debugging, unset the env var and you're back to production mode.

Why this matters more for AI agents than regular code

In traditional software, a bug in an API call is usually deterministic. Same input, same output, same broken request. You can reproduce it.

AI agents are stochastic. The LLM might call your tool differently every time. The arguments might be slightly wrong in ways you'd never think to test. The failure mode might be "it worked, but it sent the wrong thing" — the hardest kind of bug.

Being able to watch the actual HTTP traffic in real time turns debugging from archaeology into observation. You see the problem as it happens, not after your user reports it.

Some things I've caught this way:

An agent passing amount: 1000 (cents) vs amount: 10.00 (dollars) to Stripe — both "work," one costs 100x more
Retry logic firing three times because the first response was slow, not failed
Auth tokens being sent to the wrong endpoint because the agent mixed up two similar tools
A tool silently falling back to a default value when the agent passed null

Try it

If you're building MCP tools or any AI agent that talks to external APIs, go to toran.sh/try. Enter your upstream URL, get a toran URL, swap it in, and see what your agent is actually doing.

# 1. Go to toran.sh/try, enter: https://api.openai.com
# 2. Get your URL: https://abc123.toran.sh
# 3. Swap it in:
export OPENAI_BASE_URL=https://abc123.toran.sh