How do you know if your AI agent is working or just burning money?

#ai #agents #observability #devops

The moment you connect an MCP server, your coding agent stops being a thing that reads and writes in your repo—it becomes something that can reach out and act. It can hit APIs, query databases, and execute tool calls. That's the entire appeal of the Agentic era.

It is also the entire nightmare for anyone responsible for the cloud bill.

I've spent two decades watching engineers celebrate 'autonomy' right before they realize that autonomy in a loop equals an infinite recursion of API costs. When you move from simple LLM prompts to autonomous agents, you aren't just managing logic anymore; you are managing execution traces and token consumption. You need observability.

The Black Box Problem

When an agent fails—or worse, when it succeeds in a way that is incredibly expensive—you can't just look at the last chat message and guess what happened. You need to see the internals. This is why I was looking closely at the AgentOps MCP server recently.

I connected it to my local environment, and the first thing I realized is that observability in agents isn't about 'logs.' It's about traces, spans, and metrics. If you can't see the individual steps of a tool call, you don't have an agent; you have a black box running on your dime.

Peering into the Trace

The AgentOps MCP setup allows for much more than just high-level monitoring. Using tools like get_trace, I was able to pull specific execution details directly into my workflow. Instead of jumping between a browser and my IDE, I could inspect exactly what happened during a specific run.

If an agent hits a loop or an error, you can use get_span to drill down into the granular level. For example, I was testing a scenario where an agent used a 'web_search' tool. By inspecting the span ID, I could see exactly when the call started, what parameters were passed (like query: agent observability), and precisely what it returned. This is how you debug complex agentic loops—by isolating the single operation that went sideways.

The Financial Reality of Tokens

We need to talk about cost. We often focus on latency, but token usage is the silent killer of AI ROI.

One of the most useful parts of this MCP server is get_trace_metrics. I ran a trace (ID: trace_abc123) and pulled the metrics directly. The result was eye-opening: 1,450 tokens used (800 prompt, 650 completion) with an estimated cost of $0.028 over just 4.2 seconds across 5 spans.

When you are running a single trace, $0.028 is nothing. When you are running a fleet of agents in production performing thousands of these traces a day, that number becomes the difference between a profitable feature and a massive loss. Being able to monitor this in real-time directly from your agentic client—whether it's Claude or Cursor—is a game changer for DevOps teams.

Implementing Observability

If you are building with MCP, you shouldn't be flying blind. The workflow is straightforward:

Subscribe to the AgentOps server within your MCP-compatible client.
Provide your API Key.
Use get_project to ensure you are hitting the right telemetry sink.

This setup isn't just for AI Engineers trying to debug code; it's for Product Managers who need to track usage patterns and optimize ROI, and DevOps teams who need to ensure that these autonomous agents aren't behaving like rogue processes in a cluster.

Stop treating your agents like magic boxes. Start treating them like the distributed systems they actually are. If you can't trace it, you shouldn't be running it.

MCPs are the music of AI Agents. We built the catalog. Discover Vinkius MCP Catalog.

Top comments (1)

Eleftheria Batsou • Jun 24

The cost question and the safety question turn out to be the same question: both are about what the agent can reach.

An agent burning money on redundant tool calls and an agent doing damage are both unbounded-reach problems.

We build Zerops partly around bounding that (the agent gets a real but isolated environment), so I'm biased, but measuring and limiting reach is the thing I'd put at the center of this. Good piece.