When a coding agent fails, the visible error is rarely the whole story.
You might see:
- a tool call that never ran
- a command repeated again and again
- a sudden token spike
- a provider rejecting a request with
400 Bad Request - an agent that says it edited a file but did not
- a long session that starts producing shallow or confused answers
The usual reaction is to tweak the prompt and try again.
Sometimes that works. But for agentic coding tools, guessing is not enough. You need to inspect what the agent actually sent to the model.
That is the problem ccglass is built for.
GitHub: https://github.com/jianshuo/ccglass
The debugging problem with coding agents
Modern coding agents are not simple chatbots.
Tools like Claude Code, Codex, OpenCode, CodeBuddy, Qoder, and similar systems usually run a loop like this:
user request
-> model request
-> tool call
-> local command / file read / edit / search
-> tool result
-> next model request
-> final answer
When something goes wrong, the bug can be in any part of that loop.
For example:
- The model never saw the tool schema you thought it saw.
- The tool schema was too large or malformed.
- The model returned a malformed tool call.
- The local client dropped part of the tool result.
- A huge tool result entered the next request and inflated token usage.
- The provider rejected a request shape that another provider accepts.
- A proxy or gateway translated Anthropic and OpenAI formats incorrectly.
You cannot debug that reliably from the final answer alone.
What to inspect first
When an agent behaves strangely, I usually want to see five things.
1. The system prompt
The system prompt often explains behavior that looks mysterious from the outside.
It may contain rules about:
- when to ask permission
- when to use tools
- how much work to do before stopping
- whether to run tests
- whether to preserve existing files
- how to summarize results
If the agent ignores your instruction, first check whether the system prompt is pushing it in a different direction.
2. The tool schema
Tool calling depends heavily on the schema sent to the model.
If a tool is described vaguely, has confusing parameter names, or contains a schema shape the provider does not like, the model may choose the wrong tool or produce invalid arguments.
This matters even more with MCP servers and custom tools.
The question is not "what did my code define?" The real question is:
What tool schema was actually sent in the model request?
3. The tool call
A tool call bug can come from the model, the client, or the provider adapter.
You want to inspect:
- tool name
- call id
- arguments
- malformed fields
- missing required fields
- whether the tool call was emitted as structured data or plain text
For example, if the model emits something that looks like a tool call but the client renders it as text, the agent may continue as if the tool ran even though no tool result exists.
4. The tool result
Tool results are often the hidden source of context bloat.
A single file read, search result, stack trace, or command output can add thousands of tokens to the next turn.
If the agent suddenly becomes expensive or confused, check what tool results were fed back into the model.
5. Token usage and latency
Token totals are useful, but per-request token usage is better.
You want to know:
- which request got expensive
- whether input, output, or cache tokens dominated
- whether a request was slow before the first token
- whether repeated turns reused the same large context
- whether a provider returned usage data at all
That is the difference between "this session was expensive" and "this specific tool result caused the spike."
Using ccglass for request-level debugging
ccglass is a local proxy and dashboard for coding-agent traffic.
It lets you inspect what supported agents actually send to the model:
- system prompts
- messages
- tool schemas
- tool calls
- tool results
- raw request and response bodies
- token/cache/cost
- latency
- turn-to-turn diffs
It works locally. It is open source.
Install:
npm install -g ccglass
Start it:
ccglass
Or choose a client directly:
ccglass claude
ccglass codex
ccglass opencode
ccglass qoder
ccglass codebuddy
For generic OpenAI-compatible or Anthropic-compatible clients, you can also run proxy-only mode:
ccglass proxy --provider openai
ccglass proxy --provider claude
Then point your client or IDE at the printed local base URL.
Example debugging workflow
Suppose an agent repeatedly fails to call a tool correctly.
Instead of changing the prompt first, inspect the actual request flow:
- Open the ccglass dashboard.
- Find the request where the model was expected to call the tool.
- Expand the system prompt and tool schema.
- Check whether the tool was visible to the model.
- Check the model response for the tool call.
- Check whether the tool result was paired correctly.
- Compare the next request to see what context was carried forward.
That gives you a factual answer to questions like:
- Did the model see the tool?
- Did it call the wrong tool?
- Were the arguments malformed?
- Did the client drop the tool result?
- Did the next turn include the right result?
Example: debugging token spikes
Another common problem:
Why did this one coding-agent session use so many tokens?
In ccglass, inspect the request list and session summary.
Look for:
- a request with unusually high input tokens
- a large tool result entering the next request
- many repeated requests with similar context
- cache usage that is lower than expected
- a slow request with high input size
Then use turn-to-turn diff to see what changed between two requests.
This is often more useful than looking only at the final cost.
Example: debugging provider 400 errors
Provider errors are another good use case.
If an Anthropic-compatible or OpenAI-compatible endpoint rejects a request, you need the exact payload.
Check:
- request body
- tool schema
- message order
- tool_use / tool_result pairing
- response or error body
- provider/model name
This is useful when working with:
- internal gateways
- OpenRouter
- Ollama-compatible endpoints
- Bedrock or Vertex routes
- Anthropic-compatible translation layers
- OpenAI-compatible coding-agent backends
The failure is often not "the model is bad." It is often a request-shape problem.
Exporting evidence
ccglass can export captured requests:
ccglass export <session>/<seq> --format raw
ccglass export <session>/<seq> --format md
ccglass export <session>/<seq> --format json
ccglass export <session>/<seq> --format har
That is useful when reporting bugs to an agent project, provider, or proxy maintainer.
Instead of saying:
The agent failed.
You can show:
This exact request contained this tool schema, this model response emitted this malformed tool call, and this provider returned this error.
That is much easier to debug.
A few practical notes
ccglass is not a universal network sniffer.
It works best when the client can be pointed at a local base URL or local proxy. For example, API-key based OpenAI-compatible and Anthropic-compatible traffic is a good fit.
Some clients have special transports. For example, Codex authenticated through ChatGPT login may use a WebSocket path that does not honor OPENAI_BASE_URL, so local base URL inspection will not see that traffic.
For CodeBuddy, ccglass uses a forward-proxy mode because CodeBuddy hardcodes its upstream endpoint.
Why this matters
As coding agents become more autonomous, debugging needs to move one layer deeper.
It is no longer enough to ask:
Did the agent produce the right diff?
You also need to ask:
What did the agent see, what tool did it choose, what result came back, and what context entered the next turn?
That is what ccglass tries to make visible.
GitHub:
https://github.com/jianshuo/ccglass
Install:
npm install -g ccglass
If you build with coding agents, request-level debugging is worth having in your toolbox.
Top comments (0)