AI coding agents are getting good enough that they no longer feel like autocomplete.
Tools like Claude Code, Codex, OpenCode, Cursor, Cline, and other agentic coding systems can read files, modify code, run commands, call tools, inspect errors, and continue working across multiple turns.
That is useful. It is also increasingly opaque.
When you ask an agent to fix a bug, you usually see the final answer and maybe a few tool calls in the terminal. But you often do not know:
- What system prompt did the model receive?
- Which messages were included in the request?
- Which tool schemas were shown to the model?
- Why did it choose one tool instead of another?
- What tool result was fed into the next turn?
- How many tokens did the task use?
- Was cache used?
- Which request was slow?
- How much did the session cost?
For small experiments, guessing is fine. For real work, guessing is a bad debugging strategy.
That is why I built ccglass.
GitHub: https://github.com/jianshuo/ccglass
What is ccglass?
ccglass is a local observability tool for AI coding agents.
It runs a local proxy, captures model requests and responses, and shows them in a web dashboard.
The goal is simple: make it easy to see what tools like Claude Code, Codex, DeepSeek-TUI, Kimi, OpenCode, Ollama, OpenRouter, and other agent clients are actually sending to the model.
ccglass can show:
- system prompts
- user and assistant message history
- tool schemas
- tool calls and tool results
- raw request and response bodies
- token usage
- cache usage
- estimated cost
- latency
- turn-to-turn diffs
It is not another coding agent. It is a visibility layer for the agents you already use.
Why not just use a normal HTTP proxy?
General-purpose tools like Charles, mitmproxy, or Proxyman are great, but AI coding agents can be awkward to inspect with traditional proxy setups.
Some clients do not reliably honor HTTP_PROXY / HTTPS_PROXY. Some have their own networking behavior. Some use provider-specific base URL settings. Patching the client is fragile.
ccglass takes a more targeted approach.
It starts a local proxy, then launches or configures the target agent with the right base URL environment variable, such as:
ANTHROPIC_BASE_URLOPENAI_BASE_URL
The agent sends model traffic to the local proxy. ccglass logs it, forwards it to the real upstream API, and renders the result in a dashboard.
That means you can inspect LLM traffic without installing a CA certificate, decrypting TLS, or modifying the client source code.
Quick start
Install ccglass with npm:
npm install -g ccglass
Then run:
ccglass
You can also start a specific client directly:
ccglass claude
ccglass codex
ccglass opencode
ccglass deepseek
ccglass kimi
For example:
ccglass codex
When it starts, ccglass prints a local dashboard URL:
dashboard: http://127.0.0.1:57633
Open that URL and you can watch requests appear as the agent works.
What can you debug with it?
1. Prompt and context problems
Sometimes an agent makes a bad decision because it did not see the context you expected.
With ccglass, you can inspect the actual messages sent to the model instead of guessing from the terminal output.
You can answer questions like:
- Did the file content make it into the request?
- Was the previous tool result included?
- Did a long conversation bury the important instruction?
- Did the system prompt constrain the behavior?
2. Tool call behavior
Coding agents are only as good as their tool loop.
ccglass lets you inspect the tools shown to the model, the tool call selected by the model, the arguments passed to the tool, and the result that was fed back into the next request.
That is useful when an agent:
- chooses the wrong tool
- repeats the same command
- fails to use an available tool
- gets confused by a tool schema
- behaves differently across providers
3. Token and cost spikes
Long-running agent sessions can burn tokens quickly.
ccglass shows token usage, cache usage, estimated cost, and latency per request and per session.
That makes it easier to spot:
- a huge tool result entering the context
- repeated large prompts
- low cache hit behavior
- slow requests
- expensive turns that did not add much value
4. Provider and proxy issues
If you use local gateways, OpenAI-compatible endpoints, Anthropic-compatible endpoints, OpenRouter, Ollama, Bedrock, Vertex, or internal proxies, request shape matters.
ccglass helps you compare what the client sent with what the upstream expected.
This is especially useful when debugging:
- custom
base_urlconfiguration - Anthropic vs OpenAI-compatible payload differences
- missing tool call fields
- malformed tool arguments
- provider-specific routing
Exporting requests
You can export captured requests for deeper inspection or bug reports:
ccglass export <session>/<seq> --format raw
ccglass export <session>/<seq> --format md
ccglass export <session>/<seq> --format json
ccglass export <session>/<seq> --format har
That makes it easier to attach useful evidence when reporting issues to an agent, provider, or gateway project.
Who is this for?
ccglass is useful if you:
- use Claude Code, Codex, OpenCode, or similar coding agents heavily
- build tools around coding agents
- maintain an LLM gateway or proxy
- debug prompt, context, or tool-call behavior
- want to understand token usage and cost
- compare different providers or agent clients
- care about local-first observability instead of sending traces to a SaaS service
A note on Codex
Codex has multiple auth and transport paths.
In API-key mode, routing through a configurable base URL works well for local proxy inspection.
When Codex is authenticated through ChatGPT login, some traffic may use a WebSocket path that does not honor OPENAI_BASE_URL. In that case, local proxy tools like ccglass cannot see that traffic.
That distinction matters when debugging Codex routing.
Why I think this matters
As coding agents become more capable, developers need better tools for understanding agent behavior.
The interesting question is no longer only:
Did the agent produce the right code?
It is also:
Why did the agent behave that way?
To answer that, you need visibility into prompts, context, tools, requests, latency, and cost.
ccglass is a small open-source step in that direction.
GitHub: https://github.com/jianshuo/ccglass
Install:
npm install -g ccglass
If you are building with coding agents and have ever wondered what they actually send to the model, give it a try.
Top comments (0)