DEV Community

侯垒
侯垒

Posted on

Stop Flying Blind with Coding Agents: Inspect Claude Code and Codex Requests with ccglass

AI coding agents are getting good enough that they no longer feel like autocomplete.

Tools like Claude Code, Codex, OpenCode, Cursor, Cline, and other agentic coding systems can read files, modify code, run commands, call tools, inspect errors, and continue working across multiple turns.

That is useful. It is also increasingly opaque.

When you ask an agent to fix a bug, you usually see the final answer and maybe a few tool calls in the terminal. But you often do not know:

  • What system prompt did the model receive?
  • Which messages were included in the request?
  • Which tool schemas were shown to the model?
  • Why did it choose one tool instead of another?
  • What tool result was fed into the next turn?
  • How many tokens did the task use?
  • Was cache used?
  • Which request was slow?
  • How much did the session cost?

For small experiments, guessing is fine. For real work, guessing is a bad debugging strategy.

That is why I built ccglass.

GitHub: https://github.com/jianshuo/ccglass

What is ccglass?

ccglass is a local observability tool for AI coding agents.

It runs a local proxy, captures model requests and responses, and shows them in a web dashboard.

The goal is simple: make it easy to see what tools like Claude Code, Codex, DeepSeek-TUI, Kimi, OpenCode, Ollama, OpenRouter, and other agent clients are actually sending to the model.

ccglass can show:

  • system prompts
  • user and assistant message history
  • tool schemas
  • tool calls and tool results
  • raw request and response bodies
  • token usage
  • cache usage
  • estimated cost
  • latency
  • turn-to-turn diffs

It is not another coding agent. It is a visibility layer for the agents you already use.

Why not just use a normal HTTP proxy?

General-purpose tools like Charles, mitmproxy, or Proxyman are great, but AI coding agents can be awkward to inspect with traditional proxy setups.

Some clients do not reliably honor HTTP_PROXY / HTTPS_PROXY. Some have their own networking behavior. Some use provider-specific base URL settings. Patching the client is fragile.

ccglass takes a more targeted approach.

It starts a local proxy, then launches or configures the target agent with the right base URL environment variable, such as:

  • ANTHROPIC_BASE_URL
  • OPENAI_BASE_URL

The agent sends model traffic to the local proxy. ccglass logs it, forwards it to the real upstream API, and renders the result in a dashboard.

That means you can inspect LLM traffic without installing a CA certificate, decrypting TLS, or modifying the client source code.

Quick start

Install ccglass with npm:

npm install -g ccglass
Enter fullscreen mode Exit fullscreen mode

Then run:

ccglass
Enter fullscreen mode Exit fullscreen mode

You can also start a specific client directly:

ccglass claude
ccglass codex
ccglass opencode
ccglass deepseek
ccglass kimi
Enter fullscreen mode Exit fullscreen mode

For example:

ccglass codex
Enter fullscreen mode Exit fullscreen mode

When it starts, ccglass prints a local dashboard URL:

dashboard: http://127.0.0.1:57633
Enter fullscreen mode Exit fullscreen mode

Open that URL and you can watch requests appear as the agent works.

What can you debug with it?

1. Prompt and context problems

Sometimes an agent makes a bad decision because it did not see the context you expected.

With ccglass, you can inspect the actual messages sent to the model instead of guessing from the terminal output.

You can answer questions like:

  • Did the file content make it into the request?
  • Was the previous tool result included?
  • Did a long conversation bury the important instruction?
  • Did the system prompt constrain the behavior?

2. Tool call behavior

Coding agents are only as good as their tool loop.

ccglass lets you inspect the tools shown to the model, the tool call selected by the model, the arguments passed to the tool, and the result that was fed back into the next request.

That is useful when an agent:

  • chooses the wrong tool
  • repeats the same command
  • fails to use an available tool
  • gets confused by a tool schema
  • behaves differently across providers

3. Token and cost spikes

Long-running agent sessions can burn tokens quickly.

ccglass shows token usage, cache usage, estimated cost, and latency per request and per session.

That makes it easier to spot:

  • a huge tool result entering the context
  • repeated large prompts
  • low cache hit behavior
  • slow requests
  • expensive turns that did not add much value

4. Provider and proxy issues

If you use local gateways, OpenAI-compatible endpoints, Anthropic-compatible endpoints, OpenRouter, Ollama, Bedrock, Vertex, or internal proxies, request shape matters.

ccglass helps you compare what the client sent with what the upstream expected.

This is especially useful when debugging:

  • custom base_url configuration
  • Anthropic vs OpenAI-compatible payload differences
  • missing tool call fields
  • malformed tool arguments
  • provider-specific routing

Exporting requests

You can export captured requests for deeper inspection or bug reports:

ccglass export <session>/<seq> --format raw
ccglass export <session>/<seq> --format md
ccglass export <session>/<seq> --format json
ccglass export <session>/<seq> --format har
Enter fullscreen mode Exit fullscreen mode

That makes it easier to attach useful evidence when reporting issues to an agent, provider, or gateway project.

Who is this for?

ccglass is useful if you:

  • use Claude Code, Codex, OpenCode, or similar coding agents heavily
  • build tools around coding agents
  • maintain an LLM gateway or proxy
  • debug prompt, context, or tool-call behavior
  • want to understand token usage and cost
  • compare different providers or agent clients
  • care about local-first observability instead of sending traces to a SaaS service

A note on Codex

Codex has multiple auth and transport paths.

In API-key mode, routing through a configurable base URL works well for local proxy inspection.

When Codex is authenticated through ChatGPT login, some traffic may use a WebSocket path that does not honor OPENAI_BASE_URL. In that case, local proxy tools like ccglass cannot see that traffic.

That distinction matters when debugging Codex routing.

Why I think this matters

As coding agents become more capable, developers need better tools for understanding agent behavior.

The interesting question is no longer only:

Did the agent produce the right code?

It is also:

Why did the agent behave that way?

To answer that, you need visibility into prompts, context, tools, requests, latency, and cost.

ccglass is a small open-source step in that direction.

GitHub: https://github.com/jianshuo/ccglass

Install:

npm install -g ccglass
Enter fullscreen mode Exit fullscreen mode

If you are building with coding agents and have ever wondered what they actually send to the model, give it a try.

Top comments (0)