侯垒

Posted on Jun 20

Stop Flying Blind with Coding Agents: Inspect Claude Code and Codex Requests with ccglass

#ai #opensource #productivity #devbugsmash

AI coding agents are getting good enough that they no longer feel like autocomplete.

Tools like Claude Code, Codex, OpenCode, Cursor, Cline, and other agentic coding systems can read files, modify code, run commands, call tools, inspect errors, and continue working across multiple turns.

That is useful. It is also increasingly opaque.

When you ask an agent to fix a bug, you usually see the final answer and maybe a few tool calls in the terminal. But you often do not know:

What system prompt did the model receive?
Which messages were included in the request?
Which tool schemas were shown to the model?
Why did it choose one tool instead of another?
What tool result was fed into the next turn?
How many tokens did the task use?
Was cache used?
Which request was slow?
How much did the session cost?

For small experiments, guessing is fine. For real work, guessing is a bad debugging strategy.

That is why I built ccglass.

GitHub: https://github.com/jianshuo/ccglass

What is ccglass?

ccglass is a local observability tool for AI coding agents.

It runs a local proxy, captures model requests and responses, and shows them in a web dashboard.

The goal is simple: make it easy to see what tools like Claude Code, Codex, DeepSeek-TUI, Kimi, OpenCode, Ollama, OpenRouter, and other agent clients are actually sending to the model.

ccglass can show:

system prompts
user and assistant message history
tool schemas
tool calls and tool results
raw request and response bodies
token usage
cache usage
estimated cost
latency
turn-to-turn diffs

It is not another coding agent. It is a visibility layer for the agents you already use.

Why not just use a normal HTTP proxy?

General-purpose tools like Charles, mitmproxy, or Proxyman are great, but AI coding agents can be awkward to inspect with traditional proxy setups.

Some clients do not reliably honor HTTP_PROXY / HTTPS_PROXY. Some have their own networking behavior. Some use provider-specific base URL settings. Patching the client is fragile.

ccglass takes a more targeted approach.

It starts a local proxy, then launches or configures the target agent with the right base URL environment variable, such as:

ANTHROPIC_BASE_URL
OPENAI_BASE_URL

The agent sends model traffic to the local proxy. ccglass logs it, forwards it to the real upstream API, and renders the result in a dashboard.

That means you can inspect LLM traffic without installing a CA certificate, decrypting TLS, or modifying the client source code.

Quick start

Install ccglass with npm:

npm install -g ccglass

Then run:

ccglass

You can also start a specific client directly:

ccglass claude
ccglass codex
ccglass opencode
ccglass deepseek
ccglass kimi

For example:

ccglass codex

When it starts, ccglass prints a local dashboard URL:

dashboard: http://127.0.0.1:57633

Open that URL and you can watch requests appear as the agent works.

What can you debug with it?

1. Prompt and context problems

Sometimes an agent makes a bad decision because it did not see the context you expected.

With ccglass, you can inspect the actual messages sent to the model instead of guessing from the terminal output.

You can answer questions like:

Did the file content make it into the request?
Was the previous tool result included?
Did a long conversation bury the important instruction?
Did the system prompt constrain the behavior?

2. Tool call behavior

Coding agents are only as good as their tool loop.

ccglass lets you inspect the tools shown to the model, the tool call selected by the model, the arguments passed to the tool, and the result that was fed back into the next request.

That is useful when an agent:

chooses the wrong tool
repeats the same command
fails to use an available tool
gets confused by a tool schema
behaves differently across providers

3. Token and cost spikes

Long-running agent sessions can burn tokens quickly.

ccglass shows token usage, cache usage, estimated cost, and latency per request and per session.

That makes it easier to spot:

a huge tool result entering the context
repeated large prompts
low cache hit behavior
slow requests
expensive turns that did not add much value

4. Provider and proxy issues

If you use local gateways, OpenAI-compatible endpoints, Anthropic-compatible endpoints, OpenRouter, Ollama, Bedrock, Vertex, or internal proxies, request shape matters.

ccglass helps you compare what the client sent with what the upstream expected.

This is especially useful when debugging:

custom base_url configuration
Anthropic vs OpenAI-compatible payload differences
missing tool call fields
malformed tool arguments
provider-specific routing

Exporting requests

You can export captured requests for deeper inspection or bug reports:

ccglass export <session>/<seq> --format raw
ccglass export <session>/<seq> --format md
ccglass export <session>/<seq> --format json
ccglass export <session>/<seq> --format har

That makes it easier to attach useful evidence when reporting issues to an agent, provider, or gateway project.

Who is this for?

ccglass is useful if you:

use Claude Code, Codex, OpenCode, or similar coding agents heavily
build tools around coding agents
maintain an LLM gateway or proxy
debug prompt, context, or tool-call behavior
want to understand token usage and cost
compare different providers or agent clients
care about local-first observability instead of sending traces to a SaaS service

A note on Codex

Codex has multiple auth and transport paths.

In API-key mode, routing through a configurable base URL works well for local proxy inspection.

When Codex is authenticated through ChatGPT login, some traffic may use a WebSocket path that does not honor OPENAI_BASE_URL. In that case, local proxy tools like ccglass cannot see that traffic.

That distinction matters when debugging Codex routing.

Why I think this matters

As coding agents become more capable, developers need better tools for understanding agent behavior.

The interesting question is no longer only:

Did the agent produce the right code?

It is also:

Why did the agent behave that way?

To answer that, you need visibility into prompts, context, tools, requests, latency, and cost.

ccglass is a small open-source step in that direction.