<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: 侯垒</title>
    <description>The latest articles on DEV Community by 侯垒 (@houleixx).</description>
    <link>https://dev.to/houleixx</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3976799%2Fc165c283-c505-4c96-ba45-7b87ea20771c.png</url>
      <title>DEV Community: 侯垒</title>
      <link>https://dev.to/houleixx</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/houleixx"/>
    <language>en</language>
    <item>
      <title>I built a local reverse proxy to see what Claude Code actually sends to Anthropic</title>
      <dc:creator>侯垒</dc:creator>
      <pubDate>Wed, 10 Jun 2026 07:08:09 +0000</pubDate>
      <link>https://dev.to/houleixx/i-built-a-local-reverse-proxy-to-see-what-claude-code-actually-sends-to-anthropic-5foo</link>
      <guid>https://dev.to/houleixx/i-built-a-local-reverse-proxy-to-see-what-claude-code-actually-sends-to-anthropic-5foo</guid>
      <description>&lt;h2&gt;
  
  
  The problem I couldn't solve
&lt;/h2&gt;

&lt;p&gt;I was spending ~$1,800/month on Claude Code.&lt;/p&gt;

&lt;p&gt;I had no idea where the money was going. I had no idea which prompts were 4,000-token monstrosities, which ones were 200-token gems, or which ones I'd accidentally repeated 3 times this week.&lt;/p&gt;

&lt;p&gt;I tried the obvious tools first:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;mitmproxy&lt;/strong&gt; — didn't work. Claude Code (and Codex, DeepSeek, Kimi, GLM, etc.) all ignore &lt;code&gt;HTTP_PROXY&lt;/code&gt; because they're native CLIs that open HTTPS sockets directly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Charles&lt;/strong&gt; — same problem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Langfuse / Helicone&lt;/strong&gt; — these are SaaS. You have to send your data to them. Not what I wanted.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom hooks&lt;/strong&gt; — limited to events the CLI exposes. I wanted the raw HTTP.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted &lt;strong&gt;a local, open-source, zero-account way to see what my coding agent was doing&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The solution: a local reverse proxy on the loopback
&lt;/h2&gt;

&lt;p&gt;The insight: every coding agent CLI talks to &lt;code&gt;api.anthropic.com&lt;/code&gt; (or similar). If I make it talk to &lt;code&gt;http://127.0.0.1:port&lt;/code&gt; instead, and have a tiny proxy on that port forward to the real API, &lt;strong&gt;the local hop is plain HTTP&lt;/strong&gt; — easy to log, no CA cert, no TLS pinning pain.&lt;/p&gt;

&lt;p&gt;That's it. That's the whole trick.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm i &lt;span class="nt"&gt;-g&lt;/span&gt; ccglass
ccglass claude
&lt;span class="c"&gt;# → opens http://localhost:8123 in your browser&lt;/span&gt;
&lt;span class="c"&gt;# → real-time dashboard of every request&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What I learned in 30 days
&lt;/h2&gt;

&lt;p&gt;After running every Claude Code session through it, I found:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. I had a 38% cache hit rate I didn't know about
&lt;/h3&gt;

&lt;p&gt;I was repeating myself in 38% of prompts and paying full price. The dashboard made it visible. I rewrote my CLAUDE.md to front-load context — cache hit rate jumped to 70%, monthly bill dropped 35%.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Per-provider cost varies 10x
&lt;/h3&gt;

&lt;p&gt;Same task:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude Sonnet 4.6: $0.42&lt;/li&gt;
&lt;li&gt;GPT-4o: $0.31&lt;/li&gt;
&lt;li&gt;DeepSeek: $0.04&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I started picking per-task. Anthropic for quality, DeepSeek for bulk.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Turn counts were higher than I thought
&lt;/h3&gt;

&lt;p&gt;Average 4.2 turns per task. After seeing the data, I rewrote my CLAUDE.md. Turn count dropped to 2.8. Less back-and-forth = less cost = faster delivery.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. MCP self-inspection is wild
&lt;/h3&gt;

&lt;p&gt;ccglass has an MCP server. When you run &lt;code&gt;ccglass claude&lt;/code&gt;, the agent can query its own request history &lt;em&gt;inside the chat&lt;/em&gt;. I asked Claude "what did I prompt you with 3 turns ago?" and it answered correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's supported
&lt;/h2&gt;

&lt;p&gt;16+ providers out of the box:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Coding agents&lt;/strong&gt;: Claude Code, Codex, OpenCode, CodeBuddy, Reasonix&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pure LLM APIs&lt;/strong&gt;: Anthropic, OpenAI, DeepSeek, Kimi, GLM, OpenRouter&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud&lt;/strong&gt;: AWS Bedrock, GCP Vertex AI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local&lt;/strong&gt;: Ollama, LM Studio&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The limits (I want to be honest)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cursor subscription models&lt;/strong&gt; can't be intercepted (they use a server-side proxy).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VS Code Continue&lt;/strong&gt; with built-in models: same.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It's local-only by design.&lt;/strong&gt; No SaaS, no telemetry, no account. (If you want cloud, use Langfuse.)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Open source
&lt;/h2&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/jianshuo/ccglass" rel="noopener noreferrer"&gt;https://github.com/jianshuo/ccglass&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;460+ stars at time of writing. MIT licensed. PRs welcome.&lt;/p&gt;

&lt;p&gt;If you ship with Claude Code / Codex / Kimi and have ever asked "where is my money going", try it once. The data is eye-opening.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>opensource</category>
      <category>devtools</category>
    </item>
    <item>
      <title>Make Your AI Coding Agent Transparent - See What It Actually Sends to the Model</title>
      <dc:creator>侯垒</dc:creator>
      <pubDate>Wed, 10 Jun 2026 02:46:06 +0000</pubDate>
      <link>https://dev.to/houleixx/make-your-ai-coding-agent-transparent-see-what-it-actually-sends-to-the-model-4inn</link>
      <guid>https://dev.to/houleixx/make-your-ai-coding-agent-transparent-see-what-it-actually-sends-to-the-model-4inn</guid>
      <description>&lt;p&gt;If you've been using AI coding agents like Claude Code or Codex, you know how powerful they can be. But they also feel like a black box. What's actually in that system prompt? How much context is being sent every turn? Where are all my tokens going?&lt;/p&gt;

&lt;p&gt;I recently found a tool called ccglass that answers these questions beautifully, and I wanted to share my experience.&lt;/p&gt;

&lt;p&gt;What is ccglass?&lt;br&gt;
ccglass is a lightweight local tool that acts as a reverse proxy between your AI coding agent and the model API. It intercepts all requests and responses, logs them, and displays them in a really nice web dashboard.&lt;/p&gt;

&lt;p&gt;ccglass Dashboard&lt;/p&gt;

&lt;p&gt;Getting Started&lt;br&gt;
Installation is simple (Node.js 18+ required):&lt;/p&gt;

&lt;p&gt;npm install -g ccglass&lt;/p&gt;

&lt;p&gt;Then just run it and pick your agent:&lt;/p&gt;

&lt;p&gt;ccglass&lt;/p&gt;

&lt;p&gt;Or specify directly:&lt;/p&gt;

&lt;p&gt;ccglass claude      # for Claude Code&lt;br&gt;
ccglass codex       # for Codex&lt;br&gt;
ccglass deepseek    # for DeepSeek-TUI&lt;/p&gt;

&lt;p&gt;It will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start a local proxy server&lt;/li&gt;
&lt;li&gt;Set the right environment variables automatically&lt;/li&gt;
&lt;li&gt;Launch the agent for you&lt;/li&gt;
&lt;li&gt;Open the web dashboard&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it! No CA certificates to install, no complicated setup.&lt;/p&gt;

&lt;p&gt;What You Can See&lt;/p&gt;

&lt;p&gt;The dashboard shows you everything:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Full System Prompt&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is probably the most interesting part. You get to see how different agents frame their instructions to the model. Claude Code's system&lt;br&gt;
prompt is fascinating to read!&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Complete Message History&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;See the full context being sent each turn, how it evolves, and what gets kept vs. dropped.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Tool Schemas and Calls&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;See what tool definitions the agent provides to the model, and what tool calls the model makes in response.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Token and Cost Breakdown&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Know exactly how many tokens you're using, what's cached, and get cost estimates per request and per session.&lt;/p&gt;

&lt;p&gt;Token Summary &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcoqn2p5y4umhy7rglkxs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcoqn2p5y4umhy7rglkxs.png" alt=" " width="800" height="337"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Turn-by-Turn Diff View&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Compare requests to see exactly what changed between turns. Super useful for debugging why an agent started behaving differently.&lt;/p&gt;

&lt;p&gt;Supported Agents&lt;/p&gt;

&lt;p&gt;The list is pretty extensive:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude Code (including Bedrock and Vertex modes)&lt;/li&gt;
&lt;li&gt;Codex (OpenAI)&lt;/li&gt;
&lt;li&gt;DeepSeek-TUI and Reasonix&lt;/li&gt;
&lt;li&gt;Kimi (Moonshot)&lt;/li&gt;
&lt;li&gt;OpenCode&lt;/li&gt;
&lt;li&gt;Ollama&lt;/li&gt;
&lt;li&gt;OpenRouter&lt;/li&gt;
&lt;li&gt;GLM/Zhipu&lt;/li&gt;
&lt;li&gt;CodeBuddy (VS Code/JetBrains plugins)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;IDE Support&lt;/p&gt;

&lt;p&gt;If you use Cursor, Cline, Continue.dev, or similar IDEs that let you set a custom API base URL, you can use the proxy mode:&lt;/p&gt;

&lt;p&gt;ccglass proxy --provider openai&lt;/p&gt;

&lt;p&gt;Then just point your IDE's API base URL to the local proxy address it gives you.&lt;/p&gt;

&lt;p&gt;Why I Like This&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Learn from the best - See how production-grade agent systems are built&lt;/li&gt;
&lt;li&gt;Debug effectively - Understand why your agent is making certain choices&lt;/li&gt;
&lt;li&gt;Optimize costs - See where your tokens are actually going&lt;/li&gt;
&lt;li&gt;Local only - All logs stay on your machine (default redacts auth tokens)&lt;/li&gt;
&lt;li&gt;MIT licensed - Completely open source&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Pro Tip: Export for Documentation&lt;/p&gt;

&lt;p&gt;You can export any request to raw HTTP, Markdown, JSON, or HAR format:&lt;/p&gt;

&lt;p&gt;ccglass export / --format md&lt;/p&gt;

&lt;p&gt;Great for documentation, bug reports, or just saving interesting prompts.&lt;/p&gt;

&lt;p&gt;Try It Out&lt;/p&gt;

&lt;p&gt;If you're using any AI coding agent regularly, I highly recommend giving ccglass a try. It will change how you think about these tools.&lt;/p&gt;

&lt;p&gt;Install it now:&lt;/p&gt;

&lt;p&gt;npm install -g ccglass&lt;/p&gt;

&lt;p&gt;Check out the project: github.com/jianshuo/ccglass&lt;/p&gt;




&lt;h2&gt;
  
  
  What's your favorite AI coding agent? Have you found any good tools for understanding them better? Let me know in the comments!
&lt;/h2&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>monitoring</category>
      <category>tooling</category>
    </item>
  </channel>
</rss>
