<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Zelys - DFK Helper</title>
    <description>The latest articles on DEV Community by Zelys - DFK Helper (@zelys_dfkhelper).</description>
    <link>https://dev.to/zelys_dfkhelper</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3943003%2Fbb26dc0b-8b1d-4d52-8783-04f040828ac7.jpg</url>
      <title>DEV Community: Zelys - DFK Helper</title>
      <link>https://dev.to/zelys_dfkhelper</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/zelys_dfkhelper"/>
    <language>en</language>
    <item>
      <title>One Tool That Cuts Token Costs 40-80% for Claude Code, Codex, opencode, and openclaw</title>
      <dc:creator>Zelys - DFK Helper</dc:creator>
      <pubDate>Wed, 20 May 2026 22:18:23 +0000</pubDate>
      <link>https://dev.to/zelys_dfkhelper/one-tool-that-cuts-token-costs-40-80-for-claude-code-codex-opencode-and-openclaw-hh2</link>
      <guid>https://dev.to/zelys_dfkhelper/one-tool-that-cuts-token-costs-40-80-for-claude-code-codex-opencode-and-openclaw-hh2</guid>
      <description>&lt;p&gt;&lt;strong&gt;The problem isn't your prompts&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're running Claude Code, Codex, opencode, or openclaw and the API bill keeps climbing, you've probably tried writing tighter prompts. That's not where the waste is.&lt;/p&gt;

&lt;p&gt;Four structural patterns account for most of the token spend in a typical session:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Screenshots at full resolution&lt;/strong&gt;. The agent reads whatever images you paste or reference. A 3.3 MB screenshot from a high-DPI display lands in the model at full size. The model doesn't need native resolution to understand what's on screen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repeated file reads&lt;/strong&gt;. The agent re-reads files it already touched earlier in the session. A 600-line file read three times costs 1,800 lines of tokens. There's no built-in session memory to prevent the second or third read from running the full price.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compaction that loses context&lt;/strong&gt;. When a session compacts, the summary doesn't know which files were actively edited or which symbols mattered, so the next request starts with the wrong picture and prompts more reads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bash output floods&lt;/strong&gt;. Every pytest, npm install, docker build, or git log dumps hundreds of lines of passing-test names, deprecation warnings, and progress bars. The model processes all of it at full token cost.&lt;/p&gt;

&lt;p&gt;These compound. On a session with 10+ file reads, a few images, and a test run, you're easily burning 3x the tokens you actually need.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;token-goat fixes all four&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;token-goat (&lt;a href="https://github.com/DFKHelper/token-goat" rel="noopener noreferrer"&gt;https://github.com/DFKHelper/token-goat&lt;/a&gt;) is a hook daemon for Claude Code, Codex CLI, opencode, and openclaw. Install once; it handles the rest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Image shrinking&lt;/strong&gt;. Intercepts screenshots before they reach the model and compresses them. A 3.3 MB PNG becomes 84 KB, 97.4% smaller.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session-aware read hints&lt;/strong&gt;. Tracks every file the agent reads in the session. When it's about to re-read one, it gets: "you read lines 1–420 of auth.py 12 minutes ago." Most re-reads stop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compaction assist&lt;/strong&gt;. Before the session compacts, a hook builds a structured manifest — edited files, accessed symbols, key reads — and injects it into the compaction context. The next request starts with the right picture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bash output compression&lt;/strong&gt;. Filters long-running command output before it hits the model. pytest goes from 150 passing-test lines to a failures-first view, 80–97% smaller. npm install collapses warnings by package. docker build keeps step headers and errors, drops the rest.&lt;/p&gt;

&lt;p&gt;It's all automated, but you can also pull individual functions instead of whole files:&lt;/p&gt;

&lt;p&gt;_ token-goat read "src/auth.py::login"_&lt;/p&gt;

&lt;p&gt;On a 2,000-line module, that's 85% fewer tokens than reading the full file.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;The numbers&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;100K wasted tokens per session runs about $0.30. Five sessions a week is $450/year. AI coding cost reduction at that scale comes from eliminating structural waste, not from writing shorter prompts. token-goat is free.&lt;/p&gt;

&lt;p&gt;4 hours of use on my machine: 59.7 MB of data that never hit the model, 11.5 million tokens avoided. And that was just version 0.1.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Install&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Requires uv (&lt;a href="https://docs.astral.sh/uv/" rel="noopener noreferrer"&gt;https://docs.astral.sh/uv/&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;uv tool install token-goat&lt;br&gt;
token-goat install&lt;/p&gt;

&lt;p&gt;Works with Claude Code, Codex CLI, opencode, and openclaw. Windows, Linux, WSL, and macOS.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/DFKHelper/token-goat" rel="noopener noreferrer"&gt;https://github.com/DFKHelper/token-goat&lt;/a&gt;&lt;/p&gt;

</description>
      <category>claude</category>
      <category>opencode</category>
      <category>ai</category>
      <category>devtools</category>
    </item>
  </channel>
</rss>
