<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Stanislav Kremeň</title>
    <description>The latest articles on DEV Community by Stanislav Kremeň (@stkremen).</description>
    <link>https://dev.to/stkremen</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3964941%2Ff626e8f5-89da-43a5-bf81-05508fc5fe52.jpg</url>
      <title>DEV Community: Stanislav Kremeň</title>
      <link>https://dev.to/stkremen</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stkremen"/>
    <language>en</language>
    <item>
      <title>How I Stopped Burning Tokens in Claude Code</title>
      <dc:creator>Stanislav Kremeň</dc:creator>
      <pubDate>Tue, 02 Jun 2026 16:08:58 +0000</pubDate>
      <link>https://dev.to/stkremen/how-i-stopped-burning-tokens-in-claude-code-2okd</link>
      <guid>https://dev.to/stkremen/how-i-stopped-burning-tokens-in-claude-code-2okd</guid>
      <description>&lt;p&gt;I use Claude Code every day. After about a month on a larger project, I hit a wall that anyone running something longer than a weekend script knows: keeping a project on track across dozens of sessions costs an absurd amount of tokens.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkvgkjo1cw5b339ls0t9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkvgkjo1cw5b339ls0t9.gif" alt=" " width="600" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The problem isn't the coding itself. The problem is context. Every new session starts from zero - Claude doesn't know what happened in the last phase, what decisions were already made, where the project is heading. So you have to re-orient it every single time. And the more context you pile on, the sooner you hit the limit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where the tokens go&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I tried GSD (Get Shit Done) - a clever tool that breaks a project into phases and keeps documentation along the way. But that documentation is also its weakness: it generates a pile of markdown files - RESEARCH.md, PLAN.md, VERIFICATION.md and more - and re-reads them into context at every step.&lt;/p&gt;

&lt;p&gt;I turned on a token counter and watched a single phase eat more context on reading its own bookkeeping than on the actual work. That felt wrong. Project state should be a thin index, not a book that Claude laboriously reads from scratch before every edit.&lt;br&gt;
So I wrote my own tool. It's called mini.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design principle: state is a header, not a book&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The whole idea of mini fits into one sentence: keep minimal state and send Claude only the essentials.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Concretely:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;project.md - one page. What you're building, for whom, what the constraints are. Nothing more.&lt;/li&gt;
&lt;li&gt;state.json - a lightweight header: phase index, statuses, chosen models. The details of each phase (steps, report) live separately in .mini/phases/phase-{id}.json and are loaded only when needed.
When I then start work on a phase, Claude typically gets ~600–1000 tokens: a page of project.md + the current phase + five steps. No history of old phases, no old plans, no verification reports.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And what if Claude needs to understand the existing code? It reads it itself via Read/Glob/Grep. That's far cheaper than stuffing the whole codebase into context up front - Claude pulls exactly the three files it's working on, not the entire src/.&lt;/p&gt;

&lt;p&gt;And this isn't just theory from a token counter. In practice, one entire phase - from proposal through implementation to closing it out - fits within &lt;strong&gt;10–15%&lt;/strong&gt; of the context window on Opus 4.8 (1M). That leaves a huge margin: Claude has room to read files, iterate and fix things without the window filling up and the session "choking" on its own history. That headroom is the whole point - a phase never runs out of room to think.&lt;/p&gt;

&lt;p&gt;Initializing a project: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;init&lt;/code&gt;, or &lt;code&gt;import-gsd&lt;/code&gt; from a .planning/ directory&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The loop: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;next&lt;/code&gt; → &lt;code&gt;plan&lt;/code&gt; → &lt;code&gt;do&lt;/code&gt; → &lt;code&gt;done&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Working with mini is one repeating cycle. I'll show it on a small example - a REST API for todos:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;mini init&lt;/code&gt; create the project - 4 questions, produces .mini/project.md and state.json&lt;br&gt;
&lt;code&gt;mini next&lt;/code&gt; Claude proposes the next phase → phase 1 - Health endpoint&lt;br&gt;
&lt;code&gt;mini plan&lt;/code&gt; break it into concrete, verifiable steps → Phase 1 broken down into 2 steps&lt;br&gt;
&lt;code&gt;mini do&lt;/code&gt; work the phase - opens an interactive Claude Code session&lt;br&gt;
&lt;code&gt;mini done&lt;/code&gt; close it - "does it work?" → moves state forward, commits, writes memory&lt;/p&gt;

&lt;p&gt;The key point is that state operations are done by non-trivially tested TypeScript, not by Claude. mini … - apply moves state.json, writes the report, closes the phase. Claude does only the agentic work in the session. That way the state can never break from a hallucination - the thing I hated about purely prompt-based solutions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What helped me most: memory between phases&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the detail that turned out to matter most. After closing a phase, mini writes a short record into .mini/memory/phase-{id}.md:&lt;br&gt;
what was done key decisions - why solution X over Y loose ends - what was left undone, what to watch out for&lt;/p&gt;

&lt;p&gt;And note - this does not call the Claude API. mini assembles that record directly in TypeScript from the phase data (metadata + the verbatim content of the discuss/report). It's free and instant. Claude gets involved only if you explicitly set a memory scope to a model.&lt;/p&gt;

&lt;p&gt;Memory complements git log with a layer you won't find there: not what changed, but why. And a short summary of the latest record is automatically poured into the prompt for mini next - so the proposal for the next phase knows where we left off. That was the moment Claude stopped proposing phases that ignored earlier decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semi-autonomous mode&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When I want to move fast, I use mini auto:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;next → plan → do (acceptEdits) → done&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It runs the whole phase in a single Claude session (not a restart per step - every restart means re-exploring the project with no added value) and stops only at the human checkpoint: "does it work?". In acceptEdits mode Claude doesn't ask before every file edit, but Bash still asks (unless you use  &lt;em&gt;- -allow-dangerously-skip-permissions&lt;/em&gt;) - no random rm -rf.&lt;/p&gt;

&lt;p&gt;Auto also has a cooperative stop signal (mini stop from a second terminal), so I can halt an autonomous run cleanly at a step boundary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Slash commands right inside Claude Code&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The whole cycle doesn't have to run from the terminal. After mini install-commands I have /mini:* commands right inside Claude Code:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/mini:next&lt;/code&gt; - proposes and saves the next phase&lt;br&gt;
&lt;code&gt;/mini:plan&lt;/code&gt; - breaks the phase into steps&lt;br&gt;
&lt;code&gt;/mini:do &lt;/code&gt;- implements the phase and writes a report&lt;br&gt;
&lt;code&gt;/mini:done &lt;/code&gt;- human verification in the chat → moves state forward&lt;/p&gt;

&lt;p&gt;The body of those commands is thin - it just runs mini context , which prints the current prompt. The agentic work is done by Claude in the running session, the state operations by untouched TypeScript. No nested Claude inside the session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It runs on your subscription&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No extra API keys, no separate billing. mini just runs claude as a subprocess - authentication is handled by Claude Code itself, based on how you have it configured. It runs on your Pro/Max account.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;mini is free and open source (MIT). If burning tokens annoys you too, the fastest way to try it without touching ~/.claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;your-project
npx mini-orchestrator install-commands
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repo with a demo GIF of the full cycle: &lt;a href="//github.com/czsoftcode/mini-orchestrator"&gt;github.com/czsoftcode/mini-orchestrator&lt;/a&gt;&lt;br&gt;
- - - &lt;br&gt;
I'm now looking for a few people to run it on a real project and tell me where it drags or where the onboarding is confusing. Honest "this is useless because X" feedback is just as welcome as bug reports - reach out in issues.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>devtools</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
