<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: YHH</title>
    <description>The latest articles on DEV Community by YHH (@esengine).</description>
    <link>https://dev.to/esengine</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3890710%2F36667688-2190-4222-a317-87b8f93e4306.jpeg</url>
      <title>DEV Community: YHH</title>
      <link>https://dev.to/esengine</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/esengine"/>
    <language>en</language>
    <item>
      <title>How a DeepSeek-only agent framework hit 85% prefix cache rate (and saved 93% vs Claude)</title>
      <dc:creator>YHH</dc:creator>
      <pubDate>Tue, 21 Apr 2026 11:54:44 +0000</pubDate>
      <link>https://dev.to/esengine/how-a-deepseek-only-agent-framework-hit-85-prefix-cache-rate-and-saved-93-vs-claude-5c9g</link>
      <guid>https://dev.to/esengine/how-a-deepseek-only-agent-framework-hit-85-prefix-cache-rate-and-saved-93-vs-claude-5c9g</guid>
      <description>&lt;p&gt;I've been running DeepSeek behind LangChain for a few months for a side project. Worked fine, except one day I noticed&lt;br&gt;
  something weird: DeepSeek's pricing page advertises &lt;strong&gt;cached input tokens at ~10% of the miss rate&lt;/strong&gt;, but my bills didn't&lt;br&gt;
  reflect that at all.&lt;/p&gt;

&lt;p&gt;I dug in. The cache is byte-prefix based. The moment your request's prefix differs from the previous one by even a single&lt;br&gt;
  character, you pay full price. And LangChain — along with every generic agent framework I checked — rebuilds the prompt&lt;br&gt;
  every turn. Timestamps get injected. History gets reordered. Tool schemas re-serialize with different whitespace. The prefix&lt;br&gt;
   drifts, the cache never hits.&lt;/p&gt;

&lt;p&gt;So I wrote something opinionated: &lt;strong&gt;Reasonix&lt;/strong&gt; — a TypeScript agent framework built &lt;strong&gt;only&lt;/strong&gt; for DeepSeek. No multi-provider&lt;br&gt;
   abstraction, no orchestration graph, no RAG. Just three things done deeply.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📦 &lt;code&gt;npm install -g reasonix &amp;amp;&amp;amp; reasonix chat&lt;/code&gt;&lt;br&gt;
🔗 GitHub: &lt;a href="https://github.com/esengine/reasonix" rel="noopener noreferrer"&gt;esengine/reasonix&lt;/a&gt;&lt;br&gt;
📜 MIT License&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;## The numbers up front&lt;/p&gt;

&lt;p&gt;Measured against the live DeepSeek API, not marketing math:&lt;/p&gt;

&lt;p&gt;| Scenario | Model | Turns | Cache hit | Cost | Same on Claude Sonnet 4.6 | Savings |&lt;br&gt;
  |---|---|---|---|---|---|---|&lt;br&gt;
  | Multi-turn chat | &lt;code&gt;deepseek-chat&lt;/code&gt; | 5 | &lt;strong&gt;85.2%&lt;/strong&gt; | $0.000923 | $0.015174 | &lt;strong&gt;93.9%&lt;/strong&gt; |&lt;br&gt;
  | Tool-use (calculator) | &lt;code&gt;deepseek-chat&lt;/code&gt; | 2 | &lt;strong&gt;94.9%&lt;/strong&gt; | $0.000142 | $0.003351 | &lt;strong&gt;95.8%&lt;/strong&gt; |&lt;br&gt;
  | R1 reasoning + harvest | &lt;code&gt;deepseek-reasoner&lt;/code&gt; | 1 | 72.7% | $0.006478 | $0.044484 | 85.4% |&lt;/p&gt;

&lt;p&gt;Numbers come straight from &lt;code&gt;usage.prompt_cache_hit_tokens&lt;/code&gt; on real API responses. You can install Reasonix and verify in 2&lt;br&gt;
  minutes.&lt;/p&gt;

&lt;p&gt;## Pillar 1 — Cache-First Loop&lt;/p&gt;

&lt;p&gt;The problem again: DeepSeek's cache only fires on identical byte prefix. Generic frameworks rebuild prompts, so the prefix&lt;br&gt;
  drifts, so the cache rarely hits.&lt;/p&gt;

&lt;p&gt;The fix is structural. Every request's context gets partitioned into three regions with strict invariants:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  ┌─────────────────────────────────────┐
  │ IMMUTABLE PREFIX                    │ ← frozen at session start
  │   system + tool_specs + few_shots   │   this is the cache target
  ├─────────────────────────────────────┤
  │ APPEND-ONLY LOG                     │ ← grows monotonically
  │   [user₁][assistant₁][tool₁]...     │   prior turns preserve as prefix
  ├─────────────────────────────────────┤
  │ VOLATILE SCRATCH                    │ ← reset each turn
  │   R1 thoughts, transient state      │   never sent upstream
  └─────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In code, the prefix is hashed at construction and pinned. The log's &lt;code&gt;append()&lt;/code&gt; method refuses any mutation. The scratch gets&lt;br&gt;
   wiped at every turn boundary.&lt;/p&gt;

&lt;p&gt;That's it. That single discipline is enough to push cache hit rates to 85-95% on real sessions. Nothing else in the&lt;br&gt;
  framework would matter if this was wrong.&lt;/p&gt;

&lt;p&gt;## Pillar 2 — R1 Thought Harvesting&lt;/p&gt;

&lt;p&gt;DeepSeek's reasoning model &lt;code&gt;deepseek-reasoner&lt;/code&gt; (aka R1) emits extensive &lt;code&gt;reasoning_content&lt;/code&gt; — often 1000+ tokens of&lt;br&gt;
  step-by-step thinking. DeepSeek's own docs recommend &lt;strong&gt;not&lt;/strong&gt; feeding it back to the next turn (it hurts quality). So most&lt;br&gt;
  frameworks just display it or drop it.&lt;/p&gt;

&lt;p&gt;That's leaving a plan on the table. R1's reasoning trace is literally the model thinking out loud about subgoals,&lt;br&gt;
  hypotheses, and uncertainties. I pipe it through a cheap secondary V3 call in JSON mode and extract structured state:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;  &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;TypedPlanState&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;subgoals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;      &lt;span class="c1"&gt;// concrete intermediate objectives&lt;/span&gt;
    &lt;span class="nl"&gt;hypotheses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;    &lt;span class="c1"&gt;// candidate approaches being weighed&lt;/span&gt;
    &lt;span class="nl"&gt;uncertainties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt; &lt;span class="c1"&gt;// things R1 flags as unclear&lt;/span&gt;
    &lt;span class="nl"&gt;rejectedPaths&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt; &lt;span class="c1"&gt;// approaches considered and abandoned&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's R1 on a classic logic puzzle — "3 boxes with swapped labels; pick one fruit to determine all three contents":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  ‹ subgoals (3): enumerate label-content permutations · decide which box to sample · verify uniqueness
  ‹ hypotheses (3): sample from "apple" box · sample from "orange" box · sample from "mixed" box
  ‹ uncertainties (2): can a single pick uniquely determine all? · does "mixed" contain equal ratios?
  ‹ rejected (2): sampling from "apple" box (ambiguous) · sampling from "orange" box (symmetric)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every field maps to actual content in R1's reasoning trace. V3 is cheap enough (~$0.0001/turn) that this is essentially&lt;br&gt;
  free. Opt-in via &lt;code&gt;reasonix chat --harvest&lt;/code&gt; or &lt;code&gt;/harvest on&lt;/code&gt; inside the TUI.&lt;/p&gt;

&lt;p&gt;## Pillar 3 — Tool-Call Repair&lt;/p&gt;

&lt;p&gt;DeepSeek has several known tool-use quirks that generic frameworks don't handle:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Deep or wide schemas drop arguments.&lt;/strong&gt; Tool schemas with more than ~10 leaf parameters or more than 2 levels of nesting
cause V3/R1 to silently omit fields.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;R1 leaks tool calls into &lt;code&gt;&amp;lt;think&amp;gt;&lt;/code&gt;.&lt;/strong&gt; The model writes tool-call JSON inside its reasoning trace and forgets to surface
it in the actual &lt;code&gt;tool_calls&lt;/code&gt; field.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON gets truncated.&lt;/strong&gt; Long &lt;code&gt;arguments&lt;/code&gt; payloads hit &lt;code&gt;max_tokens&lt;/code&gt; mid-structure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Call storms.&lt;/strong&gt; The model hammers the same tool with identical arguments in an infinite loop.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Reasonix's repair layer has four passes running on every turn:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;  &lt;span class="c1"&gt;// 1. Auto-flatten deep/wide schemas&lt;/span&gt;
  &lt;span class="nx"&gt;ToolRegistry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;updateProfile&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="na"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;integer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;}},&lt;/span&gt;
        &lt;span class="p"&gt;}},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;updateInDB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="c1"&gt;// Internally shown to the model as a flat schema:&lt;/span&gt;
  &lt;span class="c1"&gt;//   {"user.profile.name": "...", "user.profile.age": ...}&lt;/span&gt;
  &lt;span class="c1"&gt;// On dispatch, args re-nested back to { user: { profile: { ... } } }&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Scavenge: regex + JSON parser sweeps reasoning_content for missed calls&lt;/span&gt;
  &lt;span class="c1"&gt;// 3. Truncation recovery: close braces, trim trailing commas, fill dangling keys&lt;/span&gt;
  &lt;span class="c1"&gt;// 4. Storm breaker: sliding-window dedup of (tool, args) tuples&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All four are always on. No user configuration.&lt;/p&gt;

&lt;p&gt;## Bonus: Self-Consistency Branching&lt;/p&gt;

&lt;p&gt;Here's the fun one. DeepSeek is roughly 20× cheaper than Claude Sonnet 4.6. That means &lt;strong&gt;three parallel R1 samples per turn&lt;br&gt;
  is still cheaper than a single Claude call&lt;/strong&gt;. What was a research luxury (self-consistency sampling) becomes a practical&lt;br&gt;
  default.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  reasonix chat &lt;span class="nt"&gt;--branch&lt;/span&gt; 3
  &lt;span class="c"&gt;# or inside the TUI:&lt;/span&gt;
  &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /preset max
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three samples fire in parallel at temperatures 0.0 / 0.5 / 1.0. Each one's reasoning is harvested. The default selector&lt;br&gt;
  picks whichever sample has the fewest flagged &lt;code&gt;uncertainties&lt;/code&gt; (tie-break on shorter answer length — Occam's razor as a&lt;br&gt;
  heuristic).&lt;/p&gt;

&lt;p&gt;TUI shows this live:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  🔀 branched 3 samples → picked #1   #0 T=0.0 u=2   ▸#1 T=0.5 u=0   #2 T=1.0 u=3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Anecdotally it lifts accuracy 10-15 percentage points on medium-difficulty reasoning, at roughly 1/5 the cost of a single&lt;br&gt;
  Claude pass. I haven't run a formal benchmark yet — that's next.&lt;/p&gt;

&lt;p&gt;## What it's explicitly not&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Not a LangChain replacement. No multi-provider, no graph orchestration, no RAG.&lt;/li&gt;
&lt;li&gt;Not a drop-in for OpenAI-compatible code. The whole point is DeepSeek-specific.&lt;/li&gt;
&lt;li&gt;Not production-ready. v0.0.6 pre-alpha, 135 passing tests, no formal benchmarks yet.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## Quick start&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; reasonix
  reasonix chat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First launch prompts for your DeepSeek API key and saves it to &lt;code&gt;~/.reasonix/config.json&lt;/code&gt;. Sessions auto-persist, so chat 2&lt;br&gt;
  hours of work, quit, come back tomorrow, type &lt;code&gt;reasonix chat&lt;/code&gt; — you're back where you left off.&lt;/p&gt;

&lt;p&gt;Inside the TUI, slash commands cover everything:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;  /preset fast|smart|max    one-tap config (fast = default)
&lt;/span&gt;&lt;span class="gp"&gt;  /model &amp;lt;id&amp;gt;&lt;/span&gt;&lt;span class="w"&gt;               &lt;/span&gt;deepseek-chat or deepseek-reasoner
&lt;span class="go"&gt;  /harvest [on|off]         Pillar 2 toggle
&lt;/span&gt;&lt;span class="gp"&gt;  /branch &amp;lt;N|off&amp;gt;&lt;/span&gt;&lt;span class="w"&gt;           &lt;/span&gt;N parallel samples &lt;span class="o"&gt;(&amp;gt;=&lt;/span&gt;2&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="go"&gt;  /sessions                 list saved sessions
  /forget                   delete current session
  /help                     full list
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No flag-soup to memorize. A command strip under the prompt shows the top-level commands at all times.&lt;/p&gt;

&lt;p&gt;## Library usage&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;  &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;CacheFirstLoop&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;DeepSeekClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;ImmutablePrefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;ToolRegistry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;reasonix&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;DeepSeekClient&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// reads DEEPSEEK_API_KEY&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ToolRegistry&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;add&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;integer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="na"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;integer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;b&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;loop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;CacheFirstLoop&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ImmutablePrefix&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;You are a math helper.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;toolSpecs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;specs&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="na"&gt;harvest&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;math-tutor&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ev&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;loop&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;What is 17 + 25?&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;assistant_final&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;loop&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
  &lt;span class="c1"&gt;// { turns: 2, totalCostUsd: 0.0003, savingsVsClaudePct: 94, cacheHitRatio: 0.87 }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;## Open questions I'd love feedback on&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Branching selector heuristic.&lt;/strong&gt; The default is &lt;code&gt;min(uncertainties.length)&lt;/code&gt; with length tie-break. That's obviously&lt;br&gt;
naive. What signals would you combine? Cross-sample answer similarity? Tool-call success rate per sample? An LLM-judge pass?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Harvest cost/value trade-off.&lt;/strong&gt; The $0.0001/turn V3 call feels negligible but it's a floor on per-turn cost. Has anyone&lt;br&gt;
tried fine-tuning R1 to output structured plan state directly?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache continuity across config changes.&lt;/strong&gt; Right now changing the system prompt mid-session invalidates the prefix&lt;br&gt;
cache. Is there a migration path that preserves the existing log's value?&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;Full source: &lt;a href="https://github.com/esengine/reasonix" rel="noopener noreferrer"&gt;github.com/esengine/reasonix&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Install: &lt;code&gt;npm install -g reasonix&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Issues, PRs, and benchmarks especially welcome.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
