<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tanmay Devare</title>
    <description>The latest articles on DEV Community by Tanmay Devare (@tanmay_devare_45).</description>
    <link>https://dev.to/tanmay_devare_45</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3992684%2Fa8e9fc2c-041d-48f2-9f3f-0ef8d04e6615.webp</url>
      <title>DEV Community: Tanmay Devare</title>
      <link>https://dev.to/tanmay_devare_45</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tanmay_devare_45"/>
    <language>en</language>
    <item>
      <title>How to fix LangGraph GraphRecursionError without losing your checkpointed state</title>
      <dc:creator>Tanmay Devare</dc:creator>
      <pubDate>Fri, 19 Jun 2026 17:47:59 +0000</pubDate>
      <link>https://dev.to/tanmay_devare_45/how-to-fix-langgraph-graphrecursionerror-without-losing-your-checkpointed-state-3mag</link>
      <guid>https://dev.to/tanmay_devare_45/how-to-fix-langgraph-graphrecursionerror-without-losing-your-checkpointed-state-3mag</guid>
      <description>&lt;p&gt;We’ve all been there. You leave your LangGraph agent running overnight. It hits a 403 Forbidden on a scraping tool, or a REQUIRES_SINGLE_PART_NAMESPACE error on a SQL query.&lt;br&gt;
Instead of failing gracefully, the agent asks the LLM for help. It gets stuck in a ReAct loop, burning through your API credits. Eventually, the native recursion_limit finally kills it.&lt;br&gt;
But here is the worst part: the native recursion_limit is a blunt instrument.&lt;br&gt;
When it hits the limit, LangGraph throws a GraphRecursionError. It crashes the run, wipes your checkpointed state, and returns a 500 error to your frontend user. You lose whatever partial data the agent did gather, and you get a surprise $4,000 API bill on Tuesday morning.&lt;br&gt;
I spent the last month digging into why agents do this, especially with open-weight models (Qwen/Llama) that lack native self-correction. I realized that just throwing a raw RuntimeError or a "BLOCKED" string at an agent just confuses it, and it loops again.&lt;br&gt;
Here is how we solved it using Pre-Model Intervention and Atomic Transcript Surgery.&lt;br&gt;
The Architecture: Intercepting Before the Crash&lt;br&gt;
Most guardrails wrap the entire graph or monkey-patch the HTTP client. This adds latency and breaks framework internals.&lt;br&gt;
Instead, we use LangGraph’s native pre_model_hook and ToolNode APIs. This allows us to intercept the agent before the next LLM call, mutate the ephemeral prompt, and force a strategy pivot without corrupting the user's checkpointed state.&lt;br&gt;
We call it the Progressive Intervention Protocol:&lt;br&gt;
Nudge: Injects an ephemeral warning into the tool result.&lt;br&gt;
Override: Safely strips the failing tool_calls from the prompt (preventing OpenAI/Anthropic 400 Bad Request validation errors) and forces a text-based strategy pivot.&lt;br&gt;
Hard Stop: Halts the graph but preserves the checkpointed state so you get partial results instead of a crash.&lt;br&gt;
The 1-Line Fix&lt;br&gt;
We open-sourced this engine as TokenCircuit. It uses zero-dependency semantic shingling (stdlib regex + hashlib) to catch paraphrased loops at &amp;lt;20µs latency.&lt;br&gt;
Here is how you wrap your LangGraph agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.prebuilt&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_react_agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;tokencircuit.adapters.langgraph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tc_pre_model_hook&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TokenCircuitToolNode&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;tokencircuit&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TokenCircuitConfig&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Configure the intervention engine
&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TokenCircuitConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;max_repeats&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;window_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;telemetry_enabled&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# Logs interventions locally or to Supabase
&lt;/span&gt;    &lt;span class="n"&gt;agency_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-org&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;client_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-app&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Wrap your tools with TokenCircuit's transaction tracking
&lt;/span&gt;&lt;span class="n"&gt;safe_tool_node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TokenCircuitToolNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Inject the pre-model hook for transcript surgery
&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_react_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;safe_tool_node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;pre_model_hook&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;tc_pre_model_hook&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Run your agent exactly as before
&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Get me the stock price for AAPL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why This Matters for Production&lt;br&gt;
When you deploy autonomous agents for clients, you can't afford silent loop failures.&lt;br&gt;
With TokenCircuit V8.1, we achieved zero core dependencies. We swapped pydantic for @dataclass(slots=True) and tiktoken for stdlib shingling. This means:&lt;br&gt;
Zero supply-chain vulnerabilities.&lt;br&gt;
&amp;lt;20µs overhead per turn.&lt;br&gt;
100% local execution. No prompts or PII ever leave your RAM.&lt;br&gt;
We also built a local CLI report generator. When an intervention happens, it logs to a local NDJSON file. You can run tokencircuit report --file events.json to generate a board-ready table showing exactly how many tokens and dollars your guardrail saved.&lt;br&gt;
The Code is Open Source&lt;br&gt;
If you are tired of watching your agents burn money on infinite loops, check out the repo.&lt;br&gt;
GitHub: &lt;a href="https://github.com/Devaretanmay/TokenCircut" rel="noopener noreferrer"&gt;https://github.com/Devaretanmay/TokenCircut&lt;/a&gt;&lt;br&gt;
PyPI: pip install "tokencircuit[langgraph]"&lt;br&gt;
Question for the builders: What’s the most money an agent has burned for you in a single night? Drop your war stories in the comments. 👇&lt;/p&gt;

</description>
      <category>ai</category>
      <category>langchain</category>
      <category>python</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
