<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alex</title>
    <description>The latest articles on DEV Community by Alex (@alexwang).</description>
    <link>https://dev.to/alexwang</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4014110%2F859c037d-e234-472f-9c3f-8d1f1ec04d53.jpg</url>
      <title>DEV Community: Alex</title>
      <link>https://dev.to/alexwang</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alexwang"/>
    <language>en</language>
    <item>
      <title>Catch LLM multi-hop hallucinations with zero extra tokens</title>
      <dc:creator>Alex</dc:creator>
      <pubDate>Sat, 04 Jul 2026 06:59:07 +0000</pubDate>
      <link>https://dev.to/alexwang/catch-llm-multi-hop-hallucinations-with-zero-extra-tokens-dj4</link>
      <guid>https://dev.to/alexwang/catch-llm-multi-hop-hallucinations-with-zero-extra-tokens-dj4</guid>
      <description>&lt;p&gt;LLMs don't usually fail at facts. They fail at &lt;em&gt;composing&lt;/em&gt; facts.&lt;/p&gt;

&lt;p&gt;Ask a model "who is Alice's parent?" and it answers reliably. Ask "is Alice an&lt;br&gt;
ancestor of Dave?" — a conclusion that requires chaining three parent facts —&lt;br&gt;
and accuracy falls off a cliff. Here is DeepSeek on CLUTRR, a public&lt;br&gt;
kinship-reasoning benchmark, measured by the number of composition steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;hop:      2     3     4     5     6     7     8
DeepSeek: 83%   42%   25%   25%   42%   17%   8%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;83% at two hops. &lt;strong&gt;8% at eight hops.&lt;/strong&gt; And the failures aren't shy "I don't&lt;br&gt;
know" answers — the model confidently fabricates relatives that don't exist.&lt;/p&gt;

&lt;p&gt;The usual fixes all have the same problem: they fight an LLM with more LLM.&lt;br&gt;
Ask the model to double-check itself and you pay for a second call — I measured&lt;br&gt;
+110% tokens — and it &lt;em&gt;still&lt;/em&gt; hallucinates (34% precision on the check).&lt;br&gt;
Sample five chains-of-thought and vote, and you multiply your bill by five for&lt;br&gt;
a statistical improvement with no guarantee attached.&lt;/p&gt;

&lt;p&gt;There's an older, cheaper tool for this specific job.&lt;/p&gt;
&lt;h2&gt;
  
  
  Relations are matrices
&lt;/h2&gt;

&lt;p&gt;If your facts are triples — &lt;code&gt;(alice, parent, bob)&lt;/code&gt; — then each relation is a&lt;br&gt;
boolean matrix &lt;code&gt;R&lt;/code&gt; where &lt;code&gt;R[i][j] = 1&lt;/code&gt; iff the fact holds. And then:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Composition is matrix multiplication.&lt;/strong&gt; "Grandparent" is literally
&lt;code&gt;R_parent · R_parent&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transitive closure is a sum of powers.&lt;/strong&gt; "Ancestor" is &lt;code&gt;Σ R_parentᵏ&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A claim is true iff a path exists&lt;/strong&gt; — and you can return the path as a
proof.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is new math (it's reachability, the Katz index, the Neumann&lt;br&gt;
series — decades old). The point is what it buys you when you bolt it onto an&lt;br&gt;
LLM: a verifier that accepts a multi-hop claim &lt;strong&gt;if and only if&lt;/strong&gt; a proof path&lt;br&gt;
exists in the facts. It cannot be sweet-talked into accepting a fabricated&lt;br&gt;
relation, it costs zero model tokens, and when it says yes, it shows its work.&lt;/p&gt;
&lt;h2&gt;
  
  
  Ten lines to a guaranteed check
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;grounded-reasoning
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;grounded_reasoning&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GroundedReasoner&lt;/span&gt;

&lt;span class="n"&gt;gr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GroundedReasoner&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;gr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_facts&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bob&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bob&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;carol&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;carol&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dave&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="n"&gt;gr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dave&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;via&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Verdict(grounded=True, proof=['alice', 'bob', 'carol', 'dave'], ...)
&lt;/span&gt;
&lt;span class="n"&gt;gr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;via&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Verdict(grounded=False, proof=None)   &amp;lt;- this is what a hallucination looks like
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The typical integration is a post-filter: the LLM emits relational claims, and&lt;br&gt;
only the ones with an evidence path reach the user.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;llm_claims&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;carol&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;   &lt;span class="c1"&gt;# true composition
&lt;/span&gt;              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;   &lt;span class="c1"&gt;# fabricated
&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verdict&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;gr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter_claims&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm_claims&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;KEPT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verdict&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;grounded&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BLOCKED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;claim&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Measured on real DeepSeek runs (48 multi-hop questions over supplied one-hop&lt;br&gt;
facts, two seeds): raw multi-hop precision was 33–38%, with dozens of invented&lt;br&gt;
names. After the filter: &lt;strong&gt;100% precision, zero correct answers dropped&lt;/strong&gt; —&lt;br&gt;
the filter provably never rejects a claim that has a real path. On the same&lt;br&gt;
CLUTRR chart from the top of this post, the grounded solver holds ~100% flat&lt;br&gt;
from 2 to 10 hops.&lt;/p&gt;

&lt;p&gt;The worst raw precision I've seen anywhere in this repo came from a denser,&lt;br&gt;
deliberately anti-commonsense ontology — reversed and counter-intuitive&lt;br&gt;
"is-a" relations layered into the same hierarchy, forcing the model past&lt;br&gt;
whatever it thinks it already knows. Raw precision there fell to &lt;strong&gt;31%&lt;/strong&gt;, 106&lt;br&gt;
fabricated edges. The filter caught &lt;strong&gt;106 of 106&lt;/strong&gt;, again with zero correct&lt;br&gt;
claims dropped.&lt;/p&gt;

&lt;p&gt;Detecting contradictions comes free, too. If a relation is supposed to be a&lt;br&gt;
hierarchy (is-a, part-of, causes) and the asserted facts contain a cycle,&lt;br&gt;
that's a certificate that something is wrong — and the library hands you the&lt;br&gt;
cycle:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;gr2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;GroundedReasoner&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;gr2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_facts&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mammal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
               &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mammal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;animal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
               &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;animal&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;      &lt;span class="c1"&gt;# oops
&lt;/span&gt;&lt;span class="n"&gt;gr2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contradictions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;is_a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# [['cat', 'mammal', 'animal']]
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  "But I don't have a knowledge base"
&lt;/h2&gt;

&lt;p&gt;Fair — most people don't. Two answers, both shipped:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use the LLM's own facts against it.&lt;/strong&gt; Models are far more reliable on atomic&lt;br&gt;
one-hop facts than on composition. So: ask the model for its one-hop facts,&lt;br&gt;
build the closure locally, and reject any of the model's &lt;em&gt;multi-hop&lt;/em&gt; claims&lt;br&gt;
that contradict its &lt;em&gt;own&lt;/em&gt; closure. Self-contradiction is the hallucination&lt;br&gt;
signal. On a real taxonomy task with DeepSeek this took precision from 78% to&lt;br&gt;
100% using no external knowledge whatsoever — with a proven two-sided bound on&lt;br&gt;
what it costs you in recall, and a documented failure condition (it breaks in&lt;br&gt;
domains where the model's atomic facts are themselves unreliable, like&lt;br&gt;
"a whale is a fish" trap worlds).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Extracted, noisy graph? Trade the hard guarantee for a coverage guarantee.&lt;/strong&gt;&lt;br&gt;
If an LLM extracts the facts from raw text, edges go missing and the hard&lt;br&gt;
proof-path guarantee no longer applies. A split-conformal threshold over the&lt;br&gt;
operator's confidence score keeps a distribution-free guarantee of the form&lt;br&gt;
"≥ 90% of true claims are kept." In the end-to-end test, DeepSeek's extraction&lt;br&gt;
silently dropped 31% of the edges — and coverage still held at 93%. What&lt;br&gt;
degrades under noise is efficiency (more false positives), never validity.&lt;/p&gt;
&lt;h2&gt;
  
  
  What this is not
&lt;/h2&gt;

&lt;p&gt;Honesty section, because every tool post needs one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Not a general reasoner.&lt;/strong&gt; It verifies relational/transitive claims against
a graph. It does nothing for free-standing factual errors ("the Eiffel Tower
is in Rome") — that's a different problem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not new math.&lt;/strong&gt; Reachability, Katz, conformal prediction, Horn logic — all
classical. The contribution is the packaging: a unification theorem (the
fuzzy-diffusion view, the operator view, and the spectral view are provably
the same operator, to zero numerical error), measured guarantees with their
token cost on a real LLM, and the no-KB / noisy-KB modes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Only as good as the graph it's given.&lt;/strong&gt; If the graph is wrong, the hard
guarantee is about the graph, not about the world. The conformal mode
softens exactly this, and nothing else does.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Don't take my word for any of this
&lt;/h2&gt;

&lt;p&gt;The whole point of a verification tool is that you shouldn't have to trust its&lt;br&gt;
author. Every number in this post has an offline lock:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/ALEXaquarius/grounded-reasoning
&lt;span class="nb"&gt;cd &lt;/span&gt;grounded-reasoning &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[dev]"&lt;/span&gt;
pytest tests/     &lt;span class="c"&gt;# 116 tests, no API key — includes numerical verification&lt;/span&gt;
                  &lt;span class="c"&gt;# of all seven theorems behind the guarantees&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or run the &lt;a href="https://colab.research.google.com/github/ALEXaquarius/grounded-reasoning/blob/main/examples/quickstart.ipynb" rel="noopener noreferrer"&gt;Colab quickstart&lt;/a&gt;&lt;br&gt;
— it executes the theorem verifications live in about 30 seconds.&lt;/p&gt;

&lt;p&gt;The repo (MIT) ships the library, an OpenAI/Anthropic function-calling tool,&lt;br&gt;
and an MCP server, so an agent can call &lt;code&gt;verify_relation&lt;/code&gt; like any other tool:&lt;br&gt;
&lt;a href="https://github.com/ALEXaquarius/grounded-reasoning" rel="noopener noreferrer"&gt;https://github.com/ALEXaquarius/grounded-reasoning&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you can construct a case where the guard accepts a claim with no path — or&lt;br&gt;
rejects one that has a path — that would falsify the core theorem, and I very&lt;br&gt;
much want to see it.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>python</category>
      <category>ai</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
