<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vaani Sharma</title>
    <description>The latest articles on DEV Community by Vaani Sharma (@vaani_sharma_71ea6aa72cdd).</description>
    <link>https://dev.to/vaani_sharma_71ea6aa72cdd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3875304%2F99e147ee-1fc9-4ce1-8268-82e6ef2dc8ec.png</url>
      <title>DEV Community: Vaani Sharma</title>
      <link>https://dev.to/vaani_sharma_71ea6aa72cdd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vaani_sharma_71ea6aa72cdd"/>
    <language>en</language>
    <item>
      <title>Rules Caught Nothing, Memory Caught Everything.</title>
      <dc:creator>Vaani Sharma</dc:creator>
      <pubDate>Mon, 13 Apr 2026 14:15:12 +0000</pubDate>
      <link>https://dev.to/vaani_sharma_71ea6aa72cdd/rules-caught-nothing-memory-caught-everything-9ni</link>
      <guid>https://dev.to/vaani_sharma_71ea6aa72cdd/rules-caught-nothing-memory-caught-everything-9ni</guid>
      <description>&lt;p&gt;Every invoice processing system has rules. "Flag amounts over $50,000 for manual review." "Reject invoices missing a vendor registration number." These are clear, manageable, and easy to apply.&lt;/p&gt;

&lt;p&gt;The problem is that most cases of invoice fraud, duplicate submissions, and billing mistakes don’t trigger these rules. They appear to be ordinary invoices. A vendor submitting slightly different duplicate invoices,a matching amount but a different invoice number,passes all field-level checks. The pattern only becomes noticeable when you know the vendor's history.&lt;/p&gt;

&lt;p&gt;Building Finley's decision engine taught me how to blend rule based checks with pattern detection that comes from experience. Here’s how these two layers work together.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Decision Engine Structure&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Finley follows two steps before making a decision: an analyzer that generates flags and checks, and a decision builder that interprets them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Step 4: Contextual analysis&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;analysis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;analyzeInvoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;extracted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Step 5: Decision engine&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decision&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;buildDecision&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;analysis&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The analyzer examines both the current invoice and the retrieved memories. The decision builder only receives the analysis output. This separation is important: the analyzer interprets while the decision builder applies the logic. The decision builder itself is straightforward, given the same analysis, it produces the same decision every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Layer 1: Deterministic Checks&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Some issues don’t require complex reasoning. A missing invoice number is always a problem. An amount that doesn’t match the total of line items is also always a problem. These are field-level checks that run before any complex calls.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;checks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; 
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Invoice ID present&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="na"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;!!&lt;/span&gt;&lt;span class="nx"&gt;extracted&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;invoiceId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; 
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; 
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Amount matches line items&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="na"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lineItemSum&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;extracted&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;totalAmount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;warning&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These checks run quickly, yield clear results, and catch the obvious issues without using up API credits.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Layer 2: Memory-Backed Pattern Detection&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The more intriguing layer involves what the LLM does with vendor memory. When Finley retrieves 9 previous interactions from &lt;a href="https://github.com/vectorize-io/hindsight" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt; for a vendor, these memories join the current invoice fields in the analyzer prompt.&lt;/p&gt;

&lt;p&gt;The analyzer can then identify patterns that no static rule would catch:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Duplicate detection with variation:&lt;/strong&gt; "INV-2025-0009 for ₹47,500—vendor submitted INV-2025-0007 for the same amount 3 weeks ago. Similar amounts from this vendor: 3 in the last 6 months, 2 with identical totals."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Payment terms drift:&lt;/strong&gt; "Invoice states Net-30. Memory shows user has corrected this to Net-45 twice in the past. Vendor consistently invoices on incorrect terms."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rounding pattern:&lt;/strong&gt; "Amount is ₹47,500.00. Historical pattern for this vendor shows rounding errors of ₹0.50–₹2.00. This amount is clean, no flags."&lt;/p&gt;

&lt;p&gt;None of these patterns are hard-coded. They develop from LLM reasoning based on the memory the agent has built up over time. This is the key benefit of &lt;a href="https://vectorize.io/what-is-agent-memory" rel="noopener noreferrer"&gt;agent memory&lt;/a&gt; in a business workflow: the agent improves at spotting vendor-specific issues without anyone needing to write new rules.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Flag/Check Distinction
&lt;/h2&gt;

&lt;p&gt;The analysis output generates two separate lists: &lt;em&gt;flags&lt;/em&gt; and &lt;em&gt;checks&lt;/em&gt; . Flags indicate problems. Checks confirm details.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; 
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;potential_duplicate&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Similar invoice amount submitted 3 weeks ago (INV-2025-0007)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;high&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;memoryBacked&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;checks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Vendor registered&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Invoice date valid&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;pass&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;87&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;em&gt;memoryBacked&lt;/em&gt; field on flags is a significant design choice. It informs the decision builder, and the user, whether a flag comes from field-level validation (which is always dependable) or from memory based pattern detection (which relies on the quality of the memory). A flag from 9 high quality previous interactions is more trustworthy than a flag from only 1.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Verdict Logic&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;buildDecision&lt;/em&gt; translates the analysis output into a verdict based on clear thresholds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Any &lt;em&gt;severity: "error"&lt;/em&gt; flag → &lt;em&gt;reject&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Any &lt;em&gt;severity: "high"&lt;/em&gt; flag → &lt;em&gt;flag&lt;/em&gt;(i.e hold for review)&lt;/li&gt;
&lt;li&gt;Multiple &lt;em&gt;severity: "medium"&lt;/em&gt; flags → &lt;em&gt;flag&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Clear checks with no significant flags → &lt;em&gt;approve&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The confidence score from the analyzer feeds into the result but doesn’t override the decision logic. A 90% confidence duplicate flag still results in a hold— the confidence is informational, not a deciding factor.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What Doesn't Work&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The current design has a real flaw: memory quality can decline if users consistently approve items that should be flagged. If an accountant approves duplicate invoices for months, the agent's memory fills with "approved" actions for duplicates. Future pattern detection will weaken because the historical signal becomes confusing.&lt;/p&gt;

&lt;p&gt;The solution is tracking feedback quality flagging when user actions repeatedly contradict agent recommendations and highlighting that to reviewers. We didn’t build this yet, but it’s the logical next step.&lt;/p&gt;

&lt;p&gt;Another limitation is that memory retrieval from &lt;a href="https://hindsight.vectorize.io/" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt; provides the top 20 most relevant entries. For vendors with many invoices, the retrieved 20 might not include the specific previous duplicate that is most relevant. Better retrieval query design,like filtering by invoice amount range,would help.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Takeaway&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Rules are necessary and straightforward. Pattern detection from memory is what truly makes the agent useful. The effective structure: run deterministic checks first, then provide the LLM with memory context to identify patterns that rules won’t catch. Keep the decision logic straightforward on both types. Also, monitor whether user feedback strengthens or harms the memory the agent relies on.&lt;/p&gt;

&lt;p&gt;Finley is at &lt;a href="https://finley-rho.vercel.app" rel="noopener noreferrer"&gt;finley-rho.vercel.app&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>learning</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
