<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vaibhav Kushwaha</title>
    <description>The latest articles on DEV Community by Vaibhav Kushwaha (@vaibhav_kushwaha_e8eb243e).</description>
    <link>https://dev.to/vaibhav_kushwaha_e8eb243e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3875272%2F8c34d3b7-cb9b-4632-b1eb-270942fd1232.png</url>
      <title>DEV Community: Vaibhav Kushwaha</title>
      <link>https://dev.to/vaibhav_kushwaha_e8eb243e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vaibhav_kushwaha_e8eb243e"/>
    <language>en</language>
    <item>
      <title>Getting Claude to Extract Invoice Data Reliably — What I Learned Building Finley</title>
      <dc:creator>Vaibhav Kushwaha</dc:creator>
      <pubDate>Sun, 12 Apr 2026 17:51:57 +0000</pubDate>
      <link>https://dev.to/vaibhav_kushwaha_e8eb243e/finley-j3h</link>
      <guid>https://dev.to/vaibhav_kushwaha_e8eb243e/finley-j3h</guid>
      <description>&lt;p&gt;Parsing invoices with an LLM sounds simple until you actually do it. Invoices from different vendors have different layouts, use different field names, and encode amounts in different formats. A senior accountant handles this through pattern recognition built over years. I had a weekend.&lt;/p&gt;

&lt;p&gt;Here's what I learned building the extraction layer for Finley, an AI accounts payable agent that processes invoices through an 8-step pipeline and uses &lt;a href="https://github.com/vectorize-io/hindsight" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt; to remember what it learns.&lt;/p&gt;

&lt;p&gt;The Extraction Problem Is Harder Than It Looks&lt;/p&gt;

&lt;p&gt;The naive approach — ask the LLM "extract the invoice number, vendor name, total amount, and payment terms" — works on clean PDFs from large vendors. It breaks on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scanned invoices with skewed text&lt;/li&gt;
&lt;li&gt;Invoices that embed total amounts inside paragraph descriptions&lt;/li&gt;
&lt;li&gt;Payment terms written as "30 days from receipt" vs. "Net 30" vs. "NET-30"&lt;/li&gt;
&lt;li&gt;GST/VAT breakdowns where the "total" field is ambiguous&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The real challenge isn't the extraction itself. It's getting structured, typed output that downstream code can depend on.&lt;/p&gt;

&lt;p&gt;Structured Output Is the Only Viable Approach&lt;/p&gt;

&lt;p&gt;The extraction service in Finley sends the invoice content to Claude and explicitly requests a JSON schema in the response. The schema pins the field names, types, and acceptable values so the downstream analyzer doesn't have to guess.&lt;/p&gt;

&lt;p&gt;The extracted payload looks like:&lt;/p&gt;

&lt;p&gt;javascript&lt;br&gt;
{&lt;br&gt;
  vendorName: "Prakash Office Supplies Pvt. Ltd.",&lt;br&gt;
  invoiceId: "INV-2025-0009",&lt;br&gt;
  invoiceDate: "2025-01-15",&lt;br&gt;
  totalAmount: 47500,&lt;br&gt;
  paymentTerms: "Net-30",&lt;br&gt;
  lineItems: [&lt;br&gt;
    { description: "A4 Copy Paper (500 sheets)", quantity: 10, unitPrice: 450, total: 4500 }&lt;br&gt;
    // ...&lt;br&gt;
  ],&lt;br&gt;
  currency: "INR"&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;&lt;code&gt;totalAmount&lt;/code&gt; is always a number. &lt;code&gt;paymentTerms&lt;/code&gt; is always a string in a normalized format. The LLM does the translation — "thirty days" → "Net-30" — so the analyzer never has to.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Prompt Carries the Business Logic
&lt;/h2&gt;

&lt;p&gt;The extraction prompt does more than "extract fields." It encodes accounting-domain knowledge: what constitutes a valid invoice total, how to handle line item math discrepancies, which fields are optional vs. required. That domain context is what makes Claude useful here over a simpler regex approach.&lt;/p&gt;

&lt;p&gt;One thing that helped: including explicit examples of edge cases in the extraction prompt. Not just "extract payment terms" but "payment terms may appear as 'Net 30', 'NET-30', '30 days from invoice date', or 'payment due in 30 days' — normalize to 'Net-30' format." The LLM handles variation better when you name the variations.&lt;/p&gt;
&lt;h2&gt;
  
  
  Memory Changes What Extraction Can Do
&lt;/h2&gt;

&lt;p&gt;Here's where it gets interesting. Finley uses &lt;a href="https://hindsight.vectorize.io/" rel="noopener noreferrer"&gt;Hindsight agent memory&lt;/a&gt; — and memory affects the extraction step in a way I didn't anticipate when I started.&lt;/p&gt;

&lt;p&gt;After a few invoices from the same vendor, Finley has stored observations like "this vendor's invoices use non-standard payment terms that should be corrected to Net-45 per contract." On the next invoice from that vendor, that memory feeds into the analysis step, not the extraction step — but the effect is the same as if extraction had been smarter. The extracted &lt;code&gt;paymentTerms&lt;/code&gt; field comes back as "Net-30" (what the invoice says), and the analyzer then flags it using memory: "contract terms are Net-45, vendor consistently invoices Net-30, user has corrected this 3 times."&lt;/p&gt;

&lt;p&gt;This separation matters architectually. Extraction reports facts. Analysis applies context. Memory is where the context lives.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Manual Data Path Is Underrated
&lt;/h2&gt;

&lt;p&gt;Finley has a second input path alongside file upload: a &lt;code&gt;manual&lt;/code&gt; JSON field that lets you pass pre-structured invoice data directly to the pipeline. We used this for demos, but it has a real production use case: if you already have invoice data from an ERP or structured email, you can skip LLM extraction entirely and still get memory retrieval, analysis, and decision.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;manualData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;manual&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;manual&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;extracted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;manualData&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;extractInvoiceData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fileBuffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;mimetype&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That two-line fallback means the extraction layer is optional without changing the pipeline contract. Worth designing in from the start.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Fails in Practice
&lt;/h2&gt;

&lt;p&gt;Three things go wrong most often in invoice extraction:&lt;/p&gt;

&lt;p&gt;Multi-page totals. If the summary total is on page 1 and the line items are on pages 2-3, a naive extraction might grab a subtotal instead of the final total. The prompt needs to explicitly instruct the LLM to find the final payable amount, not the first number labeled "total."&lt;/p&gt;

&lt;p&gt;Currency ambiguity. "1,00,000" is a valid Indian number format (one lakh). To US-trained models, it looks like "100,000" with a weird comma. Explicitly calling out the currency in the prompt — and including examples — reduces this error.&lt;/p&gt;

&lt;p&gt;Missing fields vs. inapplicable fields. An invoice might not have payment terms because it's a cash sale. The LLM should return &lt;code&gt;null&lt;/code&gt; for missing optional fields, not omit them or invent plausible values. Prompt explicitly for null vs. omission behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;LLM invoice extraction is genuinely useful, but it requires treating the prompt as a schema specification, not a question. Define the output format precisely, encode domain edge cases explicitly, and separate extraction (facts) from analysis (context). Memory — through &lt;a href="https://vectorize.io/what-is-agent-memory" rel="noopener noreferrer"&gt;Hindsight&lt;/a&gt; in Finley's case — sits in the analysis layer, not the extraction layer, and that separation keeps both cleaner.&lt;/p&gt;

&lt;p&gt;You can see Finley running at &lt;a href="https://finley-rho.vercel.app" rel="noopener noreferrer"&gt;finley-rho.vercel.app&lt;/a&gt;.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
