<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Mohit</title>
    <description>The latest articles on DEV Community by Mohit (@flux8labs).</description>
    <link>https://dev.to/flux8labs</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3960092%2F79babcb4-fd1f-4f3e-bb44-8e289a3949cc.jpg</url>
      <title>DEV Community: Mohit</title>
      <link>https://dev.to/flux8labs</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/flux8labs"/>
    <language>en</language>
    <item>
      <title>How We Automated Purchase Orders From Gmail to Tally Using GPT-4 (98% Extraction Accuracy)</title>
      <dc:creator>Mohit</dc:creator>
      <pubDate>Thu, 11 Jun 2026 12:30:00 +0000</pubDate>
      <link>https://dev.to/flux8labs/how-we-automated-purchase-orders-from-gmail-to-tally-using-gpt-4-98-extraction-accuracy-ceg</link>
      <guid>https://dev.to/flux8labs/how-we-automated-purchase-orders-from-gmail-to-tally-using-gpt-4-98-extraction-accuracy-ceg</guid>
      <description>&lt;p&gt;At 9:14am on a Tuesday, the system flagged an incoming purchase order from a large enterprise buyer as a duplicate.&lt;/p&gt;

&lt;p&gt;The PO had arrived in two separate emails over 48 hours — sent by different procurement contacts, both for the same batch of stainless steel flanges, same quantities, same delivery window.&lt;/p&gt;

&lt;p&gt;Under the old system, a staff member would have read both, entered both into Tally, and allocated raw material stock twice. The first sign of the error would have been an inventory shortfall two weeks later.&lt;/p&gt;

&lt;p&gt;The client is a Jaipur-based precision manufacturer serving enterprise buyers in India and overseas. At roughly ₹60Cr annual revenue, their team handled a steady flow of purchase orders across a demanding customer base.&lt;/p&gt;

&lt;p&gt;Every one of those orders arrived as a PDF in a shared Gmail inbox.&lt;/p&gt;

&lt;p&gt;Every one of those PDFs was read and entered into Tally by hand.&lt;/p&gt;

&lt;p&gt;This is the build log for the system built to replace that process.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: PDFs in Gmail, Nobody Watching
&lt;/h2&gt;

&lt;p&gt;The procurement workflow before the build:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Emails arrive in a shared Gmail inbox.&lt;/li&gt;
&lt;li&gt;A staff member opens each attachment.&lt;/li&gt;
&lt;li&gt;Reads part numbers, quantities, delivery deadlines, and supplier codes.&lt;/li&gt;
&lt;li&gt;Manually enters everything into Tally.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On a slow day this took around 90 minutes.&lt;/p&gt;

&lt;p&gt;On heavier order days, it could stretch to 3–4 hours.&lt;/p&gt;

&lt;p&gt;The inbox had no workflow state:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No processed flag&lt;/li&gt;
&lt;li&gt;No queue&lt;/li&gt;
&lt;li&gt;No audit trail outside Tally&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the same PO arrived twice, the team would know only by accident.&lt;/p&gt;

&lt;h3&gt;
  
  
  Duplicate Orders
&lt;/h3&gt;

&lt;p&gt;There was no detection mechanism.&lt;/p&gt;

&lt;p&gt;Two contacts at the same enterprise customer could send the same PO independently, and neither Gmail nor Tally would flag it.&lt;/p&gt;

&lt;h3&gt;
  
  
  No Operational Visibility
&lt;/h3&gt;

&lt;p&gt;Knowing which orders were due the following week required opening Tally and manually cross-referencing entries.&lt;/p&gt;

&lt;p&gt;There was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No dashboard&lt;/li&gt;
&lt;li&gt;No queue view&lt;/li&gt;
&lt;li&gt;No workload overview&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Manual Raw Material Calculations
&lt;/h3&gt;

&lt;p&gt;Once a PO was entered, another manual calculation followed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pulling specifications from the PDF&lt;/li&gt;
&lt;li&gt;Checking stock levels&lt;/li&gt;
&lt;li&gt;Estimating requirements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This introduced a second opportunity for human error.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Was Built: A Four-Stage Pipeline
&lt;/h2&gt;

&lt;p&gt;Each stage solved a different operational problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Gmail Push Notifications
&lt;/h3&gt;

&lt;p&gt;Instead of polling Gmail every few minutes, the system registers a Google Cloud Pub/Sub topic that triggers a webhook whenever a new email arrives.&lt;/p&gt;

&lt;p&gt;Benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Near real-time processing&lt;/li&gt;
&lt;li&gt;Lower infrastructure overhead&lt;/li&gt;
&lt;li&gt;Faster visibility for operations teams&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A PO is processed before a staff member would have opened the email.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. GPT-4 Structured Extraction
&lt;/h3&gt;

&lt;p&gt;The system sends the purchase order to GPT-4 using a strict JSON schema.&lt;/p&gt;

&lt;p&gt;Required fields include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PO number&lt;/li&gt;
&lt;li&gt;Supplier details&lt;/li&gt;
&lt;li&gt;Delivery date&lt;/li&gt;
&lt;li&gt;Line items&lt;/li&gt;
&lt;li&gt;Quantities&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The model returns structured JSON directly.&lt;/p&gt;

&lt;p&gt;No:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Regex maintenance&lt;/li&gt;
&lt;li&gt;Template mapping&lt;/li&gt;
&lt;li&gt;Field-position assumptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For scanned PDFs, the vision endpoint is used.&lt;/p&gt;

&lt;p&gt;For text PDFs, extracted text is sent directly.&lt;/p&gt;

&lt;p&gt;Both paths produce the same JSON output.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Tally Integration
&lt;/h3&gt;

&lt;p&gt;The extracted JSON is converted into a Tally-compatible XML purchase voucher and sent to Tally Prime's local HTTP server.&lt;/p&gt;

&lt;p&gt;Once accepted:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The voucher is created automatically.&lt;/li&gt;
&lt;li&gt;Inventory calculations run immediately.&lt;/li&gt;
&lt;li&gt;The order appears exactly as if a user entered it manually.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Duplicate Detection
&lt;/h3&gt;

&lt;p&gt;Every incoming PO is stored in PostgreSQL.&lt;/p&gt;

&lt;p&gt;A normalized fingerprint is generated using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Line items&lt;/li&gt;
&lt;li&gt;Quantities&lt;/li&gt;
&lt;li&gt;Delivery windows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows the system to detect duplicate orders before they reach Tally.&lt;/p&gt;

&lt;p&gt;The duplicate mentioned earlier was caught using this mechanism.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Technology Stack
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Backend
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;FastAPI&lt;/li&gt;
&lt;li&gt;PostgreSQL&lt;/li&gt;
&lt;li&gt;Gmail API&lt;/li&gt;
&lt;li&gt;Google Pub/Sub&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Frontend
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;React&lt;/li&gt;
&lt;li&gt;Framer Motion&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AI Layer
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;GPT-4 Structured Outputs&lt;/li&gt;
&lt;li&gt;GPT-4 Vision&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ERP Layer
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Tally Prime via TDL XML imports&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Working with the Tally API
&lt;/h2&gt;

&lt;p&gt;The official documentation leaves a lot to be desired.&lt;/p&gt;

&lt;p&gt;The most reliable approach is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Enable Tally's HTTP server.&lt;/li&gt;
&lt;li&gt;Generate TDL XML vouchers.&lt;/li&gt;
&lt;li&gt;POST them to Tally.&lt;/li&gt;
&lt;li&gt;Parse the acknowledgement response.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Enable Tally HTTP Server
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gateway of Tally
→ F12
→ Advanced Configuration
→ Enable ODBC/HTTP Server
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Minimal Purchase Voucher Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;ENVELOPE&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;HEADER&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;TALLYREQUEST&amp;gt;&lt;/span&gt;Import Data&lt;span class="nt"&gt;&amp;lt;/TALLYREQUEST&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/HEADER&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;BODY&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;IMPORTDATA&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;REQUESTDESC&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;REPORTNAME&amp;gt;&lt;/span&gt;Vouchers&lt;span class="nt"&gt;&amp;lt;/REPORTNAME&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;STATICVARIABLES&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;SVCURRENTCOMPANY&amp;gt;&lt;/span&gt;[Company Name]&lt;span class="nt"&gt;&amp;lt;/SVCURRENTCOMPANY&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/STATICVARIABLES&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/REQUESTDESC&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;REQUESTDATA&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;TALLYMESSAGE&lt;/span&gt; &lt;span class="na"&gt;xmlns:UDF=&lt;/span&gt;&lt;span class="s"&gt;"TallyUDF"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;VOUCHER&lt;/span&gt; &lt;span class="na"&gt;VCHTYPE=&lt;/span&gt;&lt;span class="s"&gt;"Purchase"&lt;/span&gt; &lt;span class="na"&gt;ACTION=&lt;/span&gt;&lt;span class="s"&gt;"Create"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;DATE&amp;gt;&lt;/span&gt;[YYYYMMDD]&lt;span class="nt"&gt;&amp;lt;/DATE&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;VOUCHERTYPENAME&amp;gt;&lt;/span&gt;Purchase&lt;span class="nt"&gt;&amp;lt;/VOUCHERTYPENAME&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;PARTYLEDGERNAME&amp;gt;&lt;/span&gt;[Supplier Name]&lt;span class="nt"&gt;&amp;lt;/PARTYLEDGERNAME&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;ALLLEDGERENTRIES.LIST&amp;gt;&lt;/span&gt;
              &lt;span class="nt"&gt;&amp;lt;LEDGERNAME&amp;gt;&lt;/span&gt;[Ledger]&lt;span class="nt"&gt;&amp;lt;/LEDGERNAME&amp;gt;&lt;/span&gt;
              &lt;span class="nt"&gt;&amp;lt;AMOUNT&amp;gt;&lt;/span&gt;[Amount]&lt;span class="nt"&gt;&amp;lt;/AMOUNT&amp;gt;&lt;/span&gt;
            &lt;span class="nt"&gt;&amp;lt;/ALLLEDGERENTRIES.LIST&amp;gt;&lt;/span&gt;
          &lt;span class="nt"&gt;&amp;lt;/VOUCHER&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/TALLYMESSAGE&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/REQUESTDATA&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/IMPORTDATA&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/BODY&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/ENVELOPE&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first working voucher took three days.&lt;/p&gt;

&lt;p&gt;Everything after that was extension work.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Admin Dashboard
&lt;/h2&gt;

&lt;p&gt;The dashboard became the team's primary operational interface.&lt;/p&gt;

&lt;h3&gt;
  
  
  Orders Panel
&lt;/h3&gt;

&lt;p&gt;Shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Status&lt;/li&gt;
&lt;li&gt;Supplier&lt;/li&gt;
&lt;li&gt;PO Number&lt;/li&gt;
&lt;li&gt;Delivery Deadline&lt;/li&gt;
&lt;li&gt;Part Count&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Duplicate orders are highlighted and blocked from entering Tally until reviewed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tally Panel
&lt;/h3&gt;

&lt;p&gt;Provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Revenue trends&lt;/li&gt;
&lt;li&gt;Receivables&lt;/li&gt;
&lt;li&gt;Payment status&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without opening Tally.&lt;/p&gt;

&lt;h3&gt;
  
  
  Operations Panel
&lt;/h3&gt;

&lt;p&gt;Handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dispatch notes&lt;/li&gt;
&lt;li&gt;QA checklists&lt;/li&gt;
&lt;li&gt;Production milestones&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All driven from the same purchase-order data.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Failed First
&lt;/h2&gt;

&lt;p&gt;The original extraction system used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pdf-parse&lt;/li&gt;
&lt;li&gt;OCR&lt;/li&gt;
&lt;li&gt;Regex pipelines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The assumption:&lt;/p&gt;

&lt;p&gt;Purchase order formats would remain stable.&lt;/p&gt;

&lt;p&gt;They didn't.&lt;/p&gt;

&lt;p&gt;Enterprise customers used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple templates&lt;/li&gt;
&lt;li&gt;Different table structures&lt;/li&gt;
&lt;li&gt;Scanned PDFs&lt;/li&gt;
&lt;li&gt;International formats&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~60% success rate&lt;/li&gt;
&lt;li&gt;Frequent maintenance&lt;/li&gt;
&lt;li&gt;Silent extraction failures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One part-number transposition eventually led to production of an incorrect batch.&lt;/p&gt;

&lt;p&gt;That was the turning point.&lt;/p&gt;

&lt;p&gt;The system was rebuilt around GPT-4 structured outputs.&lt;/p&gt;

&lt;p&gt;Validation against historical purchase orders increased extraction accuracy from roughly 60% to over 98%.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three Lessons From the Project
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. PDF Extraction Has a Clear ROI
&lt;/h3&gt;

&lt;p&gt;For teams processing purchase orders every week:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI costs remain small.&lt;/li&gt;
&lt;li&gt;Manual labor costs do not.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The business case becomes obvious once the numbers are compared.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Tally Integration Is Easier Than Its Reputation
&lt;/h3&gt;

&lt;p&gt;The difficult part is finding a working example.&lt;/p&gt;

&lt;p&gt;Once you successfully create one XML voucher, the rest is straightforward engineering.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Push Beats Polling
&lt;/h3&gt;

&lt;p&gt;The biggest benefit wasn't lower latency.&lt;/p&gt;

&lt;p&gt;It was trust.&lt;/p&gt;

&lt;p&gt;A dashboard updated within seconds becomes an operational tool.&lt;/p&gt;

&lt;p&gt;A dashboard updated every few minutes becomes a report.&lt;/p&gt;

&lt;p&gt;That difference determines whether teams actually adopt the system.&lt;/p&gt;




&lt;h2&gt;
  
  
  Could This Work For Your Business?
&lt;/h2&gt;

&lt;p&gt;If your team receives PDFs in Gmail and manually enters them into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tally&lt;/li&gt;
&lt;li&gt;ERP systems&lt;/li&gt;
&lt;li&gt;Inventory software&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;the workflow can likely be automated.&lt;/p&gt;

&lt;p&gt;The core pattern is repeatable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gmail
→ AI Extraction
→ Validation
→ ERP Integration
→ Operations Dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This project went from discovery call to production deployment in six weeks.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does this work with Tally ERP 9?
&lt;/h3&gt;

&lt;p&gt;Yes.&lt;/p&gt;

&lt;p&gt;The XML import approach works with both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tally Prime&lt;/li&gt;
&lt;li&gt;Tally ERP 9&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The main requirement is enabling Tally's HTTP server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can it process scanned PDFs?
&lt;/h3&gt;

&lt;p&gt;Yes.&lt;/p&gt;

&lt;p&gt;Scanned PDFs use GPT-4 Vision.&lt;/p&gt;

&lt;p&gt;Text PDFs use direct extraction.&lt;/p&gt;

&lt;p&gt;Both produce the same structured output.&lt;/p&gt;

&lt;h3&gt;
  
  
  What if the AI extracts incorrect data?
&lt;/h3&gt;

&lt;p&gt;The system includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Confidence scoring&lt;/li&gt;
&lt;li&gt;Validation rules&lt;/li&gt;
&lt;li&gt;Human review queues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Suspicious records are blocked before reaching Tally.&lt;/p&gt;

&lt;p&gt;This prevents errors from entering downstream operations.&lt;/p&gt;

</description>
      <category>casestudy</category>
      <category>oms</category>
      <category>automation</category>
    </item>
  </channel>
</rss>
