Mohit

Posted on Jun 11 • Originally published at flux8labs.com

How We Automated Purchase Orders From Gmail to Tally Using GPT-4 (98% Extraction Accuracy)

#casestudy #oms #automation

At 9:14am on a Tuesday, the system flagged an incoming purchase order from a large enterprise buyer as a duplicate.

The PO had arrived in two separate emails over 48 hours — sent by different procurement contacts, both for the same batch of stainless steel flanges, same quantities, same delivery window.

Under the old system, a staff member would have read both, entered both into Tally, and allocated raw material stock twice. The first sign of the error would have been an inventory shortfall two weeks later.

The client is a Jaipur-based precision manufacturer serving enterprise buyers in India and overseas. At roughly ₹60Cr annual revenue, their team handled a steady flow of purchase orders across a demanding customer base.

Every one of those orders arrived as a PDF in a shared Gmail inbox.

Every one of those PDFs was read and entered into Tally by hand.

This is the build log for the system built to replace that process.

The Problem: PDFs in Gmail, Nobody Watching

The procurement workflow before the build:

Emails arrive in a shared Gmail inbox.
A staff member opens each attachment.
Reads part numbers, quantities, delivery deadlines, and supplier codes.
Manually enters everything into Tally.

On a slow day this took around 90 minutes.

On heavier order days, it could stretch to 3–4 hours.

The inbox had no workflow state:

No processed flag
No queue
No audit trail outside Tally

If the same PO arrived twice, the team would know only by accident.

Duplicate Orders

There was no detection mechanism.

Two contacts at the same enterprise customer could send the same PO independently, and neither Gmail nor Tally would flag it.

No Operational Visibility

Knowing which orders were due the following week required opening Tally and manually cross-referencing entries.

There was:

No dashboard
No queue view
No workload overview

Manual Raw Material Calculations

Once a PO was entered, another manual calculation followed:

Pulling specifications from the PDF
Checking stock levels
Estimating requirements

This introduced a second opportunity for human error.

What Was Built: A Four-Stage Pipeline

Each stage solved a different operational problem.

1. Gmail Push Notifications

Instead of polling Gmail every few minutes, the system registers a Google Cloud Pub/Sub topic that triggers a webhook whenever a new email arrives.

Benefits:

Near real-time processing
Lower infrastructure overhead
Faster visibility for operations teams

A PO is processed before a staff member would have opened the email.

2. GPT-4 Structured Extraction

The system sends the purchase order to GPT-4 using a strict JSON schema.

Required fields include:

PO number
Supplier details
Delivery date
Line items
Quantities

The model returns structured JSON directly.

No:

Regex maintenance
Template mapping
Field-position assumptions

For scanned PDFs, the vision endpoint is used.

For text PDFs, extracted text is sent directly.

Both paths produce the same JSON output.

3. Tally Integration

The extracted JSON is converted into a Tally-compatible XML purchase voucher and sent to Tally Prime's local HTTP server.

Once accepted:

The voucher is created automatically.
Inventory calculations run immediately.
The order appears exactly as if a user entered it manually.

4. Duplicate Detection

Every incoming PO is stored in PostgreSQL.

A normalized fingerprint is generated using:

Line items
Quantities
Delivery windows

This allows the system to detect duplicate orders before they reach Tally.

The duplicate mentioned earlier was caught using this mechanism.

The Technology Stack

Backend

FastAPI
PostgreSQL
Gmail API
Google Pub/Sub

Frontend

React
Framer Motion

AI Layer

GPT-4 Structured Outputs
GPT-4 Vision

ERP Layer

Tally Prime via TDL XML imports

Working with the Tally API

The official documentation leaves a lot to be desired.

The most reliable approach is:

Enable Tally's HTTP server.
Generate TDL XML vouchers.
POST them to Tally.
Parse the acknowledgement response.

Enable Tally HTTP Server

Gateway of Tally
→ F12
→ Advanced Configuration
→ Enable ODBC/HTTP Server

Minimal Purchase Voucher Example

<ENVELOPE>
  <HEADER>
    <TALLYREQUEST>Import Data</TALLYREQUEST>
  </HEADER>
  <BODY>
    <IMPORTDATA>
      <REQUESTDESC>
        <REPORTNAME>Vouchers</REPORTNAME>
        <STATICVARIABLES>
          <SVCURRENTCOMPANY>[Company Name]</SVCURRENTCOMPANY>
        </STATICVARIABLES>
      </REQUESTDESC>
      <REQUESTDATA>
        <TALLYMESSAGE xmlns:UDF="TallyUDF">
          <VOUCHER VCHTYPE="Purchase" ACTION="Create">
            <DATE>[YYYYMMDD]</DATE>
            <VOUCHERTYPENAME>Purchase</VOUCHERTYPENAME>
            <PARTYLEDGERNAME>[Supplier Name]</PARTYLEDGERNAME>
            <ALLLEDGERENTRIES.LIST>
              <LEDGERNAME>[Ledger]</LEDGERNAME>
              <AMOUNT>[Amount]</AMOUNT>
            </ALLLEDGERENTRIES.LIST>
          </VOUCHER>
        </TALLYMESSAGE>
      </REQUESTDATA>
    </IMPORTDATA>
  </BODY>
</ENVELOPE>

The first working voucher took three days.

Everything after that was extension work.

The Admin Dashboard

The dashboard became the team's primary operational interface.

Orders Panel

Shows:

Status
Supplier
PO Number
Delivery Deadline
Part Count

Duplicate orders are highlighted and blocked from entering Tally until reviewed.

Tally Panel

Provides:

Revenue trends
Receivables
Payment status

Without opening Tally.

Operations Panel

Handles:

Dispatch notes
QA checklists
Production milestones

All driven from the same purchase-order data.

What Failed First

The original extraction system used:

pdf-parse
OCR
Regex pipelines

The assumption:

Purchase order formats would remain stable.

They didn't.

Enterprise customers used:

Multiple templates
Different table structures
Scanned PDFs
International formats

The result:

~60% success rate
Frequent maintenance
Silent extraction failures

One part-number transposition eventually led to production of an incorrect batch.

That was the turning point.

The system was rebuilt around GPT-4 structured outputs.

Validation against historical purchase orders increased extraction accuracy from roughly 60% to over 98%.

Three Lessons From the Project

1. PDF Extraction Has a Clear ROI

For teams processing purchase orders every week:

AI costs remain small.
Manual labor costs do not.

The business case becomes obvious once the numbers are compared.

2. Tally Integration Is Easier Than Its Reputation

The difficult part is finding a working example.

Once you successfully create one XML voucher, the rest is straightforward engineering.

3. Push Beats Polling

The biggest benefit wasn't lower latency.

It was trust.

A dashboard updated within seconds becomes an operational tool.

A dashboard updated every few minutes becomes a report.

That difference determines whether teams actually adopt the system.

Could This Work For Your Business?

If your team receives PDFs in Gmail and manually enters them into:

Tally
ERP systems
Inventory software

the workflow can likely be automated.

The core pattern is repeatable:

Gmail
→ AI Extraction
→ Validation
→ ERP Integration
→ Operations Dashboard

This project went from discovery call to production deployment in six weeks.

FAQ

Does this work with Tally ERP 9?

Yes.

The XML import approach works with both:

Tally Prime
Tally ERP 9

The main requirement is enabling Tally's HTTP server.

Can it process scanned PDFs?