DEV Community

ALI
ALI

Posted on

How I Built an AI Tool to Read Invoices and Predict GST — Without Losing My Mind

Why I Did This to Myself

Invoices are like bosses: they all look different, they all demand money, and they all ruin your day.
So I thought: “Why not let AI suffer instead of me?”
Boom. Invoice Digitization & Tax Prediction Tool.
Now my AI squints at messy PDFs and scans, while I sit back and pretend I’m productive.

The Tech Stack (aka Weapons of Mass Frustration)

  1. - OCR → For turning “blurred coffee-stained PDF” into “blurred text file.”
  2. - Regex → Because nothing screams fun like 200-character patterns.
  3. - ML-ish GST Predictor → Basically a model that guesses taxes better than me on exam day.
  4. - Python + pandas → To wrangle the chaos into something that looks like data.

How It Works

  • You upload an invoice.
  • OCR panics but spits out text.
  • Regex plays “Where’s Waldo” with GST numbers.
  • AI pretends to be an accountant.
  • You get structured output and tax predictions.

Bugs That Almost Broke Me

OCR Hallucinations:

Me: “That’s ₹1200.”
OCR: “Did you mean ‘12008S’?”

Regex PTSD:
Finding GST numbers in text feels like hunting shiny Pokémon.

The BTS Incident:

OCR once read “18% GST” as “BTS.”
For 2 minutes, I thought I’d invented a K-Pop tax predictor.

Code Snippet: My Regex Therapy Session

import re

def extract_gst(text):
    pattern = r"\d{2}[A-Z]{5}\d{4}[A-Z]{1}[1-9A-Z]{1}Z[0-9A-Z]{1}"
    match = re.search(pattern, text)
    return match.group(0) if match else "No GST found, cry harder."

invoice_text = "Invoice #123 GSTIN: 22ABCDE1234F1Z5 Amount: ₹1200"
print(extract_gst(invoice_text))  
# Output: 22ABCDE1234F1Z5

Enter fullscreen mode Exit fullscreen mode

Results (Sort of)

  • Structured PDFs? Works like a charm.
  • Low-quality scans? Good luck, buddy.
  • Predicts GST fairly well — which is more than I can say about me filing taxes.

What’s Next

  • Teaching it sarcasm so it can also roast invoices.
  • Export to Excel/CSV (for those who still trust spreadsheets).
  • Maybe SaaS — so small businesses can suffer too.

Final Thoughts

This project taught me:

  • AI isn’t about intelligence.
  • It’s about tricking machines into crying over messy data instead of you.

Top comments (0)