DEV Community

Archit Mittal
Archit Mittal

Posted on • Originally published at architmittal.com

How I Automated ITR Document Preparation With 40 Lines of Python

My chartered accountant called me last week. Five days before March 31.

His voice had that familiar panic — the same tone every CA in India carries between February and March. He needed my ITR documents. Form 16. AIS data. Twelve months of bank statements. The usual March-end package.

I told him to check his email. My script had already sent everything.

There was a long pause on the other end. Then: "What script?"

I explained that I had written a 40-line Python script that automatically pulls my Form 16, downloads my AIS (Annual Information Statement) data, grabs twelve months of bank statements from my bank's portal, formats everything exactly the way he likes it, and emails him the complete package every March. No manual downloads. No PDF renaming. No forgetting that one statement from August.

His response surprised me. He did not say "nice" or "that's cool." He said something far more interesting.

"Can you build this for my other 200 clients?"

That single conversation crystallised something I had been thinking about for months: the biggest automation opportunity in India is not in tech companies. It is in CA firms, brokerages, small finance teams, and the thousands of professionals who are brilliant at their core work but still copy-pasting PDFs at 2 AM during tax season.

The March-End Problem Nobody Talks About

Every year, approximately 7.5 crore (75 million) ITRs are filed in India. Behind each filing is a CA or tax professional who has to collect, verify, and organise documents from their clients. For a mid-sized CA firm handling 200-500 clients, this means:

  • Sending hundreds of reminder emails and WhatsApp messages asking for documents
  • Manually downloading Form 16 PDFs that clients forward in various formats
  • Cross-referencing AIS data with what clients actually reported
  • Chasing bank statements that always seem to be missing one or two months
  • Renaming and organising files into client-specific folders
  • Doing all of this in a 45-day window while also handling GST filings and audits

I have spoken with over a dozen CA firms in the last six months. The average time spent per client just on document collection and organisation is 45-90 minutes. Multiply that by 300 clients and you get 225-450 hours — essentially two to three full-time employees doing nothing but collecting and sorting PDFs for two months.

That is not accounting work. That is data entry disguised as professional service.

What the Script Actually Does

Let me walk through the automation I built, because the simplicity is the entire point.

Step 1: Pull Form 16 Data

Most employers now issue Form 16 through TRACES or their HRMS portals. The script connects to the relevant portal using stored credentials, downloads the latest Form 16 Part A and Part B, and saves them with a standardised naming convention. For salaried individuals with a single employer, this takes exactly one API call.

Step 2: Download AIS Data

The Annual Information Statement from the Income Tax portal contains all reported financial transactions — salary, interest income, dividends, property transactions, high-value purchases. The script pulls this data and converts it into a readable summary. This is critical because AIS mismatches are the number one reason for ITR notices.

Step 3: Fetch Bank Statements

This is where most people lose the most time manually. The script connects to the bank's API (or uses their download portal) to pull twelve months of statements. It handles the date ranges automatically — April to March of the relevant financial year. No more downloading month by month.

Step 4: Format and Package

Every CA has their own preferred format. Some want everything in a single PDF. Some want separate folders. Some want an Excel summary sheet on top. The script formats the output based on a configuration file — change the config once and every client's package follows the same structure.

Step 5: Email the Package

The final step is automatic delivery. The script compiles everything into a ZIP file and emails it to the CA with a checklist summary: which documents are included, which are missing (if any), and whether the AIS data has any discrepancies worth reviewing.

Total execution time: under 3 minutes per client. Compare that to 45-90 minutes of manual work.

The Real Numbers: Why This Matters for CA Firms

Let me lay out the business case with real numbers, because abstract automation talk means nothing without specifics.

A CA firm with 300 clients spends roughly 300 hours on document collection at the conservative end. If a junior accountant costs the firm around ₹25,000 per month (₹150 per hour), that is ₹45,000 worth of labour — every tax season — on a task that adds zero intellectual value.

But the real cost is not the labour. It is the opportunity cost. Those 300 hours are 300 hours that could go towards actual tax planning, advisory work, and the kind of high-value services that clients pay a premium for. A CA billing ₹2,000 per hour for advisory work who spends 100 of those hours on document collection is leaving ₹2,00,000 on the table.

The script I built cost me a Sunday afternoon. The version I am building for CA firms — with a proper dashboard, client management, and multi-bank support — will save the average firm ₹3-5 lakh per tax season in recovered billable hours.

Why CA Firms Are the Perfect Automation Clients

After working with traders, e-commerce businesses, and SaaS companies on automation, I have come to believe that CA firms are the single most underserved market for business automation in India. Here is why:

Their work is highly repetitive. Tax filing follows the same steps for every client, every year. The inputs change but the process does not. This is automation's sweet spot.

They already have the data. Unlike businesses that need to first digitise their operations, CA firms already work with structured digital data — PDFs, Excel sheets, portal downloads. The data is there. It just needs to flow automatically.

The pain is seasonal but intense. March-end is not just busy. It is existentially stressful for most CA firms. Automation that alleviates even 30% of that March crunch is worth significant money to them.

They serve as multipliers. One CA firm automates their workflow and suddenly 300 of their clients benefit from faster, more accurate filings. The ripple effect is massive.

Beyond ITR: The Full CA Automation Stack

Document collection is just the starting point. Once a CA firm sees what automation can do for their ITR workflow, the next questions come naturally:

GST Reconciliation: Matching GSTR-2A with purchase registers is a monthly headache for every firm. A Python script that pulls both datasets and highlights mismatches saves 10-15 hours per month.

TDS Certificate Collection: Chasing Form 16A from multiple deductors for a single client is one of the most frustrating parts of tax practice. Automated tracking and download changes the game.

Client Communication: Automated WhatsApp reminders that go out when documents are missing — not generic blasts, but personalised messages that say "We have your Form 16 but are still waiting for your HDFC Bank statement for October."

Audit Trail Generation: Every download, every email, every document version — automatically logged. When the tax department asks questions two years later, the firm has a complete digital paper trail.

Getting Started: What You Actually Need

You do not need to be a Python developer to begin automating your CA practice. Here is the minimum viable automation stack:

Python 3.x with the requests library for API calls, pandas for data formatting, and smtplib for email automation. That is it. No frameworks, no cloud infrastructure, no monthly subscription.

API access to your clients' financial portals. This is the tricky part — it requires client consent and proper credential management. Start with your own accounts to prove the concept.

A configuration file (even a simple Excel sheet works) that maps each client to their data sources, CA's preferred format, and delivery email.

30 minutes of setup per client the first time. After that, every subsequent tax season is fully automated.

The Bigger Picture

Somewhere in India right now, a CA is manually downloading their 200th Form 16 of the season. They are brilliant at tax law. They could be advising clients on Section 80C optimisation or structuring HUF transactions. Instead, they are renaming PDFs.

This is not a technology problem. Python has been able to do this for over a decade. It is an awareness problem. Most CA firms do not know that a Sunday afternoon of scripting can reclaim hundreds of hours of their practice every year.

The finance automation market in India is not a future opportunity. It is a present one. The tools exist. The data is digital. The pain is real and recurring. The only missing piece is someone to connect the dots.

That is exactly what I do. I automate chaos.


Archit Mittal is the founder of Automate Algos. He helps businesses automate chaos using AI agents and custom workflows. Connect with him on LinkedIn @automate-archit.

Top comments (0)