UPI processes over 16 billion transactions a month in India. If you are like most people, you pay through PhonePe for groceries, GPay for cabs, Paytm for utilities, and your bank app for rent. At every month-end, reconciling spending across all those apps is a nightmare.
I built a Python script that reads exported SMS transaction alerts from my phone, parses them into a clean pandas DataFrame, categorises spending, and prints a monthly breakdown — all in 60 lines, offline, at zero cost.
Here is exactly how it works.
The Problem: Five Apps, Zero Consolidated View
My typical month looks like this:
- Groceries: PhonePe (about ₹8,000)
- Food delivery: GPay (about ₹3,500)
- Cabs: Paytm (about ₹2,200)
- Rent: HDFC app (₹28,000)
- Bills: Multiple apps (about ₹6,500)
No single app shows everything. Export is locked behind paid "Pro" tiers. Third-party aggregators want permission to read every SMS on my phone. I did not want to hand my financial history to yet another vendor.
SMS is the answer. Every UPI transaction sends a bank-originated SMS. If I can export SMS and parse them locally, I own the data.
Step 1: Export SMS to CSV
Android phones can export SMS via "SMS Backup & Restore" or similar apps. You get an XML or CSV file. Drop it into a folder like ~/upi-data/sms.csv.
The file looks like:
date,address,body
2026-04-15,HDFCBK,"Rs.450.00 debited from A/c XX1234 on 15-APR for UPI/412839-BIGBASKET"
2026-04-15,AXISBK,"INR 120 spent on UPI to OLA-CABS ref 339400 on 15-Apr"
Step 2: The Python Script
import pandas as pd
import re
# UPI transaction regex patterns for major Indian banks
PATTERNS = {
'hdfc': r'Rs\.?([\d,]+\.?\d*)\s+debited.*UPI[/-]\w+.*-(\w+)',
'axis': r'INR\s+([\d,]+\.?\d*)\s+spent.*UPI\s+to\s+([\w-]+)',
'sbi': r'Rs\.?([\d,]+\.?\d*)\s+transferred.*UPI.*to\s+(\w+)',
'icici': r'Rs\.?([\d,]+\.?\d*)\s+debited.*UPI.*?(\w+)$',
}
CATEGORIES = {
'BIGBASKET': 'Groceries', 'GROFERS': 'Groceries', 'ZEPTO': 'Groceries',
'SWIGGY': 'Food', 'ZOMATO': 'Food', 'DOMINOS': 'Food',
'OLA': 'Transport', 'UBER': 'Transport', 'RAPIDO': 'Transport',
'NETFLIX': 'Entertainment', 'SPOTIFY': 'Entertainment', 'HOTSTAR': 'Entertainment',
'AIRTEL': 'Utilities', 'JIO': 'Utilities', 'ELECTRICITY': 'Utilities',
}
def parse_transaction(body, sender):
bank = (sender or '').lower()[:4]
for key, pattern in PATTERNS.items():
if key in bank:
match = re.search(pattern, body, re.IGNORECASE)
if match:
amount = float(match.group(1).replace(',', ''))
merchant = match.group(len(match.groups())).upper()
return amount, merchant
return None, None
def categorize(merchant):
if not merchant:
return 'Other'
for keyword, category in CATEGORIES.items():
if keyword in merchant.upper():
return category
return 'Uncategorized'
def build_report(csv_path):
df = pd.read_csv(csv_path)
df['date'] = pd.to_datetime(df['date'])
rows = []
for _, row in df.iterrows():
amount, merchant = parse_transaction(row['body'], row['address'])
if amount:
rows.append({
'date': row['date'],
'amount': amount,
'merchant': merchant,
'category': categorize(merchant)
})
report = pd.DataFrame(rows)
monthly = report.groupby([
report['date'].dt.to_period('M'),
'category'
])['amount'].sum().unstack(fill_value=0)
print("\n=== Monthly UPI Spending Report ===")
print(monthly.round(0).to_string())
print(f"\nTotal transactions: {len(report)}")
print(f"Total spent: Rs.{report['amount'].sum():,.0f}")
return report
if __name__ == "__main__":
report = build_report('sms.csv')
report.to_csv('upi_report.csv', index=False)
Step 3: Run It
pip install pandas
python upi_tracker.py
Sample output against my March–April 2026 data:
=== Monthly UPI Spending Report ===
category Food Groceries Transport Utilities Uncategorized
date
2026-03 3240 8450 2180 1890 5420
2026-04 2890 7920 1670 1890 4100
Total transactions: 84
Total spent: Rs.42,760
I went from "no idea where the money goes" to a clean monthly breakdown in under 30 seconds.
Why This Beats Third-Party Apps
Data stays on your machine. No SMS permissions granted to random apps. No servers uploading your transaction history.
Fully customisable. Do not like my categories? Edit the CATEGORIES dict. Want a chart? Three lines of matplotlib. Want to track only food spend? One filter.
Zero cost. Commercial expense trackers charge ₹200–500/month. This is free forever.
Scales well. I parsed three years of SMS (2,400+ transactions) in under four seconds on a 2019 MacBook Air.
Extensions I Layered On
After the base 60 lines, I added:
- Matplotlib chart — saves a PNG bar chart of category-wise monthly spend.
- SQLite cache — stores parsed transactions so re-runs are instant.
- Telegram bot — sends a weekly summary to my phone via a free bot token.
- Business-expense tag — marks GST-relevant transactions for year-end filing.
Each extension added about 15 lines of Python. The whole script is now ~150 lines and has replaced a ₹499/month expense-tracking app I was paying for.
Honest Limitations
SMS format changes. When a bank updates its template, the regex breaks. I add new patterns every 6–8 months.
Not every transaction triggers SMS. Some UPI auto-pay and recurring flows can be silent. I reconcile quarterly against the bank statement.
Bank-specific regex. I have patterns for HDFC, Axis, SBI, and ICICI. Other banks need 10 minutes of custom regex each.
What You Can Build Next
- A full personal-finance dashboard (Streamlit is perfect).
- ML-based merchant classification with scikit-learn on a few hundred labelled rows.
- A rules engine — "alert me if food spend crosses ₹5,000 in a week".
- Cross-reference with Zerodha or Upstox P&L for combined wealth tracking.
The core idea — your data, parsed by your own code, on your own machine — applies far beyond UPI. Start with 60 lines. Add what you actually need.
I'm Archit Mittal — I automate chaos for businesses. Follow me for daily automation content.
Top comments (0)