I'm a self-taught developer from South Africa, currently studying for my Bachelor of Accounting. Today I started a challenge: build and publish a new API every day for 21 days straight.
Day 1 is done. Here's exactly what I built, how it works, and how you can use it.
The Problem:
Every accounting app, expense tracker, and bookkeeping tool has the same pain point, invoices come in as PDFs, images, or plain text, and somebody has to pull the structured data out of them.
Vendor name. Invoice number. Line items. Tax. Total. Due date.
It's repetitive, error-prone work. And every developer building a finance tool ends up writing the same messy regex logic to solve it.
So I built an API that does it for them.
What I Built:
The Invoice & Receipt Parser API — send it raw invoice text, get back clean structured JSON.
Input:
{
"text": "Acme Corp\nInvoice No: INV-2024-0042\nDate: 15/03/2024\nDue: 15/04/2024\n\nWeb Design Services 2 $1500.00 $3000.00\nSEO Optimization 1 $800.00 $800.00\n\nSubtotal: $3800.00\nVAT 15%: $570.00\nTotal Due: $4370.00"
}
Output:
{
"success": true,
"data": {
"document_type": "invoice",
"vendor_name": "Acme Corp",
"invoice_number": "INV-2024-0042",
"dates": {
"invoice_date": "15/03/2024",
"due_date": "15/04/2024"
},
"currency": "USD",
"totals": {
"subtotal": 3800,
"tax_rate": 15,
"tax_amount": 570,
"discount": null,
"shipping": null,
"total": 4370
},
"line_items": [
{
"description": "Web Design Services",
"quantity": 2,
"unit_price": 1500,
"amount": 3000
},
{
"description": "SEO Optimization",
"quantity": 1,
"unit_price": 800,
"amount": 800
}
],
"confidence": {
"score": 100,
"level": "high"
}
}
}
No AI costs. No third-party dependencies. Pure Node.js, which means near-zero running costs and sub-100ms response times.
The 6 Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Check API is online |
| POST | /parse |
Full extraction — all fields |
| POST | /parse/totals |
Financial totals only |
| POST | /parse/line-items |
Line items only |
| POST | /parse/vendor |
Vendor details only |
| POST | /validate |
Completeness score + missing fields |
The lightweight endpoints are useful for high-volume pipelines where you only need one piece of data and don't want to pay for a full parse every time.
How It Works
The core logic is a chain of regex-based extractors, each one focused on one thing.
function fullParse(text) {
return {
document_type: detectDocType(text),
vendor_name: extractVendor(text),
invoice_number: extractInvoiceNumber(text),
dates: extractDates(text),
currency: extractCurrency(text),
totals: extractTotals(text),
line_items: extractLineItems(text),
payment_info: extractPaymentInfo(text),
contact: extractContact(text),
confidence: calculateConfidence(data),
};
}
Currency detection checks for 25+ currency codes and symbols:
const CURRENCY_CODES = ["USD","EUR","GBP","ZAR","INR",...];
function extractCurrency(text) {
for (const code of CURRENCY_CODES) {
if (new RegExp(`\\b${code}\\b`).test(text)) return code;
}
if (text.includes("$")) return "USD";
}
One bug I caught during testing, the tax amount extractor was matching 15 from VAT 15% instead of the actual amount 570.00. The fix was requiring a decimal point in the match:
// BROKEN — matches "15" from "VAT 15%"
/(?:vat|tax)[\s:$]*([0-9,. ]+)/i
// FIXED — requires decimal format, skips percentages
/(?:vat|tax)[\s:%\d]*?[\s:$£€]+([0-9,]+\.[0-9]{2})/i
Always test with real messy invoice text before shipping.
Tech Stack
- Runtime: Node.js + Express
- Hosting: Railway (free tier)
- Marketplace: RapidAPI
- Dependencies: express, cors, helmet, morgan, express-rate-limit
Zero paid APIs. Zero AI costs. The whole thing costs less than $5/month to run.
Pricing on RapidAPI
| Plan | Price | Requests |
|---|---|---|
| Free | $0 | 10/month |
| Basic | $9.99/mo | 500/month |
| Pro | $29.99/mo | 5,000/month |
What I Extracted From This Build
My accounting background actually helped here. I knew exactly what fields matter on a real invoice, payment terms, VAT rates, SWIFT codes, IBAN numbers. That domain knowledge made the extractor more accurate than a generic solution would be.
It's a reminder that your background, whatever it is, is an advantage when building in the right niche.
Try It
The API is live on RapidAPI, search for Invoice Receipt Parser or find my profile at [https://rapidapi.com/user/ruanmul04].
Free tier gives you 10 requests/month to test it with your own invoices.
What's Next
Day 2 tomorrow — Password Strength & Security Scorer API.
If you want to follow the 21-day build challenge, follow me here on dev.to. I'll be posting every day with the full breakdown of what I built, why, and how.
Drop a comment if you're building APIs too, always keen to connect with other developers doing the same thing. 🇿🇦
Built in South Africa. Sold globally.
Top comments (0)