I spent last Tuesday typing 47 invoices into a spreadsheet. By invoice number 30, I'd already caught two mistakes I'd made on earlier ones. One was a transposed number that would've thrown off a client reconciliation.
This is what bookkeeping still looks like at a lot of firms. You'd think we'd have moved past it by now.
The math on manual entry is ugly
A bookkeeper processing 100 documents a day at 4 minutes each burns 6.5 hours just on data entry. That's not analysis, not advisory work, not anything billable at a premium rate. Just typing numbers from one place into another.
The error rate makes it worse. Manual entry sits somewhere between 1% and 4% accuracy loss depending on whose research you trust. On a thousand transactions a month, that's 10 to 40 entries that need fixing. Some of those get caught during reconciliation. Some don't.
What's actually different about AI document processing now
OCR has been around forever. But old-school OCR just read characters off a page. It didn't understand what it was reading.
The newer tools do something different. They look at a document and figure out what kind of document it is, where the important fields are, and how the data relates to each other. An invoice has vendor info at the top, line items in the middle, totals at the bottom. A bank statement has transaction rows with dates, descriptions, amounts. The AI picks up on these structures.
The part that matters most for accounting work is table detection. Invoices are basically tables. Statements are tables. If the tool can't read tables properly, it's useless for finance work. I've tried tools that extracted text fine but mangled every table into an unreadable mess.
Then there's verification. The good tools tell you when they're not sure about something. They'll flag a field with low confidence so you can check it yourself instead of silently getting it wrong.
What I looked for when testing tools
I had a few priorities:
- Does it handle the documents I actually deal with? Not just clean digital PDFs, but scanned copies, faxes (yes, still), and the occasional handwritten note.
- Can it read tables without destroying the row/column structure? This is the dealbreaker.
- What happens to my data after processing? I'm not uploading client financials to a tool that keeps copies.
- Does it output formats I can use? Excel and CSV at minimum. Bonus points for QuickBooks or Xero integration.
What I've been using
I've been running documents through PDF2TEXT for the past few months. It handles the table extraction problem better than the other tools I tried, which is the main reason I stuck with it.
The zero-retention policy mattered to me. Documents get processed and deleted. Given the sensitivity of financial data, I wasn't comfortable with alternatives that stored files on their servers.
It outputs to Excel, CSV, and JSON. The QuickBooks and Xero integrations save an extra step for clients on those platforms. When it's unsure about a field, it flags it for review rather than guessing.
It's not perfect. Handwritten documents with bad penmanship still trip it up sometimes. But for typed and scanned docs, it's been solid enough that I stopped checking every single extraction.
The point
I'm not going to tell you AI is going to replace accountants. That take is tired and wrong. Accountants do work that requires judgment, context, and client relationships. None of that is going away.
But the data entry part? That's just mechanical work. It's the part of the job nobody went to school for, and it's the part that eats up the most hours. If a tool can do 90% of it correctly and flag the other 10% for you to check, that's a trade worth making.
Try it with a batch of 20 invoices. Time yourself doing them manually, then run the same batch through an AI tool. The difference is usually obvious enough that the decision makes itself.
What's your setup for handling document processing? Still manual, partially automated, fully automated? Curious what's working for people.
Top comments (0)