š Hey dev.to community,
Last week I shipped a workflow I built for a friend who runs an online shop. He called me again a few days later with a new headache: he's drowning in Purchase Orders. Every single one gets opened by hand, the data typed into a Google Sheet, and that sheet uploaded into his ERP to update his numbers. Hours a week, pure copy-paste.
So I built him something to kill that step. He uploads the PO PDFs through a simple n8n form, and a structured Google Sheet comes out the other end. He just downloads it and pushes it to his ERP.
How it's set up:
The form accepts multiple PDFs at once, so he can batch a whole stack instead of doing them one by one. Each PO loops through on its own so nothing gets jumbled.
The extraction runs on the easybits Extractor node (@easybits/n8n-nodes-extractor). I set the field structure up in two parts: the header fields that appear once per PO (PO number, PO date, delivery date, mark for, PR number, reference no), plus an articles array for the line items, each holding article name, unit and quantity. That array is the key bit, it gives you one entry per row of the PO table, and I flatten it into one sheet row per article with the header details repeated on each.
Two things I added because real documents are messy:
Error flagging. If any field comes back empty, the completion screen lists which document and which field didn't extract cleanly, so he knows exactly which PO to double-check instead of trusting it blindly.
Document name column. The original filename lands in the sheet next to every row, so if a number looks off he can jump straight back to the source PDF.
Workflow JSON is on GitHub: https://github.com/felix-sattler-easybits/n8n-workflows/blob/c38749a68fd6ea4ae6ebff41789d35cceaacdef1/easybits-purchase-order-extractor-workflow/easybits_purchase_order_extractor_workflow.json
Anyone else automating document-to-sheet data entry? Curious how you're handling the messy multi-line rows ā that was the trickiest part to get right.
Best,
Felix
Top comments (0)