Managing invoices for freelance SEO projects can get repetitive fast, especially when you need to extract data manually into spreadsheets. I built a small Python script to automate converting invoice PDFs into CSV files for quick bookkeeping and reporting.
Here’s the script:
import pdfplumber
import csv
def extract_invoice_data(pdf_path):
with pdfplumber.open(pdf_path) as pdf:
text = ""
for page in pdf.pages:
text += page.extract_text()
# Simple parsing logic for common invoice fields
lines = text.split('\n')
data = {}
for line in lines:
if ':' in line:
key, value = line.split(':', 1)
data[key.strip()] = value.strip()
return data
def save_to_csv(data, csv_path):
with open(csv_path, 'w', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=data.keys())
writer.writeheader()
writer.writerow(data)
Example usage
invoice_data = extract_invoice_data('invoice.pdf')
save_to_csv(invoice_data, 'invoice.csv')
print('Conversion complete!')
What it does
Extracts text from invoice PDFs using pdfplumber
Detects simple key:value invoice fields
Saves structured data into a CSV file
Best for
Freelancers
SEO agencies
Small businesses
Quick invoice processing tasks
This works well for basic invoice layouts. For more complex invoices, regex patterns or OCR tools may be needed.
If you’re processing large batches regularly, dedicated tools can save time, but for lightweight workflows this script has been pretty useful.
What’s your preferred method for invoice data extraction?
Top comments (0)