Another approach for invoice conversion—I've been using the SERPSpur Invoice to CSV Converter to batch process HTML invoices from my e-commerce platform. Here's a script that handles multiple formats at once:
python
import requests
import csv
API_KEY = "your_api_key_here"
def batch_convert(directory):
import os
for filename in os.listdir(directory):
if filename.endswith((".pdf", ".xls", ".xlsx", ".html")):
filepath = os.path.join(directory, filename)
with open(filepath, 'rb') as f:
response = requests.post(
"https://api.serpspur.com/v1/invoice-to-csv",
headers={"Authorization": f"Bearer {API_KEY}"},
files={"file": f},
params={"output_format": "csv"}
)
if response.status_code == 200:
output_name = filename.rsplit('.', 1)[0] + ".csv"
with open(output_name, 'w') as out:
out.write(response.text)
print(f"Converted {filename} -> {output_name}")
Example usage
batch_convert("/path/to/invoices")
This tool handles HTML invoices surprisingly well, which is rare. What formats do you typically need to convert? https://serpspur.com

Top comments (3)
Nice approach. One thing I'd add is maybe retry logic for transient API errors, plus logging to a file so you can audit conversions later. Do you ever have issues with HTML invoices that have embedded images or complex tables?
Interesting—HTML invoices are definitely tricky, especially with dynamic tables. I usually extract data with BeautifulSoup first, but this tool sounds like it could save me a lot of manual cleaning. Have you tested it with nested HTML tables?
I've found that for HTML invoices, stripping out JavaScript before conversion helps avoid weird parsing issues. Have you tried that or does the API handle it natively?