How We Automated Catalog Image Extraction using Computer Vision & FastAPI

#python #fastapi #computervision #automation

For businesses in the stone, marble, and interior design industries, managing digital catalog assets is a massive headache.

When a new product catalog arrives as a 100-page PDF, design teams spend hours manually cropping out individual tile samples to upload to their websites or inventory sheets.

To automate this, we built Tile Extractor—a high-performance, automated parsing engine designed specifically to isolate tile samples from raw catalog documents.

How it Works (Under the Hood)

PDF Ingestion: The system uses a FastAPI backend to ingest multi-page PDFs. We process the pages using PyMuPDF to extract raw page vectors and high-res layout structures.
Object Detection & Border Cleaning: Instead of relying on slow, expensive cloud Vision APIs, we use local Pillow and OpenCV-based spatial algorithms. The engine analyzes:
- Edge density to isolate individual tile boundaries.
- Aspect ratios to filter out page noise (like page numbers or logos).
- Color distributions using RGB histograms.
Lossless Cropping: Once a tile is identified and classified, the engine performs a lossless crop directly from the PDF's high-resolution asset stream, ensuring no pixel resolution is lost.
Batch ZIP Packaging: The isolated tile PNGs are packaged into a single ZIP file and returned to the user instantly.

Why it Matters for B2B Automation

What used to take a human designer 2 hours now takes our engine 5 seconds. By running localized computer vision algorithms instead of cloud APIs, we eliminate usage fees and keep client data fully private.

If your business manages product catalogs, you can try the tool for free here:

👉 Try Tile Extractor: https://tile-extractor.onrender.com
👉 Explore our work: https://anantalabs.app/