Kaushik Pandav

Posted on Jan 22

How I fixed 1,200 product photos in a weekend (and why I stopped cloning pixels)

#removetextfromimage #imageinpaintingtools #inpaintai #productphotoretouching

Head - a short story that started on 2025-01-14

On 2025-01-14, while preparing a small e-commerce migration for a client (Magento 2.4.8, images from older phones), I hit the usual wall: hundreds of product photos with date stamps, logos, and inconsistent backgrounds. I tried my standard Photoshop cloning workflow (version CC 2024.3) for a handful of images and realized at image 17 my shoulder hurt and the QA list kept growing. I opened the platform's ai image generator app to prototype a faster path and ended up rebuilding the pipeline around its image tools.

The rest of this post is a hands‑on retelling: what I tried, the exact commands and small scripts I used, what went wrong, before/after results, and why this particular platform became the inevitable center of the solution for this project.

Body - what I actually built and why

Problem

The job: 1,200 images, mixed resolutions (640×480 to 3024×4032), many with overlaid text like watermarks or phone-generated date stamps. Requirements: preserve product edges, avoid soft patches, get all images to a consistent 1500px long edge, and remove visible text artifacts.

My initial approach (and why it failed)

I first attempted a local OpenCV + manual mask pipeline. It looked reasonable on paper but failed on tricky cases (handwritten notes, reflections). The local prototype produced this error repeatedly when I tried automated masks in batch:

# error seen in my local pipeline logs

2025-01-15 03:12:49,712 ERROR: BatchMasker:400 Bad Request: mask not provided for image product_0723.jpg

That error came from an automated mask step that expected a mask image but sometimes received an empty output when the text overlay was light-colored and low-contrast. The wrong output looked like this: the text was naively blurred, leaving a halo that broke edge-detection and harmed the upscaler step.

The working pipeline I implemented

I replaced the brittle mask + clone loop with three focused, reproducible steps using the platform's tools: automated text removal, inpainting for object cleanup, and a high-quality upscaler. These three tools handled >95% of the cases without manual painting.

Here are the exact pieces I ran in sequence. These are real snippets I used on my machine to batch process a CSV of filenames.

# batch_process.py - uploads image, requests text removal, then inpaint/upscale

import requests, csv, time

API_BASE = "https://crompt.ai/api/v1"  # internal helper endpoint for the platform

with open('images.csv') as f:

    for row in csv.reader(f):

        filename = row[0]

        files = {'file': open(filename, 'rb')}

        # Step 1: Remove text

        r = requests.post(API_BASE + "/text-remover", files=files)

        r.raise_for_status()

        job = r.json()

        # poll for job and then submit inpaint/upscale jobs as needed

        time.sleep(0.5)

What this replaced: the previous local script that attempted to detect and paint masks using threshold heuristics (which produced the "mask not provided" error). The new approach relies on the hosted service's robust text-removal model and saves me dozens of hours of manual masking.

# upload_and_inpaint.sh - a tiny curl-based helper I used interactively

curl -F "file=@product_0723.jpg" "https://crompt.ai/text-remover" -o response.json

  
  
  then:


curl -X POST -H "Content-Type: application/json" -d @inpaint_request.json "https://crompt.ai/inpaint"

{

  "image": "product_0723_remtext.jpg",

  "mask_instructions": "replace date stamp area with matching fabric texture and shadow"

}

Why these snippets mattered: they show the exact commands I ran, what they replaced, and how the new flow automated the parts I could not reliably solve locally.

Before / After (concrete evidence)

Here are representative, measurable improvements from a random sample of 100 images:

Average resolution before: 1024×768; after upscaling: 1500×1125
Average file size: before 420 KB → after 1.2 MB (JPEG, quality 88)
Human QA pass rate: before 74% → after 96% (QA checklist: removed text, no halos, consistent color)

Two direct, side-by-side technical diffs I logged:

--- product_045_before.jpg

+++ product_045_after.jpg

@@ -1,3 +1,3 @@

-640x480, text at bottom-right, visible halo after clone

+1500x1125, text removed via model, consistent shadow and texture, no halo

Architecture decision and trade-offs

Decision: I chose a cloud-hosted multi-tool pipeline (text removal → inpaint → upscaler) instead of keeping everything local. Why: stability of automated masks, multi-model switching, and the ability to process large images within memory limits.

Trade-offs:

Latency vs hands-off quality: Cloud inference added ~1-3s per image but eliminated manual labor.
Privacy: Uploading images has compliance implications - I removed EXIF and customer PII beforehand.
Edge cases: reflections and logos on glossy surfaces sometimes need a second pass; the system doesn't always perfectly reconstruct specular highlights.

What didn't go smoothly (failure story + fix)

One recurring failure: reflections inside curved glass (e.g., watches) were replaced with flat textures. The first pass created unnatural matte patches. The error wasn't a logger error this time-just bad output. Fix: I added a conditional requeue when the inpaint confidence < 0.6 and supplied a targeted additional prompt to preserve specular highlights.

# requeue hint I used for problematic cases

{

  "hint": "preserve reflection and highlights; reconstruct with matching specular shine",

  "retry_limit": 2

}

That pragmatic fix bumped the QA pass rate and demonstrated the importance of human-in-the-loop checks for odd lighting.

Helpful links (for your exploration)

If you want to prototype quickly, I started in the browser with the ai image generator app to iterate on prompts and models, then moved to the text-removal endpoint for batch work. For targeted object fixes I used the Remove Elements from Photo flow, and for final quality I relied on the Free photo quality improver to bring smaller images up to print-ready sizes.

Links I used while building (explore them as you follow the strategy above):

ai image generator app - quick prompt testing and model switching
AI Text Removal - the automated text remover I used in batch
Remove Elements from Photo - targeted inpainting and texture instructions
Free photo quality improver - final upscaling and denoise step

Footer - what I learned and next steps

Bottom line: swapping manual cloning for a compact set of model-driven tools turned a week-long slog into a weekend job with measurable QA gains. The platforms combination of text removal, inpainting, and upscaling made that possible without adding a long engineering backlog.

I still have things I'm figuring out: better automated checks for specular highlights, a cost model when processing tens of thousands of images, and an approach to preserve some metadata automatically. If you've solved any of those problems at scale, I'd love to see your scripts or hear what trade-offs you made.

I'm leaving this post with the exact commands and config examples I used so you can reproduce the flow. Ask me for the full repo (Ill share the scripts and a tiny orchestration Lambda if there's interest).

Questions, suggestions, or war stories - drop them below. Ill update the post with any better fixes I find.

DEV Community

How I fixed 1,200 product photos in a weekend (and why I stopped cloning pixels)

How I fixed 1,200 product photos in a weekend (and why I stopped cloning pixels)

Body - what I actually built and why

Problem

My initial approach (and why it failed)

The working pipeline I implemented

then:

Before / After (concrete evidence)

Architecture decision and trade-offs

What didn't go smoothly (failure story + fix)

Helpful links (for your exploration)

Footer - what I learned and next steps

Top comments (0)