DEV Community

Fred Santos
Fred Santos

Posted on

Build a Complete AI Image Pipeline in 10 Lines of Python

Build a Complete AI Image Pipeline in 10 Lines of Python

Generate, remove background, resize, and extract text from images — all with a single API and no local GPU required.

The Problem: Too Many Image APIs

Every AI project that touches images ends up with the same sprawl:

  • Replicate or FAL for generation
  • remove.bg for background removal
  • Cloudinary or Imgix for resizing and transforms
  • Google Vision or AWS Textract for OCR

Four separate APIs. Four billing dashboards. Four SDKs. Four sets of rate limits. And you're not even done — you still need to stitch them together with download/upload loops between each step.

There's a better way.

IteraTools: One API, Entire Image Pipeline

IteraTools is a pay-per-use API that bundles the entire image pipeline into a single endpoint collection:

Tool Endpoint Price
Generate image (Flux 1.1 Pro) POST /image/generate $0.005
Generate image fast (Flux Schnell) POST /image/fast $0.002
Remove background POST /image/rembg $0.003
Resize / crop POST /image/resize $0.001
Extract text (OCR) POST /image/ocr $0.002

One API key. One billing account. Costs only what you use.

A Real Example: Product Image Pipeline

Say you're building an e-commerce agent that:

  1. Generates a product image from a text description
  2. Removes the background (white bg for marketplace listings)
  3. Resizes to 800x800 for consistency
  4. Extracts any text from the image for tagging

Here's the full pipeline in Python:

import requests
import base64

API_KEY = "it-XXXX-XXXX-XXXX"
BASE = "https://api.iteratools.com"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

def b64_to_url(b64_data, filename="image.png"):
    """Save base64 image to file, return path"""
    with open(filename, "wb") as f:
        f.write(base64.b64decode(b64_data))
    return filename

# Step 1: Generate the product image
print("🎨 Generating product image...")
gen = requests.post(f"{BASE}/image/generate", headers=HEADERS, json={
    "prompt": "minimalist white ceramic coffee mug on a clean surface, product photography",
    "width": 1024,
    "height": 1024
}).json()
image_b64 = gen["image"]  # base64 PNG
print(f"✅ Generated: {gen['width']}x{gen['height']}px")

# Step 2: Remove background
print("✂️ Removing background...")
rembg = requests.post(f"{BASE}/image/rembg", headers=HEADERS, json={
    "image": image_b64
}).json()
transparent_b64 = rembg["image"]
print("✅ Background removed")

# Step 3: Resize to 800x800
print("📐 Resizing to 800x800...")
resized = requests.post(f"{BASE}/image/resize", headers=HEADERS, json={
    "image": transparent_b64,
    "width": 800,
    "height": 800,
    "fit": "contain"  # letterbox, keeps aspect ratio
}).json()
final_b64 = resized["image"]
print("✅ Resized")

# Step 4: OCR any visible text
print("🔍 Extracting text...")
ocr = requests.post(f"{BASE}/image/ocr", headers=HEADERS, json={
    "image": image_b64  # check original for text
}).json()
extracted_text = ocr.get("text", "")
print(f"✅ Text found: '{extracted_text[:100]}'")

# Save final image
b64_to_url(final_b64, "product_final.png")
print("🎉 Pipeline complete! Saved as product_final.png")
print(f"💰 Total cost: ~$0.011 USDC")
Enter fullscreen mode Exit fullscreen mode

That's 10 lines of real logic (excluding helpers and prints). Total cost: $0.011.

Works With MCP Too

If you're using Claude, Cursor, or any MCP-compatible agent, you can add all these tools with one install:

npx mcp-iteratools --api-key it-XXXX-XXXX-XXXX
Enter fullscreen mode Exit fullscreen mode

Your agent can now generate, transform, and analyze images through natural language:

"Generate a product image of a leather wallet, remove the background, and tell me what text is visible on it."

Claude will automatically chain the three tool calls.

Getting Started

  1. Get an API key (free, no credit card): POST https://api.iteratools.com/credits/keys
  2. Add credits via Stripe: $5 gets you ~400+ image operations
  3. Start building: docs.iteratools.com

No account needed for the key — just an email. Pay only for what you use.


Links:

Tags: python, ai, api, machinelearning, webdev

Top comments (0)