Building an Automated AI Image Pipeline with MCP and Gemini

#mcp #ai #python #opensource

Running newsletters is a content treadmill. Articles need images. Images need prompts. Prompts need refinement. And somewhere in that loop, hours disappear.

I built gemini-image-mcp to solve this - an open-source MCP (Model Context Protocol) server that automates AI image generation from prompt to published image. Here's how it works and why I built it.

The Problem

My workflow looked like this:

Write an article
Think of image concepts (or multiple)
Manually prompt an image generator (including reference images)
Download the image
Convert to WebP for web performance
Upload to WordPress
Repeat for every article

Steps 2-6 were eating my time. I needed automation that fit inside my existing Claude Desktop workflow without jumping between tools.

Enter MCP

MCP (Model Context Protocol) is Anthropic's open standard for connecting AI models to external tools. Instead of copy-pasting between apps, MCP lets Claude directly call tools - like an image generator - without leaving the conversation.

This meant I could build a server that Claude could talk to directly. No browser switching. No manual downloads. Just a conversation.

The Architecture

The server is Python-based with three core modules:

gemini_image_server.py - Main MCP server handling tool calls
batch_manager.py - Queue management for batch operations
batch_generate.py - Batch image generation with rate limiting

Two Quality Tiers

Not every image needs to be 4K. I built in two tiers:

Pro Mode - Gemini 3 Pro Image Preview

Up to 4K resolution (defaults to 2K)
Supports up to 14 reference images
Better text rendering
Best for final production images

Fast Mode - Gemini 2.5 Flash Image

1K resolution
Faster generation
Great for iterations and testing
Significantly cheaper

This lets me prototype quickly in Fast mode and switch to Pro for final versions.

The Batch System

This is where the real value comes in. Instead of generating images one at a time, the batch system:

Queues multiple image prompts
Lets you review the queue before generating
Generates all images in one run with rate limiting
Cuts API costs roughly in half

# Queue images
add_to_batch("Newsletter header - abstract AI visualization")
add_to_batch("Feature image - quantum computing concept")
add_to_batch("Sidebar graphic - robotics innovation")

# Review queue
view_batch_queue()

# Generate all at once
run_batch()

WebP Conversion

Web performance matters. The server automatically converts generated PNGs to WebP format using Pillow, reducing file sizes significantly without visible quality loss. This reduces server disk requirements.

WordPress Integration

The final piece - direct upload to your WordPress media library via REST API. No manual uploads, no resizing, no file management. Generated, converted, and published in one workflow.

Getting Started

git clone https://github.com/PeeperFrog/gemini-image-mcp.git
cd gemini-image-mcp
cp config.json.example config.json
cp .env.example .env
pip install requests

Add your Gemini API key to .env, update paths in config.json, and add it to your MCP client. That's it.

Real-World Results

Running this for two newsletters, I've cut my image production time from roughly 30 minutes per article down to under 5 minutes. The batch system keeps API costs manageable, and the WordPress integration means published images appear in my media library automatically.

What's Next

Currently working on:

Expanded reference image library for workflows
Additional CMS integrations
More granular quality controls

Wrap Up

MCP is still early, but it's already changing how developers build AI workflows. gemini-image-mcp is one example of what's possible when you connect the dots between AI models and real-world tools.

Free, open-source, MIT licensed.

🔗 GitHub: PeeperFrog/gemini-image-mcp