DEV Community

Cover image for Building an Automated AI Image Pipeline with MCP and Gemini
Heather Scott (PeeperFrog)
Heather Scott (PeeperFrog)

Posted on

Building an Automated AI Image Pipeline with MCP and Gemini

Running newsletters is a content treadmill. Articles need images. Images need prompts. Prompts need refinement. And somewhere in that loop, hours disappear.

I built gemini-image-mcp to solve this - an open-source MCP (Model Context Protocol) server that automates AI image generation from prompt to published image. Here's how it works and why I built it.

The Problem

My workflow looked like this:

  1. Write an article
  2. Think of image concepts (or multiple)
  3. Manually prompt an image generator (including reference images)
  4. Download the image
  5. Convert to WebP for web performance
  6. Upload to WordPress
  7. Repeat for every article

Steps 2-6 were eating my time. I needed automation that fit inside my existing Claude Desktop workflow without jumping between tools.

Enter MCP

MCP (Model Context Protocol) is Anthropic's open standard for connecting AI models to external tools. Instead of copy-pasting between apps, MCP lets Claude directly call tools - like an image generator - without leaving the conversation.

This meant I could build a server that Claude could talk to directly. No browser switching. No manual downloads. Just a conversation.

The Architecture

Technical architecture diagram showing the gemini-image-mcp workflow: Claude Code connects via MCP protocol to the gemini-image-mcp server, which calls the Gemini API to generate images. Generated images then split into two paths - WebP Converter producing Optimized Images, and WordPress Upload delivering to the Media Library.

The server is Python-based with three core modules:

  • gemini_image_server.py - Main MCP server handling tool calls
  • batch_manager.py - Queue management for batch operations
  • batch_generate.py - Batch image generation with rate limiting

Two Quality Tiers

Not every image needs to be 4K. I built in two tiers:

Pro Mode - Gemini 3 Pro Image Preview

  • Up to 4K resolution (defaults to 2K)
  • Supports up to 14 reference images
  • Better text rendering
  • Best for final production images

Fast Mode - Gemini 2.5 Flash Image

  • 1K resolution
  • Faster generation
  • Great for iterations and testing
  • Significantly cheaper

This lets me prototype quickly in Fast mode and switch to Pro for final versions.

The Batch System

This is where the real value comes in. Instead of generating images one at a time, the batch system:

  1. Queues multiple image prompts
  2. Lets you review the queue before generating
  3. Generates all images in one run with rate limiting
  4. Cuts API costs roughly in half
# Queue images
add_to_batch("Newsletter header - abstract AI visualization")
add_to_batch("Feature image - quantum computing concept")
add_to_batch("Sidebar graphic - robotics innovation")

# Review queue
view_batch_queue()

# Generate all at once
run_batch()
Enter fullscreen mode Exit fullscreen mode

WebP Conversion

Web performance matters. The server automatically converts generated PNGs to WebP format using Pillow, reducing file sizes significantly without visible quality loss. This reduces server disk requirements.

WordPress Integration

The final piece - direct upload to your WordPress media library via REST API. No manual uploads, no resizing, no file management. Generated, converted, and published in one workflow.

Getting Started

git clone https://github.com/PeeperFrog/gemini-image-mcp.git
cd gemini-image-mcp
cp config.json.example config.json
cp .env.example .env
pip install requests
Enter fullscreen mode Exit fullscreen mode

Add your Gemini API key to .env, update paths in config.json, and add it to your MCP client. That's it.

Real-World Results

Running this for two newsletters, I've cut my image production time from roughly 30 minutes per article down to under 5 minutes. The batch system keeps API costs manageable, and the WordPress integration means published images appear in my media library automatically.

What's Next

Currently working on:

  • Expanded reference image library for workflows
  • Additional CMS integrations
  • More granular quality controls

Wrap Up

MCP is still early, but it's already changing how developers build AI workflows. gemini-image-mcp is one example of what's possible when you connect the dots between AI models and real-world tools.

Free, open-source, MIT licensed.

🔗 GitHub: PeeperFrog/gemini-image-mcp

Top comments (0)