Heather Scott (PeeperFrog)

Posted on Feb 4 • Edited on Feb 6

Claude Desktop Image Generation

#ai #mcp #opensource #python

PeeperFrog Create

An open-source MCP server with Claude Skills that automates AI image generation across Gemini, OpenAI, and Together AI - from prompt to WordPress

Content creation has a bottleneck problem. You need images. Lots of them. Each one requires choosing a provider, crafting prompts, managing costs, converting formats, and uploading to your CMS. What if your AI assistant could handle all of that, right on your desktop?

I built PeeperFrog Create to solve this - an open-source MCP (Model Context Protocol) server that gives Claude (by Anthropic) access to three major AI image providers with intelligent routing, batch workflows, and direct WordPress publishing. Here's how it works and why it matters.

The Content Creator's Dilemma

When I started running newsletters, I faced a recurring problem:

Write article ✅
Research image requirements
Choose provider (Gemini? DALL-E? FLUX?)
Craft a prompt for that provider's strengths
Generate image
Download, convert to WebP
Upload to WordPress
Add to the article

Steps 2-8 took longer than writing the actual content. Multiply this across multiple articles per week, and you're spending hours on image pipeline management instead of creating.

Enter PeeperFrog Create

PeeperFrog Create is an MCP server that connects Claude directly to AI image generation services. But unlike single-provider solutions, it offers:

Multi-provider support: Gemini Pro, OpenAI DALL-E, Together AI (FLUX models)
Auto mode: The MCP server picks the best provider based on your budget and needs
Full control mode: Claude picks the best provider and controls all the parameters
Batch workflows: Queue multiple images, review, and generate them in one run (cost-effective)
Cost estimation and tracking: Know the approximate price before generating, and log it afterward
Skills system: Teach Claude best practices for each provider
WebP conversion: Automatic optimization for web delivery
WordPress integration: Direct upload to media library

Here's what that looks like in practice:

The Architecture: Why MCP Changes Everything

Model Context Protocol is Anthropic's open standard for connecting AI models to external tools. Instead of context switching between applications, MCP lets Claude call tools directly within the conversation.

Traditional workflow:

You → Claude → Copy prompt → Open DALL-E → Generate → Download → Covert → Upload

With MCP:

You → Claude → [MCP handles generation, conversion, upload] → Done

The server acts as a bridge between Claude and image generation APIs:

Auto Mode: The Secret Weapon

Here's where PeeperFrog Create gets interesting. Each provider has different strengths:

Gemini Pro: Best for complex compositions, reference images, search grounding
OpenAI DALL-E: Excellent photorealism, reliable text rendering
Together FLUX: Cost-effective, fast iterations, artistic styles

Instead of manually choosing, auto mode analyzes your request:

Example conversation:

You: "Create a professional infographic showing AI cost trends using peeperfrog-create."

Claude: "I'll create a professional infographic showing AI cost trends for you. Let me generate this using the image generation system."

Generation details:

Provider: Gemini Pro
Resolution: 2K (16:9 aspect ratio)
Cost: $0.14 USD

You: "Convert it to WebP."

Conversion results:

✓ Original PNG: 3.1 MB
✓ WebP version: 319 KB
✓ 89.7% file size reduction
✓ Quality: 85 (high quality retained)

The actual image this prompt created:

Five auto modes cover every scenario:

cheapest (max $0.003/MP): Minimize cost — dreamshaper, flux1-schnell
budget (max $0.01/MP): Decent quality, low cost — hidream-fast, juggernaut-pro
balanced (max $0.04/MP): Production use, good quality/cost — seedream3, flux2-dev, flux2-pro, imagen4
quality (max $0.08/MP): Premium quality — ideogram3, imagen4-ultra, flux1-kontext-max
best (no limit): Maximum quality — Gemini Pro, OpenAI Pro

Real-World Example: Robot Poker Scene

Let me show you what reference images can do. I wanted to create a promotional image for this article showing different robot designs playing poker. Here are the reference robots:

These are real commercial humanoid robots: Boston Dynamics' Atlas, various research platforms, Unitree's H1, and others. Each was an individual photo. I wanted an image that maintained their distinct designs while composing them into a coherent scene.

Using Gemini Pro with reference images:

# In the conversation with Claude:
"Create an image of these five robots playing poker."

Result:

Claude wrote the prompt and selected the model based on the need for reference images. Gemini Pro analyzed all five robots, understood their proportions and aesthetics, and composed them into a coherent scene with proper lighting, atmosphere, and context. This is the power of reference images - you get consistency across generated content while maintaining specific design requirements.

Try that with prompt-only generation, and you'll spend hours iterating to get five consistent robot designs that feel like they belong in the same universe.

The Skills System: Teaching Claude Best Practices

MCP provides the tools. Skills teach Claude how to use them effectively.

Each skill is a markdown file that guides Claude through specific workflows:

Available Skills

Core Image Generation:

image-generation: Overview of all tools and workflows
image-auto-mode: When to use auto mode vs manual control
image-manual-control: Advanced provider-specific options
image-queue-management: Batch workflow best practices
cost-estimation: Budget planning and provider comparison

Publishing Pipeline:

webp-conversion: Web optimization strategies
wordpress-upload: CMS integration patterns

Creative Guidance:

graphic-prompt-types: Reference guide for visual styles
example-brand-image-guidelines: Template for brand consistency

Skills work in both Claude Desktop (GUI) and Claude Code (CLI). Once installed, Claude automatically applies relevant knowledge without you needing to remember command syntax or provider limitations.

Example: You ask for "five images for a newsletter about quantum computing." When using the skills together, Claude can:

Check image-queue-management skill for best practices
Add five prompts to the batch queue
Use cost-estimation to show the total cost before generating
Wait for your approval
Generate all images in one batch run (reducing cost)
Convert to WebP when you request optimization
Upload to WordPress when you request publishing

Batch Workflows: Production at Scale

The batch system transforms how you handle multiple images:

The Problem:
Generate five images individually = five separate API calls, five interruptions, five manual downloads, five uploads. About 20 minutes of context switching. All at full price.

The Solution:

1. Queue all five images with prompts
2. Review queue and estimated costs
3. Generate all in one batch (Gemini cost cut in half)
4. Convert all to WebP
5. Upload all to WordPress in bulk

Time: Under 5 minutes. Cost: Optimized through provider selection and batch processing. Mental overhead: Minimal.

Real example from my workflow:

Newsletter: "The $0.50 Intelligence Revolution."
Images needed:
- Hero image: Cost decline chart (Gemini, text/infographic)
- Figure 1: Timeline diagram (Gemini, complex layout)
- Figure 2: Comparison table (OpenAI, clear text)
- Social: Square version (Together FLUX, artistic)
- Thumbnail: Simplified hero (Together FLUX, fast)

Total cost: ~$0.40-0.45
Generation time: ~3 minutes
Manual time saved: ~25 minutes

The WordPress Pipeline: From Prompt to Published

The final piece - direct publishing:

Process:

Generate image (PNG format from provider)
Convert to WebP (90-94% size reduction for web optimization)
Upload to WordPress via REST API
Title set from filename; alt text/caption added manually in WordPress
Return media ID for insertion in posts

Configuration:

{
  "wordpress": {
    "https://yourblog.com": {
      "username": "your-username",
      "password": "your-app-password"
    }
  }
}

One API call uploads your optimized images directly to your media library, ready to insert into posts.

Cost Comparison: Why Multi-Provider Matters

The price difference between providers is massive - up to 400x difference between the cheapest and most expensive options, and batch processing can save you 50% on the most expensive Gemini Pro images:

Gemini Pro Image ($0.134-0.24/image):

Up to 14 reference images (unique capability)
Search grounding for factually accurate images
Thinking levels for quality control
Up to 4K resolution (4096×4096)
Pricing: $0.134 for 2K, $0.24 for 4K
Batch API available: 50% discount ($0.067 for 2K, $0.12 for 4K)

OpenAI DALL-E / gpt-image-1 ($0.01-0.17/image):

Three quality tiers: Low ($0.01), Medium ($0.04), High ($0.17)
Superior photorealism at medium/high tiers
Excellent text rendering
Consistent style generation
Industry-standard quality

Together AI FLUX ($0.0027-0.08/MP):

FLUX.1 Schnell: $0.0027/MP (fastest, most cost-effective)
FLUX.1 Dev: $0.025/MP (balanced quality/cost)
FLUX.1.1 Pro: $0.04/MP (premium quality)
FLUX.1 Kontext Max: $0.08/MP (highest quality with editing)
Fast iterations for creative exploration
Artistic and illustration strengths

Auto mode considers all these factors when routing your request. Budget constraints? Style requirements? Resolution needs? It handles the decision automatically.

Real Production Metrics

Time Savings (from my workflow):

Old workflow: ~30 minutes per article (5 images avg)
New workflow: ~5 minutes per article
Savings: ~83% reduction in image production time
Your results may vary based on complexity and iteration needs

Cost Efficiency:
Newsletter with 5 diverse images using auto mode balanced:

Hero (OpenAI medium): $0.04
Infographic (ideogram3): $0.06
Photo (seedream3): $0.02
Social (flux1-schnell): $0.003
Thumbnail (dreamshaper): $0.001
Total: ~$0.173 via smart routing

Compare to uniform provider costs:

Same 5 images all Gemini Pro 2K immediate: $0.67 (5 × $0.134)
Same 5 images all Gemini Pro 2K batched: $0.335 (5 × $0.067)
Same 5 images all OpenAI medium: $0.20 (5 × $0.04)
Savings: 60-75% through multi-provider optimization and batching

Quality Improvements:

Reference images maintain brand consistency
Batch review prevents mistakes before generation
Cost visibility prevents budget overruns
Skills reduce prompt iteration cycles

Installation: 5 Minutes to Running

Prerequisites:

- Python 3.8+
- Claude Desktop or Claude Code
- API keys (one or more): Gemini, OpenAI, Together AI

Setup:

# Clone repository
git clone https://github.com/PeeperFrog/peeperfrog-create.git
cd peeperfrog-create/peeperfrog-create-mcp

# Configuration
cp config.json.example config.json
cp .env.example .env

# Install dependencies
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install requests

Configure MCP Client:

Find your settings file:

Claude Code: ~/.claude/settings.json
Claude Desktop (Linux): ~/.config/Claude/claude_desktop_config.json
Claude Desktop (macOS): ~/Library/Application Support/Claude/claude_desktop_config.json
Claude Desktop (Windows): %APPDATA%\Claude\claude_desktop_config.json

Add the server:

{
  "mcpServers": {
    "peeperfrog-create": {
      "command": "/path/to/peeperfrog-create/peeperfrog-create-mcp/venv/bin/python3",
      "args": ["/path/to/peeperfrog-create/peeperfrog-create-mcp/src/image_server.py"],
      "env": {
        "GEMINI_API_KEY": "your-key",
        "OPENAI_API_KEY": "your-key",
        "TOGETHER_API_KEY": "your-key"
      }
    }
  }
}

Restart Claude. Done.

Install Skills:

For Claude Desktop: Settings > Capabilities > Skills > Upload each SKILL.md from the skills/ folder.

For Claude Code:

cp -r skills/* ~/.claude/skills/

Use Cases Beyond Newsletters

While I built this for newsletter production, the system works for:

Content Marketing:

Blog hero images
Social media graphics
Email campaign visuals
Landing page assets

Documentation:

Technical diagrams
Architecture visualizations
Process flowcharts
Tutorial illustrations

E-commerce:

Product mockups
Lifestyle photography
Promotional graphics
Brand assets

Creative Projects:

Concept art
Storyboarding
Character design
World-building

The key: any workflow where you need multiple AI-generated images with consistent quality, controlled costs, and efficient delivery.

The Open Source Advantage

PeeperFrog Create is Apache 2.0 licensed. This means:

For individuals:

Use it for free forever
Modify for your needs
No vendor lock-in

For teams:

Deploy on your infrastructure
Customize provider routing
Add your own workflows
Integrate with existing tools

For developers:

Extend with new providers
Build custom Skills
Contribute improvements
Fork for specialized needs

The codebase is Python. The skills are markdown files. No complex dependencies. Easy to understand, easier to modify.

What's Next

Current development roadmap:

Near-term:

Image editing capabilities (inpainting, outpainting)
More Claude skills
Social Media Connections
Template system for common layouts
Additional provider support (Replicate, Stability AI)

Medium-term:

Video generation integration
Animation and motion graphics
Multi-image composition tools
Advanced cost management

Long-term:

Local model support
Fine-tuning integration
Custom model hosting
Enterprise collaboration features

Want to contribute? Issues and PRs welcome at github.com/PeeperFrog/peeperfrog-create.

The Bigger Picture: AI-Assisted Workflows

This project represents a larger shift in how we work with AI tools. The traditional model - AI as chatbot - is giving way to AI as workflow participant.

Old model:

Human → Think of task → Do task → AI assists with parts

New model:

Human → Describe outcome → AI orchestrates entire workflow → Human reviews

MCP enables this transition. Instead of Claude generating text that you copy-paste into tools, Claude directly operates tools based on conversation context.

Image generation is just one domain. The same pattern applies to:

Data analysis and visualization
Code generation and testing
Research and summarization
Content publishing and distribution
Project management and tracking

PeeperFrog Create proves the concept works. Your AI assistant can manage multi-provider services, handle complex workflows, optimize costs, and deliver production-ready results - all from a conversation.

Try It Today

Install the MCP server (5 minutes)
Add one API key (any provider)
Install the Skills (3 minutes)
Ask Claude to generate an image

That's it. No tutorials needed. The Skills teach Claude how to use the tools effectively. Auto mode handles the complex decisions. Batch workflows scale your production.

Within 10 minutes, you'll have an AI-powered image pipeline that would take days to build manually.

Repository: github.com/PeeperFrog/peeperfrog-create

Documentation: Full docs in the repo README and individual Skills

Community: Issues, discussions, and PRs welcome

Running newsletters? Content marketing? Building a CMS? Try **PeeperFrog Create* and cut your image costs in half and production time by 80%. Apache 2.0 licensed, free forever.*

DEV Community