DEV Community

jesus manrique
jesus manrique

Posted on • Originally published at guayoyo.tech

Images That Sell: Automated Multimedia Generation Without a Designer — Part 3 of 5

In the previous article you built an agent that generates on-brand copy with memory and context. But a social media post without an image is an invisible post.

The problem: a designer costs $500-$1500/month. Cloud AI tools (DALL-E, Midjourney) cost $20-$60/month each and don't understand your brand. Every image you generate looks different.

The solution: Stable Diffusion running on your infrastructure, with LoRAs that learn your visual identity. Costs $0 additional per image and produces brand-consistent results.

Why Stable Diffusion Instead of DALL-E or Midjourney?

Comparison DALL-E / Midjourney Stable Diffusion Self-Hosted
Cost $20-60/month $0/image (GPU you already have)
Automation API Yes Yes (ComfyUI)
Visual consistency Low (loose prompts) High (brand LoRAs)
Data privacy Your images go to OpenAI Everything on your server
Technical control Limited Full (prompts, models, dimensions)

Stable Diffusion turns text into images. ComfyUI turns it into an automatable API. LoRAs turn generic results into brand content.

Step 1: Enable ComfyUI in API Mode

# Install with GPU support (NVIDIA)
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

# Download a base model (SDXL recommended for quality)
# Place the .safetensors file in ComfyUI/models/checkpoints/

# Start in API mode (no GUI)
python main.py --listen 0.0.0.0 --port 8188 --api
Enter fullscreen mode Exit fullscreen mode

ComfyUI exposes REST endpoints:

  • POST /api/prompt — execute a workflow
  • GET /api/history/{prompt_id} — get results
  • GET /api/view?filename={name} — download the image

Step 2: Prompt Structure That Actually Works

A good prompt for social media isn't "a nice tech image." It's a precise technical instruction:

(quality_tags) subject_description, style_directive, 
lighting_setup, color_palette, camera_framing, 
negative_prompt
Enter fullscreen mode Exit fullscreen mode

Real example for a Guayoyo Tech post:

masterpiece, best quality, 8k, professional photo of a 
modern developer workspace with multiple monitors showing 
code and dashboards, clean minimalist desk, warm ambient lighting 
from desk lamp, blue and teal accent colors (#1A73E8 #22D3EE), 
shallow depth of field, 1080x1080 square composition

Negative: lowres, bad anatomy, text, watermark, blurry, 
oversaturated, people, hands, messy desk, dark shadows
Enter fullscreen mode Exit fullscreen mode

Universal quality tags:

masterpiece, best quality, 8k, highres, sharp focus, 
intricate details, professional lighting
Enter fullscreen mode Exit fullscreen mode

Styles by content type:

  • Technical/DevOps: isometric view, clean technical diagram, blueprint aesthetic, dark mode UI
  • Business: corporate photography, glass office, professional atmosphere, natural window light
  • Abstract/Conceptual: digital art, abstract geometry, tech wave, gradient mesh, minimal

Guayoyo Tech color palette:

blue and teal accents (#1A73E8, #22D3EE) on dark background (#0b1120)
Enter fullscreen mode Exit fullscreen mode

Step 3: LoRAs — The Consistency Secret

A LoRA (Low-Rank Adaptation) is a mini-model that plugs into Stable Diffusion to teach it a specific concept: your logo, your visual style, your color palette.

Option A: Use public LoRAs (free)

Thousands available on Civitai.com. For a tech/modern style:

# Download popular LoRAs
wget https://civitai.com/api/download/models/XXXXX -O ComfyUI/models/loras/tech-style.safetensors
Enter fullscreen mode Exit fullscreen mode

In the prompt: <lora:tech-style:0.8> modern developer workspace...

Option B: Train your own LoRA (~$2 GPU cloud)

With 10-15 reference images of your brand (website screenshots, previous posts, color palette):

# Using Kohya SS (open-source tool)
git clone https://github.com/bmaltais/kohya_ss.git
cd kohya_ss
# Train with 10-15 reference images → ~30 min on T4 GPU
Enter fullscreen mode Exit fullscreen mode

A LoRA trained on your assets produces images that look like your design team made them.

Step 4: ComfyUI Workflow as JSON API

ComfyUI uses JSON workflows. Export one from the UI and send it via API:

{
  "3": {
    "class_type": "KSampler",
    "inputs": {
      "seed": 42, "steps": 25, "cfg": 7.5,
      "sampler_name": "euler_ancestral", "scheduler": "normal",
      "denoise": 1.0, "model": ["4", 0],
      "positive": ["6", 0], "negative": ["7", 0],
      "latent_image": ["5", 0]
    }
  },
  "4": { "class_type": "CheckpointLoaderSimple", "inputs": { "ckpt_name": "sd_xl_base_1.0.safetensors" }},
  "5": { "class_type": "EmptyLatentImage", "inputs": { "width": 1080, "height": 1080, "batch_size": 1 }},
  "6": { "class_type": "CLIPTextEncode", "inputs": { "text": "YOUR PROMPT HERE", "clip": ["4", 1] }},
  "7": { "class_type": "CLIPTextEncode", "inputs": { "text": "YOUR NEGATIVE HERE", "clip": ["4", 1] }},
  "8": { "class_type": "VAEDecode", "inputs": { "samples": ["3", 0], "vae": ["4", 2] }},
  "9": { "class_type": "SaveImage", "inputs": { "filename_prefix": "guayoyo_post", "images": ["8", 0] }}
}
Enter fullscreen mode Exit fullscreen mode

n8n HTTP Request node to trigger generation:

POST http://localhost:8188/api/prompt
Headers: Content-Type: application/json
Body: { "prompt": {{ $json.comfyui_workflow }} }

// Response: { "prompt_id": "abc-123" }

// Poll and get result:
GET http://localhost:8188/api/history/abc-123

// Download image:
GET http://localhost:8188/api/view?filename={{ $json.filename }}
Enter fullscreen mode Exit fullscreen mode

Step 5: Dimensions by Platform

Platform Format Dimensions
Instagram Feed (square) 1:1 1080×1080
Instagram Feed (portrait) 4:5 1080×1350
Instagram Story / Reel 9:16 1080×1920
TikTok 9:16 1080×1920
Facebook / LinkedIn 1.91:1 1200×630

In the ComfyUI JSON, change width and height in the EmptyLatentImage node per platform.

Step 6: Automatic Quality Validation

Not every generation is good. A validation node using Python Pillow:

from PIL import Image
import io, requests

img = Image.open(io.BytesIO(requests.get(image_url).content))

width, height = img.size
if width < 1000 or height < 1000:
    raise ValueError("Image too small")

# Detect fully black or white images
pixels = list(img.getdata())
avg_brightness = sum(sum(p[:3])/3 for p in pixels) / len(pixels)
if avg_brightness < 10 or avg_brightness > 245:
    raise ValueError("Image solid black or white")

# Detect low contrast (likely blurry)
# If <20% of pixels deviate >30 from average → probably blurry
Enter fullscreen mode Exit fullscreen mode

If validation fails, the flow auto-retries with a different seed.

The Image Node in Your Pipeline

Your n8n flow now has 3 new nodes after the agent:

Agent (copy) → ComfyUI Prompt Builder → POST ComfyUI API 
→ Wait 10s → GET Result → Validation → 
  ├─ OK → WhatsApp Preview (Article 4)
  └─ Failed → Retry (seed + 1000)
Enter fullscreen mode Exit fullscreen mode

In ~30 seconds you get an on-brand, validated image ready for preview. No designer. No watermark. No subscription.


Want a self-hosted content pipeline that generates brand-consistent visuals? At Guayoyo Tech, we design and implement the complete solution — on your infrastructure, with your data, under your control.

Top comments (0)