DEV Community

jesus manrique
jesus manrique

Posted on • Originally published at guayoyo.tech

Generative AI: From Curiosity to Real Production — The Complete Pipeline — Part 1 of 5

Generative AI: From Curiosity to Real Production — The Complete Pipeline

January 2026. The meeting room smells of reheated coffee and frustration. Your marketing team publishes 3 posts per week on Instagram, 2 on TikTok, and the LinkedIn feed has been abandoned since November. You hired an agency charging $1,500/month for "organic content," but their copy sounds like a template and the images look like they came from Canva circa 2019. Meanwhile, ChatGPT writes you poems about your brand and Midjourney generates stunning space landscapes… that have nothing to do with your business.

There's a chasm between playing with AI and producing with AI. This series crosses it.


The Problem Isn't the Technology. It's the Pipeline.

Most companies "using AI for content" are stuck in the loose-prompt phase:

  1. Open ChatGPT
  2. Type "write me an Instagram post about productivity"
  3. Paste the result
  4. Ask a designer to make a "roughly related" image
  5. Publish manually

That doesn't scale. And you know it because you've tried.

What you need is a production pipeline: a chain of automated steps that takes a raw idea and transforms it, without constant human intervention, into publish-ready content across multiple platforms. And you need it running on your own infrastructure — not on OpenAI's cloud billing you per token, not on Midjourney costing $30/user/month.


The Stack: What You'll Build in This Series

By the end of these 5 articles, you'll have:

Component Tool Function
Local LLM Ollama + Mistral 7B Text generation (copy, hashtags, CTAs)
Orchestrator n8n (self-hosted) Automated workflows
Image Generation ComfyUI + Stable Diffusion Brand-consistent images
Human Approval WhatsApp Business API Review circuit before publishing
Publishing Instagram Graph API + TikTok API Automated multi-platform publishing

All self-hosted. All under your control. No recurring API costs except what's strictly necessary.


Why Self-Hosted? The Dollar Difference

Let's run numbers for a real case: 1 daily Instagram post + 1 TikTok, with AI-generated copy and AI-created images.

Cloud Option (all paid APIs)

Service Monthly Cost
OpenAI GPT-4o API (~90K tokens/day) $120
Midjourney Pro (1 user) $30
Make.com (Business plan) $36
VPS for webhooks $20
Evolution API (WhatsApp) $25
Monthly subtotal $231

And this is optimistic. Scale to 3 daily multi-platform posts and you easily exceed $440/month.

Self-Hosted Option (our architecture)

Service Monthly Cost
Dedicated server (RTX 3090, 64GB RAM) $60 (Hetzner AX102)
Evolution API (self-hosted) $0
n8n (self-hosted) $0
Ollama + Mistral 7B $0
ComfyUI + SD $0
Electricity (estimated) $20
Monthly total $80

Savings: $360/month. And you can generate 10x more content without spending a cent more.

Note: The Hetzner AX102 server is just an example. Start with a modest machine if your volume is low. The key is having a GPU with at least 16GB VRAM.


Step 1: Install Ollama (15 Minutes)

Ollama is the Swiss Army knife of local LLMs. Linux installation:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verify it's running
ollama serve
# (in another terminal)
ollama list
Enter fullscreen mode Exit fullscreen mode

Download the model we'll use — Mistral 7B, excellent balance of quality and speed:

ollama pull mistral:7b
Enter fullscreen mode Exit fullscreen mode

Verify it works:

curl http://localhost:11434/api/generate -d '{
  "model": "mistral:7b",
  "prompt": "Write an Instagram hook about AI automation, max 15 words, professional but approachable tone.",
  "stream": false
}'
Enter fullscreen mode Exit fullscreen mode

Expected response:

{
  "model": "mistral:7b",
  "response": "Let AI handle the repetitive. You focus on what only you can do.",
  "done": true,
  ...
}
Enter fullscreen mode Exit fullscreen mode

That was free. No API key. No token limits. No network latency. Your LLM, your server, your data.


Step 2: Install n8n with Docker (10 Minutes)

n8n is the pipeline brain — a workflow orchestrator with a visual interface and 400+ integrations.

# Create directory for persistent data
mkdir -p ~/n8n-data

# Start n8n with Docker
docker run -d \
  --name n8n \
  --restart unless-stopped \
  -p 5678:5678 \
  -v ~/n8n-data:/home/node/.n8n \
  -e N8N_SECURE_COOKIE=false \
  -e N8N_HOST=localhost \
  -e N8N_PROTOCOL=http \
  n8nio/n8n:latest
Enter fullscreen mode Exit fullscreen mode

Access http://localhost:5678 and complete the initial setup (create your admin user). In production you'll want HTTPS and external auth — we'll cover that in article 4.


Step 3: Connect Ollama to n8n

Inside n8n, go to Settings → Credentials → Add Credential:

  1. Search for "Ollama"
  2. In Base URL enter: http://localhost:11434 (or your server IP if Ollama is on another machine)
  3. Save as Ollama Local

Step 4: Your First Flow — Webhook → AI → Response

Create a new workflow in n8n. Add these nodes:

1. Webhook Node

  • HTTP Method: POST
  • Path: test-ai
  • Response Mode: Last Node

2. Ollama Chat Model Node

  • Credential: Ollama Local
  • Model: mistral:7b
  • Options → Temperature: 0.7

3. Respond to Webhook Node

  • Connect the Ollama node output to this one

The flow looks like:

[Webhook] → [Ollama Chat Model] → [Respond to Webhook]
Enter fullscreen mode Exit fullscreen mode

Activate it (toggle "Active" top-right) and test it:

curl -X POST http://localhost:5678/webhook/test-ai \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "enterprise automation with AI",
    "platform": "instagram"
  }'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "message": "67% of administrative tasks can be automated with AI. It's not magic — it's engineering. 🚀 #Automation #AI"
}
Enter fullscreen mode Exit fullscreen mode

You just built your first content generation endpoint. In 30 minutes. Without writing a single line of code.


What's Coming in This Series

Article What You'll Build
#2 — The Orchestrator Agent An agent that understands your brand, audience, and tone. Structured JSON briefs instead of loose prompts.
#3 — Images That Sell Brand-consistent image generation with Stable Diffusion + ComfyUI + custom LoRAs.
#4 — Approval Circuit Human review system via WhatsApp: AI proposes, you approve (or request changes), it publishes.
#5 — Multi-Platform Publishing Connection to Instagram, TikTok, and Google Sheets. Metrics, feedback loop, and scaling to daily content.

The Difference Between Playing and Producing

Playing with AI is asking ChatGPT to write something and marveling at the result. Producing with AI is having a system that:

  • Receives a product or campaign brief
  • Generates copy adapted to your brand voice
  • Creates images consistent with your visual identity
  • Requests your approval via WhatsApp
  • Publishes automatically to all your platforms
  • Records metrics to improve the next iteration

That's what you're going to build. You don't need a 5-person team. You need a well-designed pipeline, a GPU server, and this article series.

Ready to stop playing and start producing?


Want to implement this in your company? At Guayoyo Tech we build self-hosted generative AI pipelines for companies that need to scale content production without scaling payroll. Let's talk →


Next article: The Orchestrator Agent: Teaching Your AI to Understand Your Brand and Execute Briefs →

Top comments (0)