jesus manrique

Posted on May 14 • Originally published at guayoyo.tech

Generative AI: From Curiosity to Real Production — The Complete Pipeline — Part 1 of 5

#ai #automation #productivity #tutorial

Generative AI: From Curiosity to Real Production — The Complete Pipeline

January 2026. The meeting room smells of reheated coffee and frustration. Your marketing team publishes 3 posts per week on Instagram, 2 on TikTok, and the LinkedIn feed has been abandoned since November. You hired an agency charging $1,500/month for "organic content," but their copy sounds like a template and the images look like they came from Canva circa 2019. Meanwhile, ChatGPT writes you poems about your brand and Midjourney generates stunning space landscapes… that have nothing to do with your business.

There's a chasm between playing with AI and producing with AI. This series crosses it.

The Problem Isn't the Technology. It's the Pipeline.

Most companies "using AI for content" are stuck in the loose-prompt phase:

Open ChatGPT
Type "write me an Instagram post about productivity"
Paste the result
Ask a designer to make a "roughly related" image
Publish manually

That doesn't scale. And you know it because you've tried.

What you need is a production pipeline: a chain of automated steps that takes a raw idea and transforms it, without constant human intervention, into publish-ready content across multiple platforms. And you need it running on your own infrastructure — not on OpenAI's cloud billing you per token, not on Midjourney costing $30/user/month.

The Stack: What You'll Build in This Series

By the end of these 5 articles, you'll have:

Component	Tool	Function
Local LLM	Ollama + Mistral 7B	Text generation (copy, hashtags, CTAs)
Orchestrator	n8n (self-hosted)	Automated workflows
Image Generation	ComfyUI + Stable Diffusion	Brand-consistent images
Human Approval	WhatsApp Business API	Review circuit before publishing
Publishing	Instagram Graph API + TikTok API	Automated multi-platform publishing

All self-hosted. All under your control. No recurring API costs except what's strictly necessary.

Why Self-Hosted? The Dollar Difference

Let's run numbers for a real case: 1 daily Instagram post + 1 TikTok, with AI-generated copy and AI-created images.

Cloud Option (all paid APIs)

Service	Monthly Cost
OpenAI GPT-4o API (~90K tokens/day)	$120
Midjourney Pro (1 user)	$30
Make.com (Business plan)	$36
VPS for webhooks	$20
Evolution API (WhatsApp)	$25
Monthly subtotal	$231

And this is optimistic. Scale to 3 daily multi-platform posts and you easily exceed $440/month.

Self-Hosted Option (our architecture)

Service	Monthly Cost
Dedicated server (RTX 3090, 64GB RAM)	$60 (Hetzner AX102)
Evolution API (self-hosted)	$0
n8n (self-hosted)	$0
Ollama + Mistral 7B	$0
ComfyUI + SD	$0
Electricity (estimated)	$20
Monthly total	$80

Savings: $360/month. And you can generate 10x more content without spending a cent more.

Note: The Hetzner AX102 server is just an example. Start with a modest machine if your volume is low. The key is having a GPU with at least 16GB VRAM.

Step 1: Install Ollama (15 Minutes)

Ollama is the Swiss Army knife of local LLMs. Linux installation:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verify it's running
ollama serve
# (in another terminal)
ollama list

Download the model we'll use — Mistral 7B, excellent balance of quality and speed:

ollama pull mistral:7b

Verify it works:

curl http://localhost:11434/api/generate -d '{
  "model": "mistral:7b",
  "prompt": "Write an Instagram hook about AI automation, max 15 words, professional but approachable tone.",
  "stream": false
}'

Expected response:

{
  "model": "mistral:7b",
  "response": "Let AI handle the repetitive. You focus on what only you can do.",
  "done": true,
  ...
}

That was free. No API key. No token limits. No network latency. Your LLM, your server, your data.

Step 2: Install n8n with Docker (10 Minutes)

n8n is the pipeline brain — a workflow orchestrator with a visual interface and 400+ integrations.

# Create directory for persistent data
mkdir -p ~/n8n-data

# Start n8n with Docker
docker run -d \
  --name n8n \
  --restart unless-stopped \
  -p 5678:5678 \
  -v ~/n8n-data:/home/node/.n8n \
  -e N8N_SECURE_COOKIE=false \
  -e N8N_HOST=localhost \
  -e N8N_PROTOCOL=http \
  n8nio/n8n:latest

Access http://localhost:5678 and complete the initial setup (create your admin user). In production you'll want HTTPS and external auth — we'll cover that in article 4.

Step 3: Connect Ollama to n8n

Inside n8n, go to Settings → Credentials → Add Credential:

Search for "Ollama"
In Base URL enter: http://localhost:11434 (or your server IP if Ollama is on another machine)
Save as Ollama Local

Step 4: Your First Flow — Webhook → AI → Response

Create a new workflow in n8n. Add these nodes:

1. Webhook Node

HTTP Method: POST
Path: test-ai
Response Mode: Last Node

2. Ollama Chat Model Node

Credential: Ollama Local
Model: mistral:7b
Options → Temperature: 0.7

3. Respond to Webhook Node

Connect the Ollama node output to this one

The flow looks like:

[Webhook] → [Ollama Chat Model] → [Respond to Webhook]

Activate it (toggle "Active" top-right) and test it:

curl -X POST http://localhost:5678/webhook/test-ai \
  -H "Content-Type: application/json" \
  -d '{
    "topic": "enterprise automation with AI",
    "platform": "instagram"
  }'

Response:

{
  "message": "67% of administrative tasks can be automated with AI. It's not magic — it's engineering. 🚀 #Automation #AI"
}

You just built your first content generation endpoint. In 30 minutes. Without writing a single line of code.

What's Coming in This Series

Article	What You'll Build
#2 — The Orchestrator Agent	An agent that understands your brand, audience, and tone. Structured JSON briefs instead of loose prompts.
#3 — Images That Sell	Brand-consistent image generation with Stable Diffusion + ComfyUI + custom LoRAs.
#4 — Approval Circuit	Human review system via WhatsApp: AI proposes, you approve (or request changes), it publishes.
#5 — Multi-Platform Publishing	Connection to Instagram, TikTok, and Google Sheets. Metrics, feedback loop, and scaling to daily content.

The Difference Between Playing and Producing

Playing with AI is asking ChatGPT to write something and marveling at the result. Producing with AI is having a system that:

Receives a product or campaign brief
Generates copy adapted to your brand voice
Creates images consistent with your visual identity
Requests your approval via WhatsApp
Publishes automatically to all your platforms
Records metrics to improve the next iteration

That's what you're going to build. You don't need a 5-person team. You need a well-designed pipeline, a GPU server, and this article series.

Ready to stop playing and start producing?

Want to implement this in your company? At Guayoyo Tech we build self-hosted generative AI pipelines for companies that need to scale content production without scaling payroll. Let's talk →

Next article: The Orchestrator Agent: Teaching Your AI to Understand Your Brand and Execute Briefs →