CIZO

Posted on Apr 10

How We Architected an AI Engine That Generates 100+ Ad Creatives From a Single Brand Brief

#ai #llm #machinelearning #showdev

A technical breakdown of the layered AI pipeline behind a scalable creative strategy system — and what developers can steal from it.

We recently completed an internal AI platform for a performance marketing team managing multiple brands simultaneously. Their core problem was simple to state, hard to solve: they needed dozens of ad creative variations per campaign, but creative production was slow, manual, and impossible to scale.

The system we designed — an AI Creative Strategy Engine — now takes raw brand inputs and produces structured advertising assets (hooks, scripts, image ads, video concepts, UGC scripts) at volume. Here's how we built it, what architecture decisions we made, and what we'd do differently.

The Problem Space

Performance marketing on Meta and TikTok is a volume game. You don't launch one ad — you launch 20–50 variations, let them compete, kill the losers, double down on winners, and repeat. The bottleneck was never strategy. It was production throughput.

The old workflow looked like this:

Brand Brief
    → Marketing Strategy Discussion (days)
    → Creative Team Brainstorming (days)
    → Copywriting & Script Writing (days)
    → Design / Video Production (days)
    → Limited Creative Variations (3–5)
    → A/B Testing

By the time you had testable assets, the market had moved. And scaling this linearly — hiring more writers, more designers — was not a viable answer.

The Architecture: A Layered Intelligence System

The key insight was to stop thinking about this as "AI helping humans write ads" and start thinking about it as a structured data pipeline where brand intelligence flows through transformation layers and emerges as deployable assets.

Here's the full system architecture:

┌─────────────────────────────────────────────┐
│              BRAND INPUTS                   │
│  Website · Product Info · Personas          │
│  Onboarding Forms · Call Transcripts        │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────┐
│         BRAND INTELLIGENCE LAYER            │
│  Brand Voice Extraction                     │
│  Product Positioning                        │
│  Audience Understanding                     │
│  Messaging Framework                        │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────┐
│        CREATIVE INTELLIGENCE LAYER          │
│  Angle Mining                               │
│  Hook Framework Generation                 │
│  Emotional Trigger Analysis                 │
│  Winning Ad Pattern Library                 │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────┐
│            STRATEGY ENGINE                  │
│  Generates: Ad Hooks · Video Scripts        │
│  Creative Briefs · Campaign Concepts        │
│  Content Angles                             │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────┐
│          GOVERNANCE / QA LAYER              │
│  Brand Consistency Check                    │
│  Messaging Validation                       │
│  Quality Scoring                            │
│  Structured Formatting                      │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────┐
│          CREATIVE GENERATION                │
│  Static Image Ads · Video Ads               │
│  UGC Scripts · Creative Variants            │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────┐
│     CAMPAIGN DEPLOYMENT + FEEDBACK LOOP     │
│  Ad Performance → Winning Creatives         │
│  → Influence Future Strategy Generation     │
└─────────────────────────────────────────────┘

Each layer has one job. Outputs are structured. Nothing flows to the next stage without passing validation. Let's break each one down.

Layer 1: Brand Intelligence Extraction

This is the ingestion layer. We feed it:

Brand website (scraped + chunked)
Product documentation
Customer personas
Onboarding form responses
Sales/marketing call transcripts (Whisper-transcribed)

The LLM task here is extraction and structuring, not generation. The prompt engineering goal is to produce a stable, reusable brand object:

{
  "brand_voice": "Direct, empowering, slightly irreverent",
  "core_positioning": "Recovery tech for serious athletes",
  "audience_segments": [
    {
      "id": "seg_01",
      "label": "Competitive weekend warriors",
      "pain_points": ["DOMS", "slow recovery", "missed training days"],
      "language_patterns": ["grind", "bounce back", "next session"]
    }
  ],
  "messaging_pillars": ["Speed of recovery", "Science-backed", "Used by pros"],
  "avoid": ["Medical claims", "Before/after framing", "Aggressive pricing language"]
}

This brand object persists and is referenced by every downstream layer. Consistency comes from the data model, not from re-prompting every time.

Layer 2: Creative Intelligence — Angle Mining

Given the brand object, this layer generates a library of creative angles. An "angle" is a strategic lens through which to frame the product:

Pain-first: Lead with the problem (DOMS is killing your gains)
Social proof: Lead with credibility (Used by 40,000 athletes)
Curiosity: Lead with a surprising claim (Most foam rollers are doing it wrong)
Aspirational: Lead with the outcome (What if you recovered overnight?)
Contrarian: Challenge conventional wisdom (Ice baths might actually slow recovery)

Each angle maps to a set of emotional triggers pulled from a pattern library built from high-performing historical ads (CTR, ROAS, thumbstop rate).

def mine_angles(brand_object: dict, pattern_library: list[dict]) -> list[dict]:
    """
    Returns ranked creative angles for a brand,
    scored against historical pattern performance.
    """
    prompt = build_angle_mining_prompt(brand_object, pattern_library)
    raw = llm.complete(prompt)
    angles = parse_structured_output(raw, schema=AngleSchema)
    return rank_by_pattern_match(angles, pattern_library)

Layer 3: Strategy Engine — Hook + Script Generation

This is where volume happens. For each angle × audience segment combination, the engine generates:

3–5 ad hooks (the first 3 seconds of a video or first line of copy)
Full video scripts (15s, 30s, 60s variants)
Static ad copy (headline + body + CTA combinations)
UGC creator briefs (instructions for human creators)

The combinatorial math starts working in your favour here. With 5 angles × 3 audience segments × 4 hook variants = 60 unique concepts from a single brand brief, before you've touched image or video generation.

Prompt structure for hook generation:

System: You are a direct-response copywriter specialising in paid social.
        Output ONLY valid JSON. No preamble.

User: Given this brand context: {brand_object}
      And this creative angle: {angle}
      And this audience segment: {segment}

      Generate 5 ad hooks. Each hook must:
      - Be under 8 words
      - Create immediate pattern interrupt
      - Match the brand voice exactly
      - Trigger the emotional lever: {angle.trigger}

      Return as: {"hooks": [{"text": str, "rationale": str, "format": "video|static"}]}

Layer 4: Governance / QA Layer

This is the layer most teams skip. It is the layer that makes the system production-safe.

Every piece of generated content passes through three checks:

1. Brand consistency scoring

def score_brand_consistency(content: str, brand_object: dict) -> float:
    """
    Returns 0.0–1.0 score. Content below 0.75 is rejected and regenerated.
    Checks: voice match, avoid-list violations, pillar alignment.
    """

2. Regulatory / compliance check
For this client: no unsubstantiated health claims, no before/after framing (platform policy), no superlatives without evidence.

3. Structured formatting validation
Output must conform to the asset schema before being written to the creative library. A video script without a defined hook segment, body, and CTA does not pass.

class CreativeAsset(BaseModel):
    id: str
    type: Literal["video_script", "static_copy", "ugc_brief"]
    angle_id: str
    segment_id: str
    hook: str
    body: str
    cta: str
    brand_consistency_score: float
    approved: bool

Rejected assets are automatically regenerated with failure reason injected into the prompt context. In practice, ~12% of first-pass outputs are rejected and regenerated successfully on retry.

Layer 5: Creative Generation

Approved strategy outputs flow into generation:

Asset Type	Tool / Model
Static image ads	DALL-E 3 / Stable Diffusion (finetuned on brand assets)
Short-form video	Runway Gen-3 / Kling via API
UGC scripts	Passed to human creator network as structured briefs
Variant expansion	GPT-4o for copy variations on approved hooks

The creative generation layer is intentionally modular. We wrap each provider behind an interface:

class CreativeGenerator(Protocol):
    def generate(self, brief: CreativeBrief) -> GeneratedAsset: ...

class RunwayGenerator(CreativeGenerator):
    def generate(self, brief: CreativeBrief) -> GeneratedAsset:
        # Runway-specific implementation
        ...

class StableDiffusionGenerator(CreativeGenerator):
    def generate(self, brief: CreativeBrief) -> GeneratedAsset:
        # SD-specific implementation
        ...

When a better video model ships next month, you swap the implementation. The pipeline doesn't care.

Layer 6: The Feedback Loop

This is the layer that turns a tool into a system that learns.

After campaigns run, performance data (CTR, thumbstop rate, ROAS, hook retention) flows back and annotates the creative assets in the library:

def update_pattern_library(
    asset: CreativeAsset, 
    performance: CampaignMetrics
) -> None:
    """
    Winning creatives (top quartile ROAS) are decomposed:
    - Angle type extracted
    - Hook structure tagged
    - Emotional trigger logged
    - Added to pattern library with performance weight
    """
    if performance.roas >= WINNING_THRESHOLD:
        pattern = decompose_winning_creative(asset)
        pattern_library.upsert(pattern, weight=performance.roas)

The pattern library is what the Creative Intelligence Layer (Layer 2) references. The system gets better at generating hooks with every campaign that runs. This compounding loop is the actual product moat.

Tech Stack Summary

LLM Backbone:        GPT-4o (strategy/scripts) + Claude (QA/governance)
Orchestration:       LangChain / custom pipeline runner
Image Generation:    DALL-E 3 + Stable Diffusion (brand-finetuned)
Video Generation:    Runway Gen-3 API
Speech-to-Text:      OpenAI Whisper (for call transcript ingestion)
Data Layer:          PostgreSQL + pgvector (embeddings for pattern library)
API Layer:           FastAPI
Frontend:            Next.js (internal dashboard)
Queue:               Redis + Celery (async generation jobs)

What We Learned / What We'd Do Differently

What worked well:

The brand object as a persistent, reusable data structure. Every layer referencing a single source of truth eliminated inconsistency.
The governance layer. Building QA in as a pipeline stage (not a manual review step) was the right call for production safety.
Modular generator interfaces. We've already swapped two models since launch without touching pipeline logic.

What we underestimated:

Prompt versioning. Prompt changes break downstream output schemas. Treat prompts like code — version control them, test them, deploy them deliberately.
Latency in multi-step pipelines. When you chain 6 LLM calls sequentially, latency compounds. We ended up parallelising Layers 2 and 3 significantly and moving to async generation jobs.
The cost of regeneration. The 12% rejection-and-regeneration rate adds up. A tighter governance check earlier in the pipeline (pre-generation rather than post) would have been cheaper.

🔄 What we'd architect differently:

Move brand consistency scoring into the generation prompt context rather than as a post-generation filter. Prevention > correction.
Build the feedback loop instrumentation in from sprint 1, not as a v2 feature. The pattern library is only as good as the data flowing into it.

The Broader Point

The system works because it was designed as a data pipeline, not a chatbot with a marketing skin. Each layer has one responsibility, outputs are typed and validated, and the feedback loop creates compounding improvement over time.

These are not novel engineering ideas — they are software engineering fundamentals applied to AI workflows. The teams shipping durable AI products are the ones who treat LLMs as components in a system, not as magic boxes that answer questions.

At CIZO, we design and build AI-powered mobile applications — from architecture and LLM integration to deployment. If you're building an AI product and want to talk architecture, we're always up for a conversation.

Tags: #ai #llm #machinelearning #showdev #productivity

Cover image suggestion: A dark diagram showing the 6-layer pipeline with glowing connectors — or a split showing "manual workflow" vs "AI pipeline" side by side.

DEV Community