Most tutorials about AI + web development show the same pattern: give the LLM a prompt, get back HTML, paste it somewhere. It works for prototypes. It breaks in production.
I've been building a system that generates full WordPress sites from text descriptions. Early on, I made the obvious choice: have the AI generate HTML. It took about two weeks to realize that was a terrible idea.
Here's what I learned, and the architecture I ended up with instead.
Why generating HTML is a trap
When you ask an LLM to generate a web page, it produces markup. Something like:
<div class="hero-section" style="background: #1a1a2e; padding: 80px 40px;">
<h1 style="color: white; font-size: 48px;">Artisan Gelato</h1>
<p style="color: #ccc;">Handcrafted flavors since 1987</p>
<a href="/menu" class="btn" style="background: #e63946;">View Menu</a>
</div>
Looks fine. Now try to:
Edit it with a visual editor. Elementor, Gutenberg, or any page builder will choke on inline styles and non-standard class names. The user can't drag-and-drop edit something that wasn't built with the editor's data model.
Update the theme. When the theme updates, your generated HTML stays frozen in time. Fonts change, spacing changes, colors change — everywhere except your AI-generated sections.
Make it responsive. The AI generated desktop markup. Mobile? Tablet? You need media queries that reference the AI's arbitrary class names. Good luck maintaining that.
Keep it consistent. Generate 5 pages with 5 separate prompts. Each one uses slightly different class names, different heading hierarchies, different spacing values. There's no design system — just 5 independent HTML blobs.
HTML generation works for one-off demos. It doesn't work for production sites that need to be maintained, edited, and updated by non-technical users.
The alternative: generate data, not markup
Instead of asking the AI to produce HTML, I have it produce a structured JSON schema that maps to WordPress constructs.
Here's the same hero section, but as structured data:
{
"type": "section",
"template": "hero",
"settings": {
"background_color": "#1a1a2e",
"padding": {"top": 80, "bottom": 80, "left": 40, "right": 40},
"text_align": "center"
},
"elements": [
{
"type": "heading",
"content": "Artisan Gelato",
"tag": "h1",
"style": {"color": "#ffffff", "font_size": 48}
},
{
"type": "text",
"content": "Handcrafted flavors since 1987",
"style": {"color": "#cccccc"}
},
{
"type": "button",
"content": "View Menu",
"url": "/menu",
"style": {"background": "#e63946", "color": "#ffffff"}
}
]
}
This JSON gets translated by a rendering layer into whatever the target system needs — Elementor widgets, Gutenberg blocks, WooCommerce products, WordPress custom fields.
Why this works better
1. The visual editor understands it
When the JSON is translated into Elementor's data format, the result is a native Elementor section. The user opens the page builder and sees editable widgets — not a frozen HTML block. They can drag elements, change colors, edit text, exactly like any other Elementor page.
JSON schema → Translator → Elementor data model → Native editable page
The translator is a mapping layer that knows how to convert each JSON element type into its Elementor equivalent:
-
"type": "heading"→ Elementor Heading widget -
"type": "text"→ Elementor Text Editor widget -
"type": "button"→ Elementor Button widget -
"type": "section"with"template": "hero"→ Elementor Section with specific column structure
2. Theme changes don't break anything
The JSON doesn't contain theme-specific code. The translator reads the current theme's settings (fonts, colors, spacing defaults) and applies them during rendering. When the theme updates, the translator re-renders with the new settings. Nothing breaks.
3. Consistency is built in
Every page goes through the same translator. The same "type": "heading" always produces the same Elementor widget with the same base settings. No more five-pages-five-different-approaches problem.
4. The AI's job is simpler
Generating valid JSON with a known schema is much easier for an LLM than generating valid, semantic, accessible, responsive HTML. The error rate drops dramatically.
With HTML generation, about 8-12% of outputs had issues — unclosed tags, broken nesting, inaccessible markup. With JSON generation against a strict schema, the error rate is under 1%.
The prompt architecture
Getting an LLM to produce reliable structured output requires more than "generate JSON." Here's the approach that works in production.
System prompt: define the schema
You are a web architect. You generate website structures as JSON.
Every response must be valid JSON matching this schema exactly:
- Root: array of sections
- Each section: type, template, settings, elements
- Each element: type, content, style (optional), children (optional)
Valid section templates: hero, features, testimonials, pricing,
cta, gallery, team, faq, contact, footer
Valid element types: heading, text, button, image, icon, list,
card, column, spacer, divider, form
NEVER include HTML tags in content fields.
NEVER add properties not in the schema.
ALWAYS use the exact property names specified.
User prompt: describe the project
Generate a website structure for:
"Artisan gelato shop in Florence. Seasonal ingredients from
local farms. Open since 1987. Need: homepage with hero,
menu page, about page with our story, contact page."
For each page, generate the complete section structure
with appropriate templates and real, relevant content.
Key prompt engineering decisions
Temperature 0.5 for structure, 0.7 for content. I actually make two passes. First pass: generate the page structure (which sections, which templates, how many elements) at low temperature for consistency. Second pass: generate the actual copy at slightly higher temperature for creativity.
Explicit constraints reduce errors. "NEVER include HTML tags" catches the LLM's tendency to sneak in <br> and <strong> tags inside content fields. "NEVER add properties not in the schema" prevents hallucinated fields that the translator doesn't know how to handle.
Examples in the system prompt. I include 2-3 examples of correctly formatted output. This alone reduced schema violations by ~60%.
WooCommerce: where structured data really shines
For ecommerce, the gap between HTML generation and structured data is even bigger.
An AI-generated WooCommerce product as HTML is useless — it's just text that looks like a product page. An AI-generated product as structured data actually creates a WooCommerce product:
{
"type": "product",
"name": "Pistachio Gelato - 500ml",
"price": 8.50,
"category": "Classic Flavors",
"description": "Made with Bronte pistachios...",
"sku": "GEL-PIST-500",
"weight": 0.6,
"stock_status": "instock",
"images": ["pistachio-gelato.jpg"],
"attributes": {
"size": "500ml",
"allergens": "Tree nuts, Milk"
}
}
This gets imported directly via the WooCommerce REST API. The product appears in the shop with all its data — price, SKU, categories, attributes — correctly set. The user can manage it from the standard WooCommerce dashboard.
Compare that to generated HTML of a product card. What do you do with it? Paste it somewhere? It doesn't connect to inventory, it doesn't process orders, it doesn't show up in the WooCommerce product list.
The translator in practice
The translator is the bridge between the AI's JSON output and WordPress. Here's a simplified version of how it works for Elementor:
def json_to_elementor(section):
"""Convert a JSON section to Elementor data structure."""
template_map = {
"hero": {"columns": 1, "min_height": 500},
"features": {"columns": 3, "min_height": 0},
"testimonials": {"columns": 1, "min_height": 0},
"pricing": {"columns": 3, "min_height": 0},
}
element_map = {
"heading": "heading", # Elementor widget type
"text": "text-editor",
"button": "button",
"image": "image",
"icon": "icon",
}
config = template_map.get(section["template"], {})
elementor_section = {
"elType": "section",
"settings": {
"background_color": section["settings"].get("background_color"),
"padding": section["settings"].get("padding"),
"min_height": {"size": config.get("min_height", 0)},
},
"elements": [] # Columns with widgets
}
# Create columns and distribute elements
for element in section["elements"]:
widget = {
"elType": "widget",
"widgetType": element_map[element["type"]],
"settings": translate_settings(element)
}
elementor_section["elements"].append(widget)
return elementor_section
The actual production translator is more complex (handles nested elements, responsive settings, theme integration), but the principle is the same: map structured data to platform-native constructs.
Error handling
Even with good prompts, LLMs occasionally produce invalid output. The system handles this with three layers:
Layer 1: JSON validation. Parse the response. If it's not valid JSON, retry with the same prompt (up to 2 retries). This catches ~90% of failures.
Layer 2: Schema validation. Validate against the expected schema. If a required field is missing or a value is the wrong type, fill in sensible defaults rather than failing. A missing font_size gets the theme default. A missing background_color gets transparent.
Layer 3: Rendering validation. After the translator produces Elementor data, validate that the output is well-formed. Check for empty sections, orphaned elements, or invalid widget types. Remove anything that would break the page builder.
In production, about 98.5% of generations succeed on the first attempt. Another 1% succeed on retry. The remaining 0.5% need manual review — usually because the AI generated content that's valid structurally but doesn't make sense contextually.
Results
Some numbers from production:
| Metric | HTML generation | JSON + translator |
|---|---|---|
| Valid output rate | ~88% | ~98.5% |
| Editable in page builder | No | Yes |
| Theme-compatible | Fragile | Fully |
| Average generation time | ~45s | ~60s (two passes) |
| User edit rate after generation | ~15% | ~65% |
The last metric is the most telling. When users receive a generated page they can actually edit (because it's native Elementor), 65% of them make at least one modification. When they receive an HTML blob, only 15% try to change anything — most give up because they can't figure out how.
The extra 15 seconds of generation time (for the two-pass approach + translation) is worth it.
Takeaways
If you're building anything that uses LLMs to generate web content for a CMS:
Don't generate markup. Generate structured data and translate it to the target platform's native format.
Define a strict schema. The tighter the schema, the more reliable the output. Give the LLM less room to improvise on structure.
Use two passes. Structure first (low temperature), content second (higher temperature). Mixing them in one pass increases both structural errors and content blandness.
Build a robust translator. The translator is where the real engineering lives. It needs to handle every edge case, fill defaults for missing values, and produce output that the visual editor treats as first-class.
Validate at every layer. JSON parsing, schema validation, rendering validation. Each layer catches a different class of errors.
The AI generates the what. The translator handles the how. Keeping them separate is what makes the system reliable at scale.
I've been running this architecture in production at Megify, where we generate WordPress sites and WooCommerce stores from text descriptions. If you want to see how the structured data maps to actual WordPress constructs, there's a technical overview here.
Happy to answer questions about the schema design, prompt engineering, or the translation layer.
Top comments (0)