DEV Community

quarktimes
quarktimes

Posted on

I Stopped Fighting Prompts: Locking Down Markdown with Jinja2

We faced a recurring issue in our content generation pipeline: the LLM frequently outputted malformed Markdown. Unclosed code blocks, broken list levels—you name it. Relying solely on Prompt engineering became a game of whack-a-mole that we couldn't win.

The core problem? Asking an LLM to generate Markdown is a probabilistic process. A Prompt is a "soft constraint." No matter how well you phrase it, a slight token fluctuation can break the syntax, causing frontend crashes.

The Shift: Data vs. Presentation

We realized we were violating the Single Responsibility Principle. We were asking the model to do two jobs:

  1. Understand the content and generate data.
  2. Format that data into valid Markdown syntax.

Models are great at semantics but terrible at strict formatting rules. So, we decoupled them.

Solution 1: Jinja2 for Deterministic Rendering

Instead of asking the LLM to write Markdown, we switched to JSON output and let Jinja2 handle the rendering.

Before (Probabilistic):

# LLM generates raw text - hope for the best
prompt = "Write an article about {topic} in Markdown format."
response = llm.generate(prompt) 
Enter fullscreen mode Exit fullscreen mode

After (Deterministic):

# LLM outputs structured data only
prompt = "Output data about {topic} in JSON format."
json_data = llm.generate(prompt) 

# Jinja2 enforces the syntax
md_content = jinja_env.get_template('article.md').render(data=json_data)
Enter fullscreen mode Exit fullscreen mode

This moved the formatting from a "maybe" to a "definitely." If the template is correct, the Markdown is correct.

Solution 2: The Format Sanitizer Pipeline

Just in case (and for legacy compatibility), we added a post-processing layer with regex validation. It acts as a safety net for unclosed code fences.

def sanitize_markdown(text):
    # Check if code blocks are properly closed
    if not re.search(r'```

[\s\S]*?

```', text):
        # Attempt to wrap raw code in fences
        text = re.sub(r'(^.*$)', r'```

\n\1\n

```', text)
    return text

final_markdown = sanitize_markdown(llm_output)
Enter fullscreen mode Exit fullscreen mode

Bonus: Handling Heterogeneous Data Sources

While fixing the text generation, we also noticed a logic gap in our stock data queries. We treated A-shares, ETFs, and Hong Kong stocks identically. This caused failures because:

  • ETFs need .SH or .SZ suffixes.
  • HK stocks require a separate auth API.

We implemented a router at the query entry point:

def get_stock_data(code):
    # Route HK stocks to specific API
    if is_hk_stock(code):
        return hk_api.get_price(code)

    # Append suffix for ETFs if missing
    elif ".SH" not in code and ".SZ" not in code:
        code = f"{code}.SH" 

    return api.get_price(code)
Enter fullscreen mode Exit fullscreen mode

The Results

By shifting from "Prompt Optimization" to "Engineering Hard Constraints":

  • We processed 50k requests in 2 weeks.
  • Format error rate dropped from 3% to 0%.
  • P99 latency stayed at a manageable 200ms.

Key Takeaway

If you are fighting with LLMs to output perfect HTML or Markdown, stop. Use the LLM for what it's good at—generating structured JSON data—and use a template engine like Jinja2 to enforce the view layer. It turns a probabilistic headache into a deterministic pipeline.

Top comments (0)