DEV Community

Cover image for Building LLM Prompts From Enterprise Data in DataWeave: 2 Traps That Garbled My AI Output
ThaSha
ThaSha

Posted on

Building LLM Prompts From Enterprise Data in DataWeave: 2 Traps That Garbled My AI Output

I connected a MuleSoft API to an LLM last quarter for a support ticket classifier. The API call was easy — the MuleSoft AI Connector handles that. Building the prompt payload from enterprise data? That's where I spent 2 hours debugging escape sequences.

TL;DR

  • DataWeave transforms ticket data into structured LLM prompt payloads (system + user roles)
  • joinBy "\n" produces literal backslash-n in JSON — not actual newlines. The LLM sees one continuous line.
  • No token estimation → prompt consumes most of the context window → truncated response
  • The pattern builds system role, user role, model config, and structured response format in 12 lines

The Pattern: Enterprise Data to LLM Prompt

%dw 2.0
output application/json
var systemPrompt = "You are an enterprise support analyst."
var lines = payload.ticketHistory map (t) -> "- [$(t.priority upper)] $(t.id): $(t.subject)"
var userPrompt = "Analyze tickets for $(payload.customer.name):\n" ++ (lines joinBy "\n")
---
{
  model: payload.model,
  max_tokens: payload.maxTokens,
  messages: [
    {role: "system", content: systemPrompt},
    {role: "user", content: userPrompt}
  ]
}
Enter fullscreen mode Exit fullscreen mode

Input: customer object + ticket array + model config.
Output: ready-to-send LLM payload with system instructions and contextual user prompt.


100 production-ready DataWeave patterns with tests: mulesoft-cookbook on GitHub


Trap 1: joinBy "\n" Is Literal, Not a Newline

The prompt looks correct in the DataWeave Playground:

- [HIGH] TK-101: API timeout
- [MEDIUM] TK-098: OAuth refresh failing
- [LOW] TK-095: Batch stuck at 80 pct
Enter fullscreen mode Exit fullscreen mode

But the actual JSON sent to the LLM contains:

"content": "- [HIGH] TK-101: API timeout\n- [MEDIUM] TK-098: OAuth refresh failing\n- [LOW] TK-095: Batch stuck at 80 pct"
Enter fullscreen mode Exit fullscreen mode

Literal \n characters. Not newlines. The LLM sees one continuous string and its analysis is garbled — it can't distinguish between tickets.

I spent 2 hours wondering why the classification was wrong before I checked the raw HTTP request body.

The fix: Use an actual newline in a variable, or use template literals that preserve whitespace.

Trap 2: No Token Estimation

I injected 200 ticket summaries into one prompt. Each summary is ~20 tokens. That's 4,000 tokens just for the ticket list. max_tokens was set to 500 for the response.

The model's context window was 4,096 tokens. Prompt + response budget = 4,500 tokens. Didn't fit. The response was truncated mid-sentence.

The fix: Estimate prompt tokens before setting max_tokens:

var estimatedPromptTokens = sizeOf(userPrompt) / 4
var safeMaxTokens = 4096 - estimatedPromptTokens - 100
Enter fullscreen mode Exit fullscreen mode

When to Use This Pattern

Use it when Alternatives
MuleSoft AI Connector integration Direct API call with HTTP requester
Structured enterprise data → LLM prompt Hardcoded prompts (won't scale)
Dynamic context injection (tickets, customer data) Static system prompts
Multiple LLM providers (swap model field) Provider-specific SDK

100 patterns with MUnit tests: github.com/shakarbisetty/mulesoft-cookbook

60-second video walkthroughs: youtube.com/@SanThaParv

Top comments (0)