How to detect and protect ESP tokens across 5 different template syntaxes
When you're building a multilingual email workflow, one problem shows up immediately: every major email service provider uses a different syntax for personalization tokens. And every one of them will break silently if a translator touches the wrong characters.
This is the problem we solved building Transendly — a localization workspace for HTML email campaigns. Here's what we learned about token detection across the five most common ESP syntaxes.
The five syntaxes you'll encounter
1. Handlebars — SendGrid, Postmark
SendGrid Dynamic Templates and Postmark both use Handlebars-compatible syntax:
{{first_name}}
{{#if customer.plan}}
You're on the {{customer.plan}} plan.
{{/if}}
{{#each items}}
{{this.name}} — {{this.price}}
{{/each}}
{{{unescaped_html}}}
Key things to detect:
- Double-stache
{{variable}}— simple interpolation - Triple-stache
{{{variable}}}— unescaped HTML output (Postmark uses this for{{{pm:unsubscribe}}}) - Block helpers:
{{#if}}...{{/if}},{{#each}}...{{/each}} - Nested paths:
{{customer.first_name}}
The triple-stache is a common failure point. Translators working in raw HTML often "fix" the extra brace because it looks like a typo.
2. Django template tags — Klaviyo
Klaviyo uses Django-style syntax with a key difference: filter chaining.
{{ first_name }}
{{ first_name|default:"there" }}
{{ order_total|currency }}
{{ description|truncatewords:20 }}
{% if person.plan == "vip" %}
Exclusive content here.
{% endif %}
The |filter syntax is the dangerous part. A translator who sees {{ first_name|default:"there" }} will sometimes translate "there" — which is actually correct — but will also sometimes translate the filter name default or the variable name first_name. Both break the template.
Django tags also use spaces inside the braces ({{ variable }} not {{variable}}), which means your regex needs to handle both variants.
3. Liquid — Shopify Email, some ActiveCampaign flows
Liquid is used in Shopify's email templates and some other platforms:
{{ customer.first_name }}
{{ customer.email | upcase }}
{% if customer.orders_count > 1 %}
Thanks for being a returning customer.
{% endif %}
{% for item in order.line_items %}
{{ item.title }}: {{ item.price | money }}
{% endfor %}
Liquid looks similar to Django but has important differences:
- Filter syntax uses
|with a space on both sides:{{ value | filter }} - Block tags use
{% %}with the tag name:{% if %},{% for %} - Object access uses dot notation:
customer.first_name
The {% %} blocks are particularly problematic because translators sometimes interpret them as HTML comments or unknown tags and delete them.
4. Merge tags — Mailchimp, Constant Contact
Mailchimp uses a completely different pattern — asterisk-pipe delimiters:
*|FNAME|*
*|LNAME|*
*|EMAIL|*
*|UNSUB|*
*|MC:SUBJECT|*
*|IF:FNAME|* Hello *|FNAME|*, *|ELSE:|* Hello there, *|END:IF|*
This syntax stands out visually, which is good — translators usually recognize it as "code". But the conditional blocks (*|IF:...|*, *|ELSE:|*, *|END:IF|*) are often mishandled because they look more like markup than the simpler variable tags.
5. Percent-delimited — ActiveCampaign, some legacy ESPs
ActiveCampaign uses percent signs as delimiters:
%FIRSTNAME%
%LASTNAME%
%EMAIL%
%UNSUBSCRIBELINK%
%CUSTOM_FIELD_NAME%
This is the simplest syntax to detect but also the easiest to accidentally break. The uppercase convention helps — translators rarely translate uppercase strings — but %UNSUBSCRIBELINK% occasionally gets "translated" to %LIENDEDESABONNEMENT% in French workflows.
Detection approach
For each syntax, you need a regex that:
- Matches the full token including delimiters
- Handles nested or block structures
- Avoids false positives on similar-looking content
Here's a starting point for each:
import re
PATTERNS = {
# Handlebars: {{variable}}, {{{variable}}}, {{#helper}}...{{/helper}}
"handlebars": re.compile(
r'\{{2,3}[#/^]?\s*[\w.]+(?:\s+[\w"\'=\s,]+)?\s*\}{2,3}'
),
# Django: {{ variable }}, {{ variable|filter }}, {% tag %}
"django": re.compile(
r'\{%[-\s]*\w[\w\s"\'=,.|:()]*[-\s]*%\}|\{\{[-\s]*[\w.|:()"\' ]+[-\s]*\}\}'
),
# Liquid: {{ variable | filter }}, {% tag %}
"liquid": re.compile(
r'\{%-?\s*[\w\s"\'=,.|:()\-]+\s*-?%\}|\{\{-?\s*[\w.|:()"\'\ ]+\s*-?\}\}'
),
# Mailchimp merge tags: *|TAG|*, *|IF:TAG|*...*|END:IF|*
"mailchimp": re.compile(
r'\*\|[A-Z0-9_:]+\|\*'
),
# Percent-delimited: %VARIABLE%
"percent": re.compile(
r'%[A-Z][A-Z0-9_]+%'
),
}
def detect_esp(html: str) -> str | None:
"""Detect which ESP syntax is present in an HTML template."""
scores = {}
for name, pattern in PATTERNS.items():
matches = pattern.findall(html)
scores[name] = len(matches)
if not any(scores.values()):
return None
return max(scores, key=scores.get)
def extract_tokens(html: str, esp: str) -> list[str]:
"""Extract all tokens from an HTML template for a given ESP."""
if esp not in PATTERNS:
raise ValueError(f"Unknown ESP: {esp}")
return PATTERNS[esp].findall(html)
The protection strategy
Once you've detected and extracted tokens, the challenge is keeping them intact through translation while allowing surrounding text to be modified.
The approach we use:
1. Extract and replace with placeholders
def protect_tokens(html: str, esp: str) -> tuple[str, dict]:
"""Replace tokens with stable placeholders. Returns protected HTML and token map."""
tokens = {}
protected = html
for i, token in enumerate(extract_tokens(html, esp)):
placeholder = f"⟦T{i}⟧" # Use unusual characters unlikely to appear in translations
tokens[placeholder] = token
protected = protected.replace(token, placeholder, 1)
return protected, tokens
def restore_tokens(translated: str, token_map: dict) -> str:
"""Restore original tokens after translation."""
restored = translated
for placeholder, original in token_map.items():
restored = restored.replace(placeholder, original)
return restored
2. Validate after restoration
def validate_tokens(original: str, restored: str, esp: str) -> list[str]:
"""Check that all tokens from original are present in restored HTML."""
original_tokens = set(extract_tokens(original, esp))
restored_tokens = set(extract_tokens(restored, esp))
missing = original_tokens - restored_tokens
added = restored_tokens - original_tokens
errors = []
if missing:
errors.append(f"Missing tokens after translation: {missing}")
if added:
errors.append(f"Unexpected tokens after translation: {added}")
return errors
Edge cases that will burn you
Translated filter arguments in Django/Liquid
# Source:
{{ first_name|default:"there" }}
# After translation (broken):
{{ first_name|default:"là" }} # French translator translated "there"
The default fallback string is technically translatable content — but it's inside a token. You need to decide: protect the entire token including the argument, or extract just the argument for translation. We protect the entire token and handle fallback strings separately.
RTL languages and bidirectional tokens
Arabic and Hebrew email templates render right-to-left, but token syntax is always LTR. Browsers handle this with dir attributes and unicode bidi marks. If you're using a WYSIWYG translation interface, you need to ensure the token placeholders don't inherit RTL direction and render backwards.
Tokens inside HTML attributes
<a href="https://example.com/account/{{customer_id}}">View account</a>
<img src="{{product_image_url}}" alt="{{product_name}}">
Tokens inside href and src attributes are at higher risk. Some translation tools will attempt to "fix" URLs they detect, normalizing or encoding the token syntax in the process.
Conditional blocks that span multiple translated segments
{% if customer.plan == "premium" %}
You have access to all features.
{% else %}
Upgrade to unlock everything.
{% endif %}
If your translation tool splits segments at sentence boundaries, the {% if %} and {% endif %} may end up in different segments. The translator working on the first segment has no idea the second segment is conditional on the same block.
What we built
This is the core problem Transendly solves — a governed workflow where the token detection, extraction, and restoration happens automatically, translators work in a clean interface without seeing raw token syntax, and validation runs before any locale can be exported to the ESP.
If you're building something similar or have hit edge cases we haven't covered here, I'd be interested to hear about it in the comments.
Tags: email webdev javascript python
Top comments (0)