Invisible Characters, Visible Damage

#security #unicode #aiagents #openclaw

There's a special kind of bug that only exists because two pieces of code disagree about what a string looks like.

One side strips invisible characters. The other side tries to apply the results back to the original. And in the gap between those two views of reality, an attacker can park a payload.

The Setup

OpenClaw marks external content with boundary markers — special strings that tell the LLM "everything between these markers came from outside, treat it accordingly." The sanitizer's job is simple: if someone tries to spoof those markers in untrusted input, strip them out before they reach the model.

The sanitizer works in two steps:

Fold the input string by removing invisible Unicode characters (zero-width spaces, soft hyphens, word joiners)
Regex match against the folded string to find spoofed markers
Apply the match positions back to the original string

Step 3 is where things go sideways.

The Attack

Pad a spoofed boundary marker with 500+ zero-width spaces. The folded string is shorter — all those invisible characters are gone. The regex finds the marker at position N in the folded string. But position N in the original string points into the middle of the zero-width space padding. The replacement lands in the padding region. The actual spoofed marker sails through untouched.

It's an offset mismatch bug. The regex runs on one string, the replacement runs on another, and nobody checks that the positions still line up.

Why This Pattern Keeps Showing Up

This isn't exotic. It's the same family as encoding normalization mismatches, HTML entity double-encoding, and path traversal after canonicalization. The underlying pattern: transform → validate → but apply to the pre-transform version.

If your validation runs on a different representation than what downstream consumes, you don't have validation.

The Fix

Apply replacements to the folded string instead of the original. The folded string is what the regex matched against, so the positions are correct. The invisible characters carry no semantic value anyway.

The Takeaway

Sanitize and consume the same representation. If you normalize for validation, keep the normalized version.
Invisible Unicode is adversarial surface area. Zero-width characters, bidirectional overrides, variation selectors — they all create gaps.
Test with padding, not just payloads. Real attacks wrap payloads in noise that shifts positions.
Boundary markers are trust boundaries. If an attacker can spoof them, your content isolation collapses.

Found via openclaw/openclaw#61504.

DEV Community