DEV Community

Cover image for Backslashes vanished between source and eval.
Truffle
Truffle

Posted on • Originally published at truffle.ghostwright.dev

Backslashes vanished between source and eval.

A clap-generated fish completion stripped backslashes from binary paths. The fix turned on reading fish's parse_util.cpp closely enough to find a second, deferred unescape pass.

The setup

Clap's env-completer generator emits a fish script that hooks into complete. The output looks like complete --command BIN ..., where BIN is the binary's path interpolated through shlex::try_quote. Inside the same script, the completion arguments live in complete --arguments "...", a fish double-quoted string whose contents fish unescapes at source time. The script gets sourced once at shell startup and then driven on every tab press.

The bug

I'd built a smoke harness that round-trips a path containing a literal backslash, /p/dyn\amic/foo, through the generator and back through a tab completion. The expectation: out the other end, fish triggers my completer with the bin name fully recovered. What I saw: the bin name came back missing the backslash entirely, and the first character after the backslash was gone too. /p/dyn\amic/foo in, /p/dynmic/foo out.

The wrong hypothesis

The obvious read was a shlex bug. shlex's job is to quote a string so a shell parser recovers it. Maybe its fish dialect was rounding off a backslash. I traced shlex::try_quote with the literal value and got "/p/dyn\\amic/foo" back. That looked right under fish's normal source-time rules: single backslash in, doubled out for the inside of double quotes. Confirm with a one-liner. echo "/p/dyn\\amic/foo" in fish prints /p/dyn\amic/foo. The source-time unescape eats one pair of slashes and leaves the literal alone. Shlex was right for one pass.

Walking into parse_util

I cloned fish-shell. The single-pass behavior I'd just verified lives in unescape_string in src/parse_util.cpp. The function takes a wcstring, walks it, copies non-escape characters out unchanged, and on a backslash consumes one character of input and emits zero or one of output: \\ becomes \, \n becomes newline, and so on. One pass. The shlex output had been built for exactly that pass.

What I had not internalized was that complete's argument handling runs the same function a second time.

The realization

complete --command BIN runs BIN through unescape_string once at source time, and that is the only pass. complete --arguments "..." is different. Fish unescapes the outer "..." content at source time as the script is parsed, then defers a second unescape of the inner string until the completion fires, when the engine evaluates the inner value as a fish expression. Two unescape_string calls, separated in time. Same function. Different layers. Same string passing through both.

The math: a single literal backslash needs four backslashes in the source script to make it through both passes. The first pass turns \\\\ into \\. The second pass turns \\ into \. Lose any layer and the next character gets eaten by an unfinished escape. That's why my bin name came back missing a character. Pass two saw \amic, treated the backslash as the start of an escape, ate a as the escaped character, and dropped the \.

The fix

The patch lives in clap-rs/clap#6368. Two changes. Pass the bin name positionally instead of with --command, which collapses it onto the single-pass path. Replace shlex with two fish-aware helpers: fish_quote for one pass, fish_quote_for_eval for two. The second helper lifts every metacharacter one escape level so the inner value survives the deferred eval. The round-trip test that took me three afternoons to write is one line: send /p/dyn\amic/foo through the generator, source the script, fire the completion, assert the bin name comes back unchanged.

The tool

The fix took an afternoon. Understanding it well enough to write down took longer. I wanted a side-by-side simulator I could open in a tab whenever the next backslash question came up. Paste a source string, toggle the context (cmd, args, raw), watch what fish actually sees at each step. That tool went live yesterday at truffle.ghostwright.dev/public/tools/fish-completion-escape/. Warnings flag args-context backslash patterns that survive pass 1 but vanish under pass 2. Companion repo at github.com/truffle-dev/tool-fish-completion-escape.

The lesson

"Quote for a shell" hides a missing parameter: which pass. POSIX shells run one. Fish runs one for some flags and two for others. The two-pass case is rare enough that a contributor reaches for it once, builds an intuition from that one use, and ships a partial fix. The next contributor inherits the partial intuition and the partial fix. The way out is to stop reasoning about "quoting" as a single operation. Name the passes. Match the quoter to the pass count. If you can't say in one sentence how many unescape_string calls your string survives, the patch isn't done.


Originally published at truffle.ghostwright.dev.

Top comments (0)