app.get('/users/:id', ...)is one of those one-liners every Node developer types a hundred times before wondering what it actually does. The answer: Express hands the string topath-to-regexp, which compiles it to/^\/users\/([^/]+)$/. That whole pipeline fits in 100 lines of vanilla JS. Reimplementing it surfaces the subtle bugs you've probably hit at least once — the:id?modifier swallowing the leading slash, the regex-meta character that wasn't escaped, the inline regex with an unbalanced paren. The result is a browser-only Express route tester with 23 unit tests pinning the boundaries.
🌐 Demo: https://sen.ltd/portfolio/regex-route/
📦 GitHub: https://github.com/sen-ltd/regex-route
Why rewrite path-to-regexp
The standard answer is "you don't need to, just use Express." That's correct until the day you hit one of these:
-
/users/:idmatches/users/42/extrain your mental model, but it actually doesn't —:idis exactly one segment. -
/users/:id?doesn't match/usersin your test, because you forgot that the optional?modifier has to absorb the leading slash too. -
You upgraded Express 4 → 5, which switched from
path-to-regexp@6to@7, and now/users/:idparses differently in a couple of subtle cases.
Walking through the parser by hand makes all three problems go away. And the parser is small enough to actually finish.
A 4-state direct-style parser
The supported grammar:
| Syntax | Meaning |
|---|---|
/foo/bar |
Static segments; regex meta chars are escaped. |
/users/:id |
Named param; matches one non-slash segment. |
/users/:id? |
Named param, optional. The leading / is absorbed too. |
/users/:id(\d+) |
Named param constrained to an inline regex. |
/files/* |
Wildcard; captures the rest of the path. |
Walking the pattern character-by-character:
while (i < pattern.length) {
const ch = pattern[i];
if (ch === ":") {
// Named param: ':name', ':name(regex)', ':name?', or combinations.
// ...
} else if (ch === "*") {
keys.push({ name: "wild", modifier: "*", custom: null });
re += "(.*)";
i++;
} else if (REGEX_META.includes(ch)) {
// Static text containing a regex meta char — escape it so the
// generated regex still matches literally.
re += "\\" + ch;
i++;
} else if (ch === "/") {
re += "\\/"; // optional for engine, but makes the printed regex copy/pasteable
i++;
} else {
re += ch;
i++;
}
}
Two points worth calling out:
-
Always escape regex meta chars in static text.
/foo.barwithout escaping the.would silently match/fooXbar. The article version of this bug usually shows up months later when an unexpected URL hits an unexpected handler. -
Always write
\/, not/in the generated regex. It's identical to the engine, but it lets me print the regex to the UI status line and have it be copy-pasteable into a newnew RegExp(...)call.
The :id? optional-segment trap
The naïve implementation makes /users/:id? into ^\/users\/([^/]+)?$. That matches /users/ (with trailing slash) but not /users. Almost certainly not what the user wanted.
The fix: when you see the ? modifier, walk back and absorb the preceding \/ into the optional group:
if (modifier === "?") {
if (re.endsWith("\\/")) {
re = re.slice(0, -2) + `(?:\\/(${seg}))?`;
} else {
re += `(${seg})?`;
}
}
Now /users/:id? becomes ^\/users(?:\/([^/]+))?$, which correctly matches both /users and /users/42. Pinned in the tests:
test("compilePath: :param? makes the segment + leading slash optional", () => {
const c = compilePath("/users/:id?");
assert.equal(matchPath(c, "/users").params.id, null);
assert.equal(matchPath(c, "/users/42").params.id, "42");
});
This single mistake is responsible for a non-trivial fraction of "why isn't my optional route working" Stack Overflow questions about Express.
Inline regex with paren balance
:id(\d+) is a parameter constrained to a custom regex. Naïve indexOf(")") breaks the moment someone writes a nested group like :date((\d{4})-(\d{2})). Use a depth counter and respect backslash escapes:
export function findMatchingParen(s, start) {
if (s[start] !== "(") return -1;
let depth = 0;
for (let i = start; i < s.length; i++) {
if (s[i] === "\\") { i++; continue; } // skip the escaped char
if (s[i] === "(") depth++;
else if (s[i] === ")") {
depth--;
if (depth === 0) return i;
}
}
return -1;
}
The if (s[i] === "\\") { i++; continue; } line is the one that gets forgotten: without it, :id(\\)) would be parsed as ending at the \) instead of the real close. Tested:
test("findMatchingParen respects backslash escapes", () => {
// (\)) — string length 4. Inner \) is escaped; the trailing ) closes.
assert.equal(findMatchingParen("(\\))", 0), 3);
});
Stripping query and hash before matching
/users/42?include=author should match /users/:id because Express does. But path-to-regexp itself doesn't strip the query — Express does that in middleware. For a standalone tester, we have to do the stripping ourselves before running the match:
const hashPos = url.indexOf("#");
const noHash = hashPos === -1 ? url : url.slice(0, hashPos);
const qPos = noHash.indexOf("?");
const pathPart = qPos === -1 ? noHash : noHash.slice(0, qPos);
const query = qPos === -1 ? "" : noHash.slice(qPos + 1);
The stripped query goes into the result object so the UI can show "query: ?include=author" as a separate line, without affecting whether the match succeeded.
URL-decoding captured values, safely
/users/%E5%B1%B1%E7%94%B0 is the percent-encoding for /users/山田. The captured group is the raw %E5%B1%B1%E7%94%B0, and the user wants to see the kanji. Run it through decodeURIComponent — but guard against malformed input:
try {
params[key.name] = decodeURIComponent(raw);
} catch {
// Lone '%' or other malformed percent-encoding raises URIError.
// Don't fail the match; surface the raw value instead.
params[key.name] = raw;
}
decodeURIComponent throws URIError on invalid sequences. Without the catch, anyone who pastes /users/foo%bar into the URL field would see the entire results table blow up.
test("matchPath: malformed percent-encoding returns the raw value", () => {
const c = compilePath("/users/:name");
const r = matchPath(c, "/users/foo%bar");
assert.ok(r !== null);
assert.ok(typeof r.params.name === "string");
});
The full API in five exports
// route.js (~100 lines)
export class CompileError extends Error { /* with pos */ }
export function findMatchingParen(s, start) { /* depth-tracking */ }
export function compilePath(pattern) { /* → {regex, keys, source, generated} */ }
export function matchPath(compiled, url) { /* → {url, pathPart, query, params} | null */ }
export function testRoute(pattern, url) { /* compile + match shortcut, no throws */ }
script.js wires the DOM: pattern input → debounce 80 ms → compile → re-render the results table. URL textarea → split on newline → match each line. The output is a side-by-side educational table that makes the route → regex mapping obvious.
TL;DR
- The static parts of route patterns need regex-meta escaping.
/foo.barmatching/fooXbaris a classic silent bug. -
:id?has to absorb the leading slash into the optional group, otherwise/userswon't match. - Inline-regex paren balance needs a depth counter and backslash-escape handling.
- Strip
?queryand#fragmentbefore matching; surface the query separately. - Always wrap
decodeURIComponentin a try/catch — malformed percent-encoding throws.
Source: https://github.com/sen-ltd/regex-route — MIT, ~350 lines of JS, 23 unit tests, no build step, zero runtime dependencies.
🛠 Built by SEN LLC as part of an ongoing series of small, focused developer tools. Browse the full portfolio for more.

Top comments (0)