SEN LLC

Posted on Apr 15

HTTP Cookies in the Wild Are a Mess — I Built a CLI That Eats Them

#typescript #node #cookies #tutorial

HTTP Cookies in the Wild Are a Mess — I Built a CLI That Eats Them

A zero-dependency TypeScript CLI that takes a raw Cookie: header from
devtools, curl -v, or an nginx log and pretty-prints every cookie, auto-
decoding JWTs, base64 blobs, and URL-encoded values in place.

📦 GitHub: https://github.com/sen-ltd/cookie-decode

I spend more time than I'd like in the middle of authentication bug hunts,
and those hunts all start the same way: I copy a line out of devtools or a
curl -v 2>&1 run and end up staring at something like this.

Cookie: session=9f8d2a1b4c5e6f7a8b9c0d1e2f3a4b5c; theme=dark; token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c; lang=en%2DUS; cart=12

There's a session id in there, a JWT, a URL-encoded locale, a literal
counter, and the word "dark". To make sense of it I have to mentally split
the string on ;, eyeball each =, paste the JWT into jwt.io, paste the
locale into a URL decoder, and ignore the ones that look opaque. The
fifteenth time I did that in a week I wrote cookie-decode — a CLI that
does exactly that dance, in the terminal, with zero runtime dependencies.

This post is a walkthrough of what I learned writing it: RFC 6265 is kinder
than I remembered, JWT detection can be done by shape alone in about ten
lines, the "is this base64?" problem is mostly about ruling things out, and
the single biggest source of cookie confusion I see on StackOverflow is
people mixing up Cookie: with Set-Cookie:.

The problem

If you've ever debugged an authenticated request, you've had to answer some
version of what is in the cookie jar, actually? The jar can contain
anything — the spec is deliberately opaque — and apps often stuff four
different kinds of things in it:

Session ids: opaque hex or UUID-looking strings, typically 32–64 chars.
JWTs: three base64url segments joined by dots, often the actual access token for SPAs that gave up on Authorization: headers.
Base64 blobs: signed or encrypted session state from Rails, Django, express-session, or the dozens of frameworks that roll their own.
Tiny structured values: counters, locale codes, theme names, feature-flag bitmaps. These are the ones most likely to be URL-encoded.

When something is wrong — a CSRF failure, a SameSite issue, a JWT that
refuses to validate — you need to see all of it at once, not pick one cookie
to investigate at a time. That's the gap cookie-decode fills.

`Cookie:` is not `Set-Cookie:` and this is the #1 thing people get wrong

Before I wrote any parsing code I re-read RFC 6265 §4.2, because I kept
tripping over the same confusion: every "cookie parser" on npm has slightly
different opinions about attributes like Path, Domain, HttpOnly,
Secure, SameSite. And that makes sense — because those attributes don't
exist in the Cookie: header at all. They exist only in Set-Cookie:.

Set-Cookie: is a response header. The server sends it to tell the browser "here is a new cookie, and by the way it expires at 3pm and it's only valid on /api." It has metadata attributes.
Cookie: is a request header. The browser sends it back to the server on subsequent requests, and it contains only name=value pairs, separated by ;. That's it. There are no attributes, no metadata, no expiry. The browser already applied all the rules.

This turns the "parse a cookie header" problem into basically string
splitting. The whole parser for Cookie: is 60 lines, including empty-
value handling and tolerating the quirks people paste in. The parser for
Set-Cookie: is a whole other animal — that's why packages like
set-cookie-parser exist — but I don't need any of that here.

The parser

Here's what RFC 6265 §4.2.1 actually says:

cookie-header = "Cookie:" OWS cookie-string OWS
cookie-string = cookie-pair *( ";" SP cookie-pair )
cookie-pair   = cookie-name "=" cookie-value

In practice, browsers are lenient, so a parser that wants to eat whatever
humans paste at it has to be even more lenient. Here's the whole thing
(lightly trimmed for the post):

export interface Cookie { name: string; value: string; size: number; }

export function parseCookieHeader(header: string): ParseResult {
  const cookies: Cookie[] = [];
  const malformed: string[] = [];

  let input = header.trim();
  // devtools copies the literal "Cookie: " prefix — eat it.
  if (/^cookie\s*:/i.test(input)) {
    input = input.replace(/^cookie\s*:\s*/i, '');
  }
  if (input === '') return { cookies, malformed };

  for (const rawPart of input.split(';')) {
    const part = rawPart.trim();
    if (part === '') continue; // tolerate ";;" and trailing ";"

    const eq = part.indexOf('=');
    if (eq < 0) { malformed.push(part); continue; }

    const name = part.slice(0, eq).trim();
    if (name === '') { malformed.push(part); continue; }

    const rawValue = part.slice(eq + 1).trim();
    const value = unquote(rawValue); // strip matched DQUOTEs
    cookies.push({ name, value, size: Buffer.byteLength(value, 'utf8') });
  }
  return { cookies, malformed };
}

Things worth noting:

Split on ;, not on ;. The spec says "; " but real servers send ; without the space, especially when they concatenate cookies themselves. I split on ; and then trim each piece.
indexOf('='), not split('='). A cookie value is legally allowed to contain =. This usually happens with base64 padding. If you split on = you'll throw away a character per padding position, which is a fun bug to debug later.
Unquote paired "...", because quoted values are legal and values with spaces or commas in them will show up quoted.
Size is byte length, not char length. A cookie full of kanji gets much bigger than it looks, and cookie size matters because the 4 KB per-domain limit is the first thing you run into in production.

Malformed pairs (bare tokens, empty names) are preserved in a separate list
so I can warn about them without dropping them silently.

JWT detection by shape

A JWT is three base64url segments joined by dots: header.payload.signature.
The first two decode to JSON; the third is the signature, which is just
bytes. So detection is a regex plus one JSON parse:

const JWT_RE = /^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]*$/;

export function detectJwt(value: string) {
  if (!JWT_RE.test(value)) return null;
  const [h, p] = value.split('.');
  try {
    const header = JSON.parse(base64urlDecode(h).toString('utf8'));
    if (typeof header !== 'object' || !('alg' in header)) return null;
    const alg = String(header.alg);

    let payloadText: string;
    try {
      payloadText = JSON.stringify(JSON.parse(base64urlDecode(p).toString('utf8')));
    } catch {
      payloadText = base64urlDecode(p).toString('utf8');
    }
    return { alg, payload: payloadText };
  } catch {
    return null;
  }
}

There are two subtleties:

a.b.c without an alg isn't a JWT. A lot of random values match the dot-segmented shape — I've seen tracking ids and GA cookies that look like it. Requiring the first segment to decode to a JSON object with an alg field is a cheap and reliable filter.
Base64url, not base64. RFC 7515 §2 Appendix C defines a URL-safe variant: + → -, / → _, no = padding. Node's Buffer doesn't understand it natively, so I wrote a ~15-line base64urlDecode that re-adds padding and swaps the characters before calling Buffer.from(.., 'base64'). It's the same trick jwt-inspect uses; I lifted it directly.

This detector does not verify signatures. That's
jwt-inspect's job, and keeping
cookie-decode signature-ignorant is what lets it stay zero-dependency.

Base64 detection is mostly about ruling things out

This one gave me the most grief. "Does this string look like base64?" has
no precise answer, because the base64 alphabet is almost a superset of
English lowercase letters and digits. You have to add heuristics until the
false-positive rate drops to something tolerable. Mine:

Minimum length 8 (shorter strings match by accident constantly).
Alphabet: either standard base64 [A-Za-z0-9+/=] or base64url [A-Za-z0-9_-].
Require at least one digit or mixed case in the input. This rule alone rules out "password", "sessionid", "darktheme", and other lowercase English words that are technically valid base64.
For standard base64, length must be a multiple of 4, and padding must be ≤ 2 characters. For base64url, len % 4 == 1 is impossible.
Round-trip check: after decoding, re-encode and compare to the original (modulo padding). Real base64 round-trips exactly; random strings that happen to pass all the previous checks usually don't.

Even with all that, I still see occasional false positives on things like
high-entropy session ids, so the dispatcher checks session-like (UUID or
long hex) before base64 — the more specific label wins.

The decoded preview is also a heuristic. If the decoded bytes are mostly
printable ASCII (> 85%), I show them as text; otherwise I show a hex dump
truncated to 80 characters. Binary cookie values happen — protobufs,
compressed state, MsgPack — and showing 4 KB of raw bytes to the terminal is
not helpful.

The JSON formatter

The --format json output exists so you can pipe cookie-decode into
jq, which is what you actually want when you're scripting anything.
Nothing clever — just shape the output and let JSON.stringify handle it:

export function formatJson(items: FormattedCookie[], opts: FormatOptions) {
  const out = items.map(({ cookie, detection }) => ({
    name: cookie.name,
    value: cookie.value,
    size: cookie.size,
    hints: detection.hints,
    decoded: opts.decode ? detection.decoded : null,
  }));
  return JSON.stringify(out, null, 2) + '\n';
}

cookie-decode "session=abc; token=eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJ4In0.sig" --format json \
  | jq '.[] | select(.hints[0] == "jwt") | .decoded'

That's the line I actually use in my own bash history. Pull out just the
JWTs from whatever the browser just sent.

Tradeoffs I picked on purpose

No signature verification. Use jwt-inspect (entry #138) for that. Keeping this tool pure-decoding means zero dependencies and no secret material lying around in shell history.
No Set-Cookie: support. If you paste a Set-Cookie into cookie-decode, it'll do its best, but it won't understand attributes like SameSite=Lax. That's a different tool. The README is loud about this because it's the question I expect to get.
Base64 heuristic false positives. A high-entropy alnum string that happens to have mixed case, a digit, and the right length can be labelled [base64] even if it isn't. I've accepted this — the decoded preview makes it obvious when it's wrong, and the alternative is labelling real base64 cookies as [plain].
No encryption support. Rails signed cookies, encrypted Django sessions, Laravel cookies — these are base64-then-ciphertext. I show the base64 decode, which is unhelpful bytes. That's by design: I'm not going to start shipping framework-specific secret handlers.
No cookie-attribute awareness. See above.

Try it in 30 seconds

docker build -t cookie-decode https://github.com/sen-ltd/cookie-decode.git

docker run --rm cookie-decode \
  "session=9f8d2a1b; theme=dark; token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxIn0.sig; lang=en%2DUS"

You should see each cookie tagged with its detected hint, the JWT rendered
with its decoded header, and the URL-encoded lang shown as en-US.

The whole thing is ~500 lines of strict TypeScript with 63 vitest cases,
runs as a non-root user in a 136 MB Alpine image, and has zero runtime
dependencies. It's one of a running series of zero-dep TS CLIs I'm writing
for the SEN LLC portfolio, next to jwt-inspect, hexview, and
webhook-signer. Each one solves a problem I hit repeatedly in debugging
and is small enough to read end-to-end in an afternoon.

Happy cookie hunting.

DEV Community

HTTP Cookies in the Wild Are a Mess — I Built a CLI That Eats Them

HTTP Cookies in the Wild Are a Mess — I Built a CLI That Eats Them

The problem

`Cookie:` is not `Set-Cookie:` and this is the #1 thing people get wrong

The parser

JWT detection by shape

Base64 detection is mostly about ruling things out

The JSON formatter

Tradeoffs I picked on purpose

Try it in 30 seconds

Top comments (0)

HTTP Cookies in the Wild Are a Mess — I Built a CLI That Eats Them

The problem

Cookie: is not Set-Cookie: and this is the #1 thing people get wrong

The parser

JWT detection by shape

Base64 detection is mostly about ruling things out

The JSON formatter

Tradeoffs I picked on purpose

Try it in 30 seconds

`Cookie:` is not `Set-Cookie:` and this is the #1 thing people get wrong