Visagan S

Posted on Mar 26

I Built a GlassWorm Detector — Here's How Invisible Unicode Attacks Actually Work

#security #javascript #vscode #opensource

Last week, I opened a VS Code extension file that looked perfectly normal. Five lines of clean JavaScript. A standard import, an activate function, a console.log. Nothing suspicious.

Except line 2 — an empty line — was carrying 246 bytes of hidden malicious code.

Not obfuscated. Not minified. Not buried in a dependency. Literally invisible. The characters were in the file, taking up space on disk, but my editor rendered them as nothing. A blank line. Empty air.

That's GlassWorm — the first self-propagating worm to use invisible Unicode characters to hide malware in VS Code extensions. It has infected 35,800+ machines across 5 waves since October 2025, compromised 151+ GitHub repositories, and as of March 2026, it's still spreading.

I spent the past week reverse-engineering the encoding technique, building detection tools, and creating an interactive educational demo. Everything is open-sourced. This article walks through what I found.

The trick in 60 seconds

Every character you type has a number — a Unicode code point. The letter A is U+0041. A space is U+0020. Your editor reads these numbers and draws something on screen.

But Unicode also has characters that are assigned numbers but draw nothing. They exist in the file. They take up bytes on disk. But when your editor tries to render them, it produces zero pixels. No dot, no space, no whitespace indicator — absolutely nothing.

GlassWorm abuses 16 of these characters, called Variation Selectors (U+FE00 through U+FE0F), to encode entire JavaScript payloads that pass every code review, every diff tool, and every linter in existence.

How the encoding actually works

I'm going to walk through this with a real example. Our payload is a harmless console.log, but the technique is identical to what GlassWorm uses to steal your GitHub tokens.

Step 1: Every character is a byte

'c' = 0x63    'o' = 0x6F    'n' = 0x6E    's' = 0x73
'o' = 0x6F    'l' = 0x6C    'e' = 0x65    '.' = 0x2E

Nothing surprising. Each character maps to a byte value.

Step 2: Split each byte into two nibbles

A nibble is 4 bits — half a byte. It can hold values 0 through 15 (0x0 to 0xF). Why does this matter? Because the next step only gives us 16 invisible characters to work with — exactly the range a nibble covers.

'c' = 0x63 = 01100011
                │         │
        ┌───────┘         └───────┐
   High nibble: 6            Low nibble: 3
   (byte >> 4)               (byte & 0x0F)

Two simple bitwise operations: shift right 4 for the high half, AND with 0x0F for the low half.

Step 3: Add 0xFE00 to each nibble

This is the core trick. Nibble 6 becomes 6 + 0xFE00 = U+FE06. Nibble 3 becomes 3 + 0xFE00 = U+FE03.

U+FE06 and U+FE03 are Variation Selector characters. They are real Unicode characters — they exist in the file, they consume 3 bytes each on disk (UTF-8 encoding: EF B8 86 and EF B8 83) — but every mainstream editor, terminal, and diff tool renders them as absolutely nothing.

So the letter 'c' has become two invisible characters. Do this for every byte in your payload, and the entire thing vanishes.

Step 4: Inject into the file

The invisible payload gets placed on what appears to be a blank line inside a normal-looking VS Code extension:

What VS Code shows you:          What's actually in the file:
─────────────────────────         ────────────────────────────
1 │ import * as vscode ...        1 │ import * as vscode ...
2 │                               2 │ ⚠ 82 INVISIBLE CHARACTERS
3 │ export function activate()    3 │ export function activate()
4 │   console.log('activated')    4 │   console.log('activated')
5 │ }                             5 │ }

Line 2 looks empty. It's not. It's carrying your entire malicious payload — 82 invisible characters encoding 41 bytes of executable JavaScript.

Step 5: The decoder brings it back

A small JavaScript function — visible in the source but innocuous-looking — reads each pair of invisible characters and reverses the encoding:

function decode(s) {
  let r = [];
  for (let i = 0; i < s.length; i += 2) {
    const high = s.codePointAt(i) - 0xFE00;     // invisible → nibble
    const low  = s.codePointAt(i + 1) - 0xFE00;
    r.push((high << 4) | low);                   // two nibbles → byte
  }
  return String.fromCharCode(...r);
}

// In real GlassWorm:
eval(decode(invisibleString));  // 💥 the hidden code executes

Five lines. That's all it takes. Subtract 0xFE00 to get the nibble back, shift-and-OR to reassemble the byte, convert to character, join, eval(). Game over.

What GlassWorm actually does with this

The invisible payload isn't a console.log. It's a multi-stage RAT called the ZOMBI module that:

Steals credentials — NPM tokens, GitHub PATs, OpenVSX tokens, Git credentials, SSH keys
Drains crypto wallets — targets 49 different wallet extensions
Installs SOCKS proxies — turns your machine into a relay for criminal traffic
Deploys hidden VNC servers — full remote access to your workstation
Self-propagates — uses stolen tokens to compromise more extensions and packages, creating exponential growth

The C2 infrastructure is equally sophisticated: Solana blockchain transaction memos (can't be taken down), Google Calendar as a backup channel (bypasses network monitoring), and BitTorrent DHT for decentralized server discovery.

And it avoids infecting machines with Russian locale. Make of that what you will.

What I built

I wanted to understand this deeply enough to detect it, so I built a complete toolkit:

1. Detection scanner

A zero-dependency Python scanner that runs 8 detection rules:

# Scan your VS Code extensions right now
python3 glassworm_scanner.py ~/.vscode/extensions/ --verbose

# JSON output for CI/CD
python3 glassworm_scanner.py ./src --json

It catches:

Variation selector clusters (3+ = suspicious, 10+ in JS = critical)
Decoder signatures near invisible characters
eval() + string construction patterns
Solana/blockchain C2 indicators in non-blockchain code
Suspicious postinstall lifecycle scripts
Mid-file BOM injection markers

Exit code 0 = clean, 1 = compromised. Drop it into any pipeline.

2. Git pre-commit hook

cp pre-commit-hook.sh .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit

Now any commit containing invisible Unicode payloads gets blocked before it hits your repo.

3. Interactive web explainer

This is the part I'm most proud of. A full interactive walkthrough where you can:

Type any string and watch it encode into invisible characters in real time
See the raw hex bytes on disk
Decode it back to prove the round-trip
Step through the nibble math character by character

Try it live: visagansp.github.io/glassworm-toolkit

4. Educational CLI demo

A 10-step terminal walkthrough that generates a safe "infected" file and scans it:

python3 glassworm_educational_demo.py

It creates a real file with invisible characters embedded in it, then runs the scanner to prove detection works. All payloads are harmless console.log statements.

Everything is on GitHub: github.com/visagansp/glassworm-toolkit

How to check if you're already infected

This takes 30 seconds:

# Clone the toolkit
git clone https://github.com/visagansp/glassworm-toolkit.git
cd glassworm-toolkit

# Scan your VS Code extensions
python3 glassworm_scanner.py ~/.vscode/extensions/ --extended

# Scan your node_modules
python3 glassworm_scanner.py /path/to/project/node_modules/

If you see CRITICAL findings, assume compromise. Rotate all tokens — NPM, GitHub, OpenVSX, SSH keys, cloud API keys — immediately. Check for rogue processes:

# Look for SOCKS proxies and VNC servers GlassWorm installs
netstat -tlnp | grep -E ':(1080|5900|5901|4444|8888)'
ps aux | grep -E 'node.*socks|vnc|proxy'

Prevention checklist

If you take away one thing from this article, do these:

1. Disable VS Code extension auto-update. GlassWorm's Wave 1 got onto 35,800 machines because extensions auto-updated to malicious versions. Go to Settings → extensions.autoUpdate → set to false.

2. Install the pre-commit hook. Three commands. Protects every repo you work in. Zero false positives on clean code in my testing.

3. Scan before you install. Before adding any extension or npm package, run the scanner on it. It takes seconds and needs zero dependencies.

4. Maintain an extension allowlist. If your team has more than a few developers, centralize which extensions are approved. Shadow extensions are how GlassWorm gets in.

5. Integrate into CI/CD. The scanner outputs JSON and uses exit codes. Plug it into GitHub Actions or GitLab CI in 5 lines of YAML.

What's next

GlassWorm is just one technique. The supply chain attack landscape is evolving fast. I'm working on a follow-up article covering the broader 2025–2026 attack timeline — from s1ngularity stealing AI assistant credentials, to Shai-Hulud's self-destructing npm worm, to the Trivy extension attack that weaponized AI coding assistants themselves through prompt injection. The common thread: developer environments are the new front line, and the attacks are getting creative in ways traditional security tooling wasn't built to handle.

If you found this useful, star the repo and share it with your team. The toolkit is MIT licensed — use it, fork it, improve it.

And go scan your extensions folder. Right now. You might be surprised what's hiding in the blank lines.

Toolkit: github.com/visagansp/glassworm-toolkit
Live demo: visagansp.github.io/glassworm-toolkit

I'm Visagan S — security engineer, pentester, and AI engineer. I build tools that help developers understand and defend against the attacks targeting their workflows. Connect with me on GitHub.

DEV Community