DEV Community: OptiRefine

How AuraWatch Finds Security Bugs Your Linter Will Never Catch

OptiRefine — Tue, 05 May 2026 20:14:10 +0000

How AuraWatch Finds Security Bugs Your Linter Will Never Catch

Most security tools for developers fall into one of two camps: static linters that pattern-match on surface-level code, or AI-powered scanners that guess based on training data. Both have the same fundamental problem — they can't reason about your code. They can tell you that eval() exists, but they can't tell you whether user input actually reaches it.

AuraWatch is built differently. It uses a six-engine deterministic analysis pipeline that constructs a full semantic model of your code and reasons about it mathematically. No guessing. No false positives from pattern matching. Provable results.

AuraWatch launches on the VS Code Marketplace on May 15th, 2026.

Here's how it works under the hood.

The Problem With Existing Tools

Consider this code:

app.post('/search', async (req, res) => {
    const { term } = req.body;
    const results  = await db.query(`SELECT * FROM products WHERE name = '${term}'`);
    res.json(results);
});

A regex-based linter might flag the template literal inside query() if it's lucky. But what about this:

app.post('/search', async (req, res) => {
    const term    = req.body.term;
    const cleaned = term.trim();
    const results = await db.query(`SELECT * FROM products WHERE name = '${cleaned}'`);
    res.json(results);
});

The data moved through an intermediate variable. Most linters miss this entirely because they're not tracking data flow — they're just looking for patterns. AuraWatch catches both because it builds a complete picture of how data moves through your code.

The Six-Engine Pipeline

Every scan runs six independent engines in parallel. Each engine looks at the code from a different angle. Their results are merged and deduplicated before being returned.

Engine 1 — LST Analyzer (Lossless Semantic Tree)

The foundation of AuraWatch is a Lossless Semantic Tree, built using ts-morph for JavaScript/TypeScript and LibCST for Python. Unlike a standard AST, an LST preserves every token in the original source including whitespace, comments, and formatting. This matters because transformers need to reconstruct the original code faithfully when applying fixes.

The LST engine runs nine rule-based analyzers over the tree:

eval() calls
dangerouslySetInnerHTML without DOMPurify
Hardcoded credentials in variable declarations
Hardcoded credentials in object literal properties (catches { password: 'secret' } inside database config objects — a pattern most tools miss)
SQL injection via template literals and string concatenation
child_process.exec() with dynamic arguments
Math.random() used to generate security-sensitive values
Unguarded database connection calls (null dereference)
Deep property chains without optional chaining

Each analyzer returns a structured vulnerability object with a unique ID, CWE classification, line number, severity, and a flag indicating whether a deterministic auto-fix is available.

Engine 2 — Type-Aware Taint Analysis

This is where AuraWatch goes beyond linting. The taint engine performs inter-procedural data flow analysis to track user-controlled data from its source to a dangerous sink.

Sources are any access to HTTP request data:

req.body.username
req.query.search
req.params.id
req.headers['x-custom']

Sinks are functions where tainted data causes harm:

eval()          // CWE-95: Code Injection
exec()          // CWE-78: OS Command Injection
db.query()      // CWE-89: SQL Injection
fs.readFile()   // CWE-22: Path Traversal
res.redirect()  // CWE-601: Open Redirect
innerHTML       // CWE-79: XSS

The engine works in three passes:

Source collection — walks all variable declarations and assignments to identify variables initialized from req.* values, including destructured patterns like const { username, password } = req.body
Taint propagation — marks any variable that flows from a tainted source as tainted itself
Sink matching — checks whether any argument to a known sink contains a tainted variable

When a tainted variable reaches a sink, AuraWatch reports which specific variables carried the taint and exactly where the data entered the program.

Engine 3 — JS-SMT Symbolic Constraint Solver

SMT (Satisfiability Modulo Theories) solving is traditionally the domain of formal verification — not developer tooling. AuraWatch brings a lightweight SMT constraint solver directly into the VS Code extension pipeline.

The engine maintains a range map for every integer variable it encounters, tracking the possible values each variable can take given the conditional guards in scope:

// After seeing: if (count > 0)
// The engine knows: count.min = 1

// After seeing: if (count < 10)
// The engine knows: count.max = 9

// Combined: count ∈ [1, 9]

With this range information, the engine can prove:

Division by zero:

function paginate(items, pageSize) {
    return items.length / pageSize; // SMT proves pageSize can be 0
}

Contradictory conditions (dead code):

if (score > 100) { ... }
if (score < 50)  { ... } // SMT proves these are mutually exclusive

Always-true auth bypasses:

if (isAdmin || true) { // SMT proves this always executes
    deleteAllUsers();
}

For Python, AuraWatch uses the Z3 SMT solver directly via its Python bindings, enabling full logical formula construction and satisfiability checking. The JavaScript engine uses a custom lightweight interval arithmetic solver that runs in-process without any native dependencies.

Engine 4 — Async Flow and Prototype Pollution Tracker

This engine handles two distinct but related categories of JavaScript-specific vulnerabilities.

Prototype pollution occurs when user-controlled data is used as an object key in an assignment:

// User sends: { "__proto__": { "isAdmin": true } }
function merge(target, source) {
    for (const key in source) {
        target[key] = source[key]; // key = "__proto__"
    }
}

If key equals __proto__, constructor, or prototype, this mutates the base Object prototype — affecting every object in the process. AuraWatch detects computed property assignments where the key is user-controlled, direct __proto__ mutations, and Object.assign() calls with untrusted sources.

Unhandled promise rejection crashes Node.js processes in production (since Node 15). The engine identifies promise chains that use .then() without .catch() at the expression statement level — a common oversight when converting callback-based code to async.

Engine 5 — Hallucination Defense

This engine is specifically designed for the AI-generated code era. Large language models frequently hallucinate npm package names — inventing packages that don't exist on the npm registry. When a developer installs an AI-suggested package and the real package doesn't exist, a malicious actor can publish a package with that exact name. This is a supply chain attack vector that barely existed five years ago.

AuraWatch maintains a curated registry of ~200 well-known legitimate packages. For every require() call and import statement it finds, it checks:

Is the package in the known-good registry?
Does the package name match a known typosquatting pattern?
If a package.json is provided in the scan request, is the package listed as a dependency?

Typosquatting patterns are matched against a curated list of confusable names:

lodas  → lodash
expres → express
axioss → axios
helmt  → helmet

A package flagged as potentially hallucinated gets a CRITICAL severity finding — because installing a non-existent package that gets claimed by an attacker is code execution on install via the postinstall lifecycle hook.

Engine 6 — Framework-Specific Sink Maps

Generic rules miss framework-specific vulnerabilities. This engine knows the dangerous patterns specific to Express, Fastify, Next.js, and React.

Express:

res.render(userInput) — Server-Side Template Injection when the template name is user-controlled
res.redirect(userInput) — Open Redirect when the destination URL comes from user input
res.send(\${userInput}) — Reflected XSS when user data appears in an HTML response

Next.js:

context.query values spread directly into page props in getServerSideProps without validation

Fastify:

reply.send(request.body) — echoing raw request body in responses that may be rendered as HTML

The framework is auto-detected from import statements — no configuration required.

Deterministic Auto-Fixing

When AuraWatch finds a vulnerability with a known safe transformation, it can fix it automatically using the same LST infrastructure it uses for analysis.

Fixers work as CST Transformers — they traverse the syntax tree and replace unsafe nodes with safe equivalents while preserving all surrounding formatting and comments:

Vulnerability	Auto-Fix
Hardcoded secret	Replaced with `process.env.VARIABLE_NAME \
{% raw %}`Math.random()` in security context	Replaced with `crypto.randomBytes(4).readUInt32BE(0)`
`dangerouslySetInnerHTML`	Wrapped with `DOMPurify.sanitize()`, import added
`eval(x)` (Python)	Replaced with `ast.literal_eval(x)`
`pickle.loads(x)` (Python)	Replaced with `json.loads(x)`
`yaml.load(x)` (Python)	Replaced with `yaml.safe_load(x)`
`subprocess.run(shell=True)` (Python)	`shell=True` argument removed
Unguarded async result	Property accesses converted to optional chaining (`?.`)

Fixes are presented as a unified diff in VS Code's native diff viewer before anything is written to disk. The user reviews the patch and explicitly accepts or rejects it — AuraWatch never modifies your files without confirmation.

Architecture: Two Engines, One Interface

AuraWatch runs as two separate Cloud Run services behind a single VS Code extension:

Python Engine (LibCST + Z3)

Handles .py files
Uses LibCST for lossless CST analysis
Uses Z3 for formal SMT solving
10 LST analyzers, 6 auto-fixers

JS/TS Engine (ts-morph + custom SMT)

Handles .js, .ts, .jsx, .tsx files
Uses ts-morph for full TypeScript-aware AST analysis
Custom interval arithmetic SMT solver
6 base analyzers + 4 advanced engines

The VS Code extension automatically routes each scan to the correct engine based on the active file's language. The API key, billing, and auth system are unified — one account works across both engines.

Authentication uses magic links — no passwords. When a user signs in, a time-limited token is written to Firestore, emailed to the user, and validated server-side when clicked. The resulting API key is stored in VS Code's native SecretStorage API (the OS keychain) — never in plaintext settings files.

What's Next

The roadmap includes:

GitHub Actions integration — run the full analysis pipeline as a CI check on every pull request
Inline diagnostics — surface findings as VS Code squiggles directly in the editor without opening the chat panel
Java and Go support — extending the LST engine to additional languages
Cross-function taint tracking — following data flow across function boundaries and module imports

Launch

AuraWatch launches on the VS Code Marketplace on May 15th, 2026. Follow along on Twitter/X and LinkedIn to get notified on launch day. If you work in security or DevSecOps and want to give early feedback, reach out — every piece of feedback shapes what gets built next.

The Code Review Comment I Left 47 Times

OptiRefine — Tue, 28 Apr 2026 00:33:22 +0000

I counted once. Not proud of it.

Forty-seven times across different PRs, different engineers, different companies — I left some version of the same comment. It usually looked like this:

"Hey — this nested loop is going to hurt you at scale. You're doing O(n²) work here. Can you use a set for the inner lookup instead?"

Sometimes the author got it immediately and fixed it in ten minutes. Sometimes I'd spend forty-five minutes in a thread explaining what O(n²) actually means in practice, with benchmarks, with examples, with a rewrite pasted directly into the comment. Sometimes the fix went in. Sometimes a slightly different version of the same pattern showed up in the next PR from the same person.

And every single time, I thought the same thing: this should not require a human.

The Thing Nobody Talks About in Code Review

Code review has this mythology around it. We talk about it like it's this high-value, high-skill practice where senior engineers pass down wisdom to junior ones. And sometimes it is. The architectural discussions, the "have you considered what happens when this service goes down" conversations, the stuff that requires genuine experience and judgment — that's worth a human's time.

But a massive chunk of code review isn't that. It's pattern matching. It's the same ten categories of problem appearing in slightly different costumes, over and over, in every codebase, at every company. Nested loops. Hardcoded credentials. Unused variables. Functions so tangled with conditionals that you need a flowchart to test them. Files left open without context managers. List comprehensions written as manual loops for no reason.

These aren't judgment calls. They have objectively correct answers. And yet we route them through humans anyway, which means they get caught inconsistently, they get caught late, and they generate the kind of review comments that quietly frustrate junior engineers who feel like they're failing when really they just weren't told the rules.

I got tired of being part of that system. So I built something to replace the parts of it that shouldn't need me.

What I Actually Built

OptiScan is a static analysis engine. You paste Python code in. It parses the code into a concrete syntax tree — not a string, not a token stream, an actual typed tree where every node is a specific kind of thing with specific kinds of children. Then it walks that tree and runs a set of analysis passes, each of which is looking for something specific.

The output is not a suggestion. It's a rewrite.

When the engine finds a nested loop doing pair comparisons — the O(n²) pattern I've left forty-seven comments about — it doesn't tell you it's bad. It replaces it. You get back working code that does the same thing in O(n) time using a HashSet for constant-time membership checks. You can copy it directly. You can run the benchmark in the browser to see the difference in milliseconds.

That's the thing I wanted to get right. A linter that tells you your code is slow and then leaves you to figure out the fix is only half useful. The fix is the point.

Let Me Show You What I Mean

Here's the function that ships as the default example:

def find_pairs(arr, target):
    result = []
    for i in range(len(arr)):
        for j in range(i+1, len(arr)):
            if arr[i] + arr[j] == target:
                result.append((arr[i], arr[j]))
    return result

dataset = list(range(500))
pairs = find_pairs(dataset, 150)

Clean enough. Readable. Does what it says. And for a dataset of 500 elements it probably runs fine on your laptop, which is exactly why it makes it to production.

But feed it 50,000 elements and suddenly you're doing 1.25 billion comparison operations. Feed it a million elements and you've brought a service to its knees over a function that looked completely harmless in review.

The AST engine parses this, detects the outer loop, detects the inner For node as the only direct child of the outer loop body, confirms there's an append call buried inside, identifies the collection name from the range(len(arr)) pattern, and rewrites:

arr_set = set(arr)
for num in arr:
    complement = target - num
    if complement in arr_set:
        result.append((num, complement))

One pass. O(1) set lookup. The benchmark in the browser — running both versions via Pyodide, a Python interpreter compiled to WebAssembly — shows this version running between 20 and 40 times faster on the default dataset. Not because I said so. Because you can run it yourself and watch the numbers.

The Security Part That Kept Surprising People

I added a security auditor because hardcoded secrets in source code is the other comment I've left too many times. But when I started building it I ran into something that made me realise most linters are only doing half the job.

The obvious case is this:

api_key = "sk-abc123def456"

Everyone knows that's bad. Most linters catch it. But what about this:

api_key = 1223

That's a real thing people do. Numeric API keys, numeric tokens, numeric "passwords" that are just IDs they didn't want to hardcode as strings for some reason. Most static analysis tools only check for string literals in sensitive variable assignments because that's the obvious case. They're checking the wrong layer — they're looking at the value type they expect rather than the variable name pattern combined with any literal at all.

The OptiScan security auditor checks both. It matches the variable name against a list of sensitive keywords — api_key, token, secret, password, credentials, stripe_key, twilio_token, twenty-odd others — and then checks whether the assigned value is any literal type: string, integer, float, all of it. If your variable name looks like it should be a secret, it shouldn't be assigned a literal. Ever. In any form.

It also catches eval() and exec() calls, subprocess with shell=True, AWS Access Key IDs embedded in string literals via regex, and a handful of dangerous module imports. The report comes back with line numbers and specific remediation instructions, not just a flag.

Cyclomatic Complexity Is One of Those Metrics That Sounds Boring Until You See It On Your Code

I'll be honest — I almost didn't add this. Cyclomatic complexity feels like the kind of thing you put in an enterprise tool so project managers have something to put in a quarterly report.

Then I ran it on some of my own old code.

The score is simple: start at 1, add 1 for every branch point. Every if. Every for. Every while. Every except. Every and and or in a boolean expression. Functions under 5 are LOW — easy to reason about, easy to write tests for. 6–10 is MEDIUM — manageable but worth a look. 11 and above is HIGH — that function is doing too many things and the test surface is enormous.

The thing about cyclomatic complexity is that it's not really about complexity. It's about testability. A function with a score of 14 has 14 independent paths through it. That's 14 test cases minimum for full branch coverage. In practice you're probably testing 3 of them and hoping the others behave. The score makes that problem visible in a way that reading the code doesn't always do.

Dead Code and Why It Accumulates

Every codebase I've worked in for more than a year has dead code in it. Functions that were replaced but not removed. Variables assigned in an early version of a feature that got refactored away. Imports for libraries that no longer get called.

It accumulates because removing it feels risky and finding it is annoying. You're never quite sure if something is actually unused or if you just can't see where it's used. So you leave it. Then the next engineer leaves it. Then it's been there for three years and nobody knows what it does or whether removing it will break something.

The dead code detector uses libcst's scope provider — actual scope analysis, not just text search — to track which names are defined and which names are referenced in the same module. Defined but never referenced means dead. It surfaces unused functions with their names, unused variable assignments, and unreachable code after return statements. Not heuristics. Not grep. Scope analysis.

On Building This Alone

I want to say something about the solo build experience because I think it's relevant to the tool itself.

When you're building something without a team, you have to be extremely honest about where the value actually is. You can't ship six half-finished features and call it a product. You have to pick the thing that is genuinely useful and make that thing work properly.

For me that meant the analysis had to be deterministic. Not "pretty good most of the time." Deterministic. If the security auditor says there's a hardcoded secret, there's a hardcoded secret. If the complexity engine says this is O(n²), it's O(n²). If the rewriter produces code, that code has to actually run and be correct.

That constraint forced some decisions that made the engine better. Using libcst instead of a language model for analysis. Testing the transformer output before marking anything as successfully converted — the bug where my transformer was silently deleting variable declarations because it marked a conversion as successful before checking if the conversion actually worked taught me that lesson very directly.

The engine is real. It has rough edges I'm filing down. But what it does, it does correctly.

Where It Goes From Here

GitHub PR integration is the one I'm working toward most urgently. Paste a pull request URL, get back an analysis of every Python file changed in the diff. That's the version that fits into a real engineering workflow without anyone having to change how they work.

After that: more language support, team workspaces so organisations can share scan history and audit reports, and a webhook mode so you can trigger scans automatically on push and route results to wherever your team already lives.

Try It

optirefine.qzz.io

Free tier available. No credit card. Paste something you've been staring at for too long and see what comes back.

If the engine misses something it should catch, or produces a rewrite that's wrong, I want to know. The best way to improve static analysis tooling is to throw real-world code at it — the messy, inconsistent, underdocumented code that actually exists in production, not the clean examples from tutorials.

That's the code that matters. That's what it's built for.

Solo-built. Actively maintained. Feedback welcome.