DEV Community

freerave
freerave

Posted on

Regex vs. AI: Why I Moved My Logic to the Cloud (DotEnvy v1.4)

The Problem: Regex is "Dumb"
If you’ve ever built a secret scanner or a linter, you know the struggle. You write a Regex for an API key, and suddenly, every random SHA hash or CSS hex code gets flagged as a "Critical Security Risk."

I got tired of False Positives in my VS Code extension, DotEnvy. I wanted a scanner that didn't just match patterns—I wanted one that understood context.

The Solution: A Hybrid Engine (Regex + AI)
In version 1.4, I introduced a new architecture. Instead of relying solely on client-side logic, I offloaded the heavy lifting to a custom Python LLM Service hosted on Railway.

The Logic Flow:

Layer 1 (Local): Fast Regex & Entropy analysis runs in VS Code. If it looks safe, we ignore it.

Layer 2 (The Filter): If the local score is ambiguous (> 0.4), we pause.

Layer 3 (The Brain): The extension queries my Python backend. The LLM analyzes the variable name, surrounding code, and entropy to give a final verdict.

The "Secret" Sauce: The 80/20 Rule
Integrating the LLM was only about 20% of the new code, but it provides 80% of the intelligence.

Frontend: TypeScript (VS Code API) handling async/debounced events.

Backend: Python service on Railway.

Result: Zero false positives. It can tell the difference between a real Stripe key and a random test string.

Watch it in Action
I ran a full suite of integration tests to prove stability. The system now validates connection, analyzes entropy, and confirms secrets in real-time.

Under the Hood
Performance: I implemented debouncing so we don't spam the server on every keystroke.

Security: The API key is injected at build time, keeping the repo clean.

Reliability: Added a circuit breaker pattern—if the server is down, it falls back to local analysis seamlessly.

Let me know what you think about hybrid AI/Local architectures in the comments!

Top comments (0)