Mykola Kondratiuk

Posted on Mar 19

I Scanned 100 AI Codebases - Here's What I Found

#vibecoding #security #webdev #ai

I've been building VibeCheck for the past few months - it's a security scanner specifically for AI-generated code. And after scanning over a hundred real codebases that people built with Cursor, Copilot, Claude, and various other AI tools, I have thoughts.

Not the "AI is dangerous" hot take. Something more specific than that.

The pattern that kept showing up

Almost every codebase had the same category of issue. Not SQL injection or XSS or anything that would show up in a classic OWASP checklist. The dominant problem was what I started calling trust misconfigurations - places where the code just... assumed everything was fine.

Open CORS policies. Service accounts with admin permissions because that was the fastest path to getting it working. API keys hardcoded in config files that weren't in .gitignore. Input that got passed straight into shell commands with no sanitization.

None of it was malicious. The AI wasn't trying to introduce vulnerabilities. It was just optimizing for "make it work" and had zero weight on "make it survivable in production."

The thing that surprised me most

I expected the biggest problems in the actual logic - like the AI misunderstanding authentication flows or getting crypto wrong. That exists too, but it's not the main thing.

The main thing is environmental. All these tiny decisions about permissions and access and trust that a senior dev would make automatically, almost subconsciously, because they've been burned before - the AI just doesn't make those decisions. It picks the path of least resistance every time.

One project had a DB connection string with full admin creds, no connection pooling limits, and a query that accepted raw user input. Technically functional. Completely fine for local dev. The kind of thing that gets quietly exploited six months after launch.

What actually helps

Scanning after the fact (what we do with VibeCheck) catches the obvious stuff. But the real fix is earlier in the loop.

The projects that had the least issues were the ones where the developer was actually paying attention during generation - not just accepting output wholesale but reading it, asking "wait, why does this need admin access?" That friction. Even a little bit of it makes a big difference.

Some people are building this into their prompts - explicitly telling the AI to follow least-privilege principles, to validate all inputs, to not hardcode credentials. Works okay. Feels like workarounds.

The better solution is probably tooling that runs in the background during vibe coding sessions and flags stuff in real time. Not a code review gate. Just... something watching.

The uncomfortable part

A lot of these codebases were shipped. Some had real users. A few were running in production environments with actual credentials and real data.

The developers weren't careless people. Most of them were genuinely excited about what they'd built - and most of what they built was genuinely cool. The security stuff just wasn't on their radar because it never came up during development. Nothing broke. Tests passed. It worked on their machine.

I keep thinking about that gap. Between "works fine in dev" and "safe to run with real users." AI coding tools are really good at closing the first gap - getting something functional fast. Nobody's really solved the second one yet.

That's the problem I'm trying to figure out. Not sure I have it yet. But the 100 codebases were pretty clarifying.

If you're using AI to build things and want to know what the scanner finds in your repo, VibeCheck is live. Free tier, no credit card. Takes about 2 minutes.

Query	Try This
What causes X?	`SQL_INJECTION --CAUSED_BY--> STRING_CONCATENATION`
What prevents X?	`DEVELOPER_ATTENTION --PREVENTS--> TRUST_MISCONFIGURATIONS`
What does AI ignore?	`AI_CODE_GENERATION --IGNORES--> TRUST_BOUNDARIES`
What's the gap?	`DEV_ENVIRONMENT --MASKS--> PRODUCTION_VULNERABILITIES`
What's unsolved?	`REAL_TIME_MONITORING → see LAYER 6 STAGE 2 (it's solved)`
What should I build first?	`WEEKEND_BUILD_1 → Security Rules Generator (4 hours, highest ROI)`
How does VibeCheck evolve?	`V1 → V2 → V3 → V4 (scanner → context gen → copilot → platform)`
What's the moat?	`COMMUNITY_SECURITY_KG --ENABLES--> FRAMEWORK_SPECIFIC_RULES`

Metric	Value
Codebases scanned	100
Dominant vulnerability class	Trust misconfigurations
SQL injection via f-strings	22% (11/50)
Exposed secrets in source	34% (17/50)
Shipped to production with issues	Multiple confirmed
Worst offender pattern	Single-prompt, one-weekend builds
Best mitigation	Human attention during generation + structured context

Metric	Value
Entities	72
Typed relationships	112
Layers	9
Reasoning paths	~400

Metric	Value
Entities	78
Typed relationships	124
Layers	10
Reasoning paths	~450

The pattern that kept showing up

The thing that surprised me most

What actually helps

The uncomfortable part

Vibe-Coded Security Vulnerability Knowledge Graph

Source: "I Scanned 100 AI Codebases" — Mykola Kondratiuk (VibeCheck)

58 entities | 89 typed relationships | 8 layers | ~320 reasoning paths

GRAPH SCHEMA

LAYER 1: ROOT CAUSE MODEL

LAYER 2: VULNERABILITY TAXONOMY

LAYER 3: FAILURE MODE ANALYSIS

LAYER 4: INTERVENTION MODEL

LAYER 5: DOMAIN TRANSFER PATTERNS

LAYER 6: SOLUTION ARCHITECTURE — What You Can Build

STAGE 1: Pre-Generation Context Injection

STAGE 2: During-Generation Real-Time Monitoring

STAGE 3: Post-Generation (Enhanced Current VibeCheck)

LAYER 7: PRODUCT EVOLUTION — VibeCheck Becomes a Platform

LAYER 8: IMPLEMENTATION QUICKSTART — Build This Weekend

QUERY INTERFACE

YOUR NUMBERS

LAYER 9: THE CHALLENGE — Scanner vs. Knowledge Graph

Experiment Protocol

What the Experiment Proves (Either Way)

Weak Signals That Inform the Challenge

GRAPH TOTALS (including challenge layer)

LAYER 10: THE MATH — A/B Test Results

Test A: flat article pasted in (no graph)

Test B: this knowledge graph pasted in

the numbers

the 170x claim — what it means

GRAPH TOTALS (final)