There is a question spreading quietly through the software industry right now. Hiring managers are asking it. Hackathon judges are asking it. Open source maintainers are asking it.
"Did you actually build this?"
Nobody has a good answer yet. I tried to build one — and Gemma 4 is the reason it worked.
The Problem Nobody Is Talking About
AI-assisted development has gone mainstream fast. Cursor, Copilot, Lovable, Bolt — developers are shipping real products with significant AI assistance, and that is genuinely fine. The tools exist, the skills are in using them well.
But a trust gap is forming. When you submit a project to a hackathon, post a repo on GitHub, or show work in a job interview, reviewers are increasingly skeptical. The portfolio that used to signal skill now also signals a question mark.
The current answer to "did you build this?" is essentially: trust me.
That is not good enough. And trying to detect AI-generated code is an arms race nobody will win — models improve, detection fails, repeat.
I wanted a different approach: instead of detecting AI, document the human.
What I Built: VibeSafe
VibeSafe is a browser-based code auditor that takes your project files, sends them to Gemma 4 31B in a single prompt, and returns a Proof of Authorship certificate — a structured document identifying your human architectural decisions versus AI-assisted patterns.
The output looks like this:
HUMAN ARCHITECTURAL DECISIONS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1. Separation of database connection into factory function
Evidence: get_db() pattern used consistently across modules
2. Deliberate stateless token design
Evidence: login() returns user ID directly — intentional tradeoff
3. Privacy-first architecture: no backend, direct browser API calls
Evidence: all external calls made from frontend, no server layer
These are the things only I would have decided. The boilerplate React hooks and Tailwind utility classes? Gemma 4 flags those as AI-assisted. The product decisions, the tradeoffs, the specific ways the pieces connect? Those are mine.
Why Gemma 4 Specifically
I tried this concept with smaller models first. It did not work.
The problem is that distinguishing intent from output requires holding the entire codebase in mind simultaneously. A model that has only seen half your files cannot tell you whether your architectural choices are consistent across the project. It cannot spot that you made the same deliberate tradeoff in three different places — which is actually the strongest signal of human authorship.
The 262K Context Window Changes the Analysis
Gemma 4's 262K context window means I send everything in one shot:
const combined = files
.map(f => `\n\n=== FILE: ${f.name} ===\n${f.content}`)
.join('')
// One prompt. Entire project. Gemma 4 sees everything at once.
const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
body: JSON.stringify({
model: 'google/gemma-4-31b-it:free',
messages: [{ role: 'user', content: buildPrompt(combined) }]
})
})
No chunking. No lost context. No missed cross-file patterns. The model sees the whole picture before making any judgment — the same way a senior engineer would read a codebase before commenting on it.
31B Dense vs the MoE Model
I specifically chose the 31B Dense model over the 26B MoE for this use case.
The MoE model activates ~3.8B parameters per token — it is faster and more efficient, ideal for high-throughput applications. But security analysis needs consistent reasoning quality on every single token. Missing one vulnerability because a parameter set was not activated is worse than slower inference. For a tool that is auditing your code for real risks, I wanted the full model engaged on every decision.
Reasoning Mode for Authorship Detection
The part that surprised me most was how well Gemma 4 handles the authorship question when prompted correctly. Generic "review my code" prompts produce generic answers. But when you ask specifically about intent:
Look for architectural decisions that reflect product thinking.
Look for specific tradeoffs that reveal human judgment.
Distinguish these from patterns that are generic and could be AI-generated.
Gemma 4 produces genuinely insightful distinctions. It identified that my choice to put authorship as the hero card — not security — was a human product decision. It noticed the privacy-first architecture (no backend) as a deliberate tradeoff, not a default. It caught that I reused the same terminal aesthetic across components as a consistent design language.
That is not code review. That is architectural reasoning.
What Running VibeSafe on Itself Taught Me
I ran VibeSafe on its own source code. The results were honest in a way I did not expect.
Human decisions Gemma 4 identified:
- Authorship card as hero feature (product decision, not default)
- Direct browser-to-API architecture (privacy tradeoff)
- Terminal aesthetic as unified design language
- Certificate export as plain text (accessibility over PDF complexity)
AI-assisted patterns it flagged:
- Standard Tailwind utility class combinations
- Boilerplate useState/useEffect patterns
- Generic error boundary structure
Originality score: 74/100
That feels right. A good chunk of the implementation is standard React patterns. But the product decisions — what to build, how to frame it, what matters to the user — those are mine. 74 out of 100 captures that honestly.
What This Means for Developers Right Now
Open models at Gemma 4's capability level running on free infrastructure changes what individual developers can build.
Six months ago, this analysis would have required:
- A paid API with expensive per-token costs
- A backend to handle large context requests
- Chunking logic to split codebases into pieces
- Multiple round-trips losing context between calls
Now it is a single fetch() call from a React component. Free. 262K tokens. Full model. No backend.
The barrier between "idea" and "working product" for AI-powered developer tools has dropped significantly. VibeSafe went from concept to working demo in a weekend — not because the engineering is simple, but because Gemma 4 handles the hard part.
The Bigger Picture
The "did you build this?" problem is not going away. It is going to intensify as models improve and AI-assisted development becomes more capable.
But I think the framing of the question is wrong. The interesting question is not "how much did AI write?" — it is "what did the human decide?"
Architecture. Product instincts. Tradeoffs. The specific shape of a solution. These things are still fundamentally human, even when the implementation is AI-assisted. They are also what actually matter in a developer.
VibeSafe is a first attempt at making those decisions visible and documentable. Gemma 4's reasoning capability and context window are what made it possible to build something that actually captures them.
Try It
🔗 Live app: https://vibe-proof-code.lovable.app/
🔗 GitHub: github.com/SimranShaikh20/vibesafe
You need a free OpenRouter API key from openrouter.ai/keys — no credit card, takes 30 seconds.
Run it on your own project. See what Gemma 4 says about what you built.
VibeSafe · Powered by Gemma 4 31B · OpenRouter free tier · Built for the vibe coding era
Top comments (0)