I code-reviewed 3 "vibe-coded" PRs last week. Every one had hardcoded API keys. So I built a grader.

#javascript #python #showdev #opensource

"Vibe coding" is everywhere. You prompt an AI, it writes your whole project, you ship it.

Last week I reviewed 3 PRs from vibe-coded projects. All three had hardcoded API keys in the source. Two had no tests. One had a raw eval() on user input.

So I built vibescore.

What it does

pip install vibescore
vibescore .

One command. Letter grade from A+ to F. Four dimensions:

Category	What it checks
Security	Hardcoded secrets, SQL injection, eval/exec, insecure defaults
Code Quality	Function length, complexity, nesting depth, type hint coverage
Dependencies	Pinning, lock files, deprecated packages, known CVEs
Testing	Test count vs LOC ratio, coverage setup, CI configuration

Example output

vibescore v0.4.0 — Project Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Security     B+  (no hardcoded secrets, 2 eval() calls found)
  Code Quality C   (4 functions >50 lines, low type hint coverage)
  Dependencies A-  (all pinned, lock file present)
  Testing      D   (3 tests for 2,400 LOC)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  OVERALL      C+

Supported languages

Python (AST-based analysis)
JavaScript/TypeScript (regex-based)
Rust (VC221-VC227: unwrap density, unsafe blocks, doc comments, clone detection)
Go (VC231-VC237: unchecked errors, goroutine leaks, naked returns, panic in library code)

Extra features

vibescore --init-ci — generates a GitHub Actions workflow
vibescore --watch — re-scans on file changes in real-time
vibescore --dashboard — historical grade tracking (Streamlit web UI)
vibescore --save-history — save scan results for trend analysis
Zero dependencies. 201 tests.

Comparison

SonarQube: requires a Java server, complex setup, enterprise pricing
Codacy/CodeClimate: SaaS, requires account, sends code to servers
pylint/ruff: lint rules only, no security/testing/dependency analysis, no single grade
vibescore: one pip install, one command, local-only, zero deps, covers 4 dimensions with a letter grade

GitHub: github.com/stef41/vibescore
PyPI: pypi.org/project/vibescore

Feedback welcome — especially ideas for new check categories or language support.

Top comments (3)

Twisted-Code'r • Apr 12

This hits close to home — I built ShadowAudit for exactly
this reason. Hardcoded API keys in prompts sent to LLMs
is the same problem one layer deeper.

vibescore catches keys before they're committed.
ShadowAudit catches them before they reach an AI API at
runtime. Different layers, same problem.

Would love to explore if these could complement each other
— a vibescore check that also flags prompts likely to
contain runtime secrets would be interesting.

github.com/Jeffrin-dev/ShadowAudit if you want to look
at the approach.

Laura Ashaley • Apr 11

This is honestly becoming a real pattern with “vibe coding” workflows—speed goes up, but basic security discipline disappears. Hardcoded API keys in PRs is one of those issues that should never make it past local development, yet it keeps slipping through when AI-generated or rapid prototype code bypasses proper review habits. Building a grader is a smart move because this isn’t just a linting problem—it’s a workflow enforcement problem. The real win here is shifting left: catching secrets, insecure configs, and unsafe patterns before code even reaches review.

Mario Gutierrez • Apr 15

Ohhh you should go to GitHub, it's the wild west in there...