"Vibe coding" is everywhere. You prompt an AI, it writes your whole project, you ship it.
Last week I reviewed 3 PRs from vibe-coded projects. All three had hardcoded API keys in the source. Two had no tests. One had a raw eval() on user input.
So I built vibescore.
What it does
pip install vibescore
vibescore .
One command. Letter grade from A+ to F. Four dimensions:
| Category | What it checks |
|---|---|
| Security | Hardcoded secrets, SQL injection, eval/exec, insecure defaults |
| Code Quality | Function length, complexity, nesting depth, type hint coverage |
| Dependencies | Pinning, lock files, deprecated packages, known CVEs |
| Testing | Test count vs LOC ratio, coverage setup, CI configuration |
Example output
vibescore v0.4.0 — Project Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Security B+ (no hardcoded secrets, 2 eval() calls found)
Code Quality C (4 functions >50 lines, low type hint coverage)
Dependencies A- (all pinned, lock file present)
Testing D (3 tests for 2,400 LOC)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
OVERALL C+
Supported languages
- Python (AST-based analysis)
- JavaScript/TypeScript (regex-based)
- Rust (VC221-VC227: unwrap density, unsafe blocks, doc comments, clone detection)
- Go (VC231-VC237: unchecked errors, goroutine leaks, naked returns, panic in library code)
Extra features
-
vibescore --init-ci— generates a GitHub Actions workflow -
vibescore --watch— re-scans on file changes in real-time -
vibescore --dashboard— historical grade tracking (Streamlit web UI) -
vibescore --save-history— save scan results for trend analysis - Zero dependencies. 201 tests.
Comparison
- SonarQube: requires a Java server, complex setup, enterprise pricing
- Codacy/CodeClimate: SaaS, requires account, sends code to servers
- pylint/ruff: lint rules only, no security/testing/dependency analysis, no single grade
- vibescore: one pip install, one command, local-only, zero deps, covers 4 dimensions with a letter grade
GitHub: github.com/stef41/vibescore
PyPI: pypi.org/project/vibescore
Feedback welcome — especially ideas for new check categories or language support.
Top comments (3)
This hits close to home — I built ShadowAudit for exactly
this reason. Hardcoded API keys in prompts sent to LLMs
is the same problem one layer deeper.
vibescore catches keys before they're committed.
ShadowAudit catches them before they reach an AI API at
runtime. Different layers, same problem.
Would love to explore if these could complement each other
— a vibescore check that also flags prompts likely to
contain runtime secrets would be interesting.
github.com/Jeffrin-dev/ShadowAudit if you want to look
at the approach.
This is honestly becoming a real pattern with “vibe coding” workflows—speed goes up, but basic security discipline disappears. Hardcoded API keys in PRs is one of those issues that should never make it past local development, yet it keeps slipping through when AI-generated or rapid prototype code bypasses proper review habits. Building a grader is a smart move because this isn’t just a linting problem—it’s a workflow enforcement problem. The real win here is shifting left: catching secrets, insecure configs, and unsafe patterns before code even reaches review.
Ohhh you should go to GitHub, it's the wild west in there...