AI code generation is producing more production code than ever. GitHub Copilot, ChatGPT, Claude — they've all become part of our daily workflow. But here's the thing nobody talks about:
AI models reproduce security mistakes.
They've been trained on the open-source ecosystem, and that ecosystem has been making the same errors for decades — hardcoded API keys, SQL injection, eval calls, pickle deserialization. The AI doesn't know it's wrong. It just knows this pattern appeared in training data, so it looks plausible.
That's where truffle-scan comes in.
pip install truffle-scan
truffle-scan .
A security scanner that's deterministic (no ML), fast (under 2 seconds for most projects), and aims for zero false positives.
Why Another Security Scanner?
There are plenty of security tools out there: Bandit, Semgrep, SonarQube, Snyk. They're all good at what they do. But they share a few pain points:
- Bandit is Python-only and slow on large codebases
- Semgrep is powerful but has a steep learning curve for custom rules
- Snyk requires a SaaS account for full functionality
- Most tools produce noise — false positives that teams learn to ignore
truffle-scan takes a different approach: simple, fast, opinionated.
- One command to scan an entire project
- Deterministic pattern matching — no heuristics, no ML, no false positives
- Sub-second or low-latency scans (183ms on a 47-file project)
- CI-ready with JSON output and non-zero exit codes on findings
- Prioritized action plans — fix what matters first
What It Detects
truffle-scan organizes rules into six categories, spanning Python, JavaScript/TypeScript, and Go:
🔴 Credentials
Hardcoded secrets that should never reach your repo:
- AWS Access Keys (
AKIA...) - GitHub tokens (
ghp_...,gho_...) - Stripe API keys (
sk_live_...,pk_live_...) - Private keys (RSA, EC, DSA)
- Generic passwords and API secrets
🔴 Code Execution
Functions that allow arbitrary code execution:
-
eval()— the classic Python footgun -
exec()— same danger, different name -
Function()constructor (JavaScript) -
os.system()— shell injection waiting to happen -
subprocess.call(... shell=True)— same problem
🟠 Deserialization
Insecure deserialization can lead to remote code execution:
-
pickle.loads()/pickle.load() -
yaml.load()withoutSafeLoader
🟡 Injection
Unvalidated user input reaching dangerous sinks:
- Raw SQL queries in Go (
db.Raw(...)) -
innerHTMLassignments (XSS in JavaScript) -
document.write()(XSS) - Unvalidated
request.args/request.form
🔵 Crypto & Quality
-
Math.random()for security-sensitive contexts - Overlong lines (>100 chars)
- Deep nesting (depth > 4)
-
TODO/FIXMEmarkers
How It Works Under the Hood
truffle-scan uses a dual-strategy approach:
AST Analysis (Python)
For Python files, it parses the abstract syntax tree and walks function calls. This is more accurate than regex because it understands the structure of the code:
# scanner.py (simplified)
def _check_python_ast(self, filepath, code, lines, rule):
tree = ast.parse(code, filename=filepath)
for node in ast.walk(tree):
if isinstance(node, ast.Call):
func_name = self._get_call_name(node)
if pattern in func_name:
# Found a match — create finding
findings.append(Finding(
severity=rule.severity,
message=rule.message,
file=filepath,
line=node.lineno,
rule_id=rule.rule_id,
snippet=lines[node.lineno - 1].strip(),
recommendation=rule.recommendation,
))
return findings
The _get_call_name method resolves dotted names like os.system or pickle.loads by walking the AST attribute chain — so it catches import os; os.system(...) without flagging a variable named system that happens to be nearby.
Regex Patterns (JS/Go/General)
For JavaScript, TypeScript, Go, and cross-language patterns, it uses carefully crafted regex patterns. Each rule is defined as a dataclass:
@dataclass
class Rule:
rule_id: str
name: str
severity: Severity
category: str
message: str
pattern: str
language: str
is_regex: bool = True
recommendation: str = ""
confidence: float = 1.0
Rules include a confidence field — patterns with < 1.0 confidence are things like "this looks like a password" (heuristic) vs "this is definitely an eval call" (deterministic). The CLI only reports findings with confidence >= the configured threshold, keeping noise low.
Parallel Scanning
The scanner uses ThreadPoolExecutor to scan files in parallel (default: 8 workers):
with ThreadPoolExecutor(max_workers=self.max_workers) as pool:
fut_map = {}
for fp in files:
lang = language or self._detect_language(fp)
if not lang:
continue
fut = pool.submit(self._scan_file, str(fp), lang)
fut_map[fut] = fp
for fut in as_completed(fut_map):
file_result = fut.result()
if file_result:
for finding in file_result.findings:
result.add(finding)
result.lines_scanned += file_result.lines_scanned
Hidden files, __pycache__, node_modules, and build artifacts are skipped by default.
Risk Scoring
Each finding has a severity. The overall project score (0–100) is calculated as:
@property
def score(self) -> int:
if not self.findings:
return 0
raw = sum(f.severity.score_value() for f in self.findings)
capped = min(raw * 5, 100)
return capped
Where severity values are: CRITICAL=4, HIGH=3, MEDIUM=2, LOW=1, INFO=0.
The verdict is:
- 0–14: ✅ Safe
- 15–39: ⚠️ Minor Issues
- 40–69: 🔍 Needs Review
- 70–100: 🚨 Dangerous
Getting Started
Installation
pip install truffle-scan
That's it. No dependencies beyond the Python standard library.
Scan a Project
# Scan current directory
truffle-scan .
# Scan a specific directory
truffle-scan /path/to/your/project
# Verbose output — show all findings with code snippets
truffle-scan . --verbose
# JSON output for CI pipelines
truffle-scan . --format json
# Get a prioritized fix plan
truffle-scan . --plan
Example Output
========================================================
Truffle Security Scan Report
========================================================
Verdict : 🚨 Dangerous
Score : 75/100
Files : 47
Duration : 183ms
Issues by severity:
🔴 Critical: 2
🟠 High: 5
🟡 Medium: 3
========================================================
🚨 Dangerous. Found 10 issues across 47 files.
========================================================
Action Plan Mode
The --plan flag goes beyond raw findings — it tells you what to fix first:
========================================================
📋 Truffle Action Plan
========================================================
🔴 Critical — fix immediately
─────────────────────────────────────────────────────
• Hardcoded AWS Access Key ID
config/aws_credentials.py:42 (GEN001)
💡 Rotate this key immediately. Store in AWS Secrets Manager.
• Arbitrary code execution via eval()
scripts/process.py:17 (PY001)
💡 Use ast.literal_eval() or a safer alternative.
🟠 High — fix this sprint
─────────────────────────────────────────────────────
• OS command execution via os.system()
scripts/process.py:21 (PY003)
💡 Use subprocess.run() with a list argument instead of a shell string.
CI/CD Integration
Add to your GitHub Actions workflow in 3 lines:
name: Security Scan
on: [pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install truffle-scan
- run: truffle-scan . --format json
If the scanner finds any issues, it exits with code 1 — which fails the CI check. Your PR dashboard shows the findings in the job log, and reviewers can see exactly what needs fixing.
Architecture at a Glance
The codebase is intentionally minimal — about 1,200 lines total:
truffle-scan/
├── truffle_scan/
│ ├── __init__.py # Public API exports
│ ├── cli.py # Argument parser, output formatting
│ ├── models.py # Finding, ScanResult, Severity (dataclasses)
│ ├── scanner.py # Core engine: AST + regex scanning
│ ├── reporter.py # Human-readable report generation
│ └── rules.py # 30+ security rule definitions
├── tests/
│ ├── test_scanner.py
│ └── samples/
│ └── dangerous.py
├── pyproject.toml
└── README.md
Key files:
| File | Purpose |
|---|---|
rules.py |
All security rules — Python, JS, Go, and general multi-language patterns |
scanner.py |
Core engine with Scanner, CodeQualityAnalyzer, and GenericScanner classes |
models.py |
Data structures — Finding, ScanResult, Severity enum |
reporter.py |
Converts raw results into readable reports and JSON output |
cli.py |
CLI entry point with --verbose, --format, --plan flags |
Comparing with Other Tools
| Feature | truffle-scan | Bandit | Semgrep | Snyk |
|---|---|---|---|---|
| Installation | pip install |
pip install |
pip install |
SaaS + CLI |
| Languages | Python, JS, Go | Python only | 30+ | 30+ |
| False positives | Near zero | Moderate | Low | Low |
| Scan speed | ~200ms | ~2-5s | ~1-3s | Varies |
| Custom rules | Coming soon | Via plugins | Native | Limited |
| Offline | ✅ | ✅ | ✅ | ❌ |
| Lockfile scanning | ❌ | ❌ | ❌ | ✅ |
truffle-scan isn't meant to replace these tools for deep analysis — it's meant to be the fast first pass that catches the most common, most dangerous issues before they reach production. Think of it as the ruff of security scanning: opinionated, fast, and zero-config.
Running It Yourself
# Try it on a sample project
git clone https://github.com/yizhuzhu222/TruffleKit-scan.git
cd truffle-scan
pip install -e .
# Scan itself!
truffle-scan .
# Try the dangerous sample
truffle-scan tests/samples/dangerous.py --verbose
The dangerous sample includes intentional vulnerabilities — eval(), os.system(), pickle.loads(), yaml.load(), hardcoded passwords — so you can see all the severity levels in action.
What's Next?
truffle-scan is actively developed. Planned features include:
- Pre-commit hook — catch secrets before they're committed
- Custom rule files — write your own patterns via YAML/TOML
- Semgrep-style matching — structural pattern matching beyond regex
- More languages — Rust, Java, Ruby support
- VS Code extension — inline annotations while you code
Why Open Source?
Security tools should be transparent. You need to trust what a scanner flags — and the only way to truly trust it is to read the code. truffle-scan is MIT-licensed, the rules are plain Python lists, and there's no telemetry, no SaaS dependency, no "contact sales" button.
The CLI is free and offline forever. It's the open-source component of TruffleKit — an AI code security platform for small teams.
Try It Today
pip install truffle-scan
Scans your project, finds issues, tells you what to fix. In under 2 seconds.
GitHub: yizhizhu222/TruffleKit-scan
What security issues has your AI coding assistant generated lately? I'd love to hear your war stories in the comments.
Top comments (0)