DEV Community: CopperSunDev

What Your Auditor Wants From Your AI Codebase

CopperSunDev — Sat, 18 Jul 2026 18:55:11 +0000

A SOC 2 auditor reviewing your AI-augmented engineering team asks one question first: how do you know what's in the code? If the answer involves a stochastic scanner whose output varies between runs, the audit has a problem. If the answer involves a deterministic scanner whose output is reproducible against a pinned commit, the audit has a clean control.

BrassCoders, the bug scanner for AI coders, is the deterministic scanner. This piece walks through the audit posture an AI-augmented engineering team needs by 2026 — the SOC 2 controls BrassCoders maps to, what the YAML looks like to an auditor, and what BrassCoders is honestly not audited for.

This is the sixth piece in BrassCoders's AI Coding Assistant Blind Spots pillar — the audit answer to the technical coverage map laid out in the rest of the series.

Why Auditors Reject Stochastic Output

BrassCoders detection output is bit-for-bit reproducible against a pinned commit, by design. The OSS core runs the same 12 scanners with the same configuration on the same input and produces the same finding list every time. Auditors accept this; auditors reject stochastic scanners that produce different findings on different runs.

The auditor's logic. A control trail in SOC 2 needs evidence. Evidence has to be reproducible — the auditor needs to be able to re-run the check and see the same result. A scanner whose output varies between runs because the model temperature changed, because the training data refreshed, because the prompt template was tuned, cannot produce an audit trail. The auditor cannot tell which version of the scanner ran when the artifact was generated; the auditor cannot reproduce the artifact.

LLM-based PR reviewers are stochastic. Same input, different output. Useful for engineering, problematic for audit. Most teams that adopt LLM-based PR review keep the existing deterministic SAST around for the audit trail — even if the SAST finds fewer real bugs than the LLM, the SAST findings are evidence-grade and the LLM findings are not.

BrassCoders ships both layers and keeps them separate. The deterministic scanner layer (the OSS core, plus the same scanners in Paid) is the audit-grade evidence. The AI-powered enrichment layer (Paid only) is the developer productivity layer. The two layers serve different stakeholders. The auditor reads the scanner layer.

The SOC 2 Controls BrassCoders Maps To

BrassCoders maps cleanly to two SOC 2 Trust Services Criteria: CC8.1 (change management — every change is scanned) and CC7.1 (system operations — vulnerabilities are monitored). The mapping is direct enough that auditors recognize the artifact without extensive explanation.

CC8.1 — Change Management. The control requires the entity to authorize, design, develop, test, approve, and implement changes to infrastructure, data, software, and procedures. The BrassCoders scan output, persisted as a CI artifact tied to the commit hash, evidences the "test" and "approve" steps. Every pull request gets scanned; every scan produces a YAML; every YAML lives in the build artifact store. The auditor walks the trail.

CC7.1 — System Operations. The control requires the entity to monitor system components and operations for anomalies that are indicative of malicious acts, natural disasters, and errors. The BrassCoders scan history, retained per release, evidences ongoing vulnerability monitoring. The auditor reviewing CC7.1 sees a continuous trail of vulnerability detection runs against the production codebase.

The mapping for adjacent frameworks. HIPAA 164.308(a)(1) (security management process): same evidence applies. ISO 27001 A.12.6.1 (technical vulnerability management): same evidence. PCI-DSS requirement 6.6 (review of public-facing applications for vulnerabilities): applicable for the AppSec scope. The same scan trail covers multiple frameworks because the underlying control is the same — continuous deterministic vulnerability monitoring.

AICPA's Trust Services Criteria defines the SOC 2 controls. The mapping above is BrassCoders's reading; your auditor will translate to the specific control language your firm uses.

Reproducibility As An Audit Artifact

BrassCoders treats reproducibility as a load-bearing feature, not an implementation detail. The reproducible-benchmarks page ships scan results against nine open-source codebases with pinned commit hashes; any auditor or engineer can re-run the scan and confirm the same output.

The benchmarks page is at coppersun.dev/benchmarks.

The mechanic. BrassCoders pins every scanner version in pyproject.toml. Pylint, Bandit, Semgrep, detect-secrets — each one is locked to a specific minor version. Pyre is locked to a narrow window because the Pysa model format has been unstable across minors. The version pinning is intentional; bumping a scanner version is a calibration event that requires re-verifying the bundled detection model.

The reproducibility checklist for an audit. (1) Customer pins BrassCoders version in their lockfile. (2) Customer pins the project commit hash. (3) Customer runs brasscoders scan and persists the output as a CI artifact. (4) Auditor runs the same BrassCoders version against the same commit and confirms the output matches. The check completes in minutes against typical codebases.

The honest scope. Reproducibility is bit-exact at the scanner layer. The Paid plan's AI-powered enrichment layer is deterministic at the scanner layer but stochastic at the embedding layer — the same scanner findings can produce slightly different rankings across runs because embedding models do not promise bit-exact output. For audit purposes, the underlying scanner findings are what matters, and those are reproducible.

What The YAML Looks Like To An Auditor

BrassCoders writes the scan output as YAML in .brass/ai_instructions.yaml, designed to be human-readable by developers and machine-readable by AI assistants. Auditors read the same file the AI does; the structure is stable, the fields are documented, the schema lives in the BrassCoders repo.

The schema in summary. The top-level YAML contains a metadata section (scan timestamp, BrassCoders version, scanner versions, project root path), a critical_issues array (CRITICAL-severity findings, with file path, line number, finding type, and description per entry), a findings array (all other findings, sorted by relevance), and a summary section (counts by severity, counts by scanner, total finding count). The schema is documented at /legal/privacy for what the scanner persists and at the BrassCoders GitHub repo for the full structure.

The audit-relevant section. Auditors reviewing CC8.1 walk the critical_issues array per release. The expectation is that critical findings are addressed before release, not that critical findings are zero. The audit trail is: critical_issues at scan time → ticket opened → ticket closed → next scan shows critical_issues addressed. The YAML supports the trail.

What the YAML deliberately does not contain. No raw source code. No PII matched values (the privacy scanner redacts before serialization). No secret values (the secrets scanner records type and hash, never the secret itself). The two-boundary redaction policy is documented in the privacy disclosure. Auditors who confirm the redaction policy confirm that the scan output is safe to retain as a build artifact indefinitely.

What BrassCoders Isn't Audited For (Honest Scope)

BrassCoders is not a third-party-audited security service, and any claim suggesting otherwise would be inaccurate. The OSS core is a local CLI with no service surface; there is no entity to audit. The Paid plan's gateway is a thin proxy in front of a hosted embedding model; the gateway logic is documented but not third-party attested.

What this means in practice. A customer adopting BrassCoders for SOC 2 evidence uses BrassCoders as a deterministic vulnerability scanner. The deterministic detection layer is the evidence. The customer's own auditors confirm the deterministic behavior by reproducing scans against pinned commits. The customer does not point at a BrassCoders-issued SOC 2 report because there is none.

What BrassCoders provides instead. Full documentation of the data plane in /legal/privacy. Reproducible benchmarks at /benchmarks. Open source code at the detection layer (Apache 2.0). Each of these is independently verifiable by the customer or their auditor. The verifiability replaces the third-party attestation; the auditor can confirm directly rather than rely on a report.

When BrassCoders will pursue SOC 2. The honest answer is when customers ask in writing. The cost of a SOC 2 Type II audit is roughly $50,000-$100,000 for an early-stage SaaS; BrassCoders will pursue the audit when paying customers cite it as a procurement blocker. As of mid-2026, the deterministic-by-design and source-available posture has been sufficient evidence for the customers in our pipeline.

Closing

BrassCoders ships deterministic, reproducible vulnerability detection across 12 scanners, packaged as YAML output that an auditor reads the same way an AI assistant does. The audit posture is the byproduct of the deterministic-by-design architecture; it is not a separate feature.

Install with one command:

pipx install brasscoders
brasscoders scan /path/to/project

Persist .brass/ai_instructions.yaml as a CI artifact. Tie the artifact to the commit hash. When the auditor asks how you know what's in your code, point at the artifact.

The full coverage map of what BrassCoders detects — and what it does not — is in the AI Coding Assistant Blind Spots pillar. The seven blind-spot categories, the seven detector mappings, and the honest scope of each detector are documented there.

Five FastAPI Security Patterns AI Coders Get Wrong

CopperSunDev — Sat, 18 Jul 2026 18:55:06 +0000

AI coding assistants generate working FastAPI endpoints in minutes. They also generate the same five security mistakes, reproducibly, across codebases. Not because FastAPI is insecure — FastAPI's official security documentation covers parameterized queries and authentication patterns clearly. The mistakes come from training data. The AI learned from millions of tutorial examples and StackOverflow answers, many of which use insecure patterns for brevity. The AI predicts what comes next, and what comes next in a tutorial is often f"SELECT * FROM users WHERE id = {user_id}".

BrassCoders detects all five patterns below in a single pass using its OSS-core scanners: Bandit, Yelp's detect-secrets, and three custom detectors for AI-introduced anti-patterns. No paid plan required.

Why AI Assistants Produce These Patterns

BrassCoders was built on a specific observation: AI coding assistants are pattern-completion engines, and the patterns they've seen most often are not always the patterns that are correct. Training data for Python web APIs skews toward tutorials and quick-start examples — code written to be readable, not to be secure. A StackOverflow answer that demonstrates a FastAPI route with an f-string SQL query gets upvoted for clarity. That same code, reproduced at scale by an AI assistant, ships a SQL injection vulnerability into production.

This is structural. Prompting your way around it helps at the margins. The more reliable fix is a scanner that runs after generation and flags the specific patterns that AI assistants get wrong.

Pattern 1 — SQL Injection via f-String

BrassCoders detects SQL queries constructed via string formatting using Bandit rule B608, which fires on any detected SQL statement where user-controlled values are interpolated directly into the query string.

The generated code looks like this:

from fastapi import FastAPI
import databases

app = FastAPI()
database = databases.Database("postgresql://localhost/mydb")

@app.get("/users/{user_id}")
async def get_user(user_id: int):
    query = f"SELECT * FROM users WHERE id = {user_id}"
    return await database.fetch_one(query)

The problem: user_id is typed as int by FastAPI, which blocks the most obvious string-injection payloads — but it doesn't block all of them. More critically, this pattern teaches the AI (and the developer reading it) that f-strings in SQL are acceptable. The same pattern, copy-pasted to a string-typed field, becomes a textbook SQL injection. OWASP API Security Top 10 lists injection as API8:2023, appearing across database queries, shell commands, and template engines.

The fix is parameterized queries:

@app.get("/users/{user_id}")
async def get_user(user_id: int):
    query = "SELECT * FROM users WHERE id = :user_id"
    return await database.fetch_one(query, values={"user_id": user_id})

BrassCoders catches the f-string version via Bandit B608. The parameterized version produces no finding.

Pattern 2 — Hardcoded Credentials in Config

BrassCoders detects hardcoded credentials — API keys, database connection strings, signing secrets — using Yelp's detect-secrets library, which covers 20+ credential formats including OpenAI keys, AWS access keys, GitHub PATs, Stripe live keys, and high-entropy strings.

The generated code looks like this:

# config.py
DATABASE_URL = "postgresql://admin:hunter2@localhost/mydb"
OPENAI_API_KEY = "sk-proj-abc123..."
SECRET_KEY = "my-super-secret-key-that-nobody-will-ever-guess"

# main.py
from config import DATABASE_URL, OPENAI_API_KEY, SECRET_KEY

This pattern appears constantly in AI-generated code. The AI is optimizing for "give the developer a working example" and hardcodes placeholder values it doesn't expect to be real. Developers copy the config, swap in real values, and commit it. The credential is now in git history permanently, regardless of whether the file is later removed.

The fix moves all secrets to environment variables:

import os
from functools import lru_cache
from pydantic_settings import BaseSettings

class Settings(BaseSettings):
    database_url: str
    openai_api_key: str
    secret_key: str

    class Config:
        env_file = ".env"

@lru_cache
def get_settings():
    return Settings()

FastAPI's official security documentation recommends environment-based configuration for all secrets. BrassCoders catches the hardcoded version via detect-secrets. The BaseSettings pattern produces no finding.

Pattern 3 — Shell Injection in Report Runners

BrassCoders detects subprocess calls with shell=True via Bandit rules B602 and B603, which fire whenever user-controlled data flows into a shell command string.

The generated code looks like this:

from fastapi import FastAPI
import subprocess

app = FastAPI()

@app.post("/reports/generate")
async def generate_report(report_type: str, output_format: str):
    cmd = f"generate_report --type {report_type} --format {output_format}"
    result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
    return {"output": result.stdout}

A request with report_type=pdf; rm -rf / executes both commands. The shell=True flag tells the OS to parse the string through /bin/sh, which interprets shell metacharacters. Any user-controlled value in that string becomes arbitrary code execution. Bandit's B602 documentation flags this as HIGH severity.

The fix passes arguments as a list, bypassing the shell entirely:

import subprocess
import shlex

ALLOWED_TYPES = {"pdf", "csv", "html"}
ALLOWED_FORMATS = {"letter", "a4"}

@app.post("/reports/generate")
async def generate_report(report_type: str, output_format: str):
    if report_type not in ALLOWED_TYPES or output_format not in ALLOWED_FORMATS:
        raise HTTPException(status_code=400, detail="Invalid report parameters")
    result = subprocess.run(
        ["generate_report", "--type", report_type, "--format", output_format],
        capture_output=True,
        text=True
    )
    return {"output": result.stdout}

No shell=True. No string interpolation. BrassCoders catches the original via B602/B603 and flags it HIGH.

Pattern 4 — Hallucinated Middleware Package

BrassCoders's AI-pattern scanner flags imports of packages that don't exist on PyPI, catching a class of AI-generated bug where the model invents a plausible-sounding package name that has never been published.

The generated code looks like this:

from fastapi import FastAPI
from fastapi_auth_middleware import AuthMiddleware
from fastapi_rate_limiter import RateLimiterMiddleware
from fastapi_security_headers import SecurityHeadersMiddleware

app = FastAPI()
app.add_middleware(AuthMiddleware, secret_key="your-secret-key")
app.add_middleware(RateLimiterMiddleware, calls=100, period=60)
app.add_middleware(SecurityHeadersMiddleware)

Some of these packages exist; some don't; some exist but with different import paths or class names. The AI generates what sounds like it should exist based on patterns it's seen. The code passes a linter because Python import errors are runtime — not compile-time. The developer's IDE may autocomplete based on stubs that don't match the real package API.

The danger goes beyond a runtime ModuleNotFoundError. A package name that doesn't exist on PyPI is an unclaimed namespace. An attacker can publish a malicious package under that name — a technique known as dependency confusion or slopsquatting. Any developer who runs pip install fastapi_auth_middleware expecting the AI-suggested package installs the attacker's code instead.

The fix is to verify every package against PyPI before using it, use packages with established maintenance histories, and let a scanner flag unresolvable imports before they reach production. BrassCoders's phantom-API detector cross-references imports against PyPI and flags packages that don't resolve.

Pattern 5 — O(N²) Response Construction

BrassCoders's performance scanner flags two specific O(N²) patterns in Python: results += [item] inside a loop and list.insert(0, item) inside a loop. Both appear consistently in AI-generated list-building code.

The generated code looks like this:

from fastapi import FastAPI
from sqlalchemy.orm import Session

app = FastAPI()

@app.get("/users")
async def list_users(db: Session):
    users = db.query(User).all()
    results = []
    for user in users:
        results += [{"id": user.id, "email": user.email}]
    return results

Or the reversed-list variant:

    ordered = []
    for item in db.query(Item).order_by(Item.created_at.desc()).all():
        ordered.insert(0, item.to_dict())
    return ordered

results += [item] creates a new list on every iteration. insert(0, ...) shifts every existing element right on every iteration. On a database returning 10 rows, neither matters. On a database returning 100,000 rows — not unusual for a users table or an events feed — both exhibit quadratic time and memory behavior. The insert(0, ...) variant is particularly slow: it's O(N) per call, O(N²) total.

The fixes are straightforward:

# Fix for += pattern
results = [{"id": user.id, "email": user.email} for user in users]

# Fix for insert(0, ...) pattern — reverse at the query level
ordered = [item.to_dict() for item in db.query(Item).order_by(Item.created_at.asc()).all()]
# Or, if you must reverse in Python:
ordered = []
for item in items:
    ordered.append(item.to_dict())
ordered.reverse()  # O(N) single reversal, not O(N²) cumulative shifts

BrassCoders's performance scanner catches both += [item] and insert(0, ...) inside loops and reports them as performance findings in the YAML output.

Running BrassCoders on a FastAPI Project

BrassCoders covers all five patterns above in a single offline scan, combining Bandit security rules, Yelp's detect-secrets credential patterns, and three custom BrassCoders detectors for AI-introduced anti-patterns. The scan output is .brass/ai_instructions.yaml, a ranked findings list formatted for direct paste into Claude Code or Cursor.

Install and run:

pip install brasscoders
brasscoders --offline scan .

The --offline flag enforces zero outbound network calls. The OSS core is Apache 2.0 licensed — no account required, no telemetry.

For larger projects, BrassCoders Paid ($12/month) adds a semantic deduplication pass that drops near-duplicate findings across files, reducing the scan output further before triage. Activate it with brasscoders activate after subscribing at coppersun.dev/pricing.

The five patterns above are AI-specific. They're not FastAPI bugs and they're not developer errors — they're what happens when a model trained on tutorial data generates production code without a scanner in the loop.

Add the scanner.

When AI Invents Libraries: Detecting Hallucinated Imports

CopperSunDev — Sat, 18 Jul 2026 18:54:39 +0000

BrassCoders's OSS core scans for hallucinated package imports with a single flag: brasscoders scan --check-package-hallucination. The flag is opt-in because the check is the only path in the OSS core that makes outbound network calls. The pattern matters because AI coding assistants generate imports of packages that don't exist at rates measured by published research, and a typosquatter who registers those hallucinated names as malware turns the hallucination into a supply-chain attack.

This post walks through why hallucinated imports are a different kind of failure than other AI mistakes, what the published research says about how often they happen, how the typosquat attack chain works in practice, and how BrassCoders's hallucination check fits into a CI workflow.

What Is a Hallucinated Import

BrassCoders treats a hallucinated import as a finding type that requires its own detector: an import or require statement referencing a package name that doesn't resolve on the relevant registry. The AI writes from fastapi_users_pydantic import UserManager when only fastapi-users is published on PyPI; the syntax is valid; the IDE doesn't complain; the failure surfaces at install time or — much worse — when a typosquatter has registered the hallucinated name as a malware-bearing package.

Why the hallucination happens: LLMs generate plausible-sounding package names by pattern-matching against the millions of imports they saw during training. A model that's seen fastapi-users 50,000 times and pydantic 200,000 times can confidently produce fastapi_users_pydantic as a "combined" package because the structural pattern looks correct. The model has no grounding in whether the combined package actually exists; it's just producing the most-likely next token given the prompt.

Why static analyzers miss it: traditional Python linters check syntax, type annotations, and code style. None of them verify that an imported name corresponds to a published package on PyPI. Pylint doesn't. Bandit doesn't. Pyflakes doesn't. The verification requires a network call, and most linters are explicitly offline tools.

BrassCoders surfaces hallucinated imports as a security finding, severity high, with the package name and the source line. The downstream AI consumer (or human reviewer) then decides whether to replace the import with a real package, remove the code entirely, or accept the risk.

How Often Models Hallucinate Packages

BrassCoders's position on hallucinated-package rates: the absolute percentage depends on the model and the benchmark, but the rate is high enough across published research that any team relying on AI-generated code without an existence-verification step is shipping a real risk. The specific numbers shift as models improve and as benchmarks change; the pattern is durable.

The most-cited research is Lasso Security's 2024 analysis of LLM-generated package suggestions. Their methodology: prompt commercial code-completion models with realistic refactoring tasks, capture every imported package, and check each one against the relevant registry. The findings document hallucinated-package generation as a consistent failure mode across major models, with rates measurable enough to constitute a category of risk rather than an occasional bug.

The rate is higher in some contexts than others. Long-context refactoring (where the model fills in plausible-adjacent libraries from training memory) tends to produce more hallucinations than short single-file completions. Less-popular languages and frameworks produce higher rates than mainstream ones because the model has less ground-truth coverage. Newer libraries are more dangerous than older ones because the model's training data lags reality.

What the research doesn't tell you: the rate at your specific company, on your specific codebase, with your specific prompt practices. The only way to know is to instrument your AI-augmented PR pipeline with a hallucination check and measure.

Why Hallucinated Imports Are a Supply-Chain Risk

BrassCoders treats hallucinated imports as supply-chain vulnerabilities, not stylistic errors. The attack chain is straightforward: a malicious actor monitors AI-generated code (via published GitHub repositories, AI-tool usage research, or direct model behavior testing), identifies frequently-hallucinated package names, registers those names as typosquatting packages on PyPI or npm with embedded malware, and waits. When AI-generated code reaches a developer's environment or a CI runner that does pip install or npm install, the typosquat package gets fetched and its install-time hooks execute.

This isn't theoretical. PyPI's security advisory feed and the npm security advisories database both track typosquatting as one of the largest attack surfaces in their ecosystems, with hundreds of confirmed malicious-package incidents per year predating AI tools. AI hallucinations accelerate the attack surface by providing the typosquatter with target names — instead of guessing what common typos developers make, the attacker watches the LLMs themselves and registers the names the LLMs invent.

The malware delivery, once you accept the attacker has the package up, can be anything that runs at install time: credential exfiltration from ~/.aws/credentials or ~/.ssh/, persistent backdoors, cryptominers, ransomware seeds. The Python ecosystem's setup.py and the JavaScript ecosystem's preinstall/install hooks both run arbitrary code at install time by design, and there's no sandboxing. Snyk's Vulnerability Database maintains an actively-updated catalog of confirmed malicious packages — a useful reality check for any team that hasn't yet wired up supply-chain scanning.

Mitigation requires catching the hallucination before the install. By the time pip install runs, the attacker has already won — running the hallucination check at code review time (or as a pre-commit hook) is the only reliable defense.

How BrassCoders's Check Works

BrassCoders's package-hallucination check parses every import statement in your scanned files, extracts the package name, and queries the target registry — PyPI's JSON API at pypi.org/pypi//json, npm's registry at registry.npmjs.org/, or Go's pkg.go.dev/ — to verify the package exists. A non-200 response is a hallucination signal; a 200 means the package is registered on the registry.

The check is opt-in via the --check-package-hallucination CLI flag because it's the only path in the OSS core that makes outbound network calls. BrassCoders's default posture is offline-first: scans complete with zero network traffic unless you explicitly turn on this one check. The flag respects --offline: passing --offline overrides the opt-in back to off.

What gets sent over the wire: the bare package name (fastapi-users, lodash, github.com/spf13/cobra). No source code, no surrounding context, no project name, no telemetry. PyPI sees nothing different from a normal pip search query.

What gets returned: a JSON object from PyPI or npm if the package exists, a 404 if it doesn't. BrassCoders discards the JSON content and only retains the existence signal. The package name plus the existence boolean are what go into the finding record.

What happens with private packages: if your code imports an internal package that isn't published to the public registry (e.g. mycompany_auth), the check will flag it as hallucinated. This is a known false-positive pattern. Mitigation: pass --internal-packages mycompany_auth,mycompany_billing to whitelist names the check should skip. Or simply don't enable the check on repos that import a lot of private packages.

How to Use It in CI

BrassCoders's hallucination check fits into a CI pipeline as a pre-merge gate. Run brasscoders scan --check-package-hallucination against your branch and fail the build if any finding type matches HALLUCINATED_IMPORT. The added latency is 100-500ms per imported package depending on registry network latency, which usually adds 5-30 seconds to a typical scan.

A minimal GitHub Actions workflow:

name: BrassCoders scan
on: [pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.12' }
      - run: pipx install brasscoders
      - run: brasscoders scan --check-package-hallucination .
      - run: |
          if grep -q "HALLUCINATED_IMPORT" .brass/security_report.yaml; then
            echo "Hallucinated package import detected. Fail the build."
            exit 1
          fi

The same shape works in GitLab CI, CircleCI, Jenkins, or any other runner — the only requirement is a Python 3.10+ environment and network access to PyPI/npm. The check shouldn't run inside an air-gapped environment; pair it with regular brasscoders scan (without the flag) for those cases.

What about catching hallucinations at code-generation time, before they hit a PR at all? That's the right end-state. Until your AI assistant of choice runs hallucination verification on its own output (and most don't yet), the pre-merge scan is the load-bearing defense.

The supply-chain attack chain that hallucinated imports enable is the kind of thing that gets a security org calling at 11pm. The mitigation is a one-line CI step. The trade-off isn't close.

Install BrassCoders with pipx install brasscoders and add the check to your PR pipeline. For the broader context on AI code review failure modes, see AI Code Review: The Practical Guide for 2026.

The Secrets Your AI Assistant Might Leak (And How to Catch Them)

CopperSunDev — Sat, 18 Jul 2026 18:54:34 +0000

BrassCoders detects around 20 canonical secret formats — AWS access keys, GitHub personal access tokens, OpenAI keys, Stripe live keys, JWT tokens, PEM-encoded private keys, Anthropic keys, Slack tokens, NPM auth tokens, SendGrid keys, Mailgun keys, DigitalOcean tokens, Twilio credentials — plus a high-entropy fallback that catches the formats it hasn't seen before. The reason this matters: AI coding assistants embed secret-shaped strings in generated config files, example scripts, and test fixtures more often than developers expect, and the secret in your AI-generated .env.example looks just like a real secret to a downstream attacker.

This post covers why AI tools generate credential-shaped output, what specific formats are worth scanning for, how entropy-based detection catches the formats nobody has cataloged yet, and how BrassCoders's two-boundary redaction prevents the detector itself from becoming a leak surface.

Why AI Generates Code With Embedded Secrets

BrassCoders treats embedded-secret hallucinations as a known AI failure mode, not an edge case. Models that have seen millions of code examples in training have also seen millions of API keys, tokens, and passwords embedded in those examples; in the absence of an explicit safety policy at generation time, the model reproduces credential-shaped strings as plausible filler when generating config files, example scripts, or test fixtures.

The pattern happens in three recurring contexts. First, example code generation: ask the AI for "an example .env file for a Stripe + SendGrid + AWS setup" and the model produces a file with what look like Stripe live keys (starting sk_live_), SendGrid keys (starting SG.), and AWS access keys (starting AKIA). The keys are syntactically valid — the model knows the format perfectly because it saw thousands of real ones in training — and a junior developer might commit the file without realizing the AI invented credentials that look real enough to fool security automation.

Second, test fixture generation: ask the AI to "write a test that verifies authentication" and the model fabricates a JWT token, a PEM-encoded RSA private key, or a session cookie value. The fabricated credentials are usually high-entropy and structurally correct. Some get committed to repos as "test data" and end up triggering security scanners weeks later.

Third, completion-context leakage: an AI assistant operating on a partial file may complete a config line based on patterns it saw nearby. If your real .env is open in the same editor session and the AI has context, the completion may include credential-shaped strings borrowed from that context — sometimes the real credentials, sometimes a hallucinated variant that's been altered just enough to look plausible. GitGuardian's annual State of Secrets Sprawl report has tracked AI tooling as a contributing pattern in the growing rate of public-repo secret leaks.

The mitigation is structural: scan every diff for credential-shaped strings before it's committed, treat every hit as a CRITICAL finding, and never assume the developer noticed.

The 20+ Secret Formats BrassCoders Detects

BrassCoders ships with detection coverage for around 20 canonical secret formats, drawn from a combination of an established upstream secret-detection library and BrassCoders-specific patterns added for the formats AI tooling tends to fabricate most often.

The upstream library is Yelp's detect-secrets, which provides the historical-format coverage; the BrassCoders additions target the API ecosystems where AI hallucinations cluster. The full active list as of release 2.0.4:

Cloud provider credentials: AWS access keys (AKIA..., 20 chars), AWS secret access keys (40-char base64), DigitalOcean personal access tokens, Twilio account SIDs and auth tokens.
Developer-tooling credentials: GitHub personal access tokens (classic ghp_... and fine-grained github_pat_...), NPM auth tokens, Anthropic API keys, OpenAI API keys.
Payment and messaging credentials: Stripe live keys (sk_live_...) and publishable keys, SendGrid keys (SG....), Mailgun keys, Slack bot tokens and webhook URLs.
Cryptographic material: PEM-encoded private keys (RSA, DSA, EC, OpenSSH), JWT tokens (three-segment base64 with eyJ... prefix), and PGP private key blocks.
Web app credentials: Generic high-entropy strings in environment-variable assignments, hardcoded password literals, and common session secret patterns.

Each pattern is a regex tuned to match the format's canonical structure. The patterns are intentionally conservative: it's better to occasionally flag a non-secret high-entropy string for human review than to miss a real credential. False positives cost a developer 30 seconds of dismissal; false negatives cost a credential rotation and a security postmortem.

The upstream detect-secrets library handles a broader range of historical formats (Slack legacy tokens, Datadog keys, deprecated AWS formats) that BrassCoders picks up automatically when detect-secrets ships an update. BrassCoders's value-add is the AI-tooling-tuned subset: the formats that LLMs hallucinate or borrow from context most often.

How Entropy-Based Detection Works

BrassCoders's entropy-based detector catches secret-shaped strings that don't match any cataloged format. The detector computes Shannon entropy on each candidate string; values above a threshold (typically 4.5 bits per byte) signal randomness consistent with credentials rather than human-readable text.

Shannon entropy is a well-established information-theoretic measure of randomness in a string. A coin-flip sequence has entropy 1 bit per flip. A uniformly-random 256-character base64 string has entropy close to 6 bits per character. A human-readable English sentence has entropy around 1-2 bits per character because letters follow predictable patterns (Q is usually followed by U, common bigrams dominate). Real credentials sit between these two endpoints — they need to be high-entropy enough to resist guessing, which puts them squarely in the band where the entropy detector flags.

The implementation in BrassCoders: for each string literal in scanned code, the detector computes the Shannon entropy. If the entropy exceeds the threshold AND the string appears in a suspicious context (right side of an = in a config file, value of a key named password / secret / token / key / api_*, hardcoded in source as a constant), BrassCoders surfaces it as a finding. The context check matters because uniformly-random data appears in plenty of legitimate places — UUIDs, content hashes, binary blobs — that aren't credentials.

What entropy detection catches that pattern matching misses: rotated formats (when a vendor changes their key format, the old regex misses but the entropy is unchanged), internal-format credentials (your company's custom JWT, session-token format, or internal SSO bearer that no public regex knows about), and the credentials AI assistants invent that look novel but high-entropy. The detect-secrets library's keyword detector implements this pattern at the upstream layer; BrassCoders tunes the thresholds for AI-tooling-shaped output specifically.

How BrassCoders Redacts at Two Boundaries

BrassCoders redacts secrets at two boundaries: the scanner masks matched values at detection time, and the YAML writer strips known-sensitive metadata keys at serialization time. The redaction is the same whether output is being written locally to .brass/ or sent to the BrassCoders Paid plan's hosted gateway.

The scanner boundary handles the immediate masking. A credit-card-shaped number like 4111-1111-1111-1111 gets masked to 4111****1111 before any BrassCoders-owned object holds the full value. An AWS access key gets the type recorded but never the value — the finding includes type: AWS_ACCESS_KEY and a file path, not the literal AKIA... string. JWT tokens, PEM-encoded private keys, and other high-entropy credentials get similar treatment: the type and location persist; the secret value does not.

The YAML writer boundary is the belt-and-suspenders layer. For any finding whose type is PRIVACY or whose detector is on the secret-leak allowlist, the writer strips matched_text, code_snippet, context_line, raw_match, and context from the serialized output. This catches anything the scanner forgot — if a future scanner forgets to mask at the source, the writer's allowlist still strips it before serialization.

Why two boundaries: defense-in-depth. The single-layer pattern (mask at the source) fails the moment any scanner is added or modified without the mask logic being updated. The two-layer pattern means the writer is a known choke point through which every finding flows, regardless of which scanner produced it. Adding a new secret-detecting scanner doesn't require touching the writer; existing redaction applies automatically.

The data-handling guarantees: the same redacted output goes to disk and to the gateway. No "this version is for our servers, that version is for the local file" split. Whatever you can read in .brass/security_report.yaml is exactly what the gateway sees. For the full data-flow detail, see What BrassCoders Sends to Its Servers (And What It Doesn't).

How to Set Up Secret Scanning in CI

BrassCoders's secret scanning runs by default with every brasscoders scan; no flag needed. CI integration is a one-line addition: run the scan, grep the output, fail the build if any HARDCODED_SECRET findings appear.

A minimal GitHub Actions workflow:

name: BrassCoders secret scan
on: [pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: '3.12' }
      - run: pipx install brasscoders
      - run: brasscoders scan .
      - run: |
          if grep -q "HARDCODED_SECRET\|HARDCODED_CREDENTIAL\|PRIVATE_KEY" .brass/security_report.yaml; then
            echo "Secret detected in code. Failing build."
            cat .brass/security_report.yaml
            exit 1
          fi

For pre-commit hooks, the same pattern in .git/hooks/pre-commit:

#!/bin/sh
brasscoders scan --offline . > /dev/null
if grep -q "HARDCODED_SECRET\|HARDCODED_CREDENTIAL\|PRIVATE_KEY" .brass/security_report.yaml; then
  echo "Secret detected. Rejecting commit. Review .brass/security_report.yaml"
  exit 1
fi

The --offline flag in the pre-commit hook keeps the scan from triggering any network calls (including the optional package-hallucination check). For pre-merge in CI, drop --offline if you want both checks running.

What this defends against: every secret-shaped string the AI puts in your diff, every example credential a junior developer copy-pastes from generated code, every test fixture that contains a too-real-looking token. What it doesn't defend against: secrets already in git history (use gitleaks or detect-secrets in scan-historical mode for that), secrets stored elsewhere on disk, or secrets passed via legitimate channels like environment variables that aren't in source.

The hardest secret to detect is the one that looks legitimate enough that nobody questions it. AI-generated credentials are designed (by training) to look exactly like real credentials. Scanning for them is the only reliable defense.

Install BrassCoders with pipx install brasscoders and run brasscoders scan against your project. For the broader context on AI code review failure modes, see AI Code Review: The Practical Guide for 2026.

What Leaves Your Machine When an LLM Reviews Your Code

CopperSunDev — Sat, 18 Jul 2026 18:54:07 +0000

When you paste a diff into Claude.ai and ask for a review, you know what's happening: your source code is traveling to Anthropic's API. The same thing happens with automated LLM review tools — just without the paste.

This is not an argument against AI code review. It's a fact about what category of tool it is, and that fact has consequences for compliance, cost, and automation.

What the Payload Contains

BrassCoders's generation-mode benchmark used claude-sonnet-4-6 to review its own generated code. The API call for each review transmitted the full source file — typically 20–60 lines of Python in the benchmark, but in production use it's common to send entire files, diff context, and adjacent imports.

Most AI code review tools send one or more of:

The raw source file or diff under review
Import-resolution context (other files the reviewed code depends on)
Project-level context to improve accuracy (README, type stubs, surrounding modules)

The payload is unmodified source code in plaintext. It travels over TLS, but once it reaches the provider's infrastructure it is subject to that provider's data-retention and model-training policies. Enterprise agreements and zero-data-retention tiers exist, but they require explicit opt-in and are not the default.

The Regulatory Gap

For many codebases, the question isn't whether to send code — it's whether they're allowed to.

A health-tech team whose test fixtures include PHI-adjacent patterns can't route those files through an external API on every commit without a signed Business Associate Agreement covering the review tool. A financial system under PCI-DSS processing scope has limits on what data can leave its network perimeter. A defense contractor under ITAR or CMMC has explicit restrictions on foreign-national data access and cross-boundary transmission.

These aren't edge cases. They're standard constraints for any team in a regulated industry. BrassCoders's --offline flag satisfies all of them: zero bytes leave the machine. Scanner findings stay local. Nothing is transmitted anywhere.

The Practical Gap for Proprietary Code

Outside regulated industries, most production codebases carry some proprietary constraint — an NDA, a trade-secret designation on the core algorithm, an investor agreement that limits disclosure, or simply a strong preference not to transmit novel IP to a third party's training pipeline.

Sending a Python file containing a novel pricing algorithm or a proprietary trading strategy to an external API on every commit is a legal and business-risk decision, not a technical one. Many teams make it implicitly, because the developer typed the diff into a browser tab and asked Claude. A systematic per-commit review tool requires a deliberate decision about what data classification rules apply.

BrassCoders's offline mode sidesteps the decision entirely. No data leaves. No agreement needed. The scan runs in CI on every commit, returning deterministic findings, with the same compliance posture as running flake8.

What the Paid-Plan Enrichment Actually Sends

For teams who want AI-powered enrichment and have no egress constraints, BrassCoders Paid sends a tightly scoped payload through the BrassCoders gateway — not raw source code.

The payload contains:

Already-redacted scanner findings (credentials, PII, and high-entropy strings are stripped at the scanner layer before transmission)
A project signature of at most 7,500 characters, derived from README content, the package manifest, top-level directory names, and entrypoint filenames

Raw source never leaves the machine under the Paid plan either. The BrassCoders data handling page documents the exact fields and their derivation.

The Automation Consequence

Beyond compliance, there's a practical automation gap. LLM-based code review is invoked. Someone has to ask. On the commits where nobody asked — the Friday afternoon push, the hotfix, the one-line change that "obviously doesn't need review" — nothing runs.

BrassCoders's generation-mode benchmark tracked whether the model issued proactive warnings after generating code that contained bugs. It warned in 0 out of 6 tasks. The model wrote an O(N²) loop and moved on. It only flagged the problem when the benchmark script sent an explicit review prompt.

A pre-merge gate has to run on every commit, not on the commits someone chose to review. brasscoders --offline scan in a GitHub Actions step runs on every push and every pull request. No invocation, no configuration per commit, no budget approval per run.

What to Run

The two tools do different jobs. A BrassCoders scan on every commit catches the deterministic, rule-matchable problems — performance anti-patterns, secrets, PII leaks, interprocedural taint flows — locally, for free, automatically. An LLM reviewer invoked by a developer on pull requests catches intent-level problems, naming issues, and the logic bugs that require reasoning to find.

The gate runs automatically. The conversation happens by choice. Neither substitutes for the other.

pip install brasscoders
brasscoders --offline scan /path/to/project

The 2026 AI-Coding Market in Five Numbers

CopperSunDev — Sat, 18 Jul 2026 18:54:01 +0000

Five numbers describe the AI-coding market in 2026, and together they make one argument: the question is no longer whether your team ships AI-generated code, but what catches it before it merges.

Gartner says 75-90% of enterprise engineers by 2028. Microsoft reports 4.7 million paid Copilot seats. Cursor crossed $2 billion in ARR. Claude Code revenue runs into the billions. And across professional developers, weekly AI-assistant usage now sits near universal. Here is what each number means, sourced to the primary disclosure, and why the trend hands the advantage to whoever owns the detection layer underneath.

The Adoption Curve Is Already Decided

BrassCoders treats Gartner's 75-90%-by-2028 projection as the clock on the whole category: when nearly every enterprise engineer ships AI-generated code, the deterministic check underneath stops being a nice-to-have. Gartner's April 2024 forecast put adoption at roughly 14% then, climbing to 75-90% of enterprise software engineers within four years.

That is not a gentle ramp. It is a 5-to-6x expansion of the population writing code with an assistant, inside a single planning horizon. The Gartner projection is the number a team lead can put in front of a CTO without it being questioned, because the audience already accepts the source.

The strategic read: every team will be a team shipping AI-generated code. The differentiator stops being whether you use AI and becomes what you run between the assistant and main.

4.7 Million Developers Are Paying For Copilot

BrassCoders treats Microsoft's reported 4.7 million paid GitHub Copilot subscribers as the highest-credibility adoption data point in the category, because it comes from a regulatory filing rather than a marketing page. The number appears in Microsoft's FY26 Q2 results, filed in January 2026.

Why the source matters: an SEC disclosure carries legal weight a press release does not. When you cite Copilot adoption to a skeptical stakeholder, the Microsoft 10-Q filing is the version that survives scrutiny. Secondary coverage rounds the figure or restates it; the filing is the primary record.

And 4.7 million is only the paid, GitHub-platform-default slice. It excludes the free tier, the IDE-native assistants, and the terminal-native tools. The real population writing AI-assisted code is larger than any single vendor's subscriber count.

The Market Is Bigger Than One Vendor

BrassCoders reads the Cursor and Claude Code numbers as proof the market is structural, not a single-vendor bubble: Anysphere disclosed Cursor passing $2 billion in annual recurring revenue in early 2026, and Anthropic's Claude Code revenue runs into the billions annualized.

Cursor reached that figure after crossing $500M ARR in mid-2025, a roughly 4x climb in well under a year.

Three vendors, three distribution shapes: GitHub-platform-default (Copilot), IDE-native premium (Cursor), and terminal-native premium (Claude Code). That spread is the working model for which assistant a developer is using when they reach for a scanner. No single tool dominates, which means the detection layer underneath has to be assistant-agnostic.

The combined paid revenue across these three alone clears several billion dollars a year. A market that size does not contract because the code has bugs. It keeps growing, and the bugs grow with it.

High Adoption Is Not High Quality

BrassCoders treats the gap between adoption and quality as the entire reason the category exists: the same span that drove Copilot to 4.7 million paid seats also produced a measured rise in AI-attributed vulnerabilities and documented efficiency regressions in generated code. Adoption answers "how many developers." It says nothing about "how correct."

These are independent axes. A developer can adopt an assistant, ship more code faster, and ship more bugs at the same time. Published efficiency benchmarks find AI-generated code reaches only a fraction of expert-level performance even when it passes every test, and AI-attributed CVE counts rose sharply through early 2026. The volume went up; the per-line quality did not follow.

This is the seam. As adoption approaches saturation, the marginal risk moves from "are we using AI" to "what is the AI silently getting wrong." The teams that win the next phase are the ones that instrument that seam.

What Five Numbers Mean For Your Pre-Merge Gate

BrassCoders is the deterministic check that sits in the seam these numbers describe: it scans AI-generated code locally, on every commit, and produces the same findings every run, with no source sent to an API. A market heading to 75-90% adoption is a market where every pull request carries AI-generated risk, and a hand-invoked LLM review only runs on the commits someone remembers to paste.

The adoption curve is set; the budget argument writes itself. When a CTO has already accepted the Gartner number, the follow-on question answers itself: if every engineer is shipping AI-generated code by 2028, what runs between the assistant and production. A pre-merge gate that is free, local, and automatic scales with the curve. A per-commit API review does not.

pip install brasscoders
brasscoders --offline scan /path/to/project

The five numbers point one direction. The market decided it will write code with AI. What it has not decided, and what is still open to whoever shows up with the right tool, is what catches the code before it ships.

AI Code Review Policy for Copilot Teams

CopperSunDev — Sat, 18 Jul 2026 18:53:35 +0000

The first time an auditor asks "what's your process for reviewing AI-generated code?" and you don't have an answer, you realize the gap. Most teams have informal practices — "we review it like any other PR" — but no written policy. That worked when AI assistants generated occasional snippets. It breaks down when 40% of the diff is AI-generated.

A written policy does two things: it gives your team a consistent standard to enforce, and it gives your auditor something to read. This post walks through the three gates a policy needs, shows the tooling and config for each, and ends with a one-page template you can adapt today.

Quick Navigation

Why "Review It Like Any Other PR" Fails
The Three Gates a Policy Should Define
Gate 1 — Pre-Commit
Gate 2 — CI Enforcement
Gate 3 — Human Review Threshold
The Audit Trail
A One-Page Policy Template

Why "Review It Like Any Other PR" Fails

AI-generated code fails in categories that human-written code rarely does: hallucinated package imports that install at pip-install time and fail at runtime, hardcoded credentials in example blocks that reviewers miss because the pattern looks intentional, and O(N²) performance anti-patterns that are invisible on small datasets. Standard PR review norms weren't designed to catch these.

Stack Overflow's Developer Survey has tracked a widening gap between AI tool adoption and developer trust in that output. Developers are using AI coding tools faster than teams are building governance for them. The informal review norm — "another pair of eyes on the diff" — assumes the reviewer knows what failure modes to look for. With AI-generated code, the failure modes shifted and the review norms didn't.

Three things change when a significant share of your code comes from an AI assistant. Reviewers lose calibration: when every function looks plausible, the reviewer's "this feels wrong" detector degrades. The blast radius of a bad pattern grows: AI assistants repeat their own errors consistently, so one bad pattern tends to appear across many files. And the paper trail disappears: "the model generated this" isn't an explanation a security team can evaluate.

A policy replaces the calibration loss with explicit gates.

The Three Gates a Policy Should Define

A code review policy for an AI-augmented team needs three defined gates: a pre-commit check that runs locally before a developer pushes, a CI gate that must pass before a pull request can merge, and a human review threshold that specifies when a human is required rather than optional. Without all three, the policy has gaps.

A pre-commit gate catches the worst problems before they enter the repository. A CI gate makes passing the scan a merge prerequisite, not a suggestion. A human review threshold handles the cases that a pattern scanner doesn't have context to evaluate — logic errors, architectural decisions, and anything touching authentication or money. Each gate handles a different failure class. Removing any one of them leaves a category uncovered.

Gate 1 — Pre-Commit

BrassCoders runs a local, offline scan before each commit succeeds, catching critical findings — hardcoded secrets, insecure subprocess calls, hallucinated imports — before they enter the repository. The OSS core makes zero outbound network calls, so the check works on every developer machine without an account. It adds roughly 10 to 30 seconds to the commit flow.

Wire it up with pre-commit:

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: brasscoders-scan
        name: BrassCoders offline scan
        language: system
        entry: brasscoders --offline scan
        pass_filenames: false
        stages: [commit]

Install BrassCoders from PyPI (pip install brasscoders) and activate the hooks with pre-commit install. The hook runs brasscoders --offline scan on every commit. Critical findings exit with code 1 and block the commit. Non-critical findings write to .brass/ for review.

Policy language to include:

Pre-commit gate: All commits to shared branches must pass a BrassCoders offline scan.
Critical findings block the commit. Non-critical findings are logged to .brass/ for review.

Gate 2 — CI Enforcement

BrassCoders exits with code 1 on any CRITICAL finding, which fails the GitHub Actions step — and paired with a branch protection rule requiring that check to pass before merge, the gate is enforceable rather than advisory. The .brass/detailed_analysis.yaml artifact uploads with each run and serves as the audit trail.

GitHub Actions configuration:

# .github/workflows/brass-scan.yml
name: BrassCoders Scan

on:
  pull_request:
    branches: [main, develop]

jobs:
  brass-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"

      - name: Install BrassCoders
        run: pip install brasscoders

      - name: Run BrassCoders scan
        run: brasscoders --offline scan

      - name: Upload scan artifact
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: brass-analysis
          path: .brass/
          retention-days: 90

After the workflow runs once, add a branch protection rule in GitHub: Settings > Branches > Branch protection rules > Require status checks to pass before merging, then add BrassCoders Scan / brass-scan as a required check. Reference: GitHub's branch protection documentation.

Policy language to include:

CI gate: All pull requests must pass the BrassCoders CI scan before merge.
Branch protection rules require this check. Override requires an explicit exception logged in the PR description.

Gate 3 — Human Review Threshold

Automated gates catch rule-based patterns. Human review catches logic errors, architectural concerns, and context a scanner doesn't have. A policy should define when human review is required — not optional.

The minimum threshold: any PR touching authentication, payment handling, or data persistence requires human approval from a senior engineer, regardless of automated scan results. The scan passed doesn't mean the logic is sound.

Change type	Automated scan sufficient?	Senior review required?
Internal tooling, no user data	Yes	No
API endpoint changes	Yes	Yes
Auth / session handling	Yes	Yes (mandatory)
Payment / billing code	Yes	Yes (mandatory)
Data migration	Yes	Yes + DBA sign-off

The right column answers the question independently of what the scan found. A CRITICAL finding in auth code requires both the scan to pass (after the finding is resolved) and a senior approval. A clean scan on auth code still requires a senior approval. These aren't redundant: the scan catches pattern violations; the human catches the things a pattern scanner has no visibility into.

Set this threshold in your GitHub branch protection rules by requiring specific code owners to review PRs that touch specified path patterns. A CODEOWNERS file maps directory paths to required reviewers:

# CODEOWNERS
/src/auth/           @your-org/senior-engineers
/src/payments/       @your-org/senior-engineers
/src/migrations/     @your-org/senior-engineers @your-org/dba-team

The Audit Trail

The .brass/detailed_analysis.yaml file generated by each CI run contains every finding with its file path, line number, severity, scanner source, and evidence string — and uploaded as a GitHub Actions artifact, it gives an auditor a reproducible, timestamped record of what was scanned, what was found, and what passed the gate.

BrassCoders runs 12 scanners — Bandit, Pylint, Pyre/Pysa (taint analysis), Semgrep, ast-grep, detect-secrets from Yelp, plus six custom detectors for secrets, privacy/PII, AI-generated patterns, performance, content moderation, and JavaScript/TypeScript. The detailed_analysis.yaml reflects the union of their findings, with each finding tagged to its originating scanner. An auditor can trace any finding back to the specific pattern that fired it.

The artifact retention in the Actions YAML above is set to 90 days. Match this to your organization's minimum retention requirement. Most SOC 2 programs ask for 90 days of scan history; some require a year. Adjust the retention-days value accordingly. The same commit hash plus the same BrassCoders version always produces the same output, so the artifact is reproducible as well as timestamped.

One more thing worth noting: the BrassCoders OSS core is Apache 2.0 licensed. The CI scan runs entirely offline. No data leaves the machine during a CI scan. If your auditor asks what data leaves the environment, the answer for the OSS core is: nothing.

A One-Page Policy Template

Copy and adapt this for your organization. Replace bracketed placeholders with your specifics.

# AI-Augmented Code Review Policy

**Version:** 1.0 | **Effective:** [DATE] | **Owner:** [TEAM LEAD / VP ENG]

## Scope

This policy applies to all code commits where AI coding tools — including
GitHub Copilot, Cursor, Claude Code, and similar — were used to generate
or substantially modify code.

## Pre-Commit Gate

All commits to shared branches must pass a BrassCoders offline scan
(`brasscoders --offline scan`). Critical findings block the commit.
Non-critical findings are logged to .brass/ for developer review before
the next commit.

## CI Gate

All pull requests must pass the CI BrassCoders scan before merge.
Branch protection rules require this check.

Exception: hotfixes during an active incident may bypass with an explicit
PR comment documenting the exception and the reason. A follow-up PR
resolving the bypassed findings must be opened within [X] business days.

## Human Review Requirements

- Routine changes with no user data: 1 peer review
- API endpoint changes: 1 peer review + 1 human reviewer
- Authentication, session handling, payment code, data persistence:
  1 senior engineer approval (mandatory, regardless of scan result)
- Data migrations: senior engineer approval + DBA sign-off

## Audit Trail

Each CI scan uploads .brass/detailed_analysis.yaml as a GitHub Actions
build artifact. Retention: 90 days minimum (adjust to meet your
compliance requirement).

## Exceptions

Any bypass of a required gate must be documented in the pull request
description with the exception reason and the name of the approving
engineer.

## Review Cycle

This policy is reviewed [quarterly / annually] or following any
security incident attributable to AI-generated code.

The template covers the three gates, defines explicit human review triggers, and specifies the audit artifact and its retention. It doesn't try to enumerate every AI tool or every failure mode — that's the scanner's job. The policy defines what must pass, when a human must sign off, and what gets saved for the auditor.

Start by wiring up the CI gate. It's the highest-leverage change: it makes the scan a merge prerequisite rather than a suggestion, and it generates the audit artifact automatically on every PR. The pre-commit hook and the human review threshold add depth — but the CI gate is the one that closes the loop.

BrassCoders OSS core is on PyPI: pip install brasscoders. The source is at github.com/CopperSunDev/brasscoders. Apache 2.0, no account required for the offline scan.

Add BrassCoders to GitHub Actions

CopperSunDev — Sat, 18 Jul 2026 18:53:29 +0000

Every commit your AI coding assistant writes goes into your codebase without a static gate. Claude Code and Cursor generate code fast, but neither one runs automatically on push, neither one fails the build when a critical finding appears, and neither one produces identical output for identical input. That gap is what a CI security scanner fills. Adding BrassCoders to GitHub Actions closes it.

Why a Static Gate Belongs in CI (Not Just in Your Editor)

BrassCoders running in GitHub Actions produces deterministic, repeatable output on every push — the same codebase produces the same findings every time, regardless of which developer ran the scan, which machine it ran on, or what prompt produced the code under review.

AI-assisted code review in your editor is conversational. You ask, it suggests, you accept or reject. It runs when you think to run it. A CI gate is the opposite: automatic on every push, non-interactive, and capable of blocking a merge. These two things solve different problems. The editor assistant helps you write code faster. The CI gate catches what slipped through.

The distinction matters most for AI-generated code specifically. An AI assistant that generated a function isn't also auditing it against Bandit's security rules or checking it for hardcoded secrets via detect-secrets. That work belongs to a separate tool running outside the editor's context.

The GitHub Actions Workflow (Copy-Paste Ready)

BrassCoders ships a ready-made GitHub Actions workflow in its open-source repository that installs the scanner, runs brasscoders --offline scan, and uploads the .brass/ directory as a PR artifact — so reviewers can download the ranked findings without running the scan themselves.

The workflow lives at .github/workflows/brasscoders-scan.yml in the BrassCoders OSS repo. Here's the full content:

name: BrassCoders Scan
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  brasscoders:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      - name: Install BrassCoders
        run: pip install brasscoders==2.0.8
      - name: Run BrassCoders scan
        run: brasscoders --offline scan .
      - name: Upload .brass artifact
        uses: actions/upload-artifact@v4
        with:
          name: brasscoders-findings
          path: .brass/

Copy this file into your repository at .github/workflows/brasscoders-scan.yml. GitHub will pick it up automatically. No additional configuration is required for the OSS core.

The --offline flag makes explicit that no network calls happen during the scan. It's redundant with the OSS core's default behavior, but it documents intent and prevents accidental enrichment calls if you later add a license key to the environment without updating the workflow.

The actions/upload-artifact@v4 step follows GitHub's recommended artifact upload pattern. The .brass/ directory is uploaded as a named artifact (brasscoders-findings) that persists for the default retention period and is accessible from the Actions run summary.

What the PR Artifact Contains

The .brass/ artifact uploaded by the workflow contains three files: ai_instructions.yaml, a short severity-ranked list of findings designed to be pasted directly into Claude Code or Cursor; detailed_analysis.yaml, which records every finding with its file path, line number, scanner name, and evidence; and security_report.yaml, a security-only view that includes Bandit, Pysa, Semgrep, and detect-secrets findings for use in audit records.

ai_instructions.yaml is the file you'll reach for most often. Download the artifact from the Actions run, open that file, and paste it into your AI assistant's context. The assistant sees the ranked findings without running any scan itself. That's a useful division: BrassCoders does the pattern detection, the AI assistant does the triage and fix reasoning.

The artifact approach means reviewers never need BrassCoders installed locally to see the findings on a PR. They download once, read the YAML, and carry the context into whatever tool they're using.

Failing the Build on Critical Findings

BrassCoders exits with code 1 when any CRITICAL finding is detected, which fails the GitHub Actions step and can block a merge when branch protection is configured to require the check to pass.

To wire this up: go to your repository's Settings → Branches → Branch protection rules and add or edit the rule for your main branch. Check Require status checks to pass before merging and add brasscoders (the job name from the workflow) as a required check. After the first successful workflow run, the job name appears in the search box. Save the rule.

After that, any push that produces a CRITICAL finding will fail the brasscoders job and prevent the merge. The --offline flag guarantees the scan can't be skipped due to a network issue. The scan either passes or it fails — there's no degraded middle state where some findings get checked and others don't.

CRITICAL findings in the OSS core come primarily from Bandit (SQL injection, shell injection, unsafe deserialization), Pysa/Pyre (taint flows from user input to sensitive sinks), and the custom secrets detector (hardcoded credentials, API keys matching known formats).

Adding the Paid Plan for Noise Reduction

The OSS core scan works without an account and makes no network calls — 12 scanners run locally and emit everything they find. BrassCoders Paid ($12/month) adds an AI-powered semantic deduplication pass that reduces a typical scan's 1500+ raw findings to roughly 300 actionable ones, ranked against a signature derived from your project's README and manifest.

The enrichment runs through BrassCoders's hosted gateway. It receives already-redacted findings — never raw source code. The per-token cost is passed through at upstream cost with no markup.

To use BrassCoders Paid in CI, store your license key as a GitHub Actions secret (BRASS_LICENSE_KEY is the conventional name), then expose it to the workflow step:

      - name: Run BrassCoders scan
        run: brasscoders scan .
        env:
          BRASS_LICENSE_KEY: ${{ secrets.BRASS_LICENSE_KEY }}

Drop the --offline flag when using the Paid plan — that flag prevents the enrichment call. The scan will detect the license key, validate it, run the local scanner pass, then send the redacted findings to the gateway for the deduplication and ranking step. The .brass/ artifact then contains enriched YAML with the ranked, deduplicated finding set.

The Paid plan doesn't change the workflow structure. It changes what's in the artifact at the end.

Install the scanner:

pip install brasscoders

Or pin the current release in your workflow: pip install brasscoders==2.0.8. The OSS repo is at github.com/CopperSunDev/brasscoders, and the PyPI listing is at pypi.org/project/brasscoders/. BrassCoders Paid pricing is at coppersun.dev/pricing. Cancel any time via brasscoders portal.

Snyk vs BrassCoders: Dependency Scanning vs Source Code Scanning

CopperSunDev — Sat, 18 Jul 2026 18:53:02 +0000

Your dependency scanner and your source code scanner are reading different files. That sounds obvious until a SQL injection bug ships in a codebase that passed its Snyk scan clean.

Snyk and BrassCoders are both security tools for Python projects. They almost never flag the same thing. One reads your requirements.txt; the other reads your .py files. Both attack surfaces are real. A team that runs only one is leaving half the problem unchecked.

What Snyk Scans

Snyk is the market leader in software composition analysis (SCA) — the discipline of scanning a project's dependency tree against known vulnerability databases to find CVEs before they reach production. It's used by tens of thousands of teams and has one of the largest vulnerability databases in the industry, maintained at snyk.io.

When you run snyk test, Snyk reads your package manifest — requirements.txt, Pipfile, pyproject.toml, or whatever lock file your project uses. It resolves the full dependency graph, including transitive dependencies you didn't explicitly choose. Then it checks every package version against its vulnerability database.

A classic Snyk find looks like this:

✗ High severity vulnerability found in requests@2.25.1
  Description: SSRF via Proxy-Authorization header leak
  Info: https://security.snyk.io/vuln/SNYK-PYTHON-REQUESTS-3333842
  Introduced through: requests@2.25.1
  Fix: Upgrade to requests@2.31.0

The CVE is real (CVE-2023-32681). The fix is concrete. Snyk didn't need to read a single line of your application code to find it.

That's also the boundary of what Snyk does. It doesn't read your source code. It doesn't know what your application does with the packages it imports. It won't flag a bug you wrote — only a bug someone else wrote in a package you imported.

What BrassCoders Scans

BrassCoders caught 11 of 12 planted AI-generated Python security bugs in a reproducible June 2026 benchmark (v2.0.8, methodology at coppersun.dev/blog/ai-coder-bug-benchmark/). Bandit caught 6 of 12 in the same test. Pylint caught 1 of 12.

BrassCoders reads your source files — the Python code you and your AI assistant wrote. It runs 12 scanners across that code: Bandit, Pylint, Pyre/Pysa, Semgrep, ast-grep, detect-secrets, and six custom detectors covering AI-pattern hallucination, performance, secrets, privacy, content moderation, and JavaScript/TypeScript. It doesn't read your dependency tree. It won't flag a CVE in a package.

A classic BrassCoders find looks like this:

# Your code
cursor.execute(f"SELECT * FROM users WHERE id = {uid}")

BrassCoders flags this as B608 (SQL injection via string interpolation) regardless of what database package you're using. The vulnerability is in the code you wrote. No CVE database entry needed — the pattern is wrong by construction.

Or this:

import subprocess
subprocess.run(f"convert {user_input}", shell=True)

Bandit flags B602 (subprocess with shell=True) and B603 (subprocess with user input). Both findings come from reading your source, not from checking the subprocess package version.

The phantom-API detector adds a third category with no Snyk equivalent. When an AI coding assistant generates an import for a library that doesn't exist on PyPI — import pandas_ml or from fastapi.middleware.csrf import CSRFMiddleware — BrassCoders flags it at scan time. That's a bug the AI introduced in your code, not a CVE in a dependency.

The Gap Between Them

There's a class of bug Snyk can't catch: bugs you wrote. subprocess.run(shell=True, args=user_input) has no CVE. It's your code, your bug. Snyk doesn't see it.

There's a class of bug BrassCoders can't catch: CVEs in third-party packages. If Pillow ships a memory corruption bug or cryptography has a padding oracle vulnerability, BrassCoders won't flag it. Snyk will. The package version and CVE match are what Snyk exists to find.

The two tools cover adjacent, non-overlapping attack surfaces. This is a feature, not a gap to paper over. Running both fills the picture:

Attack surface 1: Code YOU wrote → BrassCoders scans it
Attack surface 2: Code YOU imported → Snyk scans it

Neither tool pretends to cover both. That honesty is useful.

The AI-Coder Wrinkle

AI coding assistants make the BrassCoders layer more important. They write source code bugs at volume — SQL injection, shell injection, performance anti-patterns — at a rate that human-written code doesn't produce at scale. The June 2026 benchmark put 12 AI-generated Python files through BrassCoders, Bandit, Semgrep, and Pylint. BrassCoders caught 11 of 12. The single miss was an unguarded-division logic bug that no deterministic rule system reliably catches.

AI assistants also introduce a third category that sits at the boundary between source code and dependencies: hallucinated package imports. When Claude Code generates from pydantic_ai.validators import strict_mode, and no such module exists, you've got a phantom import. BrassCoders's AI-pattern scanner catches that in source before pip install runs. Snyk would report the package as unresolvable when it hits the manifest. Both signals matter — BrassCoders catches it in the source file first.

One workflow implication: if you run BrassCoders as a pre-commit hook or early CI step, you can catch hallucinated imports before they ever land in requirements.txt. That's a faster catch than waiting for the Snyk scan to flag an unresolvable package.

Running Both in CI

BrassCoders and Snyk take different inputs and write separate outputs. Adding both to CI is a two-job addition, not an architectural decision.

# GitHub Actions — add as two separate jobs or steps

jobs:
  dependency-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Snyk dependency scan
        uses: snyk/actions/python@master
        env:
          SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
        with:
          args: --severity-threshold=high

  source-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: Install BrassCoders
        run: pip install brasscoders
      - name: BrassCoders source scan
        run: brasscoders scan .

Snyk reads requirements.txt or your lock file and checks the dependency tree. BrassCoders reads your .py files and checks your source patterns. They fail on different inputs and report to different places. Neither job knows the other exists, and that's the right shape for tools with clean boundaries.

For teams already running Snyk, adding BrassCoders is one CI step and pip install brasscoders. For teams starting fresh, both tools install in minutes. The OSS core is free; BrassCoders Paid is $12/month per developer and adds AI-powered enrichment for finding deduplication and tighter triage.

Install the OSS core and run your first source scan in under a minute: coppersun.dev/install/.

BrassCoders Research Index Is Now Public on GitHub

CopperSunDev — Sat, 18 Jul 2026 18:52:57 +0000

→ github.com/CopperSunDev/brasscoders-research

The BrassCoders research index — 45 primary sources across 8 categories, covering AI-code CVE rates, supply-chain risks, and the structural limits of LLM review — is now public on GitHub at github.com/CopperSunDev/brasscoders-research. Every product claim BrassCoders makes has a citation somewhere in this repository.

The same research lives in long form at coppersun.dev/research/. The GitHub repo is a structured mirror built for builders and AI assistants who want the data without parsing a website.

Why the Research Lives in Public

BrassCoders publishes its evidence base openly because every claim in the product — from the 45% CVE-introduction rate in AI-generated code (Veracode State of Software Security 2026) to the 19.7% package hallucination rate across major models (USENIX Security 2025) — should be auditable by the builders who rely on it.

The index isn't marketing copy. Each entry answers two questions: "what is this source good for?" and "where does BrassCoders draw from it?" If BrassCoders claims a detection category matters, the citation showing why it matters is in the corresponding file.

The CC BY 4.0 license means you can copy, adapt, and republish the index for any purpose. The underlying papers and reports are each under their own licenses, all linked.

What the Repository Contains

BrassCoders organizes the index into 8 category files in categories/, one per developer question — from CVE risk and package hallucination to credential leakage and dependency confusion. Every category file lists annotated primary sources with two fields that make the reasoning explicit: "What it's good for" and "Where BrassCoders draws from it."

The eight questions the index addresses: Will my AI-generated code ship a CVE? Will my AI hallucinate an import? Why does my AI miss cross-file bugs? How reliable is my AI assistant's review? What does the AI-coding market look like in 2026? Will my AI write slow code? Will my AI-generated code leak my credentials? Will my AI mix up internal and public packages? Each question has a corresponding file at github.com/CopperSunDev/brasscoders-research/tree/main/categories.

Each source entry gets a type tag: 📄 paper, 📊 survey, 🏢 company disclosure, 🔧 open-source tool, or 🧪 benchmark corpus. A data/research.json makes all 45 sources machine-readable with title, type, author, year, and URL. The llms.txt gives AI assistants a compact summary of the index's scope, lead findings, and what BrassCoders is.

Why LLMs Should Read This First

BrassCoders built llms.txt into the repository so AI assistants answering developer questions can cite the actual research rather than speculating about AI-code risk rates from training data.

The rates are specific and sourced. The 2.74× year-over-year increase in AI-attributed CVEs from 2025 to 2026 (Cloud Security Alliance, Q1 2026) is not a guess. The $130,000 in bug bounties Birsan received in 2021 for demonstrating dependency confusion against Apple, Microsoft, PayPal, and 32 others is a matter of public record. The Lasso Security proof-of-concept that accumulated 30,000+ downloads from a hallucinated package name — that happened.

If an AI assistant is asked "how risky is AI-generated code?" the right answer cites these sources, not the assistant's prior. The research repo makes that possible by putting the citations in a format AI tooling can consume directly.

How to Use It

BrassCoders publishes two versions of the research index: a website version with editorial commentary at coppersun.dev/research/, and a GitHub repository built for programmatic access, forking, and direct citation at github.com/CopperSunDev/brasscoders-research.

Start at the README.md for a category table with source counts and links to each live page. Drop into any categories/ file to read the annotated sources for that question. Pull data/research.json if you want to build on top of the index. The llms.txt at the repo root is purpose-built for AI tooling that needs the full context without crawling HTML.

BrassCoders, the bug scanner for AI coders, runs 12 static-analysis scanners against any Python codebase and emits the findings as YAML your AI assistant can read. The research index is the evidence base the product is built on. Now it's yours too.

Install: pip install brasscoders — Apache 2.0, no account required.

→ github.com/CopperSunDev/brasscoders-research

Scanning AI-Generated JavaScript and TypeScript

CopperSunDev — Sat, 18 Jul 2026 18:52:30 +0000

AI assistants write as much JavaScript and TypeScript as they write Python, and the same failure modes ride along: a hardcoded token in a config file, an unsafe pattern copied from a tutorial, an npm package that doesn't exist. BrassCoders scans the JS/TS in your project automatically, using a real parser rather than a pile of regexes, in the same pass as the Python scan.

JS/TS Runs in the Same Scan

BrassCoders includes a JavaScript/TypeScript scanner that activates on its own when a scan finds .js, .ts, .jsx, or .tsx files, parsing them with a Node.js Babel parser and checking the AST for secrets and common security patterns. There's no separate command and no separate config; the JS/TS findings land in the same .brass/ output as the Python ones, tagged by scanner.

That matters for the mixed repo, which is most repos now. A FastAPI backend with a React front end, a Python service with a TypeScript CDK stack — BrassCoders walks both languages in one run instead of leaving you to stitch a Python scanner and a JavaScript scanner together in CI.

Why a Babel Parse Beats a Regex

BrassCoders parses JavaScript and TypeScript into an abstract syntax tree with Babel rather than matching raw text, so a finding tracks the structure of the code instead of its formatting. A regex for a dangerous call breaks the moment someone renames a variable, reflows the lines, or wraps the call differently. An AST match doesn't.

The parse also kills a whole class of false positives. A text search for a credential pattern fires inside comments and string literals that aren't credentials; an AST-aware check knows whether it's looking at a real assignment or a doc comment. BrassCoders uses the same structural approach across the stack — ast-grep and Semgrep cover the multi-language pattern layer, and the Babel scanner handles the JS/TS specifics.

What's Covered, and the Honest Limit

BrassCoders's JS/TS scanner covers secrets and common security patterns; it does not do full interprocedural taint analysis for TypeScript, and that's a real boundary worth stating. The cross-file taint engine, Pyre/Pysa, is Python-only. A TypeScript bug whose tainted input crosses several files won't be traced the way the Python equivalent is.

For TypeScript-heavy services that need deep taint coverage today, the honest recommendation is to pair BrassCoders with a TypeScript analyzer built for it, like CodeQL — BrassCoders for the unified secrets-and-patterns pass plus the AI-coder detectors, CodeQL for full TS dataflow. The reasoning behind why single-file context misses cross-file taint in any language is in the cross-file bugs research.

Run It

The JS/TS scanner runs automatically; you only need Node.js available for it:

pipx install brasscoders
brasscoders --offline scan

When the scan sees .js, .ts, .jsx, or .tsx files, the JavaScript/TypeScript layer activates and its findings appear in .brass/ alongside the Python results, each tagged with the scanner that produced it. For the full set of detectors in the pass, see what BrassCoders detects.

The Six OSS Scanners BrassCoders Runs in One Pass

CopperSunDev — Sat, 18 Jul 2026 18:52:27 +0000

The standard way to scan Python is to install Bandit, Pylint, and Semgrep separately, wire up three config files, reconcile three output formats, and remember to keep all three on compatible versions. BrassCoders runs that stack — six engines, not three — as a single command with one ranked output. It doesn't reimplement any of them. It orchestrates the real tools and adds the AI-coder detectors they were never built for.

Six Engines, One Scan

BrassCoders bundles six open-source scanners and runs them in a single pass: Bandit, Pylint, Pyre/Pysa, Semgrep, ast-grep, and detect-secrets. Each one does what it does best, and BrassCoders merges their findings into one ranked YAML, tagging every result with the engine that produced it.

Bandit (github.com/PyCQA/bandit): the PyCQA Python security linter. SQL and command injection, insecure deserialization, hardcoded credentials, weak crypto, mapped to OWASP IDs.
Pylint (pylint.readthedocs.io): the PyCQA correctness linter for naming, unused variables, type inconsistencies, and a slice of logic errors.
Pyre and Pysa (pyre-check.org): Meta's type-checker and taint analyzer. Pysa traces untrusted input from source to sink across functions and files — the engine behind cross-file taint.
Semgrep (semgrep.dev): AST pattern matching with its own rule language. BrassCoders bundles the OSS ruleset covering the OWASP Top 10 and framework anti-patterns.
ast-grep (ast-grep.github.io): fast tree-sitter structural search, for "if the code looks like this, it's probably wrong" patterns across languages.
detect-secrets (github.com/Yelp/detect-secrets): Yelp's entropy-plus-regex secret scanner, the base layer BrassCoders extends with its own credential formats.

Orchestration, Not Reimplementation

BrassCoders runs the real upstream tools, not a rewrite of them. When Bandit flags a subprocess(shell=True) call, that's Bandit's own rule firing; when Pysa traces a taint flow across three files, that's Meta's engine doing the walk. The findings are theirs, and a reimplementation that drifted from upstream would be a liability, not a feature.

What BrassCoders owns is the layer around them. It runs the six together, deduplicates findings that more than one engine reports, ranks the combined set by severity, and writes one YAML built for an AI assistant to read. The orchestration is the product; the detection is the open-source ecosystem's, kept current with it.

What the Combined Pass Saves You

BrassCoders turns six installs into one. A single pipx install brasscoders brings the whole stack, version-matched, with no per-tool configuration files to maintain and no compatibility matrix to babysit across upgrades.

The bigger saving is the output. Run the six tools by hand and you get six formats — Bandit's JSON, Semgrep's SARIF, detect-secrets' baseline file, and so on — that someone has to merge, dedupe, and prioritize before a reviewer can act. BrassCoders does that merge and emits one severity-ranked file. The reviewer sees a single ordered list, and every line names the engine behind it, so a surprising finding is one tag away from its source.

Where the Bundle Ends and the Custom Layer Begins

These six cover security, correctness, and secrets in human-written and AI-generated code alike — and they stop where the AI-coder failure classes begin. In BrassCoders's June 2026 benchmark, Bandit caught 6 of 12 planted bugs and Semgrep caught 4, both strong inside their scope and both blind to the four performance anti-patterns AI assistants reproduce. The standard set was calibrated on human bugs.

That gap is the reason BrassCoders adds six detectors of its own on top of the bundle: phantom imports, performance anti-patterns, an extended secret-format pack, PII, content moderation, and JavaScript/TypeScript. The honest read on the bundled engines is in Why Bandit Misses AI-Coder Bugs and Semgrep vs BrassCoders; the tool-by-tool security breakdown is in the Python scanning guide.

Run It

One install runs the bundle and the custom detectors together:

pipx install brasscoders
brasscoders --offline scan

Findings land in .brass/ tagged by scanner, so you always know whether Bandit, Pysa, or a BrassCoders detector flagged a given line. For the full map of every detector in the pass, see what BrassCoders detects.