Accidentally committing an API key, password, or private key is still one of the easiest ways to create a serious security incident.
The risk gets worse as development speeds up: larger diffs, faster iteration, and more code drafted by AI coding agents before a human reviews every line.
That is why I built keygate: a fast local pre-commit guardrail that scans only staged added lines and blocks likely secrets before they enter Git history.
pipx install keygate
keygate activate
That's it. keygate now runs automatically before every git commit.
- GitHub: https://github.com/kanekoyuichi/keygate
- PyPI: https://pypi.org/project/keygate/
- License: MIT
Why I built it
Accidentally writing an API key directly into code during development happens to everyone. The real problem is that once you git commit it, the value becomes part of Git history permanently.
- Even if you
git rmit or force-push, the old SHA can still be used to retrieve it - Once pushed to GitHub, bots can scrape it within seconds
- An AWS key can lead to a massive bill; an OpenAI key can drain your usage quota almost instantly
I needed a tool to stop this at the moment of commit. Existing tools like Gitleaks and TruffleHog are excellent, but they focus on full repository scanning and CI workflows. I wanted something optimized specifically for the local pre-commit experience.
More importantly, as we move into a world where AI agents write code, the need for an automatic check right before a commit only increases.
The AI agent angle
AI coding agents like Claude Code or Codex can generate large diffs quickly. The safest assumption is not that the agent is malicious, but that speed increases the chance of unnoticed sensitive values reaching a commit.
Specifically, AI agents tend to create situations like:
- Generating code that references
.envor config examples and including their values - Expanding sample values from READMEs or test fixtures as-is
- Inferring and completing
api_keyorpassword-looking values from surrounding context - Producing large diffs in a single pass, before a human has a chance to review every line
A local guardrail becomes more valuable in that workflow, not less. That is why keygate is designed so that whether the code was written by a human or an AI, it applies the same check right before the commit.
Rather than a tool that only works for developers who carefully read the README, keygate provides JSON output and an agent-specific execution mode so that agents themselves can read the scan results and suggest fixes.
What keygate detects
keygate combines multiple signals instead of relying on a single regex:
Rule-based detection (known formats)
- AWS access keys (
AKIA*/ASIA*/AROA*) - OpenAI API keys (
sk-*) - GitHub tokens (
ghp_*, fine-grained PATs) - Slack tokens (
xoxb-*/xoxp-*) - Stripe keys (
sk_live_*/rk_live_*/pk_live_*) - SendGrid keys (
SG.*.*) - JWTs and PEM private keys (RSA / OpenSSH)
- URL credentials (
postgres://user:pass@host, etc.)
Values like pk_live_* (which are meant to be public) or already-masked URL credentials like postgres://user:***@host are treated as WARN rather than immediately BLOCK. The goal is to catch dangerous things without blocking every documentation-friendly string.
Entropy detection
- Strings longer than 20 characters with Shannon entropy above 4.0–4.5
Context scoring
- Variable names like
api_key,password,secret_tokenare tiered into HIGH and MID - Paths like
.env,config.yaml,settings.pyare tiered similarly - Assignment syntax (
NAME = "..."/export NAME=...)
How scoring works
Instead of a binary match, keygate aggregates independent signals into a final score:
| Signal | Points |
|---|---|
| Regex rule match | +50 to +100 |
| High entropy | +20 |
Keyword (HIGH): secret, password, api_key, etc. |
+25 |
Keyword (MID): token, credential, auth
|
+15 |
Assignment syntax NAME = "..."
|
+15 |
Very sensitive path (.env, etc.) |
+20 |
Sensitive path (settings/, config/, etc.) |
+15 |
| Test file | -10 |
example, dummy, etc. |
-20 |
There is also a combo bonus: even when no regex rule matches, if multiple signals fire together, an additional bonus applies:
-
keyword(HIGH/MID) + entropy→ +15 -
keyword(HIGH) + entropy + assignment syntax→ additional +15
This means an unknown secret format can still reach BLOCK if it has a suspicious variable name, random-looking characters, and assignment syntax.
When a known regex rule does match, the combo bonus is not stacked on top — the rule's own weight is used instead. This keeps the score explainable and avoids inflating it unnecessarily.
The final verdict:
-
blockat 70+ -
warnat 40–69 - ignored below 40
Example output
When a likely secret is found, the commit is stopped:
[BLOCK] High confidence secret detected
File: config.py:12
Rule: aws-access-key
Score: 100
Reason:
AWS Access Key detected; sensitive context detected
Remediation:
- Remove the key from the code
- Rotate the AWS credentials immediately
- Use environment variables or AWS IAM roles instead
To ignore:
Add comment: # keygate: ignore reason="..."
Each finding includes:
-
File— the file and line number -
Rule— which detection rule fired -
Score— severity (70+ blocks, 40–69 warns) -
Remediation— concrete steps to fix it
At the top of the output, a machine-readable summary line is also emitted:
[KEYGATE] status=block findings=1
This makes it easy for scripts or agents to parse the outcome without needing JSON mode.
Detection accuracy (internal evaluation)
Measured against a labeled corpus of 100 samples (50 known secrets + 50 benign strings):
| Metric | Result |
|---|---|
| Recall (real secrets detected) | 100.0% |
| Precision (detected items that were real secrets) | 80.6% |
| F1 | 89.3% |
| True Positives | 50 |
| False Negatives (missed secrets) | 0 |
| False Positives (benign strings flagged) | 12 |
| True Negatives | 38 |
The primary goal was to get False Negatives to zero. Missing a real secret is far more dangerous than an occasional extra prompt.
The 12 false positives included: masked URL credentials, placeholders, Stripe publishable keys, and empty API_KEY= assignments. These are not real secrets, but they look enough like secrets that surfacing them before commit is intentional — they can be suppressed individually with inline ignores, allowlists, or a baseline.
Built for developers and coding agents
keygate provides JSON output alongside human-readable CLI output:
keygate scan --format json
keygate scan --json
keygate scan --profile agent
-
--format jsonoutputs only JSON to stdout -
--jsonis an alias for the above -
--profile agentis a fixed mode for AI agents that always returns JSON
The JSON schema is stable: schema_version, status, summary, findings[]. Each finding includes rule_id, policy, score, verdict, file, line, message, and a masked snippet when available.
This is not JSON bolted on as an afterthought. It is designed from the start so that an agent can re-run the scan, parse the output mechanically, and propose fixes — closing the loop after a commit is blocked.
keygate also has a Claude Code plugin, so Claude can scan staged changes for secrets automatically before commits.
Handling false positives without breaking flow
A secret scanner is only useful if developers can live with it every day.
keygate includes three escape hatches for expected findings:
1. Inline ignore (per line)
api_key = "dummy-key-for-testing" # keygate: ignore reason="test data"
reason is required — so the intent is always documented in the code.
2. Allowlist (project-wide)
In keygate.toml:
[allowlist]
paths = ["vendor/*", "third_party/*"]
patterns = ["dummy", "example"]
Note: adding tests/* to the allowlist wholesale is not recommended — it would suppress real secrets that accidentally end up in test files.
3. Baseline (freeze existing findings)
keygate baseline create
This saves the current findings to .keygate.baseline.json as SHA-256 fingerprints. From that point, the same finding at the same location is suppressed. The raw secret value is never stored, so the baseline file is safe to commit.
{
"version": 1,
"entries": [
{
"fingerprint": "e5282a7860678bc768d280eb3e77d2ca8a44286357c743dd024d74fe0605fe09",
"file_path": "src/app/config.py",
"line_number": 42,
"rule_id": "url-credentials",
"created_at": "2026-04-22T09:30:00+00:00"
}
]
}
To add new findings to an existing baseline: keygate baseline update.
If the baseline is committed to the repository, a new team member who runs pipx install keygate && keygate activate will automatically pick up the same baseline.
How it is different from Gitleaks or TruffleHog
keygate is not a replacement for full repository, history, CI, or cloud secret scanning.
It is intentionally narrower: a lightweight local guardrail for the moment right before a commit is created.
| Tool | Best for |
|---|---|
| keygate | Fast local pre-commit checks on staged changes |
| Gitleaks | Full repository, history, CI, and configurable rule scanning |
| TruffleHog | Deep secret discovery and verification workflows |
Use keygate when you want a small commit-time check that developers will actually keep enabled.
What keygate intentionally does not do
These were explicit non-goals during design:
- Full repository scanning (not the job of a pre-commit hook)
- LLM-based judgment (offline, fast, and deterministic behavior takes priority)
- External API validation (no checking whether a token is actually valid)
- IDE plugins, SaaS integrations, or automatic secret rotation
The primary constraint is completing within 200–500ms locally, every single commit. No LLM calls or external API lookups. For server-side protection, keygate is meant to complement — not replace — pre-receive hooks and CI-level scanning.
Disclaimer
keygate is a last-line-of-defense net for human error, not a substitute for proper secret management.
- It does not guarantee complete detection (unknown formats and obfuscated values may pass through)
- False positives are not zero (managed via allowlist / baseline / inline ignore)
-
git commit --no-verifybypasses it trivially (for organizational enforcement, combine with server-side controls) - The correct practice is to keep secrets out of the repository entirely, using environment variables, secret managers, or KMS
Quick start
pipx install keygate
cd your-project
keygate activate
From that point on, every normal git commit gets a fast local secret check automatically.
You can also scan manually:
git add .
keygate scan
keygate scans git diff --cached — staged changes only.
Links
Feedback, issues, and PRs are very welcome.
Top comments (0)