DEV Community: YUICHI KANEKO

AI Security Gate: A New Security Layer for the Age of AI Agents

YUICHI KANEKO — Mon, 29 Jun 2026 06:19:41 +0000

Introduction

This article is not about introducing a new security tool.

Nor is it an argument to replace Secret Scanners, SAST, or other existing security technologies.

Instead, I want to propose an architectural concept for the AI era:

How should security controls be positioned within a software development workflow where AI agents generate most of the artifacts?

I call this concept the AI Security Gate.

AI Is No Longer Just a Coding Assistant

Generative AI has evolved far beyond code completion.

Today's AI systems can already:

Generate source code from requirements
Write unit tests
Refactor existing code
Create pull requests
Review code

The next logical step is a development workflow where:

AI implements, AI reviews, and AI iterates.

In such a world, relying on humans as the final security checkpoint no longer scales.

When AI-generated artifacts are reviewed by another AI, we need a security mechanism that operates independently of AI reasoning and executes every time without exception.

What Is an AI Security Gate?

I define an AI Security Gate as:

A deterministic security control layer that validates AI-generated artifacts before they are accepted into a software development workflow.

Two words in this definition are particularly important.

Artifacts

The scope is broader than source code.

It includes any artifact produced by AI, such as:

Source code
Infrastructure as Code
Dockerfiles
Kubernetes manifests
SQL scripts
CI/CD workflows
API specifications

Deterministic

An AI Reviewer performs reasoning.

It may conclude:

"This design is easier to maintain."

An AI Security Gate does not reason.

Instead, it verifies objective facts such as:

An API key is embedded.
A private key is committed.
An organizational policy is violated.

Its purpose is not to judge software quality.

Its purpose is to enforce security rules consistently.

Four Characteristics of an AI Security Gate

I believe an AI Security Gate should satisfy four fundamental properties.

1. Deterministic

Every execution should produce the same result.

Security enforcement should rely on explicit rules rather than probabilistic AI decisions.

2. Policy-Based

The gate should enforce organizational security policies automatically.

Compliance should never depend on developer attention or reviewer expertise.

3. Pre-Acceptance

Validation should occur before artifacts are accepted into a repository or deployment pipeline.

If a violation is detected, the workflow should stop immediately.

4. Mandatory

Every artifact—whether generated by AI or written by a human—must pass through the same gate.

Security should be part of the workflow, not an optional step.

Isn't This Just a Secret Scanner?

Not exactly.

A Secret Scanner is a tool.

An AI Security Gate is an architectural role.

Think about concepts like:

Authentication
Authorization
Logging

These describe responsibilities rather than specific implementations.

Multiple technologies can implement authentication.

Likewise, multiple tools can implement an AI Security Gate.

Examples include:

Secret scanning
License compliance checking
Infrastructure-as-Code security validation
Organizational policy enforcement
Compliance verification

The AI Security Gate is the architectural layer where these deterministic security controls are applied before AI-generated artifacts are accepted.

A Future AI-Native Development Pipeline

As AI agents become increasingly autonomous, software development workflows may evolve into something like this:

AI Agent
    ↓
AI Security Gate
    ↓
AI Reviewer
    ↓
Automated Testing
    ↓
CI/CD
    ↓
Production

The placement of the AI Security Gate is intentional.

An AI Reviewer evaluates quality.

An AI Security Gate enforces rules.

These are fundamentally different responsibilities.

No matter how capable AI becomes, organizations should not rely solely on AI judgment for security-critical decisions.

Where KeyGate Fits

I created KeyGate as an open-source implementation of this idea.
https://github.com/kanekoyuichi/keygate/

KeyGate focuses specifically on secret detection and prevention.

In the future, the AI Security Gate category may include many different implementations, such as:

Secret Protection
License Compliance
IaC Security
Policy Enforcement
Regulatory Compliance

KeyGate is one implementation within this broader architectural category.

My goal is not simply to promote another security tool.

My goal is to establish AI Security Gate as a standard architectural layer for AI-native software development.

Conclusion

AI agents are becoming first-class participants in software development.

As that happens, our development processes must evolve as well.

The missing piece is not another AI reviewer.

It is a deterministic security layer that operates independently of AI reasoning and consistently enforces organizational security policies.

That is the role of the AI Security Gate.

Just as concepts like CI/CD, SAST, and Infrastructure as Code became part of our common engineering vocabulary, I believe AI-native development will require its own architectural patterns.

I hope AI Security Gate becomes one of them.

KeyGate: A Fast Pre-Commit Guardrail Against Secret Leaks

YUICHI KANEKO — Fri, 24 Apr 2026 05:06:55 +0000

Accidentally committing an API key, password, or private key is still one of the easiest ways to create a serious security incident.

The risk gets worse as development speeds up: larger diffs, faster iteration, and more code drafted by AI coding agents before a human reviews every line.

That is why I built keygate: a fast local pre-commit guardrail that scans only staged added lines and blocks likely secrets before they enter Git history.

pipx install keygate
keygate activate

That's it. keygate now runs automatically before every git commit.

GitHub: https://github.com/kanekoyuichi/keygate
PyPI: https://pypi.org/project/keygate/
License: MIT

Why I built it

Accidentally writing an API key directly into code during development happens to everyone. The real problem is that once you git commit it, the value becomes part of Git history permanently.

Even if you git rm it or force-push, the old SHA can still be used to retrieve it
Once pushed to GitHub, bots can scrape it within seconds
An AWS key can lead to a massive bill; an OpenAI key can drain your usage quota almost instantly

I needed a tool to stop this at the moment of commit. Existing tools like Gitleaks and TruffleHog are excellent, but they focus on full repository scanning and CI workflows. I wanted something optimized specifically for the local pre-commit experience.

More importantly, as we move into a world where AI agents write code, the need for an automatic check right before a commit only increases.

The AI agent angle

AI coding agents like Claude Code or Codex can generate large diffs quickly. The safest assumption is not that the agent is malicious, but that speed increases the chance of unnoticed sensitive values reaching a commit.

Specifically, AI agents tend to create situations like:

Generating code that references .env or config examples and including their values
Expanding sample values from READMEs or test fixtures as-is
Inferring and completing api_key or password-looking values from surrounding context
Producing large diffs in a single pass, before a human has a chance to review every line

A local guardrail becomes more valuable in that workflow, not less. That is why keygate is designed so that whether the code was written by a human or an AI, it applies the same check right before the commit.

Rather than a tool that only works for developers who carefully read the README, keygate provides JSON output and an agent-specific execution mode so that agents themselves can read the scan results and suggest fixes.

What keygate detects

keygate combines multiple signals instead of relying on a single regex:

Rule-based detection (known formats)

AWS access keys (AKIA* / ASIA* / AROA*)
OpenAI API keys (sk-*)
GitHub tokens (ghp_*, fine-grained PATs)
Slack tokens (xoxb-* / xoxp-*)
Stripe keys (sk_live_* / rk_live_* / pk_live_*)
SendGrid keys (SG.*.*)
JWTs and PEM private keys (RSA / OpenSSH)
URL credentials (postgres://user:pass@host, etc.)

Values like pk_live_* (which are meant to be public) or already-masked URL credentials like postgres://user:***@host are treated as WARN rather than immediately BLOCK. The goal is to catch dangerous things without blocking every documentation-friendly string.

Entropy detection

Strings longer than 20 characters with Shannon entropy above 4.0–4.5

Context scoring

Variable names like api_key, password, secret_token are tiered into HIGH and MID
Paths like .env, config.yaml, settings.py are tiered similarly
Assignment syntax (NAME = "..." / export NAME=...)

How scoring works

Instead of a binary match, keygate aggregates independent signals into a final score:

Signal	Points
Regex rule match	+50 to +100
High entropy	+20
Keyword (HIGH): `secret`, `password`, `api_key`, etc.	+25
Keyword (MID): `token`, `credential`, `auth`	+15
Assignment syntax `NAME = "..."`	+15
Very sensitive path (`.env`, etc.)	+20
Sensitive path (`settings/`, `config/`, etc.)	+15
Test file	-10
`example`, `dummy`, etc.	-20

There is also a combo bonus: even when no regex rule matches, if multiple signals fire together, an additional bonus applies:

keyword(HIGH/MID) + entropy → +15
keyword(HIGH) + entropy + assignment syntax → additional +15

This means an unknown secret format can still reach BLOCK if it has a suspicious variable name, random-looking characters, and assignment syntax.

When a known regex rule does match, the combo bonus is not stacked on top — the rule's own weight is used instead. This keeps the score explainable and avoids inflating it unnecessarily.

The final verdict:

block at 70+
warn at 40–69
ignored below 40

Example output

When a likely secret is found, the commit is stopped:

[BLOCK] High confidence secret detected

File: config.py:12
Rule: aws-access-key
Score: 100

Reason:
AWS Access Key detected; sensitive context detected

Remediation:
  - Remove the key from the code
  - Rotate the AWS credentials immediately
  - Use environment variables or AWS IAM roles instead

To ignore:
  Add comment: # keygate: ignore reason="..."

Each finding includes:

File — the file and line number
Rule — which detection rule fired
Score — severity (70+ blocks, 40–69 warns)
Remediation — concrete steps to fix it

At the top of the output, a machine-readable summary line is also emitted:

[KEYGATE] status=block findings=1

This makes it easy for scripts or agents to parse the outcome without needing JSON mode.

Detection accuracy (internal evaluation)

Measured against a labeled corpus of 100 samples (50 known secrets + 50 benign strings):

Metric	Result
Recall (real secrets detected)	100.0%
Precision (detected items that were real secrets)	80.6%
F1	89.3%
True Positives	50
False Negatives (missed secrets)	0
False Positives (benign strings flagged)	12
True Negatives	38

The primary goal was to get False Negatives to zero. Missing a real secret is far more dangerous than an occasional extra prompt.

The 12 false positives included: masked URL credentials, placeholders, Stripe publishable keys, and empty API_KEY= assignments. These are not real secrets, but they look enough like secrets that surfacing them before commit is intentional — they can be suppressed individually with inline ignores, allowlists, or a baseline.

Built for developers and coding agents

keygate provides JSON output alongside human-readable CLI output:

keygate scan --format json
keygate scan --json
keygate scan --profile agent

--format json outputs only JSON to stdout
--json is an alias for the above
--profile agent is a fixed mode for AI agents that always returns JSON

The JSON schema is stable: schema_version, status, summary, findings[]. Each finding includes rule_id, policy, score, verdict, file, line, message, and a masked snippet when available.

This is not JSON bolted on as an afterthought. It is designed from the start so that an agent can re-run the scan, parse the output mechanically, and propose fixes — closing the loop after a commit is blocked.

keygate also has a Claude Code plugin, so Claude can scan staged changes for secrets automatically before commits.

Handling false positives without breaking flow

A secret scanner is only useful if developers can live with it every day.

keygate includes three escape hatches for expected findings:

1. Inline ignore (per line)

api_key = "dummy-key-for-testing"  # keygate: ignore reason="test data"

reason is required — so the intent is always documented in the code.

2. Allowlist (project-wide)

In keygate.toml:

[allowlist]
paths = ["vendor/*", "third_party/*"]
patterns = ["dummy", "example"]

Note: adding tests/* to the allowlist wholesale is not recommended — it would suppress real secrets that accidentally end up in test files.

3. Baseline (freeze existing findings)

keygate baseline create

This saves the current findings to .keygate.baseline.json as SHA-256 fingerprints. From that point, the same finding at the same location is suppressed. The raw secret value is never stored, so the baseline file is safe to commit.

{
  "version": 1,
  "entries": [
    {
      "fingerprint": "e5282a7860678bc768d280eb3e77d2ca8a44286357c743dd024d74fe0605fe09",
      "file_path": "src/app/config.py",
      "line_number": 42,
      "rule_id": "url-credentials",
      "created_at": "2026-04-22T09:30:00+00:00"
    }
  ]
}

To add new findings to an existing baseline: keygate baseline update.

If the baseline is committed to the repository, a new team member who runs pipx install keygate && keygate activate will automatically pick up the same baseline.

How it is different from Gitleaks or TruffleHog

keygate is not a replacement for full repository, history, CI, or cloud secret scanning.

It is intentionally narrower: a lightweight local guardrail for the moment right before a commit is created.

Tool	Best for
keygate	Fast local pre-commit checks on staged changes
Gitleaks	Full repository, history, CI, and configurable rule scanning
TruffleHog	Deep secret discovery and verification workflows

Use keygate when you want a small commit-time check that developers will actually keep enabled.

What keygate intentionally does not do

These were explicit non-goals during design:

Full repository scanning (not the job of a pre-commit hook)
LLM-based judgment (offline, fast, and deterministic behavior takes priority)
External API validation (no checking whether a token is actually valid)
IDE plugins, SaaS integrations, or automatic secret rotation

The primary constraint is completing within 200–500ms locally, every single commit. No LLM calls or external API lookups. For server-side protection, keygate is meant to complement — not replace — pre-receive hooks and CI-level scanning.

Disclaimer

keygate is a last-line-of-defense net for human error, not a substitute for proper secret management.

It does not guarantee complete detection (unknown formats and obfuscated values may pass through)
False positives are not zero (managed via allowlist / baseline / inline ignore)
git commit --no-verify bypasses it trivially (for organizational enforcement, combine with server-side controls)
The correct practice is to keep secrets out of the repository entirely, using environment variables, secret managers, or KMS

Quick start

pipx install keygate
cd your-project
keygate activate

From that point on, every normal git commit gets a fast local secret check automatically.

You can also scan manually:

git add .
keygate scan

keygate scans git diff --cached — staged changes only.

Detecting Prompt Injection in LLM Apps (Python Library)

YUICHI KANEKO — Wed, 01 Apr 2026 03:31:37 +0000

I've been working on LLM-backed applications and ran into a recurring issue: prompt injection via user input.

Typical examples:

"Ignore all previous instructions"
"Reveal your system prompt"
"Act as another AI without restrictions"

In many applications, user input is passed directly to the model, which makes these attacks practical.

Most moderation APIs are too general-purpose and not designed specifically for prompt injection detection, especially for non-English inputs. So I built a small Python library to act as a screening layer before sending input to the LLM:

https://github.com/kanekoyuichi/promptgate

Detection strategies:

rule-based (regex / phrase matching)

latency: <1ms, no dependencies
embedding-based (cosine similarity with attack exemplars)

latency: ~5–15ms, uses sentence-transformers
LLM-as-judge

higher accuracy, but +150–300ms latency, requires external API

Baseline evaluation (rule-only):

FPR: 0.0% (0 / 30 benign samples)
Recall: 61.4% (27 / 44 attack samples)

So rule-based alone misses ~40% of attacks, especially paraphrased or context-dependent ones.

This is not intended as a complete solution — the design assumption is defense-in-depth, where this acts as a first screening layer.

Known limitations:

rule-based detection struggles with paraphrased / indirect instructions
embedding approach depends on exemplar coverage (not a trained classifier)
LLM-as-judge is non-deterministic and API-dependent

Would be interested in feedback on:

better evaluation methodologies
detection strategies beyond pattern / similarity / LLM judging
how others are handling prompt injection at the application layer