I have three repos that all want the same TODO hygiene rules. The first one got them after a review where I caught a TODO: fix this with no owner. The second one picked them up by copying CLAUDE.md from the first. The third one is where I noticed I was opening the second project's CLAUDE.md to copy it a third time.
That is the exact failure Level 3 is supposed to prevent.
The taxonomy is easy enough to write down. Level 2 is the project harness; Level 3 is the organization harness. The hard part is the split inside the repo. If both kinds of rule end up in the same files, the boundary collapses every time someone edits anything. Keystone draws the line in the file tree itself.
The split Keystone draws
A Keystone harness has two halves.
The project half is harness/corpus/ and harness/guides/. The team owns it. They edit it freely. It is the reasoning and the rules that describe this codebase: which test runner, which lint config, which idioms, which domain constraints.
The org half is harness/policies/. Each policy is its own namespace, owned by whoever published it. Policies arrive through keystone init --policy <ref> and update through keystone policy update <name>. Local edits inside a policy namespace block the next update unless --force is passed.
harness/
├── corpus/ # project reasoning (Level 2)
├── guides/ # project rules (Level 2)
├── policies/ # Level 3
│ ├── universal/ # ships with the binary
│ └── tacoda/ # installed via --policy
├── sensors/
├── adapters/
├── learning/
└── archive/
Two things make the boundary stick. First, every policy writes only inside its own subtree; the installer enforces that. Second, the lockfile records per-file hashes, so the update flow can tell whether a policy file has been edited locally. The structure is what keeps a Level 3 rule from quietly becoming a Level 2 rule.
Sensors deliberately do not live inside a policy. Sensors describe project tooling (lint, type-check, test). A policy can declare what must be checked; the project decides how. That is the right cut: principles distribute, commands do not.
The example: tacoda-policy
The shape is easier to see in a real one. tacoda-policy is the example repo I use to test the layer end-to-end. It carries my personal preferences around documentation and TODOs, and it installs as a policy named tacoda.
The repo is small on purpose:
tacoda-policy/
├── keystone-policy.yaml
├── README.md
└── policy/
└── harness/policies/tacoda/
├── corpus/
│ ├── documentation.md
│ └── todos.md
└── guides/
├── documentation.md
└── todos.md
The manifest is short. Name, version, description:
name: tacoda
version: 0.1.0
description: |
Example policy for tacoda projects. Layers on top of the keystone universal
baseline with personal/team preferences around documentation conciseness
and TODO hygiene.
The policy/ directory mirrors how the content lands inside a consumer's harness. The installer rejects anything outside the policy's own namespace (policy/harness/policies/tacoda/), so this repo cannot accidentally drop files into a consumer's project tree. That guarantee is what makes "trust this policy enough to install it" a smaller ask than "trust this script enough to run it."
What --policy does to your project
Run keystone init in a fresh repo with the policy flag:
keystone init --policy git+https://github.com/tacoda/tacoda-policy.git#v0.1.0
You get the full harness, plus the policy dropped into its namespace:
harness/policies/tacoda/
├── corpus/
│ ├── documentation.md
│ └── todos.md
└── guides/
├── documentation.md
└── todos.md
Same shape as the project layer, scoped to tacoda/. Two pairs. Each pair is a corpus file (the reasoning) and a guide file (the rules). Guides are ambient: the agent loads them every turn. Corpus is on-demand: the agent reaches a corpus file by following the forward-link in the paired guide when it needs the why.
Here is the documentation guide that lands in the project, trimmed to the golden rules:
## GOLDEN RULES
- **Apply the Hemingway test to every sentence.** Each sentence must change
what the reader does. If a sentence can be deleted without losing
instruction or context, delete it.
- **Default to no comment.** Add one only when the *why* is non-obvious.
- **Lead with the why.** Documentation explains motivation and non-obvious
behavior. Well-named identifiers explain *what*; comments explain *why*.
The corpus file holds the long-form reasoning: where the Hemingway test comes from, what gets cut, what stays, why bad docs are worse than no docs. The agent only pays for those tokens when it reaches for them.
That split matters for the same reason it matters at the project layer. Guides are the cheapest tokens you can spend; corpus is on-demand context that earns its keep when the agent has to make a judgment call.
Authoring your own policy
A policy is a git repo with two things at its root. The manifest:
name: my-org
version: 0.1.0
description: |
Org-wide engineering standards: vendor list, license rules, release gates.
And a policy/ directory that mirrors the consumer layout:
my-policy-repo/
├── keystone-policy.yaml
├── README.md
└── policy/
└── harness/policies/my-org/
├── corpus/
│ ├── vendors.md
│ └── licensing.md
└── guides/
├── vendors.md
└── licensing.md
The name in the manifest is the namespace. Every file under policy/ must live inside policy/harness/policies/my-org/; anything outside is rejected at install time. The README and the manifest sit at the repo root for humans, not consumers, and the installer ignores them.
A guide file in the policy looks the same as a project guide. Short, ambient, full of rules. The bottom line points at the corpus:
# Vendors — rules
The rules from [`corpus/vendors.md`](../corpus/vendors.md).
Loaded ambient; enforced by the [drift sensor](../../../../sensors/drift.md).
## GOLDEN RULES
- Approved cloud vendors: AWS, GCP, Cloudflare. Anything else needs review.
- Storage of customer data outside the approved list is a release blocker.
---
Traces to: [`corpus/vendors.md`](../corpus/vendors.md).
The corpus file is where the reasoning lives: why this vendor list, what the review process looks like, the contractual constraints that drove it. Same shape as project corpus. Same load behavior.
Updates without surprises
The interesting part of Level 3 is not the install. It is what happens the second time.
Each installed policy is pinned in harness/.keystone.lock:
policies:
tacoda:
source_ref: "git+https://github.com/tacoda/tacoda-policy.git#v0.1.0"
resolved_sha: "abc1234..."
policy_version: "0.1.0"
keystone_version: "0.9.0"
files:
"harness/policies/tacoda/guides/documentation.md": "sha256:..."
"harness/policies/tacoda/guides/todos.md": "sha256:..."
"harness/policies/tacoda/corpus/documentation.md": "sha256:..."
"harness/policies/tacoda/corpus/todos.md": "sha256:..."
Bump to a new ref:
keystone policy update tacoda '#v0.2.0'
Or re-resolve the current ref (useful when tracking #main):
keystone policy update tacoda
The hashes are the safety net. If a file inside the policy namespace has been edited locally since install, the update refuses to overwrite it. You either revert the local edit or pass --force and accept the loss. That refusal is what makes "policies are not project-authored" a real constraint, not a convention.
If a project genuinely needs to soften or extend a policy rule, the right move is at the project layer. Add a project guide under harness/guides/ that traces to the policy file by path and records the deviation:
# Documentation — project deviation
This project relaxes
[`policies/tacoda/guides/documentation.md`](../policies/tacoda/guides/documentation.md):
multi-paragraph docstrings are permitted on public API surfaces because the
docs site is generated from them.
The policy stays unmodified. The deviation lives where future readers will look. The lockfile keeps updating cleanly.
When a policy cannot be relaxed
The deviation pattern works for most rules. It does not work for the rules a project is not allowed to relax. A vendor list pinned by legal is not a suggestion. A license rule that gates a release is not a starting point for negotiation.
That is what strict is for. The flag lives in the policy manifest and defaults to false:
name: my-org
version: 0.2.0
strict: true
description: |
Org-wide engineering standards. Strict: project deviations do not apply.
A strict policy changes two things about how its guides reach the agent.
First, load order. Project guides under harness/guides/ normally load before policy guides, so a later project guide can sit on top of a policy rule and the agent reads the project's relaxation last. A strict policy reverses that: its guides load last and have the final word on any rule it covers.
Second, authority. The drift sensor reads strict: true and marks the policy's rules as non-overridable in the loaded context. A project-layer deviation guide that traces back to a strict policy file is treated as a violation, not a softening. The agent will not quietly apply it.
File-level behavior is unchanged. Local edits to a strict policy file still block updates the same way; --force still works the same way. Strict is not about file ownership; it is about precedence at agent runtime.
Use it sparingly. Most rules deserve the deviation door because most rules have edge cases worth respecting. Strict is for the rules whose edge cases have already been argued and lost: compliance, licensing, security posture. The flag exists so the policy author can name those rules as different in kind, not just different in topic.
A strict policy is a promise to the rest of the org. Defaulting to false keeps that promise rare enough to mean something.
Why the split is worth the trouble
The argument against splitting Level 2 and Level 3 is that it is one more concept to hold. The argument for it is that the wrong layer is the wrong owner, and the wrong owner is how rules rot.
A "use Vitest, not Jest" rule belongs to one project. A "every TODO names an owner" rule belongs to me across every project I touch. If both end up in the same CLAUDE.md, the next person to edit it has to guess which rules can travel and which cannot. They will guess wrong, and the file will drift further every quarter.
Keystone makes the answer structural. Project rules go in harness/guides/. Org rules go in harness/policies/<name>/guides/. The lockfile keeps the org rules in sync. The namespace rule keeps a policy from escaping its lane. The drift sensor enforces both layers the same way at agent runtime, because the agent does not care where a rule came from; only the humans editing the files do.
That is the rule worth naming: a harness is not just a place to put rules. It is a place to put rules where the right person can edit them. Level 3 is the layer that question gets answered at.
The example repo is tacoda-policy. The harness installer is keystone. The thing I keep returning to: the cost of a good split is one extra directory. The cost of no split is a file you stop trusting.
Top comments (0)