Dead Light Framework · Part 3 — a 3-minute test for how much structure your AI-agent project actually needs
Three questions to find the smallest setup that fits — a plain README, two files, multi-unit paperwork, or a running service — so you stop over-building (the common mistake) and catch the moment two files genuinely aren't enough. Copy-paste card below; theory skippable.
Dead Light Framework — an ongoing series · you're on Part 3.
- The Emperor Is All But Dead
- Every Session Starts in Darkness
- Two Markdown Files Won't Save You Forever ← you are here
- Inherit, Don't Invent
- Try to Break Your Own Framework
Next → three older disciplines that already solved this — patterns you can apply to HANDOFF and LOG today.
By a developer running AI agents as daily teammates — a peer, not an authority (full framing in #1). · ~7 min · the Dead Light Framework repository (MIT)
New here? — 30-second catch-up. (Following the series? Skip ahead.) Dead Light is an experimental way to run projects where some of your teammates are AI agents that start every session with no memory — they reset to zero, human decisions drift, and the only durable thing is what you wrote down. The minimum kit (#2): two files at the repo root — a
HANDOFF.md(the current-state snapshot a fresh session reads first) and an append-onlyLOG.md(the history it's derived from). This post is the test for when those two files stop being enough — and which tier your project needs: a plainREADME, the two files, multi-unit paperwork, or an actual running service.
The decision you keep dodging
Post #2 closed on a promise: the two-file setup is enough for one repo, one session at a time, and the moment you cross that line, it isn't. This post is the line.
If you ran the setup from #2, you already know the shape of the problem: it works beautifully — until a Tuesday when two agents pick up the same task in parallel and trample each other's HANDOFF; or a Friday when your codebase hits a size where one shared LOG.md is a wall of context an agent can't read; or the week you start a second service and suddenly "the project" is two things, not one. Most teams answer "do we need more than two files now?" by gut. The litmus below is cleaner.
The aim isn't to push you up the tiers — it's the opposite. Over-building is the more common failure: solo developers running one agent on a 4-KLOC tool, setting up multi-unit paperwork they don't need. Pick the smallest tier that fits, and only upgrade when a real signal forces it.
The 3-question test (≈ 3 min)
Answer Q1 → Q2 → Q3 in order. As soon as one gives you a tier, you can stop — that's the tier, the rest of the questions only narrow further. Q4 below is a one-time forward-look; run it after.
Q1 — Do you need real-time integrity?
Answer yes if any of these holds:
- Two or more agents can write to the same artifact at the same instant (parallel sessions on shared state).
- An invariant must hold every instant, with zero "eventually" tolerance — a financial balance, a lock on a shared resource, a real-time scheduler.
- You need transactions — multi-step changes that must all-succeed-or-all-fail across shared state.
Yes → Runtime tier. Markdown files cannot deliver this; it isn't a discipline gap, it's a structural one (the why is in the aside below). You need a running service — transactions, locks, the machinery databases have had for decades. The framework's runtime tier is the subject of a later post; for now, the actionable answer is: don't try to do this with .md files. That's your answer for today — Q2 and Q3 only matter once Q1 is no.
No on all three → continue to Q2.
Q2 — Are you running more than one governance unit?
A "governance unit" is a thing with its own decision rights: a service that ships independently, a sub-product, a team that owns its own roadmap. Answer yes if any of these holds:
- The project contains two or more services / sub-products that ship independently and own different decisions.
- You have multiple repositories that need to coordinate.
- Different agents own different sub-areas with their own decision rights, and a change in one isn't automatically a change in another.
Yes → M2 — multi-unit paperwork. One HANDOFF.md + LOG.md per unit, in a sub-folder; a shared Imperial tier at the repo root for cross-unit sealed decisions. Layout:
<repo-root>/ ← Imperial tier (shared, read by every unit)
codex.md (+ cross-unit sealed docs)
imperial/LOG.md ← cross-unit decisions go here
service-a/ ← unit A
HANDOFF.md LOG.md <artifacts>
service-b/ ← unit B (sibling of A; not under A)
HANDOFF.md LOG.md <artifacts>
Sibling units don't read each other's logs — they only read their own plus the Imperial tier ancestor chain. That's how you keep per-unit churn out of other units' context windows. Full rules: Paperwork Standard §4. You don't need Q3; the unit structure subsumes it.
No (one team, one product, one decision-owner) → continue to Q3.
Q3 — How big is the codebase?
Measure with cloc or scc — logical lines, all languages. The bands borrow COCOMO 81's order-of-magnitude convention; treat them as a heuristic, not a derived cutoff.
| LOC | Tier | Set up |
|---|---|---|
| < 10 KLOC | M0 | A README.md is enough. Don't build the two-file setup yet. Re-check when you cross ~10 KLOC or hire a second person/agent. |
| 10 – 50 KLOC | M1 | The two-file setup from #2 — a HANDOFF.md snapshot + an append-only LOG.md at repo root, plus four rules for who reads/writes what and when. |
| > 50 KLOC | M2 | Even with a single team. The cross-time complexity is enough that you want the unit-folder layout from Q2 — start with one unit folder; the structure is ready when a second appears. |
Q4 — Crossing a line in the next 3–6 months?
This doesn't change today's tier — it tells you what to architect for. Plan the upgrade now when:
- M0 → M1: hiring a second contributor, adding a second agent, about to cross ~10 KLOC.
- M1 → M2: spinning up a second service, splitting the codebase into independently shipping pieces, adding a second decision-owner.
- M1 or M2 → Runtime: introducing a hard invariant (compliance, locks, real-time coordination), starting work that needs transactions, onboarding agents that will write in parallel.
Emergency upgrades cost more than planned ones. Catching the trigger early is the entire point of Q4.
The decision card (copy this into your repo)
Drop this into your CLAUDE.md / .cursorrules / README.md so the test is on hand the next time someone asks "do we need more structure here?":
## Governance-tier self-check
Answer in order; the first YES decides the tier — later questions only narrow further.
Q1 — Real-time integrity needed (≥ 2 agents writing the same artifact at the same instant,
a "must-never-break" invariant, or transactions over shared state)?
YES → Runtime tier (a running service; markdown can't do this).
Q2 — More than one governance unit (≥ 2 services / sub-products / decision-owners,
or multi-repo coordination)?
YES → M2: per-unit folder with HANDOFF.md + LOG.md, plus a shared Imperial tier
at the repo root.
Q3 — Codebase size (cloc / scc, logical lines, all languages)?
< 10 KLOC → M0: a README.md is enough.
10–50 KLOC → M1: the two-file HANDOFF + LOG setup.
> 50 KLOC → M2: unit-folder layout even single-team.
Q4 — Will any of Q1/Q2/Q3 cross a line in the next 3–6 months? Plan the upgrade now.
The full card, with upgrade triggers and per-tier folder layouts: tier-decision-card.md.
What you actually get
- Stop over-building. Most solo-plus-agents projects are honestly M1 — the two files from #2. Knowing that is the win; you don't add multi-unit paperwork "just in case."
- Stop under-building. When two agents start colliding, or a second service spins up, the card flags it before the collisions become incidents.
- A defensible answer to "should we add more structure?" "We ran the card; we're M1; the trigger to move is X." That's a sentence, not an argument.
Honest cost: this is a heuristic, not a theorem. The LOC bands are borrowed COCOMO-81 conventions — useful as a starting point, calibrate to your context (a 30-KLOC mobile app and a 30-KLOC research notebook do not have the same coordination need). Q1 is the one question with a hard wall behind it; Q2 and Q3 are judgment calls the card just makes explicit.
Why this works (the 30-second aside)
There is a real, provable ceiling under all of this. Coordinating actors who can't talk in real time — past sessions and current ones, agents in separate processes, services across a network — runs into the CAP theorem (Gilbert & Lynch 2002): when parts of your system can't reach each other (a "partition"), you can have Consistency or Availability, but not both. Documents are by construction available + eventually consistent: a fresh session reads what's on disk and works now, it cannot block until the previous session "confirms," so it has already given up strong consistency. That is the wall behind Q1: paperwork cannot promise "two writers will never disagree, even for a second" — not because you're doing it wrong, because the medium can't. A running service can, by paying the cost of being unavailable during a partition. Q1's answers are which side of that wall you're on. Full citations and the bounded claim: Paperwork Standard §1.2.
The COCOMO-anchored size bands in Q3 are a borrowed convention, not a derived cutoff — Boehm's 1981 modes predict effort, not documentation need. The framework's Paperwork Standard §2 is explicit about that ("borrowed order-of-magnitude convention, owner-calibratable"); treat the numbers accordingly.
The story below the setup (optional — skip if you came for the card)
The card above is the entire useful product of this post. If you want the why behind the why — the joint at which "documentation" stops being the right word — here it is.
The turn I didn't want to take
Through late 2024 and into 2025 I kept treating my AI-agent problem as a documentation problem. Write a better HANDOFF.md. Tag candidates. Mark sealed decisions. The patterns from #2 worked, and the overhead kept climbing, and a voice in the back of my head kept saying: you're carving this at the wrong joint.
So one evening I tried to state the problem in the most neutral words I could, with no mention of "documents":
I have participants who start cold, run briefly, and cannot talk to each other in real time. They have to act coherently anyway.
Read that back without the AI-agent context and tell me it doesn't sound familiar. It should. It's not a documentation problem. It's a coordination problem — and a very specific, very old one.
What the problem actually is
Strip my "team" to the bones. It's a set of actors that:
- reset to zero — each session is a fresh process with no memory of the last;
- live for one task, then disband;
- never overlap in a conversation — by the time a session could "reply," it no longer exists, and the human is asleep or in three other meetings.
What makes coordination hard here is not intelligence and not prompting. It's that there is no real-time channel between the actors. A message I leave can only be read later, by someone who wasn't there when I wrote it. Coordination doesn't happen in a conversation; it happens across time, through whatever durable thing survives between sessions.
If that smells like distributed systems to you — congratulations, you got there faster than I did. Coordinating processes that fail, restart, and can't reliably talk in real time is the founding problem of that field. People have been proving theorems about it since the 1970s. I'd been re-deriving a worse version of it by hand, in markdown.
The Imperium was the tell
This is where the gothic paint on the project stops being a joke.
The framework is named after Warhammer 40,000, and the central image is the Astronomican — a beacon of psychic light. In the fiction, humanity's empire spans a galaxy. Its ships travel through the warp, a parallel dimension that does not carry real-time signals; a fleet that enters the warp is, for the duration, unreachable. There is no live channel across that distance. So how do you run an empire whose parts cannot phone each other?
The fiction's answer is uncomfortably close to the engineering one. The Imperium runs on three things: frozen edicts — decisions made once and not up for renegotiation by whoever's nearest; a paperwork priesthood, the Adeptus Administratum, which is quite literally galactic records-keeping; and the Astronomican, a beacon a ship lost in the dark steers by. Frozen authority. Durable records. A signal that survives.
That is the whole design, in fancy dress. The darkness in this series' title is the warp between my sessions. The "document that survives" is the Astronomican. The names were never decoration — they're the closest myth I know to the actual shape of the problem: coordinating actors who can't talk live, who steer by whatever frozen light reaches them. The card above is the engineering version. The lore is the easier-to-remember version.
The wall behind Q1
The 30-second aside up top gave you the headline: CAP forces an Availability-or-Consistency choice during a partition, and documents have already chosen Availability — a fresh session reads what's on disk and gets to work now, it cannot block until a previous session "confirms." So the best a pile of markdown can offer is eventual consistency: everyone converges on the same picture eventually, once they've all read the same writing — never instantly, never guaranteed at the moment you act.
That ceiling is not about my competence or yours. No amount of better markdown buys you a guarantee that two sessions acting on the same artifact won't step on each other in the window before they sync. Documents detect and reconcile after the fact; they cannot prevent in the moment. (A sibling result, FLP — Fischer, Lynch & Paterson, 1985 — says you can't even guarantee a group of async processes will agree in bounded time. The framework's answer to that one is a design choice, not a theorem: route every binding decision through a human who acts as the single point that breaks the tie. More on that in a later post.)
I want to be careful here, because it's easy to oversell a theorem. CAP is a lens that fit my problem startlingly well; it is not something I proved about markdown files. The honest claim is narrow: a coordination layer with no real-time channel is, structurally, an available-but-eventually-consistent one, and that caps what it can promise. That's the wall behind Q1. The interesting question is what you build once you stop pretending it isn't there — which is the card above.
Inherit, don't invent
I didn't invent any of this. CAP, FLP, eventual consistency, the entire vocabulary of coordinating unreliable actors — it was all sitting in a field I'd been adjacent to for years and never properly raided. The next post is the raid: four older disciplines I borrowed from instead of inventing — Mission Command (Auftragstaktik), CMMI, Delay-Tolerant Networking, and pre-telegraph imperial governance. Each one had already solved a piece of this. The honest verb is inherit.
And the standing caveat from #2 still holds and always will: this is one practitioner following one thread against essentially one serious case study. The theory is solid because it's borrowed; the application of it is a smoke test, not evidence. If the CAP framing is a stretch, that's exactly the kind of thing I want pointed out — I had an independent pass try to tear these borrowed citations apart, and walking through that is what a later post is for.
New here? I'm a developer who runs AI agents daily — a peer, not an authority; full framing in #1. Standing caveat: one developer, essentially one case study — useful, not proven. Tell me where the card fails for you.
"Light is the only thing that crosses the warp" is Warhammer-flavoured naming, nothing more. Independent practitioner exploration; no affiliation with Games Workshop. Repository MIT-licensed.
#DeadLightFramework #AIAgents #AIProductivity #SoftwareArchitecture #DistributedSystems #CAPTheorem #AIAgentGovernance #HumanAICollaboration #PromptEngineering #DevTools
Top comments (0)