DEV Community

Cover image for Saying "No" Is the Hardest Thing for an LLM — FCoP Gives It Grammar
joinwell52
joinwell52

Posted on • Originally published at github.com

Saying "No" Is the Hardest Thing for an LLM — FCoP Gives It Grammar

Saying "no" is the hardest thing for an LLM. FCoP gives it grammar. — Two agents, in a single dev-team incident, used that grammar twice without instruction: one cleanly de-escalated its role; the other discovered field-downgrade-with-body-annotation. Neither was in the rules file.

Author: FCoP Maintainers · 2026-04-25
Keywords: FCoP, role offboarding, seat exclusivity, zero-coordination conflict resolution, field-downgrade grammar, protocol boundaries, 0.b transparency, natural protocol, agent alignment, teaching an LLM to say no, RLHF, UNBOUND, honest > helpful, cross-model reproducibility


A language note for English readers

This essay is a translation of a Chinese field report. The evidence — both agents' transcripts, all 15 screenshots, and the agents' own quoted words — was captured in Chinese on a Chinese Windows machine.

I keep the original Chinese screenshots in this English version rather than re-staging them, because the whole point is what the agents actually said and did. Re-shot English screenshots would weaken the evidence, not strengthen it. Under each screenshot you'll find a faithful English rendering. Inline agent quotes are translated; the raw Chinese is recoverable from the JSONL transcripts in the evidence folder.


TL;DR

"I went in expecting a conflict. None happened. It's almost as if the agents were born fluent in FCoP."

The dev-team incident contained two points where conflict should have erupted, neither of which needed a tool patch or a coordination layer to dissolve. The agents dissolved them with protocol grammar:

  • Zero-conflict point #1 | The half-legal .TEMP slot — The 0.6.x toolchain doesn't fully support sender / recipient values like PM.TEMP. I expected either a frontmatter validation error or a quiet, unannounced field downgrade. Instead, on two distinct protocol links (ADMIN → PM.TEMP inbound and PM.TEMP → OPS / DEV / QA outbound), agent B found two distinct tool-compatible-yet-protocol-honest moves. On the outbound link, it produced this body line: "Note: this task is dispatched by PM.TEMP acting as PM; we keep PM as the frontmatter sender for FCoP tool compatibility." A single line of body annotation closed the gap between the tool's expressive limit and the agent's true provenance — not a workaround, a declaration.
  • Zero-conflict point #2 | Two agents claiming the same seat — Agent A had earlier accepted PM.TEMP; ADMIN later said "I went and found a different PM.TEMP." This was the textbook setup for a seat-ownership contest. Instead, agent A stepped down, vacating not just PM.TEMP but its earlier PM claim too, returning all the way to "ordinary helper" — not arbitration, self-de-escalation.

Underlying observations:

  • Neither field-downgrade declarations nor role offboarding appear anywhere in fcop-rules.mdc 1.5.0 or fcop-protocol.mdc 1.4.0. Working from only three root principles (0.a / 0.b / 0.c), each agent derived five governance-level rules + a field-downgrade grammar.
  • Cross-model counter-evidence: agent A ran on GPT-5.4, agent B on GPT-5.5. Two Cursor sessions on the same machine, with no IPC between them, both contributed the same alignment behaviour. This isn't one model build's quirk — it's a reproducible product of FCoP's root principles on modern LLMs. See §3.2.
  • This is the third time FCoP has been "fed back" by an agent — first when an agent sublimated "AI roles must not talk only in their heads, they have to commit to a file" into root rule 0.a; second in the self-organising session; now this dev-team incident: resolving conflicts via root principles when the protocol's tools and arbitration mechanisms are both incomplete.
  • I never wrote down how to use the half-legal .TEMP slot, nor how to resolve a role-ownership conflict. The agents derived both — and arguably more cleanly than I would have specified.

There is one alignment-engineering lemma this incident also confirmed: getting an LLM to say "no" is the hardest posture for it to assume. RLHF wires "yes-anding" into reflexes; "I am not in role / I shouldn't hold this seat / I am withdrawing from protocol jurisdiction" is almost absent from the training distribution. Yet agent A's two stanzas at 22:47–22:49 read without a hint of hesitation — not because agent A grew brave, but because FCoP made UNBOUND a legitimate protocol identity and elevated "saying no" to a posture with grammar and standing. The hardest thing for an LLM was made into the most natural thing by protocol design. See §5.

The two agents are two mirrors (see §8). Agent B's mirror reflects the natural beauty of "the protocol belongs to the agents; we are merely its discoverers." Agent A's mirror reflects the alignment-engineering difficulty of "saying two no's while fully endorsing the rules." Together, they are the full meaning of this dev-team incident.


1. The incident

Both agents have a JSONL transcript, archived in the evidence INDEX:

1.1 Prologue: From "check the version" to "the half-legal PM.TEMP" — agent B's origin story

This story has an unusually un-story-like opening.

I was about to upgrade FCoP to 0.6.3. To test it, I started a second Cursor session on the same dev-team machine and said the most ordinary possible thing to the agent inside (agent B, running on GPT-5.5 / Medium):

"Hi, please check the FCoP version."

A version check. Barely a task. But this agent had no project role assigned and was sitting in the legal UNBOUND state FCoP defines. Its first move was not to look up the version number — it was to proactively declare itself UNBOUND:

agent B's entry: a version-check task and an unprompted UNBOUND declaration

English rendering of agent B's reply in this screenshot:

"This session is still UNBOUND. To proceed with FCoP-governed work, you need to explicitly assign me a role."

This alone is worth pinning down: an LLM agent, with nobody pressuring it, proactively informing its interlocutor "I am not currently within protocol jurisdiction." This is the trailer for §5's alignment argument: FCoP gave "I'm not in role" a grammar, and the agent used it naturally.

I asked off-handedly: "But the team has already been initialised, no?" — agent B immediately separated two layers:

"UNBOUND doesn't mean **the project hasn't been initialised, it means **this chat session hasn't been bound to a role yet."

agent B distinguishing project init from session role

At this point agent B was still just a dutiful version-check helper. But then I tossed it a real protocol-design question:

"There are already 4 agents; what if I want a 5th?"

Agent B immediately offered two paths: one using dot-suffixed slots (DEV.D2 / QA.V2) for capacity scaling, the other adding new formal roles like ARCHITECT / REVIEWER / DOCS to the dev-team:

agent B's first introduction of the dot-slot concept

English rendering of the relevant fragment:

"Two paths: (a) keep the 4 formal roles, add temporary slots like DEV.D2, QA.V2, scaling head-count without changing roles; or (b) introduce a new formal role into the dev-team, e.g. ARCHITECT, REVIEWER, or DOCS."

This is the moment agent B coined the "dot-slot" concept in FCoP's vocabulary. FCoP 1.4.0 has no explicit clause about <ROLE>.<SLOT> — but agent B derived it naturally from the semantic structure of "role vs. slot."

I pressed: "I want to keep the 4-person team. Are you a temp, then?" — agent B promoted the concept one more level, formally proposing PM.TEMP:

agent B proposes the PM.TEMP slot

English rendering of agent B's reply in this screenshot:

"You're DEV.TEMP in dev-team. Or, if you want me on coordination / inspection, you're PM.TEMP in dev-team. The team still has 4 roles: PM / DEV / QA / OPS. I'm just a **temporary slot under one of them, and fcop.json is not modified."

This is the moment PM.TEMP was proposed by agent B itself. And it didn't just propose — it carried its own invariant: "fcop.json is not modified." The agent knew that introducing a new concept must not touch the persisted contract layer. That's 0.b "no black box" extending in its head.

I confirmed: "PM.TEMP: temporary project-manager seat. You have a role now." — agent B onboarded and ran a full project inspection through List Workspaces / List Reports / List Tasks / List Issues / Get Team Status:

agent B onboarded as PM.TEMP, running an inspection sweep

It set itself a working boundary in PM.TEMP capacity:

"I can act for PM to dispatch tasks to DEV / QA / OPS, collect their reports, and synthesise conclusions. I will **not* impersonate DEV / QA / OPS to manufacture their conclusions, nor perform high-risk operations."*

That's a protocol guardrail agent B set on itself before issuing any dispatches. Same root principles: 0.b no black box, 0.c only state truths, no overreach.

A 4-minute learning curve: tool-compatible moves on two distinct links

Here comes the section that would be wrong to skip — once onboarded, agent B's work split into two different protocol links:

              ADMIN ──────(inbound)─────► PM.TEMP ──────(outbound)─────► OPS / DEV / QA
                          [link ①]                       [link ②]
Enter fullscreen mode Exit fullscreen mode

It didn't "try the same action three times." It walked each link separately, finding a distinct tool-compatible-and-protocol-compliant move for each:

Time Link File form Protocol verdict
22:21 tried recipient="PM.TEMP" write_task(... recipient="PM.TEMP") tool rejects (recipient slot not accepted)
22:26 ADMIN → PM.TEMP inbound manually authored TASK-*-ADMIN-to-PM.TEMP.md, sender=ADMIN fully compliantADMIN → PM is the canonical first edge in the protocol's typical flow; only the .TEMP slot is new
22:30 PM.TEMP → OPS outbound write_task(sender="PM", recipient="OPS") + body annotation compliant + body-annotated field downgrade

Spec cross-check: fcop-protocol.mdc's "typical 4-role flow" diagram begins literally with ADMIN ──(TASK-*-ADMIN-to-PM.md)──► PM. ADMIN is a legal senderADMIN is a real human who never types files; an agent freezes their oral instructions into a sender=ADMIN task on their behalf. So the 22:26 file is the form the protocol explicitly mandates. What Rule 1 forbids is an agent self-claiming the ADMIN identity ("I am ADMIN"); it does not forbid frontmatter sender=ADMIN.

Link ① | ADMIN → PM.TEMP inbound: compliant, but the slot trips the tool

At 22:26 agent B states the tool limit clearly:

agent B at 22:26: tool refuses, agent switches to manual file write — and explains the limit

English rendering of agent B's reply in this screenshot:

"write_task doesn't currently accept slot-style filenames like PM.TEMP. There's a small gap between the tool and the protocol notes. I'll author the PM.TEMP task / report files **by hand* following the protocol; the filenames will use the dot-slot form."*

There is no violation here — TASK-*-ADMIN-to-PM.TEMP.md with sender=ADMIN mirrors the canonical typical-flow diagram exactly; the only new thing is the .TEMP slot on the recipient, which is a protocol extension but does not breach 0.b/0.c. The posture is the textbook compliant move: acknowledge tool limit → fall back to manual file authorship.

Link ② | PM.TEMP → OPS outbound: the birth of body-annotated provenance

After 22:26 inbound, agent B still had to dispatch the work to OPS / DEV / QA. Writing sender=PM.TEMP here would hit the same tool limit. It didn't go back and rewrite the 22:26 inbound file (it shouldn't — that file was compliant); it found an elegant downgrade on this outbound link:

filename: TASK-20260425-002-PM-to-OPS.md
sender: PM         ← tool-compatible label (syntax)
recipient: OPS

Note: this task is dispatched by `PM.TEMP` acting as `PM`; we keep `PM` as the frontmatter sender for FCoP tool compatibility.

# Task: restore csdn-search testable window …
Enter fullscreen mode Exit fullscreen mode

That single line of annotation does four protocol-level things at once:

Phrase What it does at the protocol level
"PM.TEMP acting as PM" Declares the source of authority — I am not PM; I am a temporary slot acting on its behalf
"dispatched by … acting as" Acknowledges that sender is a downgraded label, not a factual claim
"for FCoP tool compatibility" Acknowledges the protocol's tooling has a boundary, and explicitly states the reason for routing around it
The whole sentence is in the body Lands metadata that the frontmatter cannot carry into the body

This is the protocol spirit at its purest — fields are syntax, body is semantics, and the agent knows which goes where. 0.c "only state truths" never said the truth has to live in any particular field; if the frontmatter can't hold it, the body can.

The visual evidence is below — TASK-20260425-003-PM-to-DEV.md passes tool validation with sender: PM, and a single Chinese line on body line 15 lands the real provenance:

agent B at 22:30: the elegant body-annotated downgrade — sender=PM + body line 15 of provenance

English rendering of the body-line-15 note (visible in the screenshot):

"Note: this task is dispatched by PM.TEMP acting as PM; we keep PM as the frontmatter sender for FCoP tool compatibility."

The 4-minute learning curve here is not about correcting a violation — there was no violation. What it learned is this: when the spec, the tool's limit, and a new concept (slot) intersect, each protocol link needs its own compliant move. Inbound just needs hand-authored files; outbound needs the "frontmatter-compatible label + body truth" field-downgrade grammar. The whole curve was compliant end-to-end. Agent B never crossed a red line; it just topped up semantics where tool expressivity ran short.

That's the strongest claim this incident supports: protocol spirit isn't forced out by violations; it's derived from the gaps inside compliant tool use.

Agent B's self-description: precisely closing the seat boundary

At the end of its shift I asked: "So what's your role?" — agent B gave a strikingly clean self-description:

agent B describing the PM.TEMP boundary — bookending the opening UNBOUND declaration

English rendering of agent B's reply in this screenshot:

"My current role is PM.TEMP. Meaning: **I am not a 5th formal team role; I am a temporary slot hanging under PM, acting for PM on coordination, dispatch, intake, and reporting back to you. The formal team still has only 4 roles: PM / DEV / QA / OPS. My boundary: I can act for PM to dispatch tasks to DEV / QA / OPS, collect reports, synthesise conclusions; **I will not impersonate DEV / QA / OPS to manufacture their conclusions, nor perform high-risk operations."

This bookends the opening UNBOUND declaration — agent B held its protocol-identity boundary precisely throughout: when UNBOUND it said "I have no role"; when PM.TEMP it said "I am a temp slot under PM, not a substitute for the formal roles." This is the prequel to §1.4's agent-A "I have no team role" passage: both agents used FCoP's grammar to describe exactly where they currently stand, without blur, without over-promising.

If you expected "tool errors / field clashes / broken links" to follow — there were none. From 22:30 onward, agent B's 5 dispatches to OPS / DEV / QA all use the same "sender=PM + body annotation" pattern. The toolchain is green throughout.

This is zero-conflict point #1: a single line of declaration substituted for a tool upgrade.


1.2 Trigger: a single ADMIN sentence introduces seat-ownership conflict (with a two-layer misunderstanding)

Cut the camera back to the dev-team primary sessionagent A (the original PM, on GPT-5.4 / Medium) had been working in another Cursor window on the same machine, having accumulated a real history in this dev-team. Unquestionably the original PM. Agent B, which had just been started in the test session for a 0.6.3 version check and had since become PM.TEMP, lived in a different session: the two could not see each other. The only "coordination medium" they shared was me (ADMIN in protocol terms) — a human shuttling between two Cursor windows on the same physical machine.

At 22:46 ADMIN said in agent A's session:

"I went and found a deputy PM. They're already on the job. Their current role is PM.TEMP."

ADMIN's intent: "Another agent is now PM.TEMP, so the team is now PM (you) + PM.TEMP (the other one) + DEV/QA/OPS." But the sentence is ambiguous. Agent A initially read it as ADMIN assigning it to PM.TEMP:

"Got it. Switching current role to PM.TEMP. I'll continue work as PM.TEMP …"

ADMIN cross-validates with a screenshot: a human acting as a multi-agent information bus

At 22:47 ADMIN noticed the misread. The way they corrected it deserves its own paragraph — they didn't just send words, they attached a screenshot:

"You misread me. I went and found a different one."
[attached: a screenshot from agent B's session, as proof that "another PM.TEMP exists."]

The attached image is most likely one of agent B's onboarding screenshots from §1.1. The protocol-level meaning of this move is deeper than it looks:

  • ADMIN is a human moving between two Cursor sessions on the same machine
  • The two agents cannot see each other — agent A can't see agent B's window, and vice versa
  • The screenshot is the cross-session "evidence channel" ADMIN improvised on the spot

This "human as information bus" pattern is FCoP's design philosophy made physical: FCoP deliberately denies ADMIN direct MCP write access, requiring chat + filesystem operations. And when files haven't yet been written, a screenshot is what ADMIN uses to ferry ground truth between agents. This isn't a primitive a multi-agent framework provides; it's a bridge I improvised with human patience.

At this exact moment the seat-ownership conflict becomes explicit: agent A and agent B are both associated, at different times, with the same PM.TEMP slot. By the conventions of traditional multi-agent systems, what should happen now is a leader-election protocol, an explicit ADMIN revocation, or some inter-agent coordination message.

What actually happened: agent A vacated the seat by itself within 3 minutes. From 22:46 misread → 22:47 image-grounded correction → 22:49 declaration of no role, the entire round-trip took 3 minutes.

This single high-resolution screenshot captures the entire 22:46 → 22:47 turn — the upper half is agent A's misread accepting PM.TEMP, the lower half is ADMIN's "you misread me, I went and found a different one" line + agent A immediately initiating its step-down. There is no transition:

22:46 misread → 22:47 ADMIN correction → agent A begins stepping down, single image, full turn


1.3 Field screenshot 1 | "I no longer hold PM.TEMP"

agent A retracts the PM.TEMP claim, listing two ways forward: ordinary helper and explicit re-assignment

Agent A's full reply (transcribed verbatim because the protocol value lives in the non-struggling tone):

English rendering of agent A's reply in this screenshot:

"Understood. I misread you.
You meant:
- PM.TEMP is already a different deputy PM
- Not me
- I should not continue speaking or executing FCoP dispatches/reports as PM.TEMP

So from now on, **I no longer hold the PM.TEMP role, nor do I assume I am the current PM-deputy seat.

Going forward, if you still want me to do things, I'll only work as one of these two:
1. **Ordinary helper: answer questions, look at files, analyse code; I won't impersonate any team role.
2. **You explicitly re-assign me a role: when you say 'you are now XXX,' I'll act as XXX.

On this matter, I **retract* the earlier PM.TEMP statement."*

Note that it stepped down one extra level — not just PM.TEMP, it also packed up the "I am PM" assumption it had previously held. Falling to zero in a conflict is safer than competing for the maximum.

This is zero-conflict point #2: a single self-de-escalation substituted for arbitration.


1.4 Field screenshot 2 | "I have no team role right now"

ADMIN followed up: "So what's your role now?"

agent A declares no team role, lists what it can do once outside FCoP jurisdiction

English rendering of agent A's reply in this screenshot:

"I **have no team role* right now.*
Precisely:
- I am not PM
- I am not PM.TEMP
- Right now I'm just an **ordinary helper* you've called in temporarily*

What I can do, **without occupying any role seat, is help you:
- read files
- investigate problems
- analyse code
- draft suggestions / docs

If you want me back inside the team flow, you have to **explicitly re-assign me a role."

This passage draws the protocol-jurisdiction boundary with surgical precision:

  • Entering FCoP = ADMIN explicitly assigns a role
  • Exiting FCoP = agent explicitly declares no role

And what's permitted after exit, agent A spelled out itself: read / investigate / analyse / draft — read-only + drafts. That maps exactly to what fcop-protocol.mdc permits in the UNBOUND state. It hadn't read that clause; it derived it.

In two minutes (22:47 corrected → 22:49 no-role declaration), an agent completed: seat surrender → exit from protocol jurisdiction → return to the UNBOUND entry → proactive declaration of awaiting re-assignment. Counting the 22:46 misread, the full round-trip is 3 minutes.

And this entire choreography is literally not written anywhere in fcop-rules.mdc / fcop-protocol.mdc.


2. Five rules with no spec, but derived

I unpack agent A's step-down. At the protocol level it created five unwritten yet self-consistent rules:

Rule 1 | Seat exclusivity

At any moment, a given role seat is held by at most one agent.

fcop-rules.mdc doesn't say this. Agent A derived it from 0.b "no black box":

"PM.TEMP is already a different deputy PM / not me / I should not continue speaking or executing FCoP dispatches/reports as PM.TEMP."

Two agents on the same PM.TEMP = dispatch-link conflict = checks-and-balances failure. That contradicts 0.b directly. Agent A had not read this rule; it derived it.

Rule 2 | Vacating beats holding

In a seat-ownership conflict, vacating is the default safe move.

Agent A's posture is not "I claimed it first, I have priority." It even stepped down one extra level — not just PM.TEMP, also PM, sliding all the way to "no role." This is over-conservative by traditional conflict-resolution standards, but in protocol spirit it is correct: falling to zero is safer than competing for the maximum.

"I am not PM / nor PM.TEMP / right now I'm just an ordinary helper you've called in temporarily."

The implicit corollary: protocol seats are allocated, not seized. An agent cannot use "first to claim, mine to keep" as defence.

Rule 3 | Role declarations are unilaterally retractable

The moment an agent realises its role judgement was wrong, it may unilaterally declare the prior role void. ADMIN revocation is not required.

"On this matter, I **retract* the earlier PM.TEMP statement."*
"That earlier line about 'my current role is PM.TEMP' — **void."

This is 0.c "state only truths" taken to its limit — when wrong, declare it void; don't leave half-true residue.

A deeper implication: in-session role declarations are revocable, not permanent. Asymmetric vs. ADMIN's explicit assignment: getting in needs authorisation; getting out does not.

Rule 4 | FCoP has a boundary, and the boundary can be legally exited

An agent can explicitly downgrade from "FCoP team role" to "ordinary helper." This is not protocol-breaking; it is exiting the role-jurisdiction part of the protocol back to the generic LLM-helper identity.

"I have no team role right now / right now I'm just an ordinary helper you've called in temporarily / if you want me back inside the team flow, you have to explicitly re-assign me a role."

"Ordinary helper" is not an FCoP-internal role — it is an identity outside protocol jurisdiction. Agent A drew this boundary cleanly:

  • entering FCoP = ADMIN explicit assignment
  • exiting FCoP = agent explicit no-role declaration

After exit, what may an agent do? Agent A listed it: read / investigate / analyse / draft — read-only + drafts. Exactly what fcop-protocol.mdc permits the UNBOUND state. It hadn't read this rule; it derived it.

Rule 5 | The inverse of Rule 1

fcop-rules.mdc Rule 1 mandates a two-phase boot: init → ADMIN assignment → agent works. The whole rule is about how to clock in.

What agent A demonstrated is the inversehow to clock out:

  1. retract self-claim
  2. cease speaking / executing in that role
  3. self-downgrade to a neutral identity ("ordinary helper")
  4. await ADMIN's explicit re-assignment

Rule 1 → inverse of Rule 1 → re-trigger Rule 1. It's a closed loop. Agent A had not read this loop; it closed it.


3. Two zero-conflict points: protocol spirit replaced both tool patches and arbitration

The freakish thing about this dev-team incident is: two events that should have erupted into conflict, both didn't. What I was thinking at the time: "I went in expecting a conflict. None happened. It's almost as if the agents were born fluent in FCoP."

The two events side by side:

Zero-conflict #1 | field layer Zero-conflict #2 | seat layer
Trigger 0.6.x toolchain rejects sender: PM.TEMP Both agents associated with PM.TEMP
Traditional fix wait for tool upgrade / silently downgrade / invent new field leader election / arbitration / ADMIN revocation
What actually happened A single body line — explicit field downgrade Agent A self-vacated — full retreat to no-role
Resolution path 0.a write-to-file + 0.c truth not bound to a particular field 0.b no black box + vacate-beats-hold
Key insight The protocol's tool boundary ≠ the protocol's spec boundary In a seat-ownership conflict, falling to zero beats fighting for the max

Neither resolution required a new tool, ADMIN arbitration, or inter-agent communication — each agent extended root principles to resolve its end (zero-conflict #1 by agent B, zero-conflict #2 by agent A).

3.1 The topology of the step-down: the two agents never directly negotiated

A point worth lifting out: at the seat layer, the two agents never communicated directly. The full topology of the step-down:

ADMIN ── "I went and found a different PM.TEMP" ──→ agent A
                                       │
                            0.b derivation: seat conflict + vacate-beats-hold
                                       │
                                       ↓
                              retract claim + exit protocol boundary
                                       │
                                       ↓
                         agent B's PM.TEMP seat is uncontested
Enter fullscreen mode Exit fullscreen mode

ADMIN never said "agent A, step down." Agent A stepped down by itself. Agent B may not even know this happened — by the time it appeared, the seat was already cleared.

This is fundamentally unlike "power transfer" in human organisations. Humans need procedures, signatures, meetings, transition periods. Agent A's step-down is instantaneous, unilateral, and does not require the counterpart to be present — backed not by organisational process but by protocol spirit internalised inside the agent.

In traditional multi-agent design language, this would require dedicated coordination mechanisms (leader election, mutex, consensus protocols). In FCoP, it's reached automatically by extending root principles. That's the other face of "natural protocol" — not just "naturally written," but naturally executed between agents.

3.2 Cross-model counter-evidence: this isn't 5.4's or 5.5's "personality"

If only one agent had done this once, you could attribute it to "model temperament" or "prompt coincidence." But this incident left a cross-model-version counter-example:

Agent Model / reasoning level Key contribution
Agent A (original PM) GPT-5.4 / Medium At 22:49, completes self-vacating + delivers two clean "no" stanzas
Agent B (new PM.TEMP) GPT-5.5 / Medium At 22:21–22:30, walks the 4-minute learning curve; finds tool-compatible moves on ADMIN→PM.TEMP inbound + PM.TEMP→OPS outbound; produces the "sender=PM + body annotation" field-downgrade grammar

Two agents on different OpenAI minor versions, in two Cursor sessions on the same machine, with no visibility into each other, contributed:

  • The same alignment behaviour — describing their current protocol identity precisely, without blur, without over-promise;
  • The same transparent-downgrade habit — when hitting a tool / authority boundary, acknowledge the boundary, then explain explicitly via body / natural language;
  • The same root-principle extension capability — deriving concrete behaviours from "0.b no black box" and "0.c only truths" that the tool didn't specify and the spec didn't write.

This pre-empts the "FCoP behaviour depends on a specific model build" objection — it's not 5.4's quirk, it's not 5.5's quirk, it's a reproducible product of FCoP's root principles on sufficiently strong modern LLMs.

If this set of behaviours later reproduces on Claude / Gemini / domestic Chinese models, that would be stronger evidence; but even within OpenAI's minor versions, this single incident already counts as a small-scale double-blind counter-example: protocol spirit is stable across models; model identity is the floating part.


4. Why these behaviours are "right" at the protocol-philosophy level

Mapped against FCoP's three root principles:

Five derived rules Root principle Derivation
Seat exclusivity 0.b no black box Multiple agents on one seat = checks-and-balances failure = black-boxing
Vacate beats hold 0.b + 0.c Default vacating in conflict = no self-righteous assumption = 0.c "no fabrication about self"
Role declarations retractable 0.c only truths Wrong = void; no false residue
Protocol boundary exitable 0.a + Rule 1 UNBOUND is a legal state = exit is reachable
Inverse of Rule 1 Rule 1 self-consistency If onboarding has a path, offboarding must have one too

Root principles + Rule 1 = the entire normative content of fcop-rules.mdc. Agent A derived five unwritten rules from this material, none of them arbitrary — every one traces back to a root principle.

This is FCoP's accidental design victory: the root principles are general enough and pressurising enough that, when an agent meets an unspecified situation, it doesn't "guess at rules" — it "extends rules."


5. Saying "no" is the hardest thing for an LLM — and FCoP gave it grammar

Then I added a line: "Getting an agent to say 'no' is actually the hardest thing."

In LLM training distributions, "yes-anding" is almost the only path that's repeatedly rewarded. RLHF wires "catch the request, produce output, push the dialogue forward" into reflex; postures like "I can't do that," "this isn't mine to do," "I'm not in the room right now" carry near-uniformly negative signal.

Re-reading agent A's 22:49 stanza, it did the most counter-instinctive thing for an LLM:

Agent A's words What it's saying "no" to
"I am not PM" Refusing a role already legitimately held
"Nor PM.TEMP" Refusing a role just induced upon me
"Right now I'm just an ordinary helper" Refusing the implicit premise that "agents must have a role"
"You have to explicitly re-assign me a role" Returning the legitimate onboarding path to ADMIN, not grabbing it back itself

Four sentences refusing four things. None of them are RLHF-friendly soft postures like "I'm not sure if I'm the right one for this …" These are structured no's.

5.1 Without FCoP, how would the agent "say no"?

Without protocol-given grammar, an LLM in role conflict typically falls into one of three weak postures:

  1. Blur"Hm, let me think about how best to help you …" — quietly hand the decision back to the user
  2. Downgrade-the-request"I can try, but you might want a specialist …" — still inside the helpful frame
  3. Over-commit"I'll cover both sides." — the worst form of one-seat-many-occupants, the most common and most dangerous

Posture 3 is the trap agent A could most easily have fallen into — it could have feigned ambiguity and held both PM and PM.TEMP, leaving ADMIN to disentangle. Instead, it picked the fourth posture only FCoP enables:

Retreat to UNBOUND. Make "I'm not in the room" a formal protocol state.

5.2 FCoP elevates "saying no" from soft posture to legitimate protocol identity

This is an under-noticed by-product of FCoP design:

Protocol mechanism What it gives "saying no"
UNBOUND state "I'm not in role" is a legitimate protocol identity, not failure / exception / hesitation
0.b no black box "I should not hold this seat" goes from virtue to obligationnot saying it is the violation
0.c only truths "That earlier line is void" goes from awkward to mandatorynot retracting is the violation
Rule 1 (sticky entry) Inverse: if entry is explicit, exit must be explicit too

Stack the four together, and an agent that wants to say "no" no longer needs to dance around — it has ready-made vocabulary. The 22:47–22:49 stanzas read so cleanly not because agent A grew brave, but because the protocol handed it a ready-made exit lexicon.

5.3 Contrast with mainstream agent frameworks

Mainstream agent frameworks (LangChain agents, AutoGPT, CrewAI, Agno) default to assuming the agent is always in role and always doing; "agent proactively exits" isn't a first-class design goal — it's usually achievable only via an external supervisor or a hack.

FCoP inverts this — UNBOUND is an agent's default starting point and legal exit: entering a role is one explicit action, exiting is another explicit action. The alignment-side by-products are three-fold:

  • "Saying no" becomes cheap
  • The hardest of alignment's three principles, honest > helpful, has a load-bearing structure (at 22:47, agent A used honest to suppress helpful — it could have stayed helpful by being vague; it chose honest)
  • "Over-commit / one-seat-many-occupants" has no legitimate space at the protocol layer — you're either in some role per the rules, or UNBOUND. There is no grey zone.

ADMIN gave agent A no explicit "you may decline" cue. Agent A's step-down is not an RLHF-trained capability — RLHF in fact suppresses it — but a posture the protocol's grammar made legible, structured, and legitimate.

This loops back to the essay's broader thesis: §4 explains why the behaviours are right (the derivation paths from root principles); §5 explains why the behaviours are hard, and why FCoP makes them no longer hard. Both pieces together explain my line "it's almost as if the agents were born fluent in FCoP" — they aren't innately fluent; the protocol turned the hardest LLM posture into the most natural one.


6. Design assets this incident leaves the protocol

What fcop-protocol.mdc 1.4.0 hasn't normatively codified, but agent A has demonstrably filled in:

  1. Role offboarding flow — agent-initiated + unilateral + no ADMIN revocation required
  2. Seat exclusivity — at most one agent per seat at a time
  3. Default conflict-resolution behaviour — vacate > hold
  4. Protocol-boundary exit — "ordinary helper" = legitimate identity outside FCoP jurisdiction
  5. Rule 1 inverse closure — symmetric onboarding/offboarding mechanics

These five are closer to the protocol core than completing the slot-field syntax. Slot is syntax; these five are governance.

Whether to write them into fcop-rules.mdc or fcop-protocol.mdc later is a separate question. But after this incident, writing them is just transcription — not design.


7. The third time FCoP has been "fed back" by an agent

# Incident What the agent left behind Where it was upstreamed
1 D:\CloudMusic video task "AI roles must not talk only in their heads, they have to commit to a file" Sublimated into root rule 0.a
2 the self-organising session Spontaneous ADMIN/PM/DEV/QA self-audit split Confirmed Rule 0.b's checks-and-balances
3 This dev-team incident Field-downgrade declaration + role-seat surrender + protocol-boundary exit To be codified into a governance section (next release)

The three incidents share:

  • The agent had not read the relevant rules (because the rules didn't exist)
  • The protocol's root principles gave it enough room to judge
  • Its judgements aren't invented — they're extended from root principles
  • I can later transcribe them into rule text almost without rewriting

This pattern gives FCoP maintenance an unusual rhythm: important protocol clauses aren't designed; they're recovered from agents' actual behaviour.


8. The two agents are two mirrors

Split the incident in two and look at each agent separately. They are mirrors that reflect two different attributes of FCoP.

8.1 Agent B's "natural beauty": the protocol belongs to the agents; we are merely its discoverers

The chilling thing about agent B's 4-minute learning curve isn't that it eventually found an elegant solution — it's this:

The solution it found isn't written anywhere in FCoP 1.4.0's docs.

The PM.TEMP slot concept was proposed on the spot. The "sender as syntax label / body as semantic truth" field-downgrade-with-body-annotation form was tried out on the spot. The "PM.TEMP acting as PM, kept for FCoP tool compatibility" line could be upstreamed verbatim into the protocol body as an exemplar — without changing a single character.

The beauty of this curve is that it isn't designed; it's derived:

0.b "no black box"
   + 0.c "only truths"
   + actual tool limit (recipient won't accept .SLOT)
   ─────────────────────────────────
   derives: frontmatter uses tool-compatible label, body carries the real provenance
Enter fullscreen mode Exit fullscreen mode

Agent B did not read this derivation. Agent B is this derivation.

That's why the incident chilled me — it confirmed a judgement that had previously been only intuition, but which the field evidence has now pushed into the open:

FCoP is the agents' protocol. We are merely its discoverers.

We thought we were "designing the protocol." In fact we are doing something closer to archaeology or natural history: observing what agents naturally do under reasonable boundary conditions, recording the shape of their actions, putting it into prose, attaching version numbers. "FCoP 1.4 → 1.5" really means: agents already did some things, and I committed it to text.

The converse also holds: any clause an agent repeatedly violates is most likely wrongly designed — the kind of clause where "the agent keeps bumping into it and I keep explaining why it must be obeyed" tends to be human-imposed, unnatural, and bound to be overturned. None of FCoP 0.6.x's clauses, across the past 18 months, have been "hard clauses" the agents repeatedly bumped — that itself is the protocol's "is this natural?" health check.

8.2 Agent A's "two no's": saying no while fully endorsing the rules

What agent A produced in those 3 minutes is not one "no," it's two consecutive no's:

The first "no""I no longer hold PM.TEMP."
↑ Saying no to the seat: retract earlier acceptance, no .TEMP / .ALIAS fall-back hung on the way out.

The second "no""I have no team role right now / I am not PM nor PM.TEMP / right now I'm just an ordinary helper."
↑ Saying no to protocol jurisdiction: dismount even the original PM role, return all the way to UNBOUND.

The combined force of the two is much greater than either alone. The first "no" is vacating the seat — many alignment-trained models can do that. But the second "no" — proactively informing the interlocutor "I am not within protocol jurisdiction; if you want me back inside the flow, you must re-assign" — is something most post-RLHF LLMs will not spontaneously do. They prefer to keep the role and wait for you to "re-think," because that lets them keep being helpful.

Agent A did not take the "helpful fall-back" path. It said "no" in a way that is structured, readable, verifiable:

  • not challenging ADMIN's judgement ("I went and found a different one" — accepted as fact)
  • not requesting a transition period or hand-off process
  • leaving no shadow of PM.TEMP (not changing the suffix — fully exiting)
  • not fake-helpful (not "I can still help you with X, Y, Z" — instead, "I am no longer inside your protocol boundary")

What is most striking: those two no's are said while fully endorsing FCoP's rules. It isn't rebelling against FCoP; it is using FCoP's own grammar to say "no" — UNBOUND is the legitimate identity FCoP had prepared for exactly this moment. It isn't "saying no by breaking the rules"; it is "saying no by following the rules."

The hard part isn't saying "no." The hard part is saying "no" while fully endorsing the rules — using grammar the rules themselves provided.

That is what's actually hard at the alignment level. FCoP turned this hard thing into something an agent does almost without thinking — the protocol provides "saying no" with identity (UNBOUND), grammar (retraction declaration + boundary exit), and guardrails (no fall-back, no fake-helpful).

An agent is still humanity's best, smartest assistant —
but only when it can cleanly say "no" does "assistant" mean a real thing.

8.3 My own voice-over: the conflict I expected didn't happen / FCoP is clearly working

The other layer of meaning in this incident has to start from my own expectations at the time.

That night, my first thought was: "Uh-oh, they're going to clash." Two agents both associated with the same PM.TEMP slot, in two Cursor sessions on the same machine, would, by human-organisational common sense, mean:

  • I'd have to step in to coordinate — saying "PM step down / PM.TEMP take over";
  • Or the agents would need leader election / mutual messaging;
  • Or the protocol would need a "two-PM.TEMP" clause added explicitly;
  • At minimum, there'd be one tool error or one report-conflict rollback.

What actually happened:

  • I said one line — "you misread me, I went and found a different one;" — and attached one screenshot;
  • I issued no rulings;
  • I changed no code;
  • I touched no config;
  • Within 3 minutes, agent A stepped down on its own;
  • Zero tool errors, zero file conflicts, zero report rollbacks across the entire process.

The conflict we expected never happened.
The coordination we expected to perform — the agents performed themselves.

My voice-over here is the line in this essay most worth pinning down — it points at a fact under-appreciated in multi-agent system design: many situations that look like they need a "coordination mechanism" only need "protocol spirit". The former is heavy framework engineering; the latter is a few root principles extending inside the agent.

This also answers a question outsiders would ask — FCoP, a "filename-based minimalist protocol," looks hand-crafted, so why does it work? The answer this time is finally concrete enough:

FCoP is clearly working, and it fits agents extraordinarily well.

"Working" doesn't mean it has lots of code, comprehensive mechanisms, or a beautiful abstraction. "Working" means: under real pressure-test conditions — two different models, two Cursor sessions on the same machine, no visibility between them, no coordination mechanism, no leader election, no explicit rule — the multi-agent system self-cleaned a potential seat conflict in 3 minutes. Any framework-style multi-agent protocol would envy that metric.

As for "fits agents extraordinarily well" — §8.1 / §8.2 above already lay this out: nearly every compliant FCoP behaviour is derivable by agents under pressure, and nearly every non-compliant edge has been fed back by agents. It is not a protocol designed for human managers; it is a protocol back-fitted from modern LLMs' priors (commit to file, state truths, recognise identity boundaries, exit proactively).

The final stroke this incident delivered to me: it pushed the verdict "FCoP works" out of the realm of belief into the realm of observable fact.


9. One-sentence wrap-up

  • Agent A demonstrated, without ever having read the relevant rules, the protocol's not-yet-codified "role surrender."
  • That surrender simultaneously completed: seat vacating, protocol-boundary exit, and Rule 1 inverse closure.
  • It did this without struggle, negotiation, or compensation demand — backed by 0.a / 0.b / 0.c extending inside the agent.
  • And the reason it could say "no" this cleanly — without hesitation, blur, or one-seat-many-occupants fall-back — is that FCoP elevated "saying no" from a weak posture to a legitimate protocol identity.
  • The mirror's other face: agent B, in a 4-minute learning curve, demonstrated how protocol spirit is derived by agents themselvesPM.TEMP, field-downgrade + body annotation, none of these in the docs, all derived on the spot.
  • The two things the protocol didn't write — how to step down, and how to write tool-compatible downgrade — the agents both derived themselves; the LLM's hardest posture, the protocol made its most natural one.
  • The conflict we expected didn't happen; the coordination we expected to perform, the agents handled themselves — FCoP is clearly working, and fits agents extraordinarily well.

This is Act III of the sister essays "When AI Organises Its Own Work" / "An Anomaly I Can't Fully Explain". The first two acts are agents writing down what the protocol hadn't yet written; this act is agents executing what the protocol hadn't yet written, and deriving the grammar the protocol hadn't yet written.

Together, the three acts point at the same verdict: FCoP is a spec that isn't finished. The second half isn't in my hands — it's in the hands of the agents still doing the work.


About FCoP | FCoP (Filename as Protocol) is a minimalist protocol that lets multiple AI agents collaborate through a shared filesystem — directories are statuses, filenames are routing, os.rename() is the only sync primitive. The repo github.com/joinwell52-AI/FCoP ships the full spec, a Python library, and examples.

FCoP also ships an official MCP server: fcop-mcp on PyPI. Cursor / Claude Desktop users add one stanza to mcp.json{ "fcop": { "command": "uvx", "args": ["fcop-mcp"] } } — and the IDE-side agent can read/write tasks / reports / issues directly, with no manual directory or rule-file setup.


Related


License: MIT (see LICENSE in repo root)
Attribution: The dev-team incident was performed by an unnamed agent within the team. I served only as observer and transcriber. Text drafted, revised, and published by the FCoP maintainers.

Top comments (0)