DEV Community

ibrahim kobeissy
ibrahim kobeissy

Posted on

I built a compiler for how AI agents should write to you

I kept correcting AI agents in the same ways: "too long," "answer first," "use a diagram," "assume I know the jargon." Each correction improved the current exchange, but the preference was not represented as durable system state.

I built /calibrate-comms to make that state explicit. It is an open-source skill inside an Obsidian vault, used by both Claude Code and Codex.

The model: nine operational dials

The skill does not try to discover a personality type. It calibrates nine choices that directly change how an answer is rendered:

Dial Practical question
Density Tight sections or full reasoning?
Sequence Answer first or chronological build-up?
Modality Prose or diagrams for relational content?
Abstraction Concrete example or principle first?
Tradeoff One recommendation or several options?
Detail Main path or edge cases too?
Jargon Define terms or assume expertise?
Tone Casual, neutral, or formal?
Context-giving Should the agent extract missing context or split an overloaded brief?

Prior → calibration → directives

The workflow has three stages:

L1 PRIOR → L2 SAMPLE REACTION → COMPILE → CLAUDE.md
    hypothesis        empirical override       shared by both agents
Enter fullscreen mode Exit fullscreen mode

Quick mode asks one bespoke forced-choice proxy per axis. Those questions are deliberately labelled as proxies, not validated psychometrics. Deep mode must fetch the exact items from supported open-access instruments before use; if the source cannot be obtained, the skill stays in Quick mode and makes no validation claim.

The prior is then challenged through pairwise samples. For sequence, the contrast looks like this:

Build-up first:
We traced the latency spike to N+1 queries, then found lazy loading in a loop—so the fix is eager loading.

Answer first:
Fix: eager-load the association. Why: lazy loading in a loop caused N+1 queries and the latency spike.
Enter fullscreen mode Exit fullscreen mode

The user's pick is revealed preference. If it contradicts the prior, the sample wins.

The compiler is the useful part

The final profile is not a score report. A deterministic band-to-rule table emits instructions an agent can execute:

- Bottom line up front: answer in sentence one, justification after.
- Use a table or diagram when content is structural or relational.
- Assume working fluency; define only non-obvious terms.
Enter fullscreen mode Exit fullscreen mode

Those directives are written into a delimited COMMS-PROFILE block in CLAUDE.md. In this vault, AGENTS.md points to the same instructions, so Claude Code and Codex receive one source of truth.

Interactions are resolved explicitly. Low density plus complete detail becomes "thorough but layered." A preference for both answer-first and multiple options becomes a one-line lean followed by the option breakdown. A visual replaces prose when the user wants both diagrams and brevity.

Calibration, not classification

The skill rejects DISC, MBTI, VARK, and deterministic user types. Supported psychometrics are used only as optional priors. Pairwise choices, direct edits, and live corrections have higher authority because they measure what actually lands.

That live loop matters. If the user says "too long" during real work, the density directive can be corrected without rerunning the full process. A calibration that cannot learn from use is just a quiz.

Privacy is part of the architecture

A filled communication profile describes a person, so it is personal data. The private vault keeps the ledger and compiled directives. The public template ships the skill, templates, and an empty placeholder; its sync verifier fails if a populated profile appears in the public copy.

The implementation is plain Markdown and prompt-defined skills. Inspect the model, challenge the mappings, or adapt the compiler:

https://github.com/ibrahimkobeissy/ai-second-brain-template

Top comments (0)