Mohamed Salah

Posted on May 20

How I made every AI coding agent on my machine follow the same rules with one markdown file

#ai #githubcopilot #productivity #opensource

I use Claude Code, GitHub Copilot, OpenAI Codex CLI, and Cursor — sometimes in the same day. Each one drifts the same way: skips planning, forgets tests, picks stale library versions, and asks twelve questions in the middle of a build. So I wrote one markdown template and a small installer that drops the same rules into every agent on my machine. Here is the design.

The drift problem

If you have ever said "build me a kanban app" to an AI coding agent, you have lived this:

It picks a stack you would never pick yourself
It scaffolds, then stops to ask "what database do you want?" three files in
It writes the happy path and skips error handling
It "forgets" to add tests until you remind it
It pulls a 3-year-old version of a library because that is what its training data remembers

None of these are intelligence problems. They are workflow problems. The model is genuinely capable; the protocol around it is missing.

The fragmentation problem

Every agent reads a different configuration file:

GitHub Copilot reads .github/copilot-instructions.md
Claude Code reads CLAUDE.md or ~/.claude/skills/<name>/SKILL.md
Cursor reads .cursorrules
Windsurf reads .windsurfrules
Codex CLI, Cursor, Aider, Windsurf all also read AGENTS.md

That is five files saying the same thing, drifting independently the moment one gets updated. Most people pick one tool and ignore the rest, which traps them in that tool. I wanted the opposite — to be able to swap agents like editors and keep the same operating rules.

The one-template-many-files solution

The whole project is a folder of markdown. One source-of-truth template (SKILL.md), then a small build step that materializes the agent-specific files from it. The result: every agent on my machine reads the same intake, the same protocol, the same quality gates.

The protocol itself is four steps:

1. One-shot intake

Instead of the agent asking questions mid-build, it asks them all upfront, in a single round, with sensible defaults derived from the one-line command. I reply "go" or override specific lines. Then it executes without interruption.

This is the biggest single improvement. It removes ~80% of mid-build derails.

2. Plan before code

The agent writes PLAN.md listing features, dependencies, and order. No code is written until the plan exists. That sounds obvious but most agents skip it by default when given an action-shaped prompt.

3. Ralph-style feature loop

For each feature: spec → failing test → implement → run → on-fail debug-and-retry → on-pass checkpoint → next feature. Maximum three retry passes before escalating back to me.

The retry budget matters. Without it, the agent will grind on a broken edge case for an hour. With it, the loop exits loudly when it is stuck instead of silently when I stop watching.

4. Binary quality gate

Before "done" is allowed: lint passes, types pass, tests pass, build passes. Any red and it goes back to the loop. No "mostly working" allowed.

Why it works the same across agents

Because the same SKILL.md is the source. The materialized agent-specific files are formatting wrappers around the same protocol body. The Copilot version has Copilot front-matter, the Claude version has Claude front-matter, the rules are byte-identical.

Side benefit: when I switch from Copilot to Claude Code to test something, I do not relearn anything. The agent recognizes the same skill, runs the same intake, hits the same gates.

Localhost-first by design

I made one explicit choice that surprises some people: this is localhost-first. There is no "deploy to Vercel" step, no Docker, no production paths in v1. The reason is simple — every "ship to prod" feature I have seen baked into a coding agent has been the source of half its bugs. Shipping to localhost is a deterministic problem. Shipping to production is a conversation. Keeping those separate makes the agent behave better.

The honest limits

Skill auto-discovery varies by model. Claude triggers it reliably. Codex needs me to type /myvibe. Copilot picks it up from the prompt directory.
It will not save you from a weak model. If the base model cannot write the code, no workflow rescues it. What this does is fail earlier and louder, which is itself an improvement.
It is opinionated. If you disagree with "failing test first" or "three retries then ask", you will not like it. That is fine.

Try it

The whole thing is open source, MIT, no telemetry, no paid tier. The installer is ~80 lines you can read before running.

Windows:

iwr https://raw.githubusercontent.com/Mohamed201389/myVibe/main/bootstrap.ps1 | iex

macOS / Linux:

curl -fsSL https://raw.githubusercontent.com/Mohamed201389/myVibe/main/bootstrap.sh | bash

Repo: https://github.com/Mohamed201389/myVibe

The file I would read first is INTAKE.md — that is the "no mid-build interrogation" promise, and the highest-leverage idea in the kit.

Originally written from running into the same drift for the hundredth time. Feedback welcome.

DEV Community