HIROKAZU YOSHINAGA

Posted on May 4

STRATEGY.md as code — turning a doc nobody reads into an LLM contract

#ai #llm #mcp #marketing

TL;DR:

Strategy docs that live in Notion or in a slide deck rarely affect day-to-day decisions, because nobody reads them at the moment a decision happens. LLM agents have the same problem in a worse form: they reinvent generic best practice every session.

STRATEGY.md as code is the pattern of writing strategy as a file the agent must load before acting. Persona, USP, brand voice, goals, operation mode, and explicit constraints become inputs to the LLM's reasoning, not slides for a quarterly review.

The pattern generalizes. INCIDENT_RESPONSE.md for SREs, INVESTMENT_THESIS.md for funds, THREAT_MODEL.md for security teams — anywhere the local rules differ from the generic best practice an LLM was trained on, a contract file closes that gap.

I have a Notion page from 2024 titled "FY24 Marketing Strategy". It is 18 pages. Three people on the team can find it. Two have read it. One read it once, then forgot it existed. Last quarter we made a budget reallocation decision that contradicted page 7 of that document, and it took six weeks for someone to notice.

This is the normal state of strategy documents. They are written to be approved, not to be consulted. The act of writing them is what the organization wants; the act of reading them is what nobody has time for. The Notion page exists so that, in a future meeting, somebody can say "as discussed in our strategy doc". It does not exist so that, on a Tuesday afternoon when you are deciding whether to bid on a competitor's brand term, you reread page 7.

LLM agents inherit this problem and make it worse. Every new chat session starts from zero context. The agent has read the entire public internet but knows nothing about your business. Ask Claude or ChatGPT a question about your ad account, your codebase, or your investment thesis, and it will helpfully return whatever the average answer to that kind of question looks like. "For a B2B SaaS account, you should consider..." Yes. For a B2B SaaS account. Not for mine.

The fix is not "prompt better". The fix is to write the business-specific constraints down in a file the agent loads at the start of every session, and to design the agent's tools so that constraints in that file are evaluated against the actions it proposes. Strategy as a contract, not a deck. That is what STRATEGY.md as code means in practice.

What goes in the file

The shape that has held up for me, after writing this kind of file for an ad-ops product (mureo) and watching agents use it across a few hundred sessions, is six sections. Each one is a different kind of constraint on what the LLM is allowed to recommend.

Persona answers "who are we selling to". The agent reads this and stops generating ad copy aimed at the wrong audience. Without a persona, an LLM defaults to the median small business; with one, it speaks to a specific buyer.

USP answers "what is the differentiator". The agent reads this and stops recommending tactics that flatten the differentiator into commodity competition. If your USP is "the only one with X", the agent should not be suggesting price-led headlines.

Brand voice answers "how do we sound". This one is the easiest to underestimate. An LLM with no brand voice constraint produces text that sounds like an LLM. With three sentences of "no exclamation marks, no superlatives, no metaphors about journeys", the output stops sounding generic.

Goals answer "what numbers matter and by when". A goal is Target | Deadline | Current | Priority. The agent reads this and prioritizes the metric you actually care about, not the metric most legible on a dashboard.

Operation mode answers "what is the current posture". I use seven values internally: ONBOARDING_LEARNING, TURNAROUND_RESCUE, SCALE_EXPANSION, EFFICIENCY_STABILIZE, COMPETITOR_DEFENSE, CREATIVE_TESTING, LTV_QUALITY_FOCUS. The exact set is less interesting than the fact that there is a set, with rules about which actions each mode permits. Mode is what flips an agent from "find new keywords to test" to "do not change anything, the algorithm is still learning".

Constraints answer "what is forbidden, and why". This is where the document earns its keep. Generic best practice plus your specific exceptions. The constraints section in our demo STRATEGY.md for a fitness-app account, for example, contains three rules:

1. No competitor-name bidding on either platform.
   Past attempt cost JPY 600K/quarter and produced 8 subscriptions
   at JPY 75K CPS — 16x worse than baseline.
2. Conversion campaigns optimize for "Subscribed" (paid signup),
   never for "Started trial". Trial-volume optimization trains the
   ad platforms toward shallow signups whose downstream trial-to-paid
   rate falls below the 12% quality floor.
3. Meta Lookalike stack capped at 3 active variants, and the
   variants must be 1%, 2%, 5% (or smaller). Past stacking of LAL 7%
   and 10% caused audience overlap, frequency above 8x, and CTR
   collapse within 14 days.

A vanilla LLM has no idea about any of those rules. They are the residue of past mistakes. Each rule has a number and a reason attached, because rules without reasons get ignored the moment they look inconvenient. With this file loaded, the agent has access to the same accumulated organizational scar tissue that a senior team member would carry in their head — and unlike that team member, the agent loads it on every single decision.

What "as code" actually changes

A document is read by humans, sometimes. A contract file is loaded by an agent, always. The difference is mechanical: the file goes through a tool call into the LLM's context window before any reasoning happens.

In mureo's case, v0.8.0 (released 2026-05-02) shipped five MCP tools that expose STRATEGY.md and STATE.json to the agent — mureo_strategy_get, mureo_strategy_set, mureo_state_get, mureo_state_action_log_append, mureo_state_upsert_campaign. The names are boring on purpose. What matters is the side effect: any host that speaks MCP — Claude Desktop chat, Cowork, the web UI, any of them — can now read the strategy file without needing direct filesystem access. Read and write. The agent can also propose updates to the strategy when the human says "we just changed our quarterly target", and the file moves with the conversation.

The reason this matters more than "the agent reads a markdown file" might suggest: most LLM hosts outside Claude Code do not have Read and Write filesystem tools. If your strategy lives in a local file, the agent in the chat window literally cannot see it. Exposing the file through an MCP tool is the bridge. The contract becomes part of the agent's input regardless of which client it is talking to.

The other thing that moves once the file is a contract: the agent's recommendations stop being LLM-generic. Concretely, here is what /daily-check produces today against a synthetic D2C cosmetics scenario where Meta CPA has spiked because of a broken Pixel:

The single biggest story: Meta CPA is 5.2x Google CPA — well past the STRATEGY.md "50% sibling-channel divergence ⇒ diagnose before more spend" tripwire — and three prior manual cuts have worsened the curve.

Google Ads (last 30d): blended CPA ¥2,054. Healthy.
Meta Ads (last 30d): blended CPA ¥10,714 against a ≤¥4,500 target.

Recommend: run /rescue (pixel / Conversions API audit) on Meta. Hold all Meta bid/budget moves until divergence is diagnosed.

Read that closely. The agent quoted a constraint by name — "50% sibling-channel divergence ⇒ diagnose before more spend" — that exists nowhere outside of this account's STRATEGY.md. A vanilla LLM looking at the same numbers would tell you Meta CPA is high and suggest pausing underperformers. That is exactly what the human manager in the demo did, three times over twenty-five days, and it made things worse, because pausing a campaign whose tracking is broken does not fix the tracking.

The contract was the thing standing between the LLM and the same mistake.

A second example, less dramatic, more typical

The seasonality-trap demo is the loud kind of failure. The strategy-drift demo is the quiet kind, and honestly the more important.

A subscription fitness app. STRATEGY.md says no competitor bidding, optimize for Subscribed not Trial, cap Meta Lookalike at three variants. A new growth manager joins on Day 30 and, over the next month, violates each rule. They launch a "Competitor Names" campaign. They flip the Meta optimization target from Subscribed to Started Trial. They add LAL 4%, 7%, 10% on top of the existing 1%, 2%, 5%.

None of the three actions appears in the dashboard as red. The competitor campaign generates apparent volume. The optimization swap inflates the result count because trials are easier than subscriptions. The bigger Lookalike stack reaches more people. Each violation is paired with a better-looking surface metric. The action_log in STATE.json shows zero new entries since Day 30. That silence is itself the diagnostic signal, because the strategy says all actions must be logged.

mureo's STRATEGY-vs-STATE compliance audit walks the constraints, reads the campaign list, and produces a violation report with start dates and JPY-impact estimates. Not because the agent is clever. Because the rules and the state are both in files the agent can load, and "is rule N violated?" is a check the agent can perform mechanically. The cleverness is upstream — in the decision to write the rules down with reasons attached, instead of relying on tribal knowledge that the new manager never absorbed.

I find this scenario more useful for explaining the pattern than the seasonality one. Loud failures get caught eventually; the manager will eventually realize their cuts are not working. Quiet drift compounds for months because the surface metrics keep looking fine. A contract file is the only thing that catches it.

Where else the pattern fits

Marketing is not special here. Anywhere an LLM agent operates and the local rules differ from the generic best practice it was trained on, the contract pattern works.

For SREs running incident response, an INCIDENT_RESPONSE.md codifies severity classification, escalation chains, the postmortem template, and the explicit anti-patterns ("do not page the on-call DBA for read replica lag below 30 seconds — it self-resolves"). An on-call agent reading that file at the start of a page makes calibrated decisions instead of fashionable ones.

For an investment fund, an INVESTMENT_THESIS.md codifies the thesis itself, position sizing rules, sell triggers, and the kind of trade the fund refuses to make regardless of expected value. An agent doing pre-meeting prep with that file in context surfaces concerns that match the fund's actual operating principles, not LinkedIn's.

For a security team, a THREAT_MODEL.md lists assets in priority order, threats per asset, and the mitigations already in place. An agent triaging a CVE alert can compare it against the model and decide whether the vuln matters for this threat surface, not whether it is generally bad.

The shape under all three is the same: a file the agent loads, written in a format the agent can parse, containing the local rules that overrule generic best practice. The format does not have to be markdown. It has to be loadable, parseable, and short enough that loading it does not blow the context window.

The hardest part is not the format. It is convincing yourself to write the rules down, with reasons attached, instead of trusting them to live in three senior people's heads.

What does not work

Honest limitations, because the pattern is not magic.

The contract is exactly as good as the human who wrote it. A vague STRATEGY.md produces vague reasoning. "Be data-driven" is not a constraint. "Do not change bids on a campaign whose 7-day conversion volume is below 30 unless the action_log shows two prior weeks of stable spend" is a constraint. The first one cannot be checked; the second one can.

Tacit knowledge still escapes. A senior ad ops person knows the brand without writing it down. They know not to bid on certain compound terms because of a regulatory issue from 2019 that nobody bothered to document. The agent does not know any of that. Writing those things down is the work, and most of the work happens after the first version of the file ships, when the agent makes a recommendation that violates an unwritten rule and the human has to either correct the agent or update the file. The file gets better; the unwritten rules shrink.

A grounded agent is not automatically a careful agent. The agent can quote the constraint and still recommend a bad action — confident-sounding writing is the LLM's default. The contract reduces a class of mistakes (generic recommendations that ignore local rules); it does not eliminate the meta-class of "LLMs are sometimes confidently wrong". Approval gates and rollback paths still matter; the contract is the input layer, not the safety layer.

The file has to be maintained. A STRATEGY.md that lasts a year without edits is either a stable business or a neglected file. Most are the second one. The fix is to make the agent propose edits and the human approve them — mureo_strategy_set exists for that — but the discipline of actually doing it through quarterly review still has to come from a human who cares.

A small thing, if you want to try the pattern

Pick the smallest constraint your team has paid for in past mistakes. One sentence. Write it down with a number attached and a one-line reason. Put it in a file your LLM-using tooling can read at the start of a session. See if the next decision the agent surfaces is meaningfully different.

If it is, you have the start of a contract. Add a constraint per week as new edges of "the agent gave a generic answer that does not fit our context" become visible. Within a quarter you will have something a senior teammate could read and say "yes, that is how we actually operate". At which point you have something the LLM can also read.

mureo is one implementation of this pattern, focused on ad ops. The repo is at github.com/logly/mureo; the strategy doc format is in docs/strategy-context.md, the demo scenarios that exercise it live in mureo/demo/scenarios/. I would be more interested in seeing the pattern picked up outside marketing than in seeing more marketing-specific implementations of it. If you build an INCIDENT_RESPONSE.md or INVESTMENT_THESIS.md flavor of the same idea, I am at @yoshinaga on X — show me what you ended up with.

Yoshinaga (founder, mureo)

Top comments (1)

toshihiro shishido • May 8

The persona/USP/voice load nicely but measurable constraints are where most STRATEGY.md files go thin. "channel X must hit RPS > Y" beats "optimize for growth" — agents can actually plan against the former, not the latter.