<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yuan</title>
    <description>The latest articles on DEV Community by Yuan (@baskduf).</description>
    <link>https://dev.to/baskduf</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3951772%2F7a4b25e8-20f7-41a0-b645-7bd63c4f6fd9.jpeg</url>
      <title>DEV Community: Yuan</title>
      <link>https://dev.to/baskduf</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/baskduf"/>
    <language>en</language>
    <item>
      <title>I stopped prompt-engineering my AI coding agent. I started engineering the repo instead.</title>
      <dc:creator>Yuan</dc:creator>
      <pubDate>Fri, 29 May 2026 02:43:50 +0000</pubDate>
      <link>https://dev.to/baskduf/i-stopped-prompt-engineering-my-ai-coding-agent-i-started-engineering-the-repo-instead-1i3e</link>
      <guid>https://dev.to/baskduf/i-stopped-prompt-engineering-my-ai-coding-agent-i-started-engineering-the-repo-instead-1i3e</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhuqlec0j9voapewuocfp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhuqlec0j9voapewuocfp.png" alt=" " width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The itch
&lt;/h2&gt;

&lt;p&gt;Quick hello first: I'm a developer based in South Korea, and English isn't my&lt;br&gt;
first language — so I'll keep this plain and let the code and the repo do most&lt;br&gt;
of the talking. (If a sentence reads a little stiff, that's the translation tax,&lt;br&gt;
not the idea.)&lt;/p&gt;

&lt;p&gt;If you've handed real work to an AI coding agent, you know the pattern.&lt;/p&gt;

&lt;p&gt;It edits the wrong file. It re-introduces an approach your team rejected three&lt;br&gt;
months ago. It invents a folder structure nobody agreed to. You correct it in&lt;br&gt;
the chat, it says "You're absolutely right," and then the next session it does&lt;br&gt;
the exact same thing — because the next session starts with none of that&lt;br&gt;
context.&lt;/p&gt;

&lt;p&gt;I spent weeks getting better at prompting. Better system messages, better&lt;br&gt;
examples, sharper instructions. It helped one conversation at a time. But every&lt;br&gt;
new task, every new branch, every teammate's agent started from zero again.&lt;/p&gt;

&lt;p&gt;At some point the framing flipped for me: &lt;strong&gt;prompt engineering improves one&lt;br&gt;
interaction. It does nothing for the environment that interaction happens in.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The reframe: engineer the repo, not the prompt
&lt;/h2&gt;

&lt;p&gt;So I stopped optimizing prompts and started optimizing the &lt;em&gt;repository&lt;/em&gt;. The&lt;br&gt;
idea — I've been calling it &lt;strong&gt;harness engineering&lt;/strong&gt; — is to turn the implicit&lt;br&gt;
context an agent keeps missing into durable artifacts that live in the repo and&lt;br&gt;
survive every session.&lt;/p&gt;

&lt;p&gt;It comes down to four components, plus a way to keep them from rotting:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Job&lt;/th&gt;
&lt;th&gt;Typical files&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Instruction document&lt;/td&gt;
&lt;td&gt;Tell the agent how to behave&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;AGENTS.md&lt;/code&gt;, &lt;code&gt;CLAUDE.md&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architecture constraints&lt;/td&gt;
&lt;td&gt;Block invalid structure before merge&lt;/td&gt;
&lt;td&gt;linters, type checks, import rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Feedback loops&lt;/td&gt;
&lt;td&gt;Correct behavior fast&lt;/td&gt;
&lt;td&gt;tests, CI, pre-commit, examples&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Knowledge store&lt;/td&gt;
&lt;td&gt;Preserve decisions and dead ends&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;docs/decisions&lt;/code&gt;, &lt;code&gt;docs/failures&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And the part most people skip: &lt;strong&gt;garbage collection&lt;/strong&gt; — drift checks that fail&lt;br&gt;
when docs reference missing files, when temp files sneak in, when the structure&lt;br&gt;
wanders away from what you agreed to.&lt;/p&gt;

&lt;p&gt;The operating principle that ties it together:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Every recurring agent failure should become at least one durable artifact — a&lt;br&gt;
clearer instruction, an automated constraint, a test or CI check, a decision&lt;br&gt;
or failure record, or a drift check.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You're not trying to make the agent perfect. You're trying to make the project&lt;br&gt;
&lt;strong&gt;easier to understand and harder to damage.&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The build
&lt;/h2&gt;

&lt;p&gt;I packaged this into a starter kit. Design constraints I gave myself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool-agnostic.&lt;/strong&gt; It's prompt-first. You hand any agent the repo URL, it
reads the kit, and adapts the pattern to &lt;em&gt;your&lt;/em&gt; stack. Not locked to one
vendor's agent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Boring on purpose.&lt;/strong&gt; MIT licensed, standard-library Python, plain Markdown.
No framework to install, nothing to audit for an afternoon.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conservative.&lt;/strong&gt; It inspects the target repo first and adds only the missing
pieces. It never blindly overwrites your files.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because I kept tripping over English-only docs myself, I wrote the README in&lt;br&gt;
four languages — English, Korean, Japanese, and Chinese. If you've ever bounced&lt;br&gt;
off a great tool because its docs assumed your first language, you know why that&lt;br&gt;
mattered to me.&lt;/p&gt;

&lt;p&gt;A drift check is deliberately tiny — small enough to read in one sitting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# scripts/check_docs_drift.py (excerpt)
# Fails when a doc links to a path that doesn't exist,
# so your README can't quietly rot.
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reference_value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;missing_paths&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Missing referenced path in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;relative_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;reference_value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;missing_paths&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The instruction file (&lt;code&gt;AGENTS.md&lt;/code&gt;) is just as plain — project overview,&lt;br&gt;
directory rules, exact commands, forbidden actions, PR behavior. Nothing magic.&lt;br&gt;
The value isn't in any one file; it's in having all four pillars present and&lt;br&gt;
enforceable.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4o4mv2j8upixtayvux5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4o4mv2j8upixtayvux5.png" alt=" " width="799" height="288"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  The dogfood: building a Django app &lt;em&gt;through&lt;/em&gt; the harness
&lt;/h2&gt;

&lt;p&gt;A methodology you only ever describe is a blog post. I wanted to know if it&lt;br&gt;
actually held up, so I took an &lt;strong&gt;empty folder&lt;/strong&gt; and built a small Django app —&lt;br&gt;
a board with post CRUD, ownership permissions, admin user management, search,&lt;br&gt;
pagination, comments — entirely through the harness workflow.&lt;/p&gt;

&lt;p&gt;What accumulated in the repo as I went:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;8 decision records&lt;/strong&gt; in &lt;code&gt;docs/decisions/&lt;/code&gt; — why generic-first, why this
Django layout, why post-ownership permissions work the way they do.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI&lt;/strong&gt; running the same harness checks GitHub Actions runs on every push.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;.harness/source.json&lt;/code&gt;&lt;/strong&gt; pinning the exact kit commit the repo absorbed, so
"which version of the methodology is this repo on?" has a real answer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The features weren't the point. The point was that every behavior change shipped&lt;br&gt;
&lt;em&gt;with&lt;/em&gt; its verification and its durable memory, automatically, because the&lt;br&gt;
harness made that the path of least resistance.&lt;/p&gt;
&lt;h2&gt;
  
  
  The moment it bit me — and why that's the best part
&lt;/h2&gt;

&lt;p&gt;Here's the beat I didn't plan for.&lt;/p&gt;

&lt;p&gt;I added CI. It immediately went red. The failure was inside my own drift&lt;br&gt;
checker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Missing referenced path in docs/decisions/0002-initialize-django-config-project.md: .\.venv\Scripts\python.exe
Missing referenced path in docs/harness/adoption-report.md: .\.venv\Scripts\python.exe
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;My docs documented a Windows virtual-env command, &lt;code&gt;.\.venv\Scripts\python.exe&lt;/code&gt;.&lt;br&gt;
My drift checker saw the backslashes and "helpfully" decided it was a file&lt;br&gt;
reference — then checked whether that file existed. It existed &lt;strong&gt;on my Windows&lt;br&gt;
machine&lt;/strong&gt;, so it passed locally. It did &lt;strong&gt;not&lt;/strong&gt; exist in Linux CI. Green on my&lt;br&gt;
laptop, red in the cloud. The classic.&lt;/p&gt;

&lt;p&gt;The tool I built to catch drift had drifted.&lt;/p&gt;

&lt;p&gt;But this is exactly what the methodology is &lt;em&gt;for&lt;/em&gt;. I didn't just patch it and&lt;br&gt;
move on. I fixed the checker to recognize venv commands by executable name&lt;br&gt;
instead of path existence — and then I wrote it down, in&lt;br&gt;
&lt;code&gt;docs/failures/0001-docs-drift-windows-venv-command.md&lt;/code&gt;: context, symptoms, root&lt;br&gt;
cause, resolution, prevention. Now the next agent (or the next me, at 1 a.m.)&lt;br&gt;
that touches drift logic reads a one-page record instead of rediscovering the&lt;br&gt;
trap.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqzs76ikbjy2lny8mjpnm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqzs76ikbjy2lny8mjpnm.png" alt=" " width="800" height="743"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A recurring failure became a durable artifact. The principle, applied to the&lt;br&gt;
tool itself.&lt;/p&gt;
&lt;h2&gt;
  
  
  The result (and an honest limit)
&lt;/h2&gt;

&lt;p&gt;I also built a quick diagnostic, &lt;code&gt;harness doctor&lt;/code&gt;, that scores how ready a repo&lt;br&gt;
is for reliable agent collaboration across five axes. Run against the dogfooded&lt;br&gt;
Django app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Score: 83/100 (baseline evidence scan)
Grade: B+ (baseline)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpkny395qk9pskqfr2jhp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpkny395qk9pskqfr2jhp.png" alt=" " width="800" height="434"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Now the honest part, because dev.to readers can smell a pitch: &lt;strong&gt;that score is a&lt;br&gt;
baseline evidence scan.&lt;/strong&gt; It checks that durable files and command patterns&lt;br&gt;
exist — it does not yet prove that adopting the harness &lt;em&gt;measurably&lt;/em&gt; reduces&lt;br&gt;
repeated agent mistakes. I have the measurement protocol (wrong-file edits,&lt;br&gt;
first-pass verification, repeated-mistake counts) but not enough before/after&lt;br&gt;
runs to claim a number. That's the next thing I'm working on, and it's the&lt;br&gt;
subject of a follow-up post.&lt;/p&gt;

&lt;p&gt;So: take this as a strong qualitative result — a real app, real decision and&lt;br&gt;
failure memory, a tool that caught its own bug — not as a benchmark. Yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;The shift that actually changed how my agents behave wasn't a cleverer prompt.&lt;br&gt;
It was treating the repository as the thing you engineer:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Every recurring agent failure should become a durable artifact.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If your agent keeps making the same mistake, stop re-explaining it in chat.&lt;br&gt;
Write it into the repo once, in a form the next session can't miss.&lt;/p&gt;

&lt;p&gt;The kit is MIT-licensed and on GitHub:&lt;br&gt;
&lt;strong&gt;&lt;a href="https://github.com/baskduf/harness-starter-kit" rel="noopener noreferrer"&gt;https://github.com/baskduf/harness-starter-kit&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To try it, open your repo with your agent and point it at that URL — it'll&lt;br&gt;
inspect your stack and add only the harness pieces you're missing. Stars,&lt;br&gt;
issues, and "this broke on my stack" reports are all genuinely welcome — and if&lt;br&gt;
you read the Korean, Japanese, or Chinese README and something sounds off, tell&lt;br&gt;
me; I maintain all four by hand.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 2 will be the measurement: can I show, with numbers, that a harnessed repo&lt;br&gt;
makes agents repeat fewer mistakes? Follow along if that's the post you actually&lt;br&gt;
want.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stack would you try this on? Tell me in the comments — I'm collecting&lt;br&gt;
real targets for the measurement study.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Harness Engineering: Stop Re-Prompting Your Coding Agent Every Session</title>
      <dc:creator>Yuan</dc:creator>
      <pubDate>Tue, 26 May 2026 05:10:05 +0000</pubDate>
      <link>https://dev.to/baskduf/harness-engineering-stop-re-prompting-your-coding-agent-every-session-4lec</link>
      <guid>https://dev.to/baskduf/harness-engineering-stop-re-prompting-your-coding-agent-every-session-4lec</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7i93rhgr7i8k09e3x6z3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7i93rhgr7i8k09e3x6z3.png" alt=" " width="800" height="505"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every time I started a new agent session, I was re-explaining the same things.&lt;/p&gt;

&lt;p&gt;The architecture rules. The patterns to avoid. The decisions I'd already made. The approaches that already failed.&lt;/p&gt;

&lt;p&gt;The agent would forget everything and I'd be back to square one.&lt;/p&gt;

&lt;p&gt;My first instinct was to write better prompts. Longer, more detailed, more explicit. But that just made the problem worse — now I had a 200-line prompt to maintain, and the agent still forgot it all next session.&lt;/p&gt;

&lt;p&gt;The real problem isn't the prompt. &lt;strong&gt;Prompts are session-scoped. They disappear.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  A Different Approach: Harness Engineering
&lt;/h2&gt;

&lt;p&gt;Instead of putting rules in prompts, what if we put them in the repository itself?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Prompting is temporary. Context is session-scoped. A harness is project-scoped.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The idea is simple: every time an agent makes a recurring mistake, convert it into a durable artifact in the repo instead of re-prompting.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Agent repeats a bad pattern → add a lint rule that blocks it&lt;/li&gt;
&lt;li&gt;Agent forgets architecture decisions → write an ADR in &lt;code&gt;docs/decisions/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Agent retries approaches that already failed → log them in &lt;code&gt;docs/failures/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Agent ignores conventions → codify them in &lt;code&gt;AGENTS.md&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The repository gets smarter with every mistake. The agent reads the repo at the start of each session and picks up all the rules automatically — no prompting required.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Five Components of a Harness
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. &lt;code&gt;AGENTS.md&lt;/code&gt; — Instruction Document&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A file the agent reads at the start of every session. Contains project overview, directory rules, forbidden patterns, test commands, and PR behavior. Think of it as a permanent briefing document.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Architecture Constraints&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Automated rules that block invalid code before it merges. Linters, type checks, import boundaries, pre-commit hooks. If &lt;code&gt;AGENTS.md&lt;/code&gt; says "no direct DB access from routes", add a lint rule that enforces it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Feedback Loops&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Signals the agent uses to self-correct. Test failures, CI failures, lint errors. A good harness gives the agent clear, actionable failure messages so it can fix its own mistakes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Knowledge Store (&lt;code&gt;docs/&lt;/code&gt;)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Durable context that survives session resets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;docs/decisions/&lt;/code&gt; — why certain architectural choices were made&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/failures/&lt;/code&gt; — approaches already tried and rejected&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/conventions/&lt;/code&gt; — project-specific coding rules&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;docs/domain/&lt;/code&gt; — business terminology and domain knowledge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5. Drift Checks&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Scripts that detect when the harness itself goes stale. Is &lt;code&gt;AGENTS.md&lt;/code&gt; referencing files that no longer exist? Are there temporary files that never got cleaned up? Drift checks catch this automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  harness-starter-kit
&lt;/h2&gt;

&lt;p&gt;I built a starter kit that applies this pattern to any existing project.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Usage:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Clone it inside your target repo:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;workspace/
└── target-repo/
    ├── harness-starter-kit/
    └── existing project files
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then give your coding agent this prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Read ./harness-starter-kit first, then apply the harness engineering 
starter kit to this repository. Preserve existing architecture, tools, 
and conventions. Do not overwrite existing files without explaining why.
Finish with a short adoption report.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent inspects your repo, adapts to your existing tools (eslint, tsc, ruff, etc.), and installs only the missing harness files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key design decisions:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Non-destructive — never overwrites existing files&lt;/li&gt;
&lt;li&gt;Tool-agnostic — works with whatever you already use&lt;/li&gt;
&lt;li&gt;Agent-agnostic — works with Claude Code, Cursor, Copilot, etc.&lt;/li&gt;
&lt;li&gt;Profiles available for generic, Python, and TypeScript projects&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;Better prompting is a local fix. Harness engineering is a systemic fix.&lt;/p&gt;

&lt;p&gt;Every recurring agent failure should become at least one durable artifact — a clearer rule, an automated constraint, a test, a decision record, or a drift check. That's the core loop.&lt;/p&gt;

&lt;p&gt;The goal isn't to write the perfect prompt. It's to build a repository that makes the same mistakes increasingly unlikely over time.&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/baskduf/harness-starter-kit" rel="noopener noreferrer"&gt;GitHub: harness-starter-kit&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Have you been dealing with agents repeating the same mistakes? How are you handling it?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>tooling</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
