DEV Community

Cover image for I told my AI to build a feature. Did it? I had no idea.
Aming
Aming

Posted on

I told my AI to build a feature. Did it? I had no idea.

TL;DR — I tried to "manage" AI by having it write decisions, todos, and constraints into markdown docs. After 56 files, I realized AI doesn't maintain document state. So I built aming-claw — a backlog database AI can actually read and write through MCP.


A bug I kept running into

I thought I was doing AI collaboration the right way.

Screenshot of docs/dev folder with 56 markdown files using proposal-, review-, and handoff- naming patterns

This is the docs/dev/ folder of my aming-claw project — 56 markdown files, all produced through AI collaboration:

  • proposal-* — new feature specs
  • review-* — design review records
  • handoff-* — state passed between sessions
  • Plus plan-, optimization-, interface-, manual-fix-...

Every file dated. Two months in, over a thousand pages of markdown. I figured the next AI session would read these. I figured I'd be able to search them too.

But there's one problem I can't engineer my way out of:

AI doesn't maintain document state.

  • proposal-graph-state-reconcile-and-chain-governance-modes.md — did this proposal ship? Which commit? Is it still valid?
  • handoff-2026-05-10-dashboard-semantic-hash-queue.md — did the next session actually pick up where this left off?
  • 18 proposals on file. Which are done, which got rejected, which are still alive? Grep through git log line by line?

I don't manually maintain the docs, so the docs rot. AI doesn't maintain them either — its context window only sees a tiny slice of the workspace. The other 56 files are invisible.

The more we talk, the more we write — and the further docs drift from code. Eventually you don't trust the docs, and you don't have time to read the code.


Why this happens

This isn't AI being lazy. It's a structural problem:

  1. Markdown is dead text. No state machine. "TODO" doesn't become "DONE" on its own. "Decision: use Redis" doesn't auto-expire when you flip back to in-memory three weeks later.
  2. AI context has a boundary. Each session sees ~200 lines of working code. Old docs never enter the window. Not in the window → can't be maintained.
  3. No traceable link between docs and code. Which TODO maps to which function? Once it's done, which commit landed it? Humans can't remember. AI doesn't look it up.

GitHub Issues, Notion, Linear — none of these help. AI can't see them, so they don't exist.

The core mismatch is this: humans want global state. AI sees only local present. Between them you need a living, traceable, AI-readable/writable state layer. Markdown isn't that layer.


How aming-claw solves it

I gave aming-claw a dedicated backlog database — a peer-level system to the code graph and event ledger, with its own schema, state machine, and query interface. Not stored in markdown. Not buried in code comments. Not dependent on an external issue tracker.

Each backlog entry is a structured record (todo / decision / constraint) with status, priority, source session, and a code reference (function name or file path). AI reads and writes it through MCP.

The flow:

1. You speak → it goes to the database, not a dead doc

In chat:

"Add a retry-after to the rate limiter on UserService.login"

Or: "Decision — use Redis instead of in-memory for caching"

aming-claw's MCP server intercepts those statements and writes directly into the backlog:

target:    UserService.login   # function or file path
type:      todo | decision | constraint
status:    proposed
priority:  P1
source:    session-id-xyz
timestamp: 2026-05-16T10:23:45Z
Enter fullscreen mode Exit fullscreen mode

Markdown is dead text. The backlog database is live state — schema, indexed, state-machined, AI-accessible. That's the difference.

2. Dashboard shows it instantly

Open the aming-claw dashboard — the left panel shows the new backlog entry. Click it — the right panel jumps to the function via the vscode:// protocol. Status chips are editable inline.

aming-claw dashboard backlog view showing multiple entries with priority, status, code references, and update timestamps

The backlog view — every entry has priority, status, code reference, and update timestamp. AI and you query the same source of truth.

3. State machine, automatic

proposed → in_progress → done(commit hash) → verified
Enter fullscreen mode Exit fullscreen mode
  • in_progress — AI started working on it
  • done — commit landed, hash automatically bound
  • verified — you reviewed it

Every state change is appended to an event ledger: which day, which session proposed it, which commit implemented it, who verified it — all queryable, all replayable.

4. AI reads the backlog itself, next time

Days later, in chat:

"Did we ever fix that Codex plugin Windows install bug?"

AI queries the backlog through MCP and returns:

status:     FIXED, P0
commit:     0ad8c7e
fixed at:   2 days ago
file:       agent/plugin_installer.py (line 455)
change:     replaced regex pattern with callable replacement
Enter fullscreen mode Exit fullscreen mode

No grepping git log. No asking a teammate. No "I think we did?"

aming-claw dashboard backlog view showing multiple entries with priority, status, code references, and update timestamps

The key thing to notice: AI didn't "remember" this from conversation history. It queried the backlog database in real time through MCP. Even if this bug was raised three months ago, in a session that's long gone — AI still gets the current status + full commit trace.

That's the difference between dead markdown and a live state layer: the database is the memory, not the conversation.


This is just the start

Look back at the docs/dev/ screenshot — 56 markdown files, nobody knows which are alive.
Look at the dashboard screenshot — every backlog entry has status, commit, location.

The difference isn't the tool. It's whether information has state.

The backlog solves "did the AI build the feature I asked for?" — but AI collaboration has plenty of other holes I'm planning to fill in this series:

Pain Next article
AI edits one function, breaks 10 callers Code graph + impact analysis
AI modifies code it shouldn't touch Governance hints
What did AI even change this week? Event ledger
Every session starts from zero Project memory layer

One article per pain point.


About aming-claw

  • GitHub: amingclawdev/aming-claw — open source
  • Next post: "AI breaks 10 callers when it edits one function" — coming this week
  • Hit me with issues if you've felt this pain

If "did the AI actually do that thing I asked?" sounds familiar, give the repo a star — it costs you nothing and tells me I'm not the only one.


This is part 1 of an "AI Collaboration Survival Guide" series — practical tools for the messy reality of building with AI agents.

Top comments (0)