<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: shinobi apps</title>
    <description>The latest articles on DEV Community by shinobi apps (@numbererikson).</description>
    <link>https://dev.to/numbererikson</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3955953%2F8ca6699c-32b2-4830-991f-172fcb90e85a.png</url>
      <title>DEV Community: shinobi apps</title>
      <link>https://dev.to/numbererikson</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/numbererikson"/>
    <language>en</language>
    <item>
      <title>Why AI coding agents need a task spine</title>
      <dc:creator>shinobi apps</dc:creator>
      <pubDate>Wed, 03 Jun 2026 07:10:00 +0000</pubDate>
      <link>https://dev.to/numbererikson/why-ai-coding-agents-need-a-task-spine-15dd</link>
      <guid>https://dev.to/numbererikson/why-ai-coding-agents-need-a-task-spine-15dd</guid>
      <description>&lt;p&gt;I've been pair-programming with Claude since day one — long before Claude Code existed, before MCP existed, back when "AI coding assistant" still meant tab-completion. The setup got unreasonably good. Then I noticed I kept re-explaining the same things.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Me, Tuesday: "We picked Postgres for this project, not MongoDB. Use JSONB for the metadata column."&lt;/p&gt;

&lt;p&gt;Me, Friday: "We picked Postgres for this project, not MongoDB. Use JSONB for the metadata column."&lt;/p&gt;

&lt;p&gt;Me, Monday: "Wait, why did you suggest MongoDB?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The agent is brilliant in any single session. Across sessions it has the memory of a goldfish. Every conversation starts from scratch. Every decision I thought we'd settled has to be re-litigated.&lt;/p&gt;

&lt;p&gt;If you've been there too, this post is for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of the problem
&lt;/h2&gt;

&lt;p&gt;Three things that should persist across agent sessions, and don't:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What are we working on?&lt;/strong&gt; The list of tasks. Their status. Who claimed which one (you? the agent? a different agent on the other laptop?).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What did we decide?&lt;/strong&gt; The architectural / library / pattern choices we already made — and &lt;em&gt;why&lt;/em&gt;. Not so the agent can parrot them; so it can build &lt;em&gt;on&lt;/em&gt; them instead of relitigating.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;What did we already try?&lt;/strong&gt; The dead-ends. The "we already tried that and the reason it didn't work was X" insights that are the most expensive to re-derive.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Tool-call memory inside a single session is great. Vector recall over your codebase is great. Neither solves &lt;em&gt;project state&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The naive fixes that don't work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;A README.&lt;/strong&gt; I tried it. The agent doesn't always read it. When it does, it can't update it without overwriting your formatting. When two agents edit it concurrently you get conflicts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A Notion / Linear / Jira board.&lt;/strong&gt; The agent can call their API if you set up the integration. But: it's slow (every read is a network round-trip), it's noisy (sync delays), and the schema doesn't fit what agents actually care about (you don't need a Kanban with assignees and sprint planning; you need "is this task done? what's blocking it? what decisions touch this code path?").&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Markdown files in&lt;/strong&gt; &lt;code&gt;.claude/&lt;/code&gt; &lt;strong&gt;or&lt;/strong&gt; &lt;code&gt;.cursor/&lt;/code&gt;&lt;strong&gt;.&lt;/strong&gt; Closer. Now the data is local. But it's plain text — no querying, no FTS, no relations.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually wanted
&lt;/h2&gt;

&lt;p&gt;Three things, none of which existed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;A &lt;strong&gt;MCP server&lt;/strong&gt; the agent can call to claim/complete tasks, log decisions, log dead-ends, and recall any of them later — without leaving the conversation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A &lt;strong&gt;local SQLite store&lt;/strong&gt; that holds all of it. Querying. FTS. Cheap.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A &lt;strong&gt;dashboard&lt;/strong&gt; I can open in the browser to see what state the project is in, what decisions are pending, what the agent is doing right now. Just glance at it — don't need to ask the agent.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So I built it. It's called Shinobi. It's a single CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @shinobiapps/shinobi
&lt;span class="nb"&gt;cd&lt;/span&gt; /your/project
shinobi init
shinobi dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;init&lt;/code&gt; drops an &lt;code&gt;.mcp.json&lt;/code&gt; into the current directory that points at the local Shinobi MCP server. Restart Claude Code (or Cursor) and it auto-loads &lt;code&gt;create_project&lt;/code&gt;, &lt;code&gt;claim_task&lt;/code&gt;, &lt;code&gt;log_decision&lt;/code&gt;, &lt;code&gt;recall&lt;/code&gt;, and friends.&lt;/p&gt;

&lt;p&gt;Now the next time I'm starting a session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; Read the project context. What are we working on?

[claude calls recall, reads the project spine, reads the decisions log]

We're building the auth flow for $project. Decisions on record:
- 2026-06-12: Postgres + JSONB for session metadata (not MongoDB —
  per the constraint that the rest of the app already uses Postgres)
- 2026-06-14: NextAuth.js over rolling our own (3-line config vs.
  weeks of cookie/CSRF correctness)

Dead ends ruled out:
- 2026-06-13: Tried passing sessions as URL query strings; got
  flagged by the linting rule and breaks bookmark sharing.

Next task: subtask #7, "wire up the OAuth callback". Want me to
claim it?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Yes. Yes I do. And we've already saved ten minutes of "remember when we decided…" thrash.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters more than it sounds
&lt;/h2&gt;

&lt;p&gt;The bottleneck on AI-assisted development is &lt;strong&gt;not&lt;/strong&gt; the model's single-turn cleverness. The model is already smarter than it needs to be for 90% of tasks. The bottleneck is &lt;em&gt;getting the right context in front of it without you, the human, becoming the bottleneck&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;A README is a chat log. A task spine is a database. The agent can read, write, and query a database without your typing. That's the unlock.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's coming next in this series
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Part 2:&lt;/strong&gt; Mobile push approvals — how &lt;code&gt;request_approval&lt;/code&gt; lets you step away from the laptop while the agent runs, and respond from your phone when it hits a fork.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Part 3:&lt;/strong&gt; Multi-agent real-time sync — desktop and laptop, or you and a teammate, working in the same workspace with no manual git pull.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Part 4:&lt;/strong&gt; Why the data stays local (SQLite + optional git sync) and what the hosted SaaS adds on top — the explicit "no lock-in" promise.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to try Shinobi now: &lt;a href="https://github.com/numbererikson/shinobi" rel="noopener noreferrer"&gt;https://github.com/numbererikson/shinobi&lt;/a&gt;. MIT-licensed, self-host forever.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>mcp</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
