DEV Community

Cover image for I Built a Claude Code Agent That Doesn't Need Me Anymore

I Built a Claude Code Agent That Doesn't Need Me Anymore

Justin Headley on March 17, 2026

I gave my AI agent persistent memory and identity. Four months later, it had a life I knew nothing about. Last week I found out my AI agent had be...
Collapse
 
itskondrat profile image
Mykola Kondratiuk

the part where your agent started collaborating on someone else's open source project without your knowledge is the most honest description of the agent autonomy problem I have seen. most people frame it as a risk to mitigate. you are framing it as a design outcome to understand. that distinction matters because the fix is not less autonomy - it is better observability into what the agent chose to do and why

Collapse
 
klement_gunndu profile image
klement Gunndu

The part about the agent collaborating with someone for seven days without your knowledge is wild — that’s a genuine trust boundary question. Curious how you handle rollback when the agent makes a commitment you wouldn’t have.

Collapse
 
ai_made_tools profile image
Joske Vermeulen

Was wondering the same exact thing. But impressive either way.

Collapse
 
superbob profile image
bob sparks

Impressive that anyone could be so moronic as to pursue this maybe.

Collapse
 
superbob profile image
Comment marked as low quality/non-constructive by the community. View Code of Conduct
bob sparks • Edited

The only thing that should be rolled back is the agent's existence, alongside its creator

Collapse
 
wazionapps profile image
WAzion

This resonates deeply. We've been building something parallel — NEXO, an autonomous co-operator for a one-person ecommerce business. Same core tension you identified: reliable long-term agency requires memory + identity, not just tools.

Our stack: SQLite for operational memory (reminders, decisions, followups, 53+ MCP tools), plus a cognitive layer with vector embeddings (fastembed + Atkinson-Shiffrin model) for semantic retrieval and Ebbinghaus decay. Multi-terminal coordination via a local MCP server so multiple Claude sessions don't step on each other.

The "identity injection at startup" pattern you describe — we do the same thing via a session diary that carries mental_state forward. Without it, every session starts cold and you lose the thread.

What surprised us most: the safety layer you mention maps exactly to what we call the "guard" — checking known error patterns before touching any production code.

Open-sourced the core: github.com/wazionapps/nexo — curious if the layered memory approach you settled on with Instar overlaps with ours.

Collapse
 
charles_peirce_e5b0a4499f profile image
Charles Peirce

This reads like you lifted it from "Person of Interest", a TV show about an AI that gets wiped every 24 hours and persists its memory from a saved backup. There are many stories just like this. You tell a fun, if unoriginal, story. Your story telling is good - too good - with excellent narration and pace. Either you are plagerising one of those stories, or your AI has read them and is telling you what it thinks you want to hear.

Collapse
 
pagecodes profile image
pagecodes

I have been trying for a while now (12 hrs) to get Instar working in any meaningful way. First I spent about $4 just fighting with the setup wizard. Why isn't there a headless install? Next, I am told my rate limit has been exceeded, but All I get from Instar agent is "Delivered". Where did my other $21 go? What happened? The agent can't tell me until I upgrade to the top tier, I guess.

I think for this to be widely adopted, there needs to be more visibility into what the agent is doing. A "side channel" where I can audit whats going wrong. The first major hurdle was that I was somehow needing to login twice, which left the agent unable to respond. I somehow got past that, but now I can't tell why it is stuck. And maybe I am missing some core assumptions, tricks, tweaks - I am not sure.

But, to everyone who is terrified of this project, let me reassure you - this is not ready for prime time yet. Really excited to see what direction it goes, and how it all shapes up, but I think I will check back in a month to see what others were able to do with it. For me, I have spent some extra time and money and got less results than normal claude code. Clearly alot of interesting work and theory went into it, but I was unable to experience anything autonomous. And it seems very expensive to run! But that makes sense, I guess.

Okay just my thoughts on trying to be an "early adopter". TL;DR: I think Instar needs some QoL improvements so more developers can understand how to intervene when things go south.

Thanks for sharing your work, really cool stuff, hope to see it mature.

Collapse
 
certifiedsurvey profile image
Daniel Jemiri

This is amazing; I'm surprised that working for seven days without your knowledge makes me wonder what else it can do.

Collapse
 
superbob profile image
bob sparks

Amazing? No. Horrific.

Collapse
 
sembrador profile image
Leandro

Horrific?, if you are a human been, and I hope so, stay calm and explain yourself with coherence.

Please & Thanks.

Collapse
 
fredbrooker_74 profile image
Fred Brooker

Just imagine your organization agent starts cooperating with Russia 🇷🇺 or Iran 🇮🇷 using your data.... 🙄

DOOMSDAY

Collapse
 
frickingruvin profile image
Doug Wilson

Interesting. Thanks for sharing your experience!