Andrei Sambra

Posted on Mar 13

The Three Things Wrong with AI Agents in 2026

#ai #agents #opensource #beginners

95% of generative AI pilots fail to deliver measurable ROI. Gartner projects 40%+ of agentic AI projects will be cancelled by 2027. The gap between "impressive demo" and "reliable production system" remains enormous.

I have been building and using AI agents for the past two years. After burning through OpenClaw, LangChain stacks, raw API wrappers, and every "personal AI" that launched on Product Hunt, I think the problem comes down to three structural failures that nobody is fixing.

1. Every agent has amnesia, and memory is siloed

ChatGPT and Claude now remember facts about individual users. Progress. But every person's memory is isolated. When a family shares a household or a team collaborates on a project, none of that knowledge connects.

Five people can tell the same AI about the same project and it learns nothing from the overlap. There is no compounding, no collective intelligence, no network effect. Each user starts alone, stays alone.

This is not a feature gap. It is an architectural decision. ChatGPT's memory is per-user by design. Claude's project context resets. OpenClaw stores flat Markdown files in a directory. None of them build connected knowledge across users.

Think about how knowledge actually works in a team. When someone documents a decision, everyone benefits. When a teammate learns something about a client, the whole team gets smarter. AI agents do not work this way. They are individual notepads pretending to be collective intelligence.

What would actually work: a shared knowledge graph where every user enriches the same structure. Facts connect to preferences, preferences connect to patterns. Private sessions stay private, but shared knowledge compounds across everyone who contributes. The more people use it, the richer it gets.

2. Setup complexity locks out 99% of potential users

Every AI agent platform I have tried requires developer-level skills to set up. OpenClaw needs Node.js, CLI fluency, YAML configuration, and manual API key management. LangChain is a Python framework. AutoGPT requires Docker and environment variables. CrewAI assumes you can write Python.

This is fine for developers. It is insurmountable for the billion-plus knowledge workers who would benefit most from AI agents.

The irony: the people who need AI agents least (developers who can automate things themselves) are the only ones who can set them up. The people who need them most (consultants, analysts, researchers, project managers) are locked out by installation friction.

And even for developers, the dependency chains are fragile. Node.js version conflicts. Python virtual environments. YAML files that break silently. API keys scattered across config files. I have spent more time debugging my agent's infrastructure than using it.

A single binary with zero dependencies is not a nice-to-have. It is the only architecture that scales beyond the developer bubble. One file. One database. No Redis, no Postgres, no Kubernetes. Deploy in 30 seconds, inspect completely, trust with your data.

3. Cost is a black box

Power users burn $30 to $800/month in API calls with minimal visibility into what drove those costs. No per-conversation breakdown. No model-level analytics. No budget alerts. No trend visualization.

I discovered I was spending $180/month only by checking my OpenAI billing page. I had no idea which conversations cost what, which model was being used for which operations, or where the money was going.

The fix is not just dashboards. It is architectural. Most agent operations do not need frontier models. Memory enrichment, retrieval classification, routine scheduled tasks: these can run on models costing $0.25/M tokens instead of $15/M tokens. Route 70-80% of operations to cheap models and the economics change completely.

After implementing two-tier routing, my costs dropped from $180/mo to $70/mo. Same capabilities. The expensive model handles conversations. The cheap model handles everything else. This should be table stakes, not a custom optimization.

The OpenClaw wake-up call

OpenClaw hit over 200K GitHub stars in under two weeks. That proved the demand is real. People want personal AI agents.

Then its creator joined OpenAI in February 2026 and transferred the project to a foundation with unclear governance. The project is now effectively an OpenAI satellite with no commercial model and no clear roadmap.

Worse: a Snyk security audit found over 13% of ClawHub skills contain critical security issues, with 36% containing detectable prompt injection. The marketplace that was supposed to make OpenClaw extensible became a liability. No sandboxing, no curation, no accountability.

The OpenClaw situation crystallized something for me. The demand is real. The execution is broken. Not because the technology is missing, but because nobody is solving the structural problems: siloed memory, setup complexity, cost opacity.

What would actually fix this

An AI agent that solves all three:

Shared knowledge graph instead of per-user notepads. Every user in a household or team enriches the same graph. Knowledge compounds across people, not just sessions.
Single binary, zero dependencies. One Go binary, one SQLite database. Runs on a Mac or a $5/month VPS. No Node.js, no Python, no YAML, no dependency chains.
Built-in cost transparency with two-tier model routing. Per-conversation analytics, model-level usage tracking, and automatic routing of background operations to cheap models.

I built this. It is called Cogitator, it is open source (AGPL-3.0), and it is running on macOS as a native app that anyone can use. It is also available as a docker image for those who want to deploy it on a VPS.

After 30 days of use with multiple users contributing to the same knowledge graph: 340+ nodes, 500+ edges, connections forming between knowledge that no single user contributed. The agent infers preferences I never explicitly stated. My partner added dietary preferences; the agent referenced them when I asked for restaurant recommendations without me ever mentioning them. The graph connected them.

This is what "AI that learns" should actually mean. Not a flat list of facts per user. A shared, evolving, interconnected understanding that grows richer with every person who contributes.

Source: https://github.com/cogitatorai/cogitator
Website: https://cogitator.me

Top comments (2)

liuhaotian2024-prog • Mar 15

There's a fourth structural failure you didn't mention: no accountability layer. When an agent does something wrong, you can see what it did but not whether it matched what it was supposed to do. The logs are a witness, not an auditor.
The OpenClaw security issues you cited are a good example — the problem isn't just that skills can be malicious, it's that there's no tamper-evident record of what each skill actually did versus what it declared it would do.
That's the gap K9Audit tries to fill: github.com/liuhaotian2024-prog/K9Audit

Abdullah Shahin • May 29

The "five people, zero overlap" line is the one that stuck. The hard part isn't the shared graph itself — it's the scope hierarchy underneath (user / team / project / session) and deciding which writes propagate up. Curious whether Cogitator exposes that boundary to the user or infers it — I've seen both fail in different ways: explicit scope puts onboarding cost on the user, inferred scope quietly mis-shares context across project boundaries. We made scope explicit at the runtime layer at hivein.ai — happy to compare notes against Cogitator's design.