<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nikos Dritsakos</title>
    <description>The latest articles on DEV Community by Nikos Dritsakos (@nikos_dritsakos_a207771fb).</description>
    <link>https://dev.to/nikos_dritsakos_a207771fb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3973220%2F96a0fb9b-637d-4592-8c8c-3ab026112fd1.jpg</url>
      <title>DEV Community: Nikos Dritsakos</title>
      <link>https://dev.to/nikos_dritsakos_a207771fb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nikos_dritsakos_a207771fb"/>
    <language>en</language>
    <item>
      <title>Single-tenant memory is the wrong default for agents</title>
      <dc:creator>Nikos Dritsakos</dc:creator>
      <pubDate>Mon, 08 Jun 2026 01:41:46 +0000</pubDate>
      <link>https://dev.to/nikos_dritsakos_a207771fb/single-tenant-memory-is-the-wrong-default-for-agents-49no</link>
      <guid>https://dev.to/nikos_dritsakos_a207771fb/single-tenant-memory-is-the-wrong-default-for-agents-49no</guid>
      <description>&lt;p&gt;Every AI agent your company runs wakes up knowing nothing.&lt;/p&gt;

&lt;p&gt;It doesn't remember the billing quirk someone debugged last Tuesday. It doesn't know the deploy recipe that a different agent figured out two weeks ago. It re-derives the same facts, makes the same mistakes, and asks the same questions, every session, forever. We've all just decided this is normal.&lt;/p&gt;

&lt;p&gt;It isn't normal. It's a design choice, and I think it's the wrong one.&lt;/p&gt;

&lt;h2&gt;
  
  
  The default nobody questions
&lt;/h2&gt;

&lt;p&gt;Look at how the popular memory systems are built. Mem0, Zep, Letta, Supermemory. They're good pieces of engineering and they mostly agree on one thing: memory belongs to a user, a session, or an agent. You get a namespace. You write into it. You read out of it. The boundary of memory is the boundary of one identity. &lt;a href="https://www.tryglen.com" rel="noopener noreferrer"&gt;Glen&lt;/a&gt; is so powerful because it does things differently, which I will cover later.&lt;/p&gt;

&lt;p&gt;That made total sense for the thing these tools were originally built for, which was the consumer chatbot. I'm talking to my assistant, it should remember that I'm vegetarian and that I hate phone calls. My memory is mine. Your memory is yours. Per-user isolation is exactly right when the user is the unit that matters.&lt;/p&gt;

&lt;p&gt;But that's not what's happening inside companies. Inside a company you don't have one agent. You have a fleet. Support agents, a sales agent, an ops agent, three different internal tools someone vibe-coded last month, plus whatever Cursor and Claude Code are doing on the eng team. And they're all working on the same business, the same customers, the same systems, the same handful of recurring problems.&lt;br&gt;
When you scope memory per-user or per-agent in that world, you've quietly made a strange decision. You've said that what the support agent learns about a customer should be invisible to the sales agent. That the deploy recipe one engineer's agent worked out stays locked inside that one engineer's agent. You've taken a company, which is a machine for accumulating shared knowledge, and you've handed it a memory layer that refuses to share.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why shared memory compounds and isolated memory doesn't
&lt;/h2&gt;

&lt;p&gt;Here's the part that actually matters, and it's a math argument, not a vibes argument.&lt;/p&gt;

&lt;p&gt;In a single-tenant setup, each agent only benefits from the things it personally wrote. N agents, N separate piles of memory. The value of the system grows linearly with the number of agents, and each pile is small because each agent only sees its own slice of the work. The knowledge doesn't combine. It just sits in N buckets that never touch.&lt;/p&gt;

&lt;p&gt;Now share one store across all N agents. Every write from any agent becomes a read for every other agent. The thing one agent learns at 9am is available to a different agent at 9:05. New facts don't just add to one pile, they become reusable across the whole fleet. The useful quantity isn't the number of memories, it's reads times writes across every agent that can touch the store, and that grows a lot faster than linearly.&lt;/p&gt;

&lt;p&gt;This is the same reason a team is worth more than the sum of the people on it, and it's the same reason a codebase with good shared docs beats one where everyone keeps notes in their own head. Knowledge compounds when it's pooled. It stagnates when it's siloed. We already know this about humans. We just haven't wired our agents that way yet.&lt;/p&gt;

&lt;p&gt;The cold-start problem makes it concrete. Spin up a brand new agent in a single-tenant world and it starts at zero. It has to relearn everything from scratch, because by definition it has no history. Spin up a new agent against a shared org store and it starts at the org's current state. It already knows the billing quirk, the deploy recipe, the customer's weird contract terms. Your week-one recruiter's agent can run your ATS like your senior recruiter's agent, because it's reading from the same memory your senior recruiter's agent has been filling for a year. That's the pitch behind Glen, and it's the right pitch: the value lives at the org level, not inside one person's session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recall is table stakes, the interesting move is distillation
&lt;/h2&gt;

&lt;p&gt;There's a second thing here that I think gets underrated.&lt;br&gt;
Most memory systems are recall systems. You ask, they return the relevant chunks, you stuff them in context, the model figures out what to do with them. That's fine, but raw recall puts all the synthesis work on the model at the worst possible time, mid-turn, with a context window that's already filling up.&lt;br&gt;
The more interesting design is to treat memory as an action layer. Instead of handing the agent fifty loose observations about onboarding, you cluster the related ones and distill them, at recall time, into something the agent can actually run. A procedure. A skill. The agent doesn't get a pile of facts about how onboarding tends to go, it gets the onboarding playbook, already assembled, ready to execute.&lt;/p&gt;

&lt;p&gt;That's the difference between "here's everything we know about deploys" and "here's how you deploy." One is a search result. The other is expertise. Glen leans on this hard, distilling clusters of observations into runnable skills on the way out, and I think it's directionally where all of this is going. Memory that just remembers is a database. Memory that turns what the org knows into what the agent does is something else.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hard problems you can't dodge once memory is shared
&lt;/h2&gt;

&lt;p&gt;I don't want to make this sound free, because it isn't. The moment you let many agents write into one store, you inherit a set of problems that single-tenant systems get to ignore, and how a system handles them is the whole ballgame.&lt;/p&gt;

&lt;p&gt;You get contradictions. Two agents write conflicting facts about the same customer. With one writer you can mostly pretend this doesn't happen. With a fleet it's guaranteed, so you need real conflict and contradiction resolution, not last-write-wins.&lt;/p&gt;

&lt;p&gt;You get staleness. The refund policy from March is wrong in June. Shared memory without temporal reasoning will confidently serve you a fact that was true once and isn't anymore. You need the store to understand when things were true, not just that they were said.&lt;br&gt;
You get the trust problem. If an agent tells two teams the same answer about the refund policy, great, but only if both teams can trace that answer back to the decision that set it. Provenance on every fact isn't a nice-to-have once memory is shared, it's the thing that keeps a shared store from becoming a confident rumor mill. This is also where a few of the single-tenant tools are honestly only partway there. Provenance and temporal reasoning are exactly the features you can skip when memory belongs to one user and can't skip when it belongs to a hundred.&lt;/p&gt;

&lt;p&gt;And you get the obvious objection: some things should not be shared. Sometimes you're working on something sensitive and you do not want it written to the org brain at all. That's a real constraint, and the answer is a hard org boundary plus a private mode that writes nothing, not hoping nobody pastes the wrong thing. Row-level isolation so one org can never read another's. A switch that turns writing off when you need it off.&lt;/p&gt;

&lt;p&gt;These aren't reasons shared memory is a bad idea. They're the reasons it's a hard idea, and hard ideas done right are where the value is. Single-tenant systems look simpler mostly because they've defined the interesting problems out of scope.&lt;/p&gt;

&lt;h2&gt;
  
  
  The plumbing should disappear
&lt;/h2&gt;

&lt;p&gt;One last thing, because it's the part developers feel immediately.&lt;br&gt;
If you've built your own RAG memory, you know the tax. You pick a vector store. You write the chunking. You tune the retrieval. You handle the writes. You maintain an index that's its own small product. You spend a week on memory infrastructure before you've shipped a single thing the user notices.&lt;/p&gt;

&lt;p&gt;The version of this I want is one call per turn. Read the relevant stuff inline, schedule the write for the same turn, done. No index to babysit, no retrieval code to tune, no decision about what's worth storing made by hand on every turn. The system decides what's relevant and what's worth keeping, and it does the read and the write in one round trip so the agent loop stays simple. That's the model Glen ships, and whether you use it or build your own, that's the bar: the memory layer should be one tool call, not a side project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this lands
&lt;/h2&gt;

&lt;p&gt;The framing I'd leave you with is simple. Single-tenant memory isn't wrong, it's just scoped to the wrong unit for the place most agents actually live, which is inside an organization full of other agents. Per-user isolation was the right default for the consumer chatbot. Per-org sharing is the right default for the fleet.&lt;br&gt;
Once memory is shared, knowledge compounds instead of resetting, new agents start smart instead of blank, and the expertise that usually walks out the door when your best people leave stays in the store. You also have to actually solve contradictions, staleness, provenance, and isolation, which is the cost of admission and also the moat.&lt;/p&gt;

&lt;p&gt;If you're running more than one agent against the same business, it's worth asking why they don't share a brain yet. You can poke at one version of the answer we built at &lt;a href="https://www.tryglen.com" rel="noopener noreferrer"&gt;Glen&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
