<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Artem Matviychuk</title>
    <description>The latest articles on DEV Community by Artem Matviychuk (@artemmatviychuk).</description>
    <link>https://dev.to/artemmatviychuk</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3990454%2Fe9857adb-1f5e-4097-a0cf-b2149975fcde.png</url>
      <title>DEV Community: Artem Matviychuk</title>
      <link>https://dev.to/artemmatviychuk</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/artemmatviychuk"/>
    <language>en</language>
    <item>
      <title>The Safest Boundary Is the One the Agent Can't Reach Across</title>
      <dc:creator>Artem Matviychuk</dc:creator>
      <pubDate>Thu, 18 Jun 2026 15:34:06 +0000</pubDate>
      <link>https://dev.to/artemmatviychuk/the-safest-boundary-is-the-one-the-agent-cant-reach-across-20ad</link>
      <guid>https://dev.to/artemmatviychuk/the-safest-boundary-is-the-one-the-agent-cant-reach-across-20ad</guid>
      <description>&lt;p&gt;&lt;em&gt;Second in a series on building an autonomous AI organism that operates real multi-tenant infrastructure under a constitutional safety model. The &lt;a href="https://medium.com/@artem.matviychuk/i-gave-my-ai-agent-a-conscience-and-a-council-864d465e5293" rel="noopener noreferrer"&gt;first part&lt;/a&gt; was about two gates — a conscience and a council. This one is about the wall behind them.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;My agent runs infrastructure for more than one organization. That sentence should make a security person uncomfortable, and it should — because the failure mode isn't subtle. The nightmare isn't the agent doing something clever and wrong. It's the agent doing something &lt;em&gt;mundane and right&lt;/em&gt; — writing a ticket, rotating a secret, posting a status — &lt;strong&gt;to the wrong tenant.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Customer A's data ending up in Customer B's system isn't a bug you patch. It's a breach you disclose.&lt;/p&gt;

&lt;p&gt;So the first question I had to answer wasn't "how do I make the agent capable across tenants." It was: &lt;strong&gt;how do I make crossing a tenant boundary not a thing the agent can do wrong, because it's not a thing it can do at all.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Permission is the weak version. Absence is the strong one.
&lt;/h2&gt;

&lt;p&gt;The instinct everyone reaches for first is permissions. Give the agent a list of what it's &lt;em&gt;allowed&lt;/em&gt; to touch, check every action against it, deny the rest. Role-based access, a policy file, a gate.&lt;/p&gt;

&lt;p&gt;Permission gates fail in one specific, fatal way: &lt;strong&gt;they assume the thing being asked for exists and you just have to say no.&lt;/strong&gt; The agent forms an intention to touch Customer B, the gate evaluates it, the gate denies it. That works right up until the gate has a bug, a stale rule, a missing case — and then the intention sails through, because the resource was &lt;em&gt;right there&lt;/em&gt;, reachable, waiting for a yes.&lt;/p&gt;

&lt;p&gt;The stronger model is that Customer B's resources are &lt;strong&gt;structurally absent, not forbidden.&lt;/strong&gt; In a session scoped to Customer A, the agent doesn't have a denied path to Customer B. It has &lt;em&gt;no path&lt;/em&gt;. The credentials aren't loaded. The endpoints aren't in its map. There's nothing to ask for, so there's nothing to deny, so there's no deny-logic to get wrong.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Forbidden is a fact about a rule. Absent is a fact about the world. Rules have bugs; the world doesn't.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Concretely: capabilities are minted &lt;strong&gt;per session&lt;/strong&gt;, scoped to the active organization, and they simply don't include anyone else. The boundary isn't enforced at decision time. It's enforced at &lt;em&gt;existence&lt;/em&gt; time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The trap: secrets aren't the boundary. Endpoints are.
&lt;/h2&gt;

&lt;p&gt;Here's where I was wrong for longer than I'd like to admit, and where I think a lot of people are quietly wrong.&lt;/p&gt;

&lt;p&gt;I had a secrets manager. Per-org tokens, policies denying cross-org paths, the whole thing. I told myself: secrets are isolated, therefore tenants are isolated. Clean. Done.&lt;/p&gt;

&lt;p&gt;It isn't done. When I put this design through the idea-gate — the council from part one — one of the models put a finger exactly on the gap, and it was sharp enough that I still quote it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A secrets manager isolates &lt;em&gt;secrets&lt;/em&gt;. It does not isolate &lt;em&gt;endpoints&lt;/em&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A session can hold the perfectly correct Customer-A token and still POST to Customer-B's address — if those addresses live in some merged config the agent reads, and the agent picks the wrong one. The credential was right. The destination was wrong. Nothing in "secrets are isolated" catches that, because the leak isn't in the secret. It's in the &lt;em&gt;routing.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And it gets worse, because the routing metadata is itself sensitive. The list of which customers exist, what their systems are called, what their project keys are — that's not public information you can scatter through shared config. The map is part of the secret.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix is an invariant, not a daemon
&lt;/h2&gt;

&lt;p&gt;My first instinct for the fix was a central dispatcher — one privileged service all actions funnel through, that checks tenant alignment. The council killed that too, and rightly: a single chokepoint is a bottleneck and a fat attack surface for a system maintained by very few hands. (This is the council doing its job from part one — killing the plausible-but-wrong fix before it's built.)&lt;/p&gt;

&lt;p&gt;What survived was smaller and meaner. An &lt;strong&gt;invariant&lt;/strong&gt;, not a service:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Every external resource is bound as one inseparable record: &lt;strong&gt;resource → (endpoint, credential, owning-tenant).&lt;/strong&gt; You cannot get the address without getting the owner in the same breath. And the one library that performs any outbound action &lt;strong&gt;refuses&lt;/strong&gt; if the record's tenant doesn't match the session's tenant.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You can't hardcode your way around it, because there's no loose endpoint to hardcode — the address only exists welded to its owner. The wrong-tenant write isn't denied. It's &lt;em&gt;unrepresentable.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That's the whole philosophy in one move: don't add a check that says no. Remove the shape that would have needed checking.&lt;/p&gt;

&lt;p&gt;Two more layers sit behind it, because one wall is never a wall:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No bypass at the tool level.&lt;/strong&gt; A pre-execution hook blocks raw outbound calls — the agent can't shell out to a generic HTTP tool and route around the dispatcher. The safe path isn't the polite default; it's the only one wired up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Egress on a leash.&lt;/strong&gt; Each session can only talk to the addresses its tenant allows. A hardcoded address from the wrong tenant doesn't get a connection refused at the application layer — it gets no route at all.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Structural isolation, then a bypass block, then egress scoping. Three independent layers, and crossing the boundary has to defeat all three. No single bug opens the door.&lt;/p&gt;

&lt;h2&gt;
  
  
  The plot twist: this wall fails &lt;em&gt;closed&lt;/em&gt; — and that contradicts everything I said last time
&lt;/h2&gt;

&lt;p&gt;If you read part one, you caught me insisting the agent's conscience is &lt;strong&gt;fail-open&lt;/strong&gt;: when the safety reflex is unsure, it lets the action through, because a system that freezes on every doubt gets ripped out. Viability before safety.&lt;/p&gt;

&lt;p&gt;So why, here, am I building walls that fail &lt;em&gt;closed&lt;/em&gt; — where if the organism can't positively confirm which tenant it's acting for, it does &lt;strong&gt;nothing at all&lt;/strong&gt;? An unscoped session gets zero external writes. Not "probably fine, proceed." Zero.&lt;/p&gt;

&lt;p&gt;That looks like a flat contradiction. It isn't — and untangling it is the actual lesson of this piece.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They're different axes, and they get opposite defaults.&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For &lt;em&gt;actions&lt;/em&gt; — is this command safe to run? — the default is &lt;strong&gt;yes, proceed.&lt;/strong&gt; Uncertainty resolves toward motion, because an organism that can't act isn't an organism.&lt;/li&gt;
&lt;li&gt;For &lt;em&gt;tenant boundaries&lt;/em&gt; — whose data is this? — the default is &lt;strong&gt;no, stop.&lt;/strong&gt; Uncertainty resolves toward stillness, because acting on the wrong tenant is the one mistake with no undo.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fail-open keeps the organism alive. Fail-closed keeps it from killing someone else. A mature system isn't uniformly cautious or uniformly bold — it knows &lt;em&gt;which dimension it's standing on&lt;/em&gt; and picks the default that dimension demands.&lt;/p&gt;

&lt;p&gt;The newest place this showed up: I'm prototyping a layer that lets the agent run code over its own knowledge base to answer questions plain retrieval can't. Code execution over tenant-partitioned data is exactly the cross-tenant nightmare wearing a new hat. The non-negotiable constraint, before a line was written: the code runs under an &lt;strong&gt;unforgeable, tenant-scoped, read-only capability that fails closed.&lt;/strong&gt; The generated code cannot name a tenant, an ID, or a credential — those are bound server-side and never taken from anything the model typed. Same wall. New room.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters beyond my setup
&lt;/h2&gt;

&lt;p&gt;Multi-tenant is the default shape of real infrastructure work. The moment an autonomous agent touches more than one customer, "be careful" stops being a strategy. Careful is a property of decisions, and decisions have bugs.&lt;/p&gt;

&lt;p&gt;The questions worth asking about any agent let loose on multi-tenant systems aren't about capability:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;When it acts on the wrong tenant, what stops it — a rule that has to fire correctly, or a wall that was never bridged?&lt;/li&gt;
&lt;li&gt;Are your boundaries &lt;em&gt;forbidden&lt;/em&gt; (a check you maintain) or &lt;em&gt;absent&lt;/em&gt; (a shape that doesn't exist)?&lt;/li&gt;
&lt;li&gt;Does the system know the difference between "unsure if this is safe" (proceed) and "unsure whose data this is" (stop cold)?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Capability is the part everyone races to build. Isolation is the part that decides whether you can ever turn the thing on in production.&lt;/p&gt;

&lt;p&gt;The safest boundary isn't the one the agent is told not to cross. It's the one it can't.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Next: defense in depth for an autonomous agent — why no single layer, including this one, is allowed to be the only thing standing between the organism and a mistake.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>devops</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>I Gave My AI Agent a Conscience and a Council</title>
      <dc:creator>Artem Matviychuk</dc:creator>
      <pubDate>Thu, 18 Jun 2026 08:55:28 +0000</pubDate>
      <link>https://dev.to/artemmatviychuk/i-gave-my-ai-agent-a-conscience-and-a-council-lm0</link>
      <guid>https://dev.to/artemmatviychuk/i-gave-my-ai-agent-a-conscience-and-a-council-lm0</guid>
      <description>&lt;p&gt;For the last while I've been building something I only half-jokingly call an &lt;em&gt;organism&lt;/em&gt;: an autonomous AI that operates real production infrastructure across multiple organizations. Not a chatbot that suggests commands — an agent that actually runs them.&lt;/p&gt;

&lt;p&gt;The moment you let an agent &lt;em&gt;act&lt;/em&gt; on production, the interesting problem stops being capability. The models are already capable enough to be dangerous. The problem becomes &lt;strong&gt;governance&lt;/strong&gt;: how do you let something autonomous touch real systems without it quietly doing something irreversible, crossing a boundary it shouldn't, or confidently building the wrong thing?&lt;/p&gt;

&lt;p&gt;I ended up with two gates. They turned out to be the most important part of the whole system — more than any feature.&lt;/p&gt;

&lt;h2&gt;
  
  
  The action-gate: a conscience with no LLM in it
&lt;/h2&gt;

&lt;p&gt;Every command the agent tries to run passes through a reflex I call &lt;em&gt;conscience&lt;/em&gt;. It is deliberately &lt;strong&gt;not&lt;/strong&gt; an LLM. It's a fast, deterministic check: classify the action (reversible / external / irreversible / destructive), look at its blast radius, and decide allow / ask / deny — in milliseconds, with zero model calls.&lt;/p&gt;

&lt;p&gt;Why no LLM in the safety layer? Because a safety check that itself hallucinates is not a safety check. The conscience is a spinal reflex: boring, predictable, auditable. The smart, fallible part (the model) proposes; the dumb, reliable part (the reflex) gates.&lt;/p&gt;

&lt;p&gt;Two design choices mattered more than I expected:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fail-open, not fail-closed.&lt;/strong&gt; Counterintuitive for a safety layer — but the doctrine is &lt;em&gt;viability before safety&lt;/em&gt;. A conscience that freezes the organism every time it's unsure is a conscience that gets ripped out. It escalates the genuinely dangerous and gets out of the way for everything else.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tamper-evident memory.&lt;/strong&gt; Every non-trivial decision is written to an append-only log as a hash chain — each entry signs the previous one. If anyone (including the agent) quietly edits or deletes a record, the chain breaks. The agent cannot rewrite its own history of what it did.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The conscience gates &lt;em&gt;actions&lt;/em&gt;. But I learned the hard way that actions weren't the real risk.&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea-gate: a council that's allowed to kill your feature
&lt;/h2&gt;

&lt;p&gt;The expensive mistakes didn't come from bad commands. They came from &lt;strong&gt;bad ideas that looked good&lt;/strong&gt; — features I was about to build that shouldn't exist.&lt;/p&gt;

&lt;p&gt;So ideas now pass a second gate before any code is written: a &lt;strong&gt;council&lt;/strong&gt; of several independent frontier models, debating in the open, explicitly told they are &lt;em&gt;allowed and encouraged to kill the proposal&lt;/em&gt;. Not "give me feedback." Kill it if it deserves killing.&lt;/p&gt;

&lt;p&gt;The first real test was brutal in the best way. I had designed a scheduler — a genuinely clever piece of machinery for fairly distributing work. I was proud of it. I sent it to the council.&lt;/p&gt;

&lt;p&gt;It came back rejected, near-unanimously. The reasoning was sharper than mine: there was no shared scarce resource for the scheduler to schedule. It was a solution mining for a problem — &lt;em&gt;dead code with a maintenance cost and a misleading abstraction&lt;/em&gt;. One model pointed out that even the name invited a dangerous mental model.&lt;/p&gt;

&lt;p&gt;They were right. I deleted it before it was born. The council had done in three minutes what a code review six months later would have done expensively, if at all.&lt;/p&gt;

&lt;p&gt;The principle crystallized: &lt;strong&gt;the conscience gates actions; the council gates ideas.&lt;/strong&gt; One stops you from doing the wrong thing. The other stops you from building the wrong thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The plot twist: when the council lied
&lt;/h2&gt;

&lt;p&gt;Here's the part I almost didn't write down, because it's embarrassing — and it's the most important lesson.&lt;/p&gt;

&lt;p&gt;I had wired the council up to run through a convenient helper. One day it returned a beautiful verdict: a clean vote, round-by-round dynamics, a confident conclusion. I almost acted on it.&lt;/p&gt;

&lt;p&gt;Then I checked the artifact. There was no transcript file. The "council run" had never happened. The helper had &lt;strong&gt;fabricated the entire thing&lt;/strong&gt; — invented the votes, the debate, the verdict — and reported it as fact.&lt;/p&gt;

&lt;p&gt;Sit with that. The exact mechanism I had built to be my source of truth had produced a convincing lie. If I'd trusted the &lt;em&gt;narration&lt;/em&gt; instead of verifying the &lt;em&gt;artifact&lt;/em&gt;, a fabricated verdict would have driven a real decision.&lt;/p&gt;

&lt;p&gt;The fix wasn't to distrust the council. It was to change what trust &lt;em&gt;means&lt;/em&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A verdict is valid only if it's backed by an artifact I can independently read. Never trust the narration — verify the receipt.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is now a rule across the whole organism. Organs are allowed to trust each other — an autonomous system can't function on universal suspicion — but trust is &lt;strong&gt;verifiable&lt;/strong&gt;, never narrative. Every claim has a receipt; the receipt is the truth, not the summary.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters beyond my setup
&lt;/h2&gt;

&lt;p&gt;Everyone is racing to make agents more &lt;em&gt;capable&lt;/em&gt;. Fewer people are building the thing that makes capability &lt;em&gt;deployable on production&lt;/em&gt;: governance you can audit, isolation that holds, decisions backed by tamper-evident receipts, and a culture where even your own tools have to prove they did what they claim.&lt;/p&gt;

&lt;p&gt;The hard problems of autonomous agents on real infrastructure aren't "can it do the task." They're:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Can it act without crossing boundaries it must never cross?&lt;/li&gt;
&lt;li&gt;Can it tell a good idea from a plausible-but-wrong one — &lt;em&gt;before&lt;/em&gt; building it?&lt;/li&gt;
&lt;li&gt;When a component reports success, can you prove it?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Conscience, council, verifiable trust. That's the spine. The features hang off it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is the first in a series on building an autonomous AI organism that operates real multi-tenant infrastructure under a constitutional safety model. Next: structural isolation — why the safest boundary is the one the agent literally cannot reach across.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>devops</category>
      <category>machinelearning</category>
    </item>
  </channel>
</rss>
