<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Virginia Nyambura  Mwega</title>
    <description>The latest articles on DEV Community by Virginia Nyambura  Mwega (@virginiamwega2svg).</description>
    <link>https://dev.to/virginiamwega2svg</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4005611%2Faf72765f-e3ee-4aff-8c58-be369191fedb.jpeg</url>
      <title>DEV Community: Virginia Nyambura  Mwega</title>
      <link>https://dev.to/virginiamwega2svg</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/virginiamwega2svg"/>
    <language>en</language>
    <item>
      <title>Your WHERE clause is not a security boundary (multi-tenant RAG with pgvector + RLS)</title>
      <dc:creator>Virginia Nyambura  Mwega</dc:creator>
      <pubDate>Sun, 28 Jun 2026 19:01:51 +0000</pubDate>
      <link>https://dev.to/virginiamwega2svg/your-where-clause-is-not-a-security-boundary-multi-tenant-rag-with-pgvector-rls-28fk</link>
      <guid>https://dev.to/virginiamwega2svg/your-where-clause-is-not-a-security-boundary-multi-tenant-rag-with-pgvector-rls-28fk</guid>
      <description>&lt;p&gt;&lt;em&gt;TL;DR: app-layer filtering is a single point of failure. Push tenant isolation into Postgres with RLS — and watch out for the security definer trap in your vector-search function.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your WHERE clause is not a security boundary&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;My app is an AI wellness coach for parents. Every user's data is about the most private thing they have: how they're actually coping. Their check-ins, their bad nights, the things they'd never say out loud. The whole product runs on retrieval — when someone talks to the coach, the system pulls their relevant history out of a vector store and grounds the response in it.&lt;/p&gt;

&lt;p&gt;Which means the single scariest bug I can imagine isn't a crash. It's user A asking a question and the retrieval quietly returning a snippet of user B's private history. No error. No stack trace. Just one person's worst night surfacing in another person's conversation.&lt;/p&gt;

&lt;p&gt;In a multi-tenant app, that bug is one forgotten line of code away at all times. Here's how I make sure it can't happen — and the part of it that no tutorial warns you about.&lt;/p&gt;

&lt;p&gt;The obvious fix is a single point of failure&lt;/p&gt;

&lt;p&gt;The instinctive way to keep tenants apart is to filter in your query:&lt;/p&gt;

&lt;p&gt;``&lt;br&gt;
sql&lt;br&gt;
select * from embeddings&lt;br&gt;
where user_id = $current_user&lt;br&gt;
order by embedding &amp;lt;=&amp;gt; $query&lt;br&gt;
limit 5;&lt;/p&gt;

&lt;p&gt;``&lt;/p&gt;

&lt;p&gt;This works. It also relies on me, a tired human, remembering to write where user_id = ... on every single query that ever touches that table, forever, across every feature, including the ones I haven't built yet.&lt;/p&gt;

&lt;p&gt;That's not a security boundary. That's a promise. And the failure mode of a promise is that the day you forget it — or a new query path skips it, or a refactor drops it  there is nothing underneath to catch you. The app returns the wrong tenant's data and looks completely healthy doing it. That's exactly the shape of bug I caught in my own audit once. I didn't want to rely on never making it again.&lt;/p&gt;

&lt;p&gt;Isolation belongs in the database, not the application&lt;/p&gt;

&lt;p&gt;The fix is to move the boundary down a layer, into Postgres itself, using Row Level Security. RLS lets the database enforce which rows a user is even allowed to see, regardless of what the query asks for.&lt;/p&gt;

&lt;p&gt;``&lt;br&gt;
sql&lt;br&gt;
alter table embeddings enable row level security;&lt;/p&gt;

&lt;p&gt;create policy "Users read their own embeddings"&lt;br&gt;
on embeddings for select&lt;br&gt;
using (auth.uid() = user_id);&lt;/p&gt;

&lt;p&gt;``&lt;/p&gt;

&lt;p&gt;Now the rule isn't "please remember to filter." The rule is: this user physically cannot select another user's rows, because the database won't return them. A query that forgets the filter still comes back isolated, because the isolation isn't in the query anymore — it's in the table.&lt;/p&gt;

&lt;p&gt;This is defense in depth, the same principle security people have leaned on for decades. The app-layer filter is still there as the first line. RLS is the backstop that makes a mistake in that first line survivable instead of catastrophic. One layer can fail without the whole guarantee failing.&lt;/p&gt;

&lt;p&gt;The pgvector trap nobody mentions&lt;/p&gt;

&lt;p&gt;Here's where it gets interesting, and where I'd put real money that most "build RAG on Supabase" tutorials are quietly broken.&lt;/p&gt;

&lt;p&gt;Vector similarity search is usually wrapped in a SQL function — a match_documents-style RPC  so you can call it cleanly from your app and keep the ANN index happy:&lt;/p&gt;

&lt;p&gt;``&lt;br&gt;
sql&lt;br&gt;
create function match_user_docs(query_embedding vector(1536), match_count int)&lt;br&gt;
returns setof embeddings&lt;br&gt;
language sql&lt;br&gt;
as $$&lt;br&gt;
  select *&lt;br&gt;
  from embeddings&lt;br&gt;
  order by embedding &amp;lt;=&amp;gt; query_embedding&lt;br&gt;
  limit match_count;&lt;br&gt;
$$;&lt;/p&gt;

&lt;p&gt;``&lt;/p&gt;

&lt;p&gt;The footgun is the function's security mode. If you mark a function security definer — and a lot of copy-pasted vector-search examples do, to smooth over permissions — it runs with the definer's privileges and bypasses the caller's RLS entirely. You carefully set up Row Level Security on the table, then call it through a function that turns that protection off, and you'd never know: the function returns results, the app works in the demo, and every tenant's vectors are quietly reachable through that one call.&lt;/p&gt;

&lt;p&gt;The fix is boring and important: keep the search function security invoker so the caller's RLS still applies, or — if it genuinely has to be security definer — filter by auth.uid() inside the function and pin the search_path. The point is to never let the convenience wrapper become the hole in the wall you just built.&lt;/p&gt;

&lt;p&gt;One more wrinkle: filtering and approximate search fight a little&lt;/p&gt;

&lt;p&gt;There's a subtle performance interaction worth knowing. pgvector's index (HNSW or IVFFlat) does approximate nearest-neighbor search — it returns roughly the closest vectors, fast. Add RLS on top, and the isolation filter trims that candidate set down to the current tenant's rows.&lt;/p&gt;

&lt;p&gt;If you ask the index for the global top 5 and then isolation removes the ones that aren't yours, you can end up with fewer than 5 results — or, in a busy table, none. The pattern is to over-fetch: ask the index for more candidates than you need, so that after isolation you still have enough to ground a good answer. It's a small thing that only shows up under real multi-tenant load, which is exactly why it's worth saying out loud.&lt;/p&gt;

&lt;p&gt;The takeaway&lt;/p&gt;

&lt;p&gt;The model gets all the attention, but the part of an AI app that has to be certain is rarely the model. Here, it's the data boundary. And a boundary you enforce in application code is only as strong as your memory on your worst day.&lt;/p&gt;

&lt;p&gt;So I push it down to where it can't be forgotten. The app filters because it should. The database isolates because it must. One forgotten where clause should be a non-event, not a breach — and the only way to guarantee that is to stop trusting the query and start trusting the table.&lt;/p&gt;

</description>
      <category>database</category>
      <category>ai</category>
      <category>security</category>
      <category>postgres</category>
    </item>
  </channel>
</rss>
