PostgreSQL Row-Level Security Saved My SaaS From Bugs I Didn't Know I Had
I build Nokos, an AI note-taking app. Every user's memos, diaries, and coding sessions are stored in one PostgreSQL database. One authorization bug = one user sees another's private data.
Most apps have one layer of defense: application-level auth checks. We have two. The second layer — PostgreSQL Row-Level Security — has already caught bugs that our application code missed.
The Setup: One Function, Total Isolation
Our entire RLS system hinges on one PostgreSQL function:
CREATE OR REPLACE FUNCTION current_app_user_id() RETURNS UUID AS $$
SELECT NULLIF(current_setting('app.current_user_id', true), '')::UUID;
$$ LANGUAGE SQL STABLE SECURITY DEFINER;
Every table policy checks: WHERE user_id = current_app_user_id().
On every API request, we set the session variable inside a transaction:
export async function withRLS<T>(
userId: string,
callback: (tx: TransactionClient) => Promise<T>,
): Promise<T> {
return prisma.$transaction(async (tx) => {
await tx.$executeRaw`SELECT set_config('app.current_user_id', ${userId}, true)`;
return callback(tx);
});
}
The true in set_config makes it transaction-local. When the transaction ends, the variable resets. No leakage between requests.
The Bug That RLS Caught
We use fire-and-forget patterns for non-critical async work — like generating embeddings after a memo is saved:
// After memo is created, generate embedding async
generateEmbedding(contentText)
.then(async (embedding) => {
// BUG: This runs OUTSIDE the withRLS transaction
await prisma.$executeRawUnsafe(
`UPDATE memos SET embedding = $1::vector WHERE id = $2::uuid`,
embedding, memoId,
);
});
This looks correct. The UPDATE has the right memo ID. What could go wrong?
Everything. Because memos has FORCE ROW LEVEL SECURITY, and this code runs outside withRLS(). There's no app.current_user_id set. The function returns NULL. The policy evaluates user_id = NULL, which is always false.
Result: 0 rows updated. No error. No warning. Complete silence.
The embedding was generated, the API call to Gemini was paid for, and the UPDATE ran successfully — it just matched zero rows. PostgreSQL doesn't consider "zero rows updated" an error.
The fix:
generateEmbedding(contentText)
.then(async (embedding) => {
// FIXED: wrap in withRLS so the policy can match
await withRLS(userId, async (tx) => {
await tx.$executeRawUnsafe(
`UPDATE memos SET embedding = $1::vector WHERE id = $2::uuid`,
embedding, memoId,
);
});
});
Without RLS, this bug would have silently worked in dev and production. The UPDATE would hit the row directly, no policy check. We'd never know the auth boundary was missing — until someone found a way to exploit it.
FORCE vs NO FORCE: The Decision That Matters
PostgreSQL RLS has a subtle setting: FORCE ROW LEVEL SECURITY applies policies even to the table owner. Without FORCE, the owner role bypasses all policies.
We use FORCE on every table except five:
| Table | NO FORCE | Why |
|---|---|---|
users |
✓ | Auth middleware looks up users by firebase_uid before user_id is known |
user_usage |
✓ | Usage records are created during first login, before RLS context exists |
api_keys |
✓ | API key auth looks up keys by SHA-256 hash before user_id is known |
device_refresh_tokens |
✓ | Token refresh happens before user auth |
device_codes |
✓ | Device Authorization Flow — codes exist before any user is authenticated |
The pattern: tables that are accessed before authentication completes need NO FORCE. Everything else — memos, books, tags, chat sessions, media — uses FORCE.
This means if we accidentally run a query outside withRLS() on a FORCE table, it returns zero rows instead of leaking data. Default deny.
Two Roles: App vs Batch
We use two PostgreSQL roles:
-
nokos_app— Used by the API. Subject to RLS. -
nokos_batch— Used by batch jobs (diary generation, embedding backfill). Hasrow_security = off, completely bypassing RLS.
Why? Batch jobs need to process data across all users. The daily diary generator reads all users' memos to create personalized diaries. Running this through RLS would require setting app.current_user_id for each user in a loop — technically possible, but fragile and slow.
The trade-off: nokos_batch has no authorization boundary. We accept this because batch code runs in a controlled environment (Cloud Scheduler → Cloud Run endpoint with secret-based auth), never exposed to user input.
Policy Patterns
Direct ownership (most tables):
CREATE POLICY memos_select ON memos
FOR SELECT USING (user_id = current_app_user_id());
JOIN-based (junction tables without user_id):
CREATE POLICY memo_tags_select ON memo_tags
FOR SELECT USING (
EXISTS (SELECT 1 FROM memos
WHERE memos.id = memo_tags.memo_id
AND memos.user_id = current_app_user_id())
);
Open insert, restricted read (users table — anyone can sign up):
CREATE POLICY users_insert ON users
FOR INSERT WITH CHECK (true);
CREATE POLICY users_select ON users
FOR SELECT USING (id = current_app_user_id());
Each policy is idempotent (DROP POLICY IF EXISTS before CREATE POLICY), so we safely reapply them on every deployment.
What I'd Do Differently
Start with FORCE on every table. We initially had some tables without it and had to migrate. Starting strict is easier than tightening later.
Test the silent failure. We now have integration tests that verify: without
set_config, a query on a FORCE table returns zero rows. This catches the exact bug class described above.
it("Without set_config, non-superuser sees no data", async () => {
const result = await appPrisma.$transaction(async (tx) => {
// No set_config call — should see nothing
return tx.memo.findMany();
});
expect(result).toHaveLength(0);
});
-
Document the NO FORCE exceptions. Every
NO FORCEtable should have a comment explaining why. Future developers (or future AI agents writing your code) need to know the reasoning.
Five Takeaways
- RLS is your second line of defense, not a replacement for application auth. Both layers should exist.
-
FORCE ROW LEVEL SECURITYis the critical setting. Without it, your table owner bypasses all policies — which is most ORM connections. - Silent zero-row updates are the real danger. RLS doesn't throw errors; it just filters. Test for this explicitly.
-
Pre-auth tables need
NO FORCE. Any table accessed before you know the user_id must opt out of FORCE. - Separate roles for app vs batch. Don't hack around RLS in batch jobs — give them a role that bypasses it cleanly.
Try It
Nokos is protected by this exact RLS setup across 20+ tables. Free plan available — your data is isolated at the database level, not just the application level.
Have you implemented RLS in your project? What patterns worked for you?
This is article 3 in my series about building a SaaS with AI. Article 1: Zero Lines of Code. Article 2: Claude + Gemini Cost Split.
Launching on Product Hunt March 31st — follow for updates!
Top comments (0)