Ugur Aslim

Posted on Jun 29 • Originally published at uguraslim.com

SQLAlchemy Mapper Configuration for Tenant-Scoped Relationships: Preventing Accidental Cross-Tenant Data Access in ORM Lazy-Loaded Relations

#sqlalchemy #postgres #multitenancy #security

SQLAlchemy Mapper Configuration for Tenant-Scoped Relationships: Preventing Accidental Cross-Tenant Data Access in ORM Lazy-Loaded Relations

I've shipped multi-tenant systems where a single careless .organizations lazy-load returned data from every tenant in the database. That bug lived in production for three days before a security audit caught it. The query layer had tenant filters, but the ORM relationships bypassed them entirely.

This is the post I wish I'd had then.

The Problem: Silent Data Leaks Through Lazy-Loaded Relations

Here's a dangerous pattern I see constantly:

# models.py
class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    tenant_id: Mapped[int] = mapped_column(ForeignKey("tenants.id"))

    # This relationship has NO tenant filtering
    organizations: Mapped[List["Organization"]] = relationship(
        "Organization",
        secondary="user_organization",
        back_populates="users"
    )

class Organization(Base):
    __tablename__ = "organizations"
    id: Mapped[int] = mapped_column(primary_key=True)
    tenant_id: Mapped[int] = mapped_column(ForeignKey("tenants.id"))
    users: Mapped[List["User"]] = relationship(
        "User",
        secondary="user_organization",
        back_populates="organizations"
    )

# endpoint.py
@app.get("/api/users/{user_id}/organizations")
async def get_user_organizations(user_id: int, request: Request):
    user = db.query(User).filter(
        User.id == user_id,
        User.tenant_id == request.state.tenant_id  # ✅ Tenant filter here
    ).first()

    # ❌ BUG: This lazy-loads WITHOUT tenant filtering
    return {"organizations": user.organizations}

Why is this dangerous? Because when you access user.organizations, SQLAlchemy generates:

SELECT * FROM organizations 
WHERE id IN (SELECT organization_id FROM user_organization WHERE user_id = ?)
-- No tenant_id filter!

If a user somehow guesses another tenant's organization ID, they see it. The problem compounds: your query layer might have exhaustive tenant checks, but lazy-loaded relationships are invisible to that logic.

Why This Happens

SQLAlchemy's relationship loading is separate from your query filters. When you define a relationship, you're telling SQLAlchemy "here's the foreign key path." You're not telling it "and by the way, always filter by tenant_id."

The ORM assumes the database schema handles isolation (which is wrong for multi-tenancy). It also assumes you're not security-conscious enough to filter at the query level and the relationship level. It's a reasonable assumption in single-tenant systems. It's a catastrophic assumption in multi-tenant ones.

The Solution: Relationship-Level Tenant Filtering

I use three tactics, in order of preference:

1. Foreign Keys with Tenant Context (Best)

If your schema supports it, make relationships foreign key on (tenant_id, entity_id):

class UserOrganization(Base):
    __tablename__ = "user_organization"
    user_id: Mapped[int] = mapped_column(ForeignKey("users.id"), primary_key=True)
    user_tenant_id: Mapped[int] = mapped_column(
        ForeignKey("users.tenant_id"), primary_key=True
    )
    organization_id: Mapped[int] = mapped_column(ForeignKey("organizations.id"), primary_key=True)
    organization_tenant_id: Mapped[int] = mapped_column(
        ForeignKey("organizations.tenant_id"), primary_key=True
    )

    # Constraint that user and org must be in same tenant
    __table_args__ = (
        CheckConstraint("user_tenant_id = organization_tenant_id"),
    )

class User(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    tenant_id: Mapped[int] = mapped_column(ForeignKey("tenants.id"), primary_key=True)

    organizations: Mapped[List["Organization"]] = relationship(
        "Organization",
        secondary="user_organization",
        foreign_keys=[UserOrganization.organization_id, UserOrganization.organization_tenant_id],
        primaryjoin="and_(User.id==UserOrganization.user_id, User.tenant_id==UserOrganization.user_tenant_id)",
        secondaryjoin="and_(Organization.id==UserOrganization.organization_id, Organization.tenant_id==UserOrganization.organization_tenant_id)"
    )

Why? The database itself enforces tenant boundaries. A cross-tenant join becomes impossible. This is bulletproof but requires schema redesign.

2. Relationship Loaders with Tenant Context (Most Practical)

Use selectinload() or contains_eager() with tenant filtering in your query:

from sqlalchemy.orm import selectinload

# In your service layer
def get_user_with_organizations(user_id: int, tenant_id: int):
    return db.query(User).options(
        selectinload(User.organizations).where(Organization.tenant_id == tenant_id)
    ).filter(
        User.id == user_id,
        User.tenant_id == tenant_id
    ).first()

Why? You control the loader strategy per query. No lazy-loading accidents. But you must remember to use it everywhere—it's not automatic.

I hate that it's not automatic. This is why I prefer approach #3:

3. Mapper Events with Automatic Tenant Filtering (What I Use in CitizenApp)

Use SQLAlchemy's before_all_gets and mapper events to intercept relationship loading:

from sqlalchemy import event
from sqlalchemy.orm import Mapper
from flask import request

# Global tenant context (you'd use a proper context manager in FastAPI)
_tenant_context = None

def set_tenant_context(tenant_id: int):
    global _tenant_context
    _tenant_context = tenant_id

@event.listens_for(Mapper, "before_all_gets")
def receive_before_all_gets(mapper_instance, clause):
    """Automatically append tenant_id filter to all loads."""
    if _tenant_context is None:
        return clause

    # Only filter if the mapped class has tenant_id
    if hasattr(mapper_instance.class_, "tenant_id"):
        return clause.where(
            mapper_instance.class_.tenant_id == _tenant_context
        )
    return clause

# In your FastAPI dependency
@app.middleware("http")
async def tenant_middleware(request: Request, call_next):
    tenant_id = extract_tenant_from_token(request)  # Your auth logic
    set_tenant_context(tenant_id)
    response = await call_next(request)
    set_tenant_context(None)
    return response

# Now this is safe—lazy-loads are filtered automatically
user = db.query(User).filter(User.id == user_id).first()
safe_orgs = user.organizations  # Filtered by tenant!

Why I prefer this: Every relationship respects tenant context by default. No selectinload() boilerplate. No accidental cross-tenant queries.

Testing for the Regression

I ship a test that would catch the original bug:

def test_lazy_loaded_relationships_respect_tenant_isolation(db, tenant_1, tenant_2):
    user_t1 = User(id=1, tenant_id=tenant_1.id)
    org_t2 = Organization(id=1, tenant_id=tenant_2.id)

    db.add_all([user_t1, org_t2])
    db.commit()

    # Manually set tenant context to tenant_1
    set_tenant_context(tenant_1.id)

    user = db.query(User).filter(User.id == 1).first()

    # This should be empty because org belongs to tenant_2
    assert user.organizations == []

This test would fail with the original unsafe code. It passes with the mapper event approach.

Gotcha: Context Propagation in Async Code

Using a global _tenant_context variable breaks with async. In FastAPI/async contexts, use contextvars:

from contextvars import ContextVar

_tenant_context: ContextVar[int | None] = ContextVar("tenant_id", default=None)

def set_tenant_context(tenant_id: int):
    _tenant_context.set(tenant_id)

@event.listens_for(Mapper, "before_all_gets")
def receive_before_all_gets(mapper_instance, clause):
    tenant_id = _tenant_context.get()
    if tenant_id is None:
        return clause
    # ...

Context vars automatically isolate across async tasks. That three-day bug? Partly because I didn't know about contextvars yet.

Final Thoughts

Lazy-loading is convenient until it's a security breach. In CitizenApp, I enforce tenant filtering at three levels: the query, the relationship loader, and the mapper. Redundancy feels paranoid until an engineer is tired at 11 PM and forgets one of the safeguards.

Make tenant isolation automatic and impossible to bypass. Your future self auditing production logs will thank you.

DEV Community

SQLAlchemy Mapper Configuration for Tenant-Scoped Relationships: Preventing Accidental Cross-Tenant Data Access in ORM Lazy-Loaded Relations

SQLAlchemy Mapper Configuration for Tenant-Scoped Relationships: Preventing Accidental Cross-Tenant Data Access in ORM Lazy-Loaded Relations

The Problem: Silent Data Leaks Through Lazy-Loaded Relations

Why This Happens

The Solution: Relationship-Level Tenant Filtering

1. Foreign Keys with Tenant Context (Best)

2. Relationship Loaders with Tenant Context (Most Practical)

3. Mapper Events with Automatic Tenant Filtering (What I Use in CitizenApp)

Testing for the Regression

Gotcha: Context Propagation in Async Code

Final Thoughts

Top comments (0)