Prince Raj

Posted on Jan 18

From Documents to Relations: A Complete Guide to Migrating MongoDB to Supabase at Scale

#supabase #postgres #database #serverless

A deep dive into database migration patterns, normalization strategies, and leveraging Supabase's powerful features when moving from document stores to a modern PostgreSQL platform.

TL;DR

We migrated a production system from MongoDB to Supabase:

Metric	Value
Collections Migrated	65
Tables Created	282
Rows Migrated	~80,000
Foreign Keys	217
Indexes	491
RLS Policies	156
Migration Approach	Zero downtime

This guide covers the complete journey—from understanding why we chose Supabase, through the technical implementation, to leveraging Supabase's unique features post-migration.

Why We Chose Supabase
The Challenge: Document vs Relational Paradigm
Architecture Overview
Normalization Strategies
Data Type Mapping
The Transformation Engine
Handling Complex Patterns
Row Level Security for Multi-Tenancy
Indexing Strategy
Foreign Key Strategy
Leveraging Supabase Features
Error Handling & Data Quality
Performance Optimizations
Lessons Learned
Conclusion

Why We Chose Supabase

The Business Case

After months of running MongoDB in production, several pain points emerged. We evaluated multiple options and chose Supabase for these reasons:

1. ACID Compliance with Developer Experience

Supabase gives us PostgreSQL's rock-solid transactional guarantees wrapped in a modern developer experience. The dashboard, client libraries, and tooling made the transition smoother than raw PostgreSQL.

2. Built-in Row Level Security (RLS)

Our multi-tenant SaaS needed tenant isolation. MongoDB required application-level enforcement everywhere. Supabase's RLS policies enforce isolation at the database level—impossible to bypass.

-- One policy, complete tenant isolation
CREATE POLICY tenant_isolation ON entities
    USING (organization_id = auth.jwt()->>'org_id');

3. Realtime Subscriptions

We had a custom change stream implementation on MongoDB. Supabase Realtime gave us this out of the box with WebSocket support and client library integration.

4. Complex Reporting Needs

Our analytics team struggled with MongoDB aggregations. Queries that took seconds in PostgreSQL required complex multi-stage pipelines in MongoDB. With Supabase, we can write standard SQL and even use their built-in pg_graphql extension.

5. Integrated Auth

Supabase Auth integrates seamlessly with RLS. User context flows from authentication through to database policies without custom middleware.

6. Cost Optimization

MongoDB's memory-mapped storage consumed significantly more RAM. Supabase's pricing model (based on PostgreSQL) was more predictable and cost-effective for our workload.

7. Edge Functions & Storage

Moving to Supabase meant we could consolidate our file storage (previously S3) and serverless functions into one platform.

The Challenge: Document vs Relational Paradigm

Before diving into implementation, it's crucial to understand the fundamental paradigm shift we're dealing with.

MongoDB's Document Model

A typical MongoDB document looks like this:

{
  "_id": "507f1f77bcf86cd799439011",
  "name": "Enterprise Account",
  "contacts": [
    { "email": "alice@example.com", "role": "admin" },
    { "email": "bob@example.com", "role": "viewer" }
  ],
  "settings": {
    "theme": "dark",
    "notifications": {
      "email": true,
      "sms": false
    }
  },
  "tags": ["premium", "enterprise", "active"],
  "customFields": {
    "industry": "Technology",
    "employeeCount": 500
  }
}

This single document contains:

Nested objects (settings, notifications)
Arrays of objects (contacts)
Simple arrays (tags)
Dynamic fields (customFields with arbitrary keys)

Supabase's Relational Model

The same data in Supabase requires multiple tables:

-- Main table
CREATE TABLE accounts (
    id TEXT PRIMARY KEY,
    organization_id UUID NOT NULL REFERENCES organizations(id),
    name TEXT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Enable RLS
ALTER TABLE accounts ENABLE ROW LEVEL SECURITY;

-- 1:1 relationship for nested object
CREATE TABLE account_settings (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id TEXT REFERENCES accounts(id) ON DELETE CASCADE,
    theme TEXT DEFAULT 'light',
    UNIQUE(account_id)
);

-- Deeper nesting
CREATE TABLE account_notification_settings (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    settings_id UUID REFERENCES account_settings(id) ON DELETE CASCADE,
    email BOOLEAN DEFAULT true,
    sms BOOLEAN DEFAULT false,
    UNIQUE(settings_id)
);

-- Array of objects
CREATE TABLE account_contacts (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id TEXT REFERENCES accounts(id) ON DELETE CASCADE,
    email TEXT NOT NULL,
    role TEXT,
    ordinal INTEGER NOT NULL  -- Preserves array order
);

-- Simple array
CREATE TABLE account_tags (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id TEXT REFERENCES accounts(id) ON DELETE CASCADE,
    tag TEXT NOT NULL,
    ordinal INTEGER NOT NULL
);

-- Dynamic fields (key-value pattern)
CREATE TABLE account_custom_fields (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id TEXT REFERENCES accounts(id) ON DELETE CASCADE,
    field_key TEXT NOT NULL,
    field_value TEXT,
    value_type TEXT NOT NULL  -- For type preservation
);

One document became six tables. Multiply this by 65 collections, and you understand why 65 collections became 282 tables.

Architecture Overview

Our migration architecture followed a three-phase approach:

Phase 1: Schema Creation

We used Supabase CLI migrations to manage our schema:

# Create a new migration
supabase migration new create_entities_table

# Apply migrations
supabase db push

Example migration file (supabase/migrations/01_entities.sql):

-- Create main table
CREATE TABLE IF NOT EXISTS entities (
    id TEXT PRIMARY KEY,
    organization_id UUID NOT NULL,
    name TEXT,
    status TEXT DEFAULT 'active',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Enable RLS (required for Supabase security model)
ALTER TABLE entities ENABLE ROW LEVEL SECURITY;

-- Create child table for metadata
CREATE TABLE IF NOT EXISTS entity_metadata (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    entity_id TEXT REFERENCES entities(id) ON DELETE CASCADE,
    key TEXT NOT NULL,
    value TEXT,
    ordinal INTEGER DEFAULT 0
);

ALTER TABLE entity_metadata ENABLE ROW LEVEL SECURITY;

Phase 2: Data Migration

The bulk data migration reads from BSON backup files (not live MongoDB) to avoid impacting production. We connect directly to Supabase's PostgreSQL connection string.

Phase 3: Constraints & Security

Foreign keys, indexes, and RLS policies are created after all data is loaded. This is critical for performance.

Normalization Strategies

This section is the heart of the migration. Understanding these patterns will help you tackle any document-to-relational transformation.

First Normal Form: Eliminating Arrays

The Problem: MongoDB arrays have no equivalent in relational databases.

The Solution: Create child tables with an ordinal column to preserve order.

Before (MongoDB):

{
  "_id": "123",
  "name": "Widget",
  "categories": ["electronics", "gadgets", "sale"]
}

After (Supabase):

-- products table
| id  | name   |
|-----|--------|
| 123 | Widget |

-- product_categories table
| id                                   | product_id | category    | ordinal |
|--------------------------------------|------------|-------------|---------|
| a1b2c3d4-e5f6-7890-abcd-ef1234567890 | 123        | electronics | 0       |
| b2c3d4e5-f6a7-8901-bcde-f12345678901 | 123        | gadgets     | 1       |
| c3d4e5f6-a7b8-9012-cdef-123456789012 | 123        | sale        | 2       |

Transformation Code Pattern:

def transform_array_field(parent_id: str, array: list, field_name: str) -> list:
    """Transform a MongoDB array into rows for a Supabase child table."""
    rows = []
    for ordinal, value in enumerate(array):
        rows.append({
            "parent_id": parent_id,
            field_name: value,
            "ordinal": ordinal
        })
    return rows

Second Normal Form: Extracting Nested Objects

The Problem: Embedded documents in MongoDB need their own tables.

The Solution: Create child tables with foreign keys. Use UNIQUE constraints for 1:1 relationships.

Naming Convention: parent_nested_field (e.g., user_profile, order_shipping_address)

Before (MongoDB):

{
  "_id": "user_456",
  "email": "user@example.com",
  "profile": {
    "bio": "Software engineer",
    "avatar": "https://example.com/avatar.jpg",
    "social": {
      "twitter": "@user",
      "github": "user"
    }
  }
}

After (Supabase):

-- users table
| id       | email             |
|----------|-------------------|
| user_456 | user@example.com  |

-- user_profiles table (1:1 with UNIQUE constraint)
| id  | user_id  | bio               | avatar                        |
|-----|----------|-------------------|-------------------------------|
| uuid| user_456 | Software engineer | https://example.com/avatar.jpg|

-- user_profile_social table (1:1 with profile)
| id  | profile_id | twitter | github |
|-----|------------|---------|--------|
| uuid| uuid       | @user   | user   |

Third Normal Form: Transitive Dependencies

The Problem: Multi-level nesting creates transitive dependencies.

The Solution: Each level gets its own table with references only to its immediate parent.

Example: 4-Level Deep Structure

Transformation Pattern:

def flatten_nested_structure(doc: dict, parent_id: str = None, depth: int = 0):
    """Recursively flatten nested structures into separate table rows."""
    results = {}

    # Extract scalar fields for current level
    current_row = {k: v for k, v in doc.items()
                   if not isinstance(v, (dict, list))}

    if parent_id:
        current_row["parent_id"] = parent_id

    # Process nested objects
    for key, value in doc.items():
        if isinstance(value, dict):
            child_table = f"{current_table}_{key}"
            results[child_table] = flatten_nested_structure(
                value,
                parent_id=current_row["id"],
                depth=depth + 1
            )
        elif isinstance(value, list):
            # Handle arrays (covered in 1NF section)
            pass

    return results

Handling Dynamic/Schemaless Data

The Problem: MongoDB allows arbitrary fields. Supabase requires defined columns.

The Solution: Key-value pattern with type discriminators, or JSONB columns for query-heavy dynamic data.

Option 1: Key-Value Table (Better for simple lookups)

CREATE TABLE entity_custom_fields (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    entity_id TEXT REFERENCES entities(id) ON DELETE CASCADE,
    field_path TEXT NOT NULL,      -- Dot notation for nested: "address.city"
    field_key TEXT NOT NULL,       -- Final key name
    field_value TEXT,              -- Stored as text
    value_type TEXT NOT NULL,      -- 'string', 'number', 'boolean', 'null'
    ordinal INTEGER DEFAULT 0
);

-- Index for efficient lookups
CREATE INDEX idx_entity_custom_fields_lookup
    ON entity_custom_fields(entity_id, field_key);

Option 2: JSONB Column (Better for complex queries)

-- Add JSONB column for flexible querying
ALTER TABLE entities ADD COLUMN custom_data JSONB DEFAULT '{}';

-- Create GIN index for JSONB queries
CREATE INDEX idx_entities_custom_data ON entities USING GIN(custom_data);

-- Query example
SELECT * FROM entities
WHERE custom_data->>'industry' = 'Technology'
  AND (custom_data->>'employeeCount')::int > 100;

Transformation Code:

def transform_dynamic_fields(
    parent_id: str,
    fields: dict,
    path_prefix: str = ""
) -> list:
    """Transform dynamic fields into key-value rows with type preservation."""
    rows = []
    ordinal = 0

    for key, value in fields.items():
        current_path = f"{path_prefix}.{key}" if path_prefix else key

        if isinstance(value, dict):
            # Recurse into nested objects
            rows.extend(transform_dynamic_fields(parent_id, value, current_path))
        else:
            # Determine type
            if value is None:
                value_type = "null"
                str_value = None
            elif isinstance(value, bool):
                value_type = "boolean"
                str_value = str(value).lower()
            elif isinstance(value, (int, float)):
                value_type = "number"
                str_value = str(value)
            else:
                value_type = "string"
                str_value = str(value)

            rows.append({
                "entity_id": parent_id,
                "field_path": current_path,
                "field_key": key,
                "field_value": str_value,
                "value_type": value_type,
                "ordinal": ordinal
            })
            ordinal += 1

    return rows

Data Type Mapping

Understanding type conversions is critical. Here's our complete mapping for Supabase:

MongoDB Type	Supabase Type	Conversion Notes
`ObjectId`	`TEXT`	Hex string (24 chars). Consider `UUID` for new IDs
`String`	`TEXT`	Direct mapping
`Int32`	`INTEGER`	Direct mapping
`Int64` / `Long`	`BIGINT`	Direct mapping
`Double`	`DOUBLE PRECISION`	Direct mapping
`Boolean`	`BOOLEAN`	Direct mapping
`Date`	`TIMESTAMPTZ`	Always timezone-aware in Supabase
`Null`	`NULL`	Handle explicitly in code
`Array`	Child Table	With ordinal column
`Embedded Document`	Child Table	FK with CASCADE
`Binary`	`BYTEA` or Supabase Storage	Large files → Storage
`Decimal128`	`NUMERIC`	For precise decimals
`Mixed/Dynamic`	`JSONB` or KV Table	JSONB preferred for queries

Special Handling: IDs

Supabase prefers UUIDs for new records. For migrated data, we preserve MongoDB ObjectIds as TEXT:

def convert_object_id(oid) -> str:
    """Convert MongoDB ObjectId to string for Supabase."""
    if oid is None:
        return None
    return str(oid)  # Returns 24-character hex string

For new tables or columns, use Supabase's UUID generation:

CREATE TABLE new_entities (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    legacy_mongo_id TEXT UNIQUE,  -- Preserve old ID for reference
    -- ...
);

Special Handling: Dates

Supabase uses TIMESTAMPTZ which is timezone-aware:

from datetime import datetime, timezone

def convert_date(mongo_date) -> datetime:
    """Convert MongoDB date to Supabase timestamptz."""
    if mongo_date is None:
        return None
    if isinstance(mongo_date, datetime):
        # Ensure timezone awareness
        if mongo_date.tzinfo is None:
            return mongo_date.replace(tzinfo=timezone.utc)
        return mongo_date
    # Handle milliseconds since epoch
    return datetime.fromtimestamp(mongo_date / 1000, tz=timezone.utc)

Special Handling: Binary/Files

Large binary data should go to Supabase Storage instead of the database:

from supabase import create_client

supabase = create_client(SUPABASE_URL, SUPABASE_KEY)

def migrate_file(file_data: bytes, filename: str, bucket: str = "migrated-files"):
    """Upload binary data to Supabase Storage."""
    path = f"legacy/{filename}"
    result = supabase.storage.from_(bucket).upload(path, file_data)
    # Return the public URL or path for database reference
    return supabase.storage.from_(bucket).get_public_url(path)

The Transformation Engine

BSON Reader Layer

We read directly from BSON backup files rather than querying live MongoDB:

BSON Reading Implementation:

import struct
import bson

def read_bson_file(filepath: str):
    """Generator that yields documents from a BSON file."""
    with open(filepath, 'rb') as f:
        while True:
            # Read 4-byte document length (little-endian)
            length_bytes = f.read(4)
            if len(length_bytes) < 4:
                break  # EOF

            doc_length = struct.unpack('<i', length_bytes)[0]

            # Read rest of document (length includes the 4 bytes we already read)
            doc_body = f.read(doc_length - 4)
            if len(doc_body) < doc_length - 4:
                raise ValueError("Unexpected EOF in BSON file")

            # Parse BSON document
            full_doc = length_bytes + doc_body
            yield bson.decode(full_doc)

Connecting to Supabase for Migration

For bulk migration, connect directly to Supabase's PostgreSQL:

import psycopg2
from urllib.parse import urlparse

# Use the direct database connection string from Supabase dashboard
# Settings > Database > Connection string > URI
DATABASE_URL = "postgresql://postgres.[project-ref]:[password]@aws-0-[region].pooler.supabase.com:6543/postgres"

def get_supabase_connection():
    """Get direct PostgreSQL connection to Supabase."""
    return psycopg2.connect(DATABASE_URL)

Recursive Flattening Algorithm

The core transformation logic handles arbitrary nesting depth:

from dataclasses import dataclass, field
from typing import Any
import re

@dataclass
class FlattenResult:
    """Result of flattening a document."""
    main_row: dict
    child_tables: dict = field(default_factory=dict)

def flatten_document(
    doc: dict,
    table_name: str,
    parent_id: str = None,
    parent_key: str = None,
    max_depth: int = 10,
    current_depth: int = 0
) -> FlattenResult:
    """
    Recursively flatten a MongoDB document into Supabase table rows.

    Args:
        doc: The document to flatten
        table_name: Name of the current table
        parent_id: ID of the parent row (for child tables)
        parent_key: Foreign key column name
        max_depth: Maximum nesting depth (prevents infinite recursion)
        current_depth: Current recursion depth

    Returns:
        FlattenResult with main row and child table data
    """
    if current_depth > max_depth:
        raise ValueError(f"Max nesting depth exceeded: {max_depth}")

    result = FlattenResult(main_row={}, child_tables={})

    # Handle parent reference
    if parent_id and parent_key:
        result.main_row[parent_key] = parent_id

    # Get document ID
    doc_id = str(doc.get('_id', ''))

    for key, value in doc.items():
        if key == '_id':
            result.main_row['id'] = str(value)
            continue

        if value is None:
            result.main_row[to_snake_case(key)] = None

        elif isinstance(value, dict):
            # Nested object -> child table
            child_table = f"{table_name}_{to_snake_case(key)}"
            child_result = flatten_document(
                value,
                child_table,
                parent_id=doc_id,
                parent_key=f"{table_name}_id",
                max_depth=max_depth,
                current_depth=current_depth + 1
            )
            result.child_tables[child_table] = [child_result.main_row]
            # Merge nested child tables
            for ct_name, ct_rows in child_result.child_tables.items():
                result.child_tables.setdefault(ct_name, []).extend(ct_rows)

        elif isinstance(value, list):
            # Array -> child table with ordinal
            child_table = f"{table_name}_{to_snake_case(key)}"
            result.child_tables[child_table] = []

            for ordinal, item in enumerate(value):
                if isinstance(item, dict):
                    child_result = flatten_document(
                        item,
                        child_table,
                        parent_id=doc_id,
                        parent_key=f"{table_name}_id",
                        max_depth=max_depth,
                        current_depth=current_depth + 1
                    )
                    child_result.main_row['ordinal'] = ordinal
                    result.child_tables[child_table].append(child_result.main_row)
                    # Merge nested child tables
                    for ct_name, ct_rows in child_result.child_tables.items():
                        result.child_tables.setdefault(ct_name, []).extend(ct_rows)
                else:
                    # Simple value array
                    result.child_tables[child_table].append({
                        f"{table_name}_id": doc_id,
                        "value": convert_value(item),
                        "ordinal": ordinal
                    })

        else:
            # Scalar value
            result.main_row[to_snake_case(key)] = convert_value(value)

    return result


def to_snake_case(name: str) -> str:
    """Convert camelCase to snake_case."""
    s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()


def convert_value(value: Any) -> Any:
    """Convert MongoDB values to Supabase-compatible values."""
    from bson import ObjectId
    from datetime import datetime

    if isinstance(value, ObjectId):
        return str(value)
    elif isinstance(value, datetime):
        return value  # psycopg2 handles datetime
    elif isinstance(value, bytes):
        return value  # BYTEA
    elif isinstance(value, bool):
        return value
    elif isinstance(value, (int, float)):
        return value
    else:
        return str(value) if value is not None else None

Batch Processing Strategy

Inserting one row at a time is catastrophically slow. We batch inserts with configurable thresholds:

Batch Insert Implementation:

from typing import List
from threading import Lock

class BatchInserter:
    """Efficient batch insert manager for Supabase PostgreSQL."""

    def __init__(
        self,
        connection,
        table_name: str,
        columns: List[str],
        batch_size: int = 1000,
        on_conflict: str = "DO NOTHING"
    ):
        self.conn = connection
        self.table = table_name
        self.columns = columns
        self.batch_size = batch_size
        self.on_conflict = on_conflict
        self.buffer: List[tuple] = []
        self.total_inserted = 0
        self._lock = Lock()

    def add(self, row: dict):
        """Add a row to the buffer, flushing if threshold reached."""
        values = tuple(row.get(col) for col in self.columns)
        self.buffer.append(values)

        if len(self.buffer) >= self.batch_size:
            self.flush()

    def flush(self):
        """Execute batch insert for buffered rows."""
        if not self.buffer:
            return

        with self._lock:
            placeholders = ', '.join(['%s'] * len(self.columns))
            columns_str = ', '.join(self.columns)

            query = f"""
                INSERT INTO {self.table} ({columns_str})
                VALUES ({placeholders})
                ON CONFLICT {self.on_conflict}
            """

            with self.conn.cursor() as cur:
                cur.executemany(query, self.buffer)

            self.total_inserted += len(self.buffer)
            self.conn.commit()
            self.buffer.clear()

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.flush()  # Flush remaining on context exit

Handling Complex Patterns

Graph Structures (Workflow Nodes/Edges)

Many applications store graph data in documents. Workflows with nodes and edges are a common example:

Supabase Schema:

CREATE TABLE workflows (
    id TEXT PRIMARY KEY,
    organization_id UUID NOT NULL,
    name TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

ALTER TABLE workflows ENABLE ROW LEVEL SECURITY;

CREATE TABLE workflow_nodes (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    workflow_id TEXT REFERENCES workflows(id) ON DELETE CASCADE,
    node_type TEXT,
    position_x DOUBLE PRECISION,
    position_y DOUBLE PRECISION,
    label TEXT,
    config JSONB DEFAULT '{}',  -- Flexible node configuration
    ordinal INTEGER
);

ALTER TABLE workflow_nodes ENABLE ROW LEVEL SECURITY;

CREATE TABLE workflow_edges (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    workflow_id TEXT REFERENCES workflows(id) ON DELETE CASCADE,
    source_node_id TEXT,  -- References node by original ID
    target_node_id TEXT,
    edge_type TEXT,
    ordinal INTEGER
);

ALTER TABLE workflow_edges ENABLE ROW LEVEL SECURITY;

Multi-Tenant Architecture

Every table includes organization scoping. This is where Supabase's RLS shines (covered in next section):

-- Every table follows this pattern
CREATE TABLE entities (
    id TEXT PRIMARY KEY,
    organization_id UUID NOT NULL,  -- Tenant scoping
    -- ... other columns
);

-- RLS policy enforces tenant isolation
CREATE POLICY tenant_isolation ON entities
    FOR ALL
    USING (organization_id = (auth.jwt()->>'org_id')::uuid);

Audit Trail Preservation

Audit logs work beautifully with Supabase's JSONB support:

CREATE TABLE audit_logs (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organization_id UUID NOT NULL,
    resource_type TEXT NOT NULL,
    resource_id TEXT NOT NULL,
    action TEXT NOT NULL,           -- 'create', 'update', 'delete'
    actor_id UUID REFERENCES auth.users(id),
    actor_type TEXT,                -- 'user', 'system', 'api'
    timestamp TIMESTAMPTZ DEFAULT NOW(),
    changes JSONB,                  -- Before/after diff
    metadata JSONB                  -- Additional context
);

ALTER TABLE audit_logs ENABLE ROW LEVEL SECURITY;

-- Efficient queries
CREATE INDEX idx_audit_resource ON audit_logs(resource_type, resource_id);
CREATE INDEX idx_audit_org_time ON audit_logs(organization_id, timestamp DESC);
CREATE INDEX idx_audit_changes ON audit_logs USING GIN(changes);

Row Level Security for Multi-Tenancy

This is one of Supabase's killer features. Instead of filtering in every query, we enforce isolation at the database level.

Basic Tenant Isolation Pattern

-- Enable RLS on the table
ALTER TABLE entities ENABLE ROW LEVEL SECURITY;

-- Create policy using JWT claims
CREATE POLICY tenant_isolation ON entities
    FOR ALL
    USING (
        organization_id = (
            SELECT (auth.jwt()->>'org_id')::uuid
        )
    );

Policy Patterns by Operation

-- SELECT: Users see only their organization's data
CREATE POLICY select_own_org ON entities
    FOR SELECT
    USING (organization_id = auth.jwt()->>'org_id');

-- INSERT: New records must belong to user's organization
CREATE POLICY insert_own_org ON entities
    FOR INSERT
    WITH CHECK (organization_id = auth.jwt()->>'org_id');

-- UPDATE: Can only update own organization's records
CREATE POLICY update_own_org ON entities
    FOR UPDATE
    USING (organization_id = auth.jwt()->>'org_id')
    WITH CHECK (organization_id = auth.jwt()->>'org_id');

-- DELETE: Can only delete own organization's records
CREATE POLICY delete_own_org ON entities
    FOR DELETE
    USING (organization_id = auth.jwt()->>'org_id');

Role-Based Access Within Tenants

-- Admins can do everything within their org
CREATE POLICY admin_full_access ON entities
    FOR ALL
    USING (
        organization_id = auth.jwt()->>'org_id'
        AND auth.jwt()->>'role' = 'admin'
    );

-- Regular users can only view
CREATE POLICY user_read_only ON entities
    FOR SELECT
    USING (
        organization_id = auth.jwt()->>'org_id'
        AND auth.jwt()->>'role' = 'user'
    );

Service Role Bypass for Migration

During migration, use the service role to bypass RLS:

# For migration: use service role key (bypasses RLS)
# This is the "service_role" key from Supabase dashboard
SUPABASE_SERVICE_KEY = "eyJ..."

# Service role connection bypasses RLS
# Use direct database connection with service credentials

Post-Migration: Verify RLS

After migration, verify RLS is working:

-- Check which policies exist
SELECT tablename, policyname, cmd, qual
FROM pg_policies
WHERE schemaname = 'public';

-- Test as a specific user (in Supabase SQL editor)
SET request.jwt.claims = '{"org_id": "org_123", "role": "user"}';
SELECT * FROM entities;  -- Should only see org_123's data

Indexing Strategy

Proper indexing is critical for Supabase performance:

Index Categories

Category	Example	Index Type	Notes
Primary Key	`id`	B-Tree (automatic)	Use UUID for new tables
Foreign Key	`parent_id`	B-Tree	Critical for JOINs
Tenant Scope	`organization_id`	B-Tree	Required for RLS performance
Status	`status`	B-Tree	Often combined with org
Temporal	`created_at`	B-Tree DESC	For recent-first queries
JSONB	`custom_data`	GIN	For containment queries
Text Search	`name`	GIN (tsvector)	For full-text search

Index Creation Script Pattern

-- Foreign key indexes (critical for JOIN performance)
CREATE INDEX IF NOT EXISTS idx_entity_items_entity_id
    ON entity_items(entity_id);

-- Tenant-scoped queries (CRITICAL for RLS performance)
CREATE INDEX IF NOT EXISTS idx_entities_org
    ON entities(organization_id);

-- Combined index for common query patterns
CREATE INDEX IF NOT EXISTS idx_entities_org_status
    ON entities(organization_id, status);

-- Temporal queries
CREATE INDEX IF NOT EXISTS idx_entities_created
    ON entities(created_at DESC);

-- JSONB containment queries
CREATE INDEX IF NOT EXISTS idx_entities_custom_data
    ON entities USING GIN(custom_data);

-- JSONB specific key queries (expression index)
CREATE INDEX IF NOT EXISTS idx_entities_custom_industry
    ON entities((custom_data->>'industry'));

Index Count by Category

Our final index breakdown:

Category	Count	Percentage
Foreign Key	217	44%
Tenant Scope	89	18%
Status/Type	72	15%
Temporal	58	12%
Composite	41	8%
JSONB/GIN	14	3%
Total	491	100%

Foreign Key Strategy

Phased Constraint Creation

Critical insight: Creating foreign keys during data load is extremely slow. Each INSERT triggers constraint validation.

Our approach:

Create tables without foreign keys
Migrate all data (using service role)
Clean orphaned references
Add foreign keys
Create indexes
Enable and create RLS policies

Orphan Cleanup

Before adding foreign keys, clean up invalid references:

-- Find orphans: child references non-existent parent
SELECT child.id, child.parent_id
FROM child_table child
LEFT JOIN parent_table parent ON child.parent_id = parent.id
WHERE child.parent_id IS NOT NULL
  AND parent.id IS NULL;

-- Clean orphans: set to NULL
UPDATE child_table child
SET parent_id = NULL
WHERE parent_id IS NOT NULL
  AND NOT EXISTS (
    SELECT 1 FROM parent_table
    WHERE id = child.parent_id
  );

Cascade Strategy Decision Matrix

Relationship	ON DELETE	ON UPDATE	Reasoning
Strong ownership	CASCADE	CASCADE	Child meaningless without parent
Soft reference	SET NULL	CASCADE	Preserve child, clear reference
Lookup reference	RESTRICT	CASCADE	Prevent invalid state
Audit/history	NO ACTION	CASCADE	Never delete audit records

Leveraging Supabase Features

After migration, you can take advantage of Supabase's full feature set.

Realtime Subscriptions

Replace MongoDB change streams with Supabase Realtime:

import { createClient } from '@supabase/supabase-js'

const supabase = createClient(SUPABASE_URL, SUPABASE_ANON_KEY)

// Subscribe to changes on a table
const subscription = supabase
  .channel('entities-changes')
  .on(
    'postgres_changes',
    {
      event: '*',  // INSERT, UPDATE, DELETE
      schema: 'public',
      table: 'entities',
      filter: `organization_id=eq.${orgId}`
    },
    (payload) => {
      console.log('Change received:', payload)
      // Handle the change
    }
  )
  .subscribe()

Enable Realtime for Tables

-- Enable realtime for specific tables
ALTER PUBLICATION supabase_realtime ADD TABLE entities;
ALTER PUBLICATION supabase_realtime ADD TABLE entity_items;

Supabase Client for Application Code

Replace MongoDB driver with Supabase client:

// Before: MongoDB
const entities = await db.collection('entities')
  .find({ organizationId: orgId, status: 'active' })
  .toArray()

// After: Supabase (RLS handles org filtering automatically!)
const { data: entities, error } = await supabase
  .from('entities')
  .select('*')
  .eq('status', 'active')

Edge Functions for Complex Operations

Replace MongoDB aggregations with Edge Functions:

// supabase/functions/calculate-metrics/index.ts
import { serve } from 'https://deno.land/std@0.168.0/http/server.ts'
import { createClient } from 'https://esm.sh/@supabase/supabase-js@2'

serve(async (req) => {
  const supabase = createClient(
    Deno.env.get('SUPABASE_URL')!,
    Deno.env.get('SUPABASE_SERVICE_ROLE_KEY')!
  )

  // Complex aggregation that was painful in MongoDB
  const { data, error } = await supabase.rpc('calculate_entity_metrics')

  return new Response(JSON.stringify(data), {
    headers: { 'Content-Type': 'application/json' }
  })
})

Database Functions for Complex Queries

-- Create a function for complex aggregations
CREATE OR REPLACE FUNCTION calculate_entity_metrics(org_id UUID)
RETURNS TABLE (
    status TEXT,
    count BIGINT,
    avg_items NUMERIC
) AS $$
BEGIN
    RETURN QUERY
    SELECT
        e.status,
        COUNT(*)::BIGINT,
        AVG(item_count)::NUMERIC
    FROM entities e
    LEFT JOIN (
        SELECT entity_id, COUNT(*) as item_count
        FROM entity_items
        GROUP BY entity_id
    ) items ON e.id = items.entity_id
    WHERE e.organization_id = org_id
    GROUP BY e.status;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;

Storage Integration

Move binary data to Supabase Storage:

// Upload file
const { data, error } = await supabase.storage
  .from('documents')
  .upload(`${orgId}/${entityId}/${filename}`, file)

// Get public URL
const { data: { publicUrl } } = supabase.storage
  .from('documents')
  .getPublicUrl(`${orgId}/${entityId}/${filename}`)

// Store URL in database
await supabase
  .from('entity_attachments')
  .insert({ entity_id: entityId, file_url: publicUrl })

Error Handling & Data Quality

Graceful Degradation

A single bad document should not stop the entire migration:

import logging
from dataclasses import dataclass
from threading import Lock

@dataclass
class MigrationStats:
    """Thread-safe migration statistics."""
    documents_processed: int = 0
    documents_failed: int = 0
    rows_inserted: int = 0
    _lock: Lock = None

    def __post_init__(self):
        self._lock = Lock()

    def record_success(self, rows: int):
        with self._lock:
            self.documents_processed += 1
            self.rows_inserted += rows

    def record_failure(self):
        with self._lock:
            self.documents_processed += 1
            self.documents_failed += 1


def migrate_with_error_handling(bson_path: str, db_connection) -> MigrationStats:
    """Migrate collection with graceful error handling."""
    stats = MigrationStats()
    logger = logging.getLogger(__name__)

    for doc in read_bson_file(bson_path):
        try:
            result = flatten_document(doc, 'entities')
            rows_inserted = insert_all_rows(db_connection, result)
            stats.record_success(rows_inserted)

        except Exception as e:
            stats.record_failure()
            doc_id = doc.get('_id', 'unknown')
            logger.error(f"Failed to migrate document {doc_id}: {e}")

            # Save failed documents for retry
            save_failed_document(doc, str(e))

    return stats

Data Validation

MongoDB allows messy data. Sanitize during transformation:

from typing import Optional

def sanitize_text(value: Optional[str]) -> Optional[str]:
    """Sanitize text for Supabase insertion."""
    if value is None:
        return None

    # Remove null bytes (PostgreSQL doesn't allow \x00)
    value = value.replace('\x00', '')

    # Ensure valid UTF-8
    value = value.encode('utf-8', errors='replace').decode('utf-8')

    # Trim excessive whitespace
    value = ' '.join(value.split())

    return value if value else None

Performance Optimizations

Parallel Collection Processing

Collections are independent—process them in parallel:

from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import List

def migrate_all_collections(
    collection_paths: List[str],
    db_connection_factory,
    max_workers: int = 4
) -> dict:
    """Migrate multiple collections in parallel."""
    results = {}

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {}

        for path in collection_paths:
            conn = db_connection_factory()
            future = executor.submit(migrate_collection, path, conn)
            futures[future] = path

        for future in as_completed(futures):
            path = futures[future]
            try:
                stats = future.result()
                results[path] = stats
                print(f"✓ {path}: {stats.documents_processed} docs")
            except Exception as e:
                print(f"✗ {path}: {e}")
                results[path] = None

    return results

Supabase Connection Pooling

Use Supabase's connection pooler for better performance:

# Use pooler connection string (port 6543) instead of direct (port 5432)
# Transaction mode for short queries
POOLER_URL = "postgresql://postgres.[ref]:[pass]@aws-0-[region].pooler.supabase.com:6543/postgres"

# Session mode for migrations (maintains state)
DIRECT_URL = "postgresql://postgres.[ref]:[pass]@db.[ref].supabase.co:5432/postgres"

Deferred Constraint Validation

Disable triggers during bulk load:

-- Before migration (run as superuser/service role)
SET session_replication_role = replica;

-- ... perform bulk insert ...

-- After migration
SET session_replication_role = DEFAULT;

Lessons Learned

1. Plan for Orphaned References

Problem: MongoDB doesn't enforce referential integrity. We found thousands of references to deleted documents.

Solution: Build orphan detection into your pipeline. Run cleanup before adding foreign keys.

2. RLS Performance Requires Indexes

Problem: RLS policies that filter by organization_id are slow without proper indexes.

Solution: Always create indexes on columns used in RLS policies BEFORE enabling policies.

-- Create index BEFORE policy
CREATE INDEX idx_entities_org ON entities(organization_id);

-- Then create policy
CREATE POLICY tenant_isolation ON entities
    USING (organization_id = auth.jwt()->>'org_id');

3. Test with Production-Scale Data

Problem: A migration that works with 1,000 documents may fail with 100,000.

Solution: Test with full production data volume. Monitor memory and query performance.

4. Use Service Role for Migration

Problem: RLS policies block migration inserts if organization_id doesn't match JWT.

Solution: Use service role key during migration (bypasses RLS), then verify RLS works after.

5. Supabase CLI is Your Friend

Solution: Use Supabase CLI for:

Schema migrations (supabase migration new)
Local development (supabase start)
Type generation (supabase gen types typescript)

6. Handle Time Zones Explicitly

Problem: MongoDB dates might be in different time zones.

Solution: Always use TIMESTAMPTZ and explicitly convert to UTC during migration.

Results & Metrics

Migration Statistics

Metric	Value
Source Collections	65
Target Tables	282
Total Rows	~80,000
Foreign Keys	217
Indexes	491
RLS Policies	156
Orphaned References Cleaned	2,847
Failed Documents	12
Success Rate	99.98%

Performance Comparison (Post-Migration)

Query Type	MongoDB	Supabase	Improvement
Simple lookup	5ms	2ms	60%
Aggregation	250ms	45ms	82%
Join (3 tables)	N/A*	15ms	-
Reporting query	2.5s	180ms	93%
Realtime updates	Custom code	Built-in	∞

*MongoDB required multiple queries and application-level joins.

Developer Experience Improvements

Aspect	MongoDB	Supabase
Multi-tenant security	Application code	Database RLS
Realtime	Custom change streams	Built-in
Type safety	Manual	Generated types
Auth integration	Separate system	Integrated
Dashboard	MongoDB Compass	Supabase Studio

Conclusion

Migrating from MongoDB to Supabase is more than a database change—it's an upgrade to a complete backend platform. The relational model brings data integrity, RLS brings security, and Supabase's ecosystem brings developer productivity.

Key Takeaways:

Plan normalization carefully. Each nested structure needs a strategy.
Embrace RLS. It's not just security—it simplifies your application code.
Phase your migration. Schema → Data → Constraints → RLS policies.
Use the full platform. Realtime, Storage, Edge Functions, and Auth are included.
Test at scale. Performance characteristics change with data volume.

Next Steps After Migration

Validation: Compare counts, verify data integrity, test RLS policies
Update Application: Switch to Supabase client, remove manual tenant filtering
Enable Realtime: Add subscriptions where needed
Migrate Files: Move binary data to Supabase Storage
Monitor: Use Supabase dashboard for query performance

Useful Resources

This post is based on a real production migration. Numbers have been rounded, and specific table names have been generalized.

Questions about your own migration? Drop them in the comments!