DEV Community

Cover image for From Documents to Relations: A Complete Guide to Migrating MongoDB to Supabase at Scale
Prince Raj
Prince Raj

Posted on

From Documents to Relations: A Complete Guide to Migrating MongoDB to Supabase at Scale

A deep dive into database migration patterns, normalization strategies, and leveraging Supabase's powerful features when moving from document stores to a modern PostgreSQL platform.


TL;DR

We migrated a production system from MongoDB to Supabase:

Metric Value
Collections Migrated 65
Tables Created 282
Rows Migrated ~80,000
Foreign Keys 217
Indexes 491
RLS Policies 156
Migration Approach Zero downtime

This guide covers the complete journey—from understanding why we chose Supabase, through the technical implementation, to leveraging Supabase's unique features post-migration.


Table of Contents

  1. Why We Chose Supabase
  2. The Challenge: Document vs Relational Paradigm
  3. Architecture Overview
  4. Normalization Strategies
  5. Data Type Mapping
  6. The Transformation Engine
  7. Handling Complex Patterns
  8. Row Level Security for Multi-Tenancy
  9. Indexing Strategy
  10. Foreign Key Strategy
  11. Leveraging Supabase Features
  12. Error Handling & Data Quality
  13. Performance Optimizations
  14. Lessons Learned
  15. Conclusion

Why We Chose Supabase

The Business Case

After months of running MongoDB in production, several pain points emerged. We evaluated multiple options and chose Supabase for these reasons:

1. ACID Compliance with Developer Experience

Supabase gives us PostgreSQL's rock-solid transactional guarantees wrapped in a modern developer experience. The dashboard, client libraries, and tooling made the transition smoother than raw PostgreSQL.

2. Built-in Row Level Security (RLS)

Our multi-tenant SaaS needed tenant isolation. MongoDB required application-level enforcement everywhere. Supabase's RLS policies enforce isolation at the database level—impossible to bypass.

-- One policy, complete tenant isolation
CREATE POLICY tenant_isolation ON entities
    USING (organization_id = auth.jwt()->>'org_id');
Enter fullscreen mode Exit fullscreen mode

3. Realtime Subscriptions

We had a custom change stream implementation on MongoDB. Supabase Realtime gave us this out of the box with WebSocket support and client library integration.

4. Complex Reporting Needs

Our analytics team struggled with MongoDB aggregations. Queries that took seconds in PostgreSQL required complex multi-stage pipelines in MongoDB. With Supabase, we can write standard SQL and even use their built-in pg_graphql extension.

5. Integrated Auth

Supabase Auth integrates seamlessly with RLS. User context flows from authentication through to database policies without custom middleware.

6. Cost Optimization

MongoDB's memory-mapped storage consumed significantly more RAM. Supabase's pricing model (based on PostgreSQL) was more predictable and cost-effective for our workload.

7. Edge Functions & Storage

Moving to Supabase meant we could consolidate our file storage (previously S3) and serverless functions into one platform.


The Challenge: Document vs Relational Paradigm

Before diving into implementation, it's crucial to understand the fundamental paradigm shift we're dealing with.

MongoDB Mapping

MongoDB's Document Model

A typical MongoDB document looks like this:

{
  "_id": "507f1f77bcf86cd799439011",
  "name": "Enterprise Account",
  "contacts": [
    { "email": "alice@example.com", "role": "admin" },
    { "email": "bob@example.com", "role": "viewer" }
  ],
  "settings": {
    "theme": "dark",
    "notifications": {
      "email": true,
      "sms": false
    }
  },
  "tags": ["premium", "enterprise", "active"],
  "customFields": {
    "industry": "Technology",
    "employeeCount": 500
  }
}
Enter fullscreen mode Exit fullscreen mode

This single document contains:

  • Nested objects (settings, notifications)
  • Arrays of objects (contacts)
  • Simple arrays (tags)
  • Dynamic fields (customFields with arbitrary keys)

Supabase's Relational Model

The same data in Supabase requires multiple tables:

-- Main table
CREATE TABLE accounts (
    id TEXT PRIMARY KEY,
    organization_id UUID NOT NULL REFERENCES organizations(id),
    name TEXT NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Enable RLS
ALTER TABLE accounts ENABLE ROW LEVEL SECURITY;

-- 1:1 relationship for nested object
CREATE TABLE account_settings (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id TEXT REFERENCES accounts(id) ON DELETE CASCADE,
    theme TEXT DEFAULT 'light',
    UNIQUE(account_id)
);

-- Deeper nesting
CREATE TABLE account_notification_settings (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    settings_id UUID REFERENCES account_settings(id) ON DELETE CASCADE,
    email BOOLEAN DEFAULT true,
    sms BOOLEAN DEFAULT false,
    UNIQUE(settings_id)
);

-- Array of objects
CREATE TABLE account_contacts (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id TEXT REFERENCES accounts(id) ON DELETE CASCADE,
    email TEXT NOT NULL,
    role TEXT,
    ordinal INTEGER NOT NULL  -- Preserves array order
);

-- Simple array
CREATE TABLE account_tags (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id TEXT REFERENCES accounts(id) ON DELETE CASCADE,
    tag TEXT NOT NULL,
    ordinal INTEGER NOT NULL
);

-- Dynamic fields (key-value pattern)
CREATE TABLE account_custom_fields (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    account_id TEXT REFERENCES accounts(id) ON DELETE CASCADE,
    field_key TEXT NOT NULL,
    field_value TEXT,
    value_type TEXT NOT NULL  -- For type preservation
);
Enter fullscreen mode Exit fullscreen mode

One document became six tables. Multiply this by 65 collections, and you understand why 65 collections became 282 tables.


Architecture Overview

Our migration architecture followed a three-phase approach:

migration structure

migration structure

Phase 1: Schema Creation

We used Supabase CLI migrations to manage our schema:

# Create a new migration
supabase migration new create_entities_table

# Apply migrations
supabase db push
Enter fullscreen mode Exit fullscreen mode

Example migration file (supabase/migrations/01_entities.sql):

-- Create main table
CREATE TABLE IF NOT EXISTS entities (
    id TEXT PRIMARY KEY,
    organization_id UUID NOT NULL,
    name TEXT,
    status TEXT DEFAULT 'active',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Enable RLS (required for Supabase security model)
ALTER TABLE entities ENABLE ROW LEVEL SECURITY;

-- Create child table for metadata
CREATE TABLE IF NOT EXISTS entity_metadata (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    entity_id TEXT REFERENCES entities(id) ON DELETE CASCADE,
    key TEXT NOT NULL,
    value TEXT,
    ordinal INTEGER DEFAULT 0
);

ALTER TABLE entity_metadata ENABLE ROW LEVEL SECURITY;
Enter fullscreen mode Exit fullscreen mode

Phase 2: Data Migration

The bulk data migration reads from BSON backup files (not live MongoDB) to avoid impacting production. We connect directly to Supabase's PostgreSQL connection string.

Phase 3: Constraints & Security

Foreign keys, indexes, and RLS policies are created after all data is loaded. This is critical for performance.


Normalization Strategies

This section is the heart of the migration. Understanding these patterns will help you tackle any document-to-relational transformation.

First Normal Form: Eliminating Arrays

The Problem: MongoDB arrays have no equivalent in relational databases.

The Solution: Create child tables with an ordinal column to preserve order.

parent-child structure

Before (MongoDB):

{
  "_id": "123",
  "name": "Widget",
  "categories": ["electronics", "gadgets", "sale"]
}
Enter fullscreen mode Exit fullscreen mode

After (Supabase):

-- products table
| id  | name   |
|-----|--------|
| 123 | Widget |

-- product_categories table
| id                                   | product_id | category    | ordinal |
|--------------------------------------|------------|-------------|---------|
| a1b2c3d4-e5f6-7890-abcd-ef1234567890 | 123        | electronics | 0       |
| b2c3d4e5-f6a7-8901-bcde-f12345678901 | 123        | gadgets     | 1       |
| c3d4e5f6-a7b8-9012-cdef-123456789012 | 123        | sale        | 2       |
Enter fullscreen mode Exit fullscreen mode

Transformation Code Pattern:

def transform_array_field(parent_id: str, array: list, field_name: str) -> list:
    """Transform a MongoDB array into rows for a Supabase child table."""
    rows = []
    for ordinal, value in enumerate(array):
        rows.append({
            "parent_id": parent_id,
            field_name: value,
            "ordinal": ordinal
        })
    return rows
Enter fullscreen mode Exit fullscreen mode

Second Normal Form: Extracting Nested Objects

The Problem: Embedded documents in MongoDB need their own tables.

The Solution: Create child tables with foreign keys. Use UNIQUE constraints for 1:1 relationships.

Naming Convention: parent_nested_field (e.g., user_profile, order_shipping_address)

naming convention

Before (MongoDB):

{
  "_id": "user_456",
  "email": "user@example.com",
  "profile": {
    "bio": "Software engineer",
    "avatar": "https://example.com/avatar.jpg",
    "social": {
      "twitter": "@user",
      "github": "user"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

After (Supabase):

-- users table
| id       | email             |
|----------|-------------------|
| user_456 | user@example.com  |

-- user_profiles table (1:1 with UNIQUE constraint)
| id  | user_id  | bio               | avatar                        |
|-----|----------|-------------------|-------------------------------|
| uuid| user_456 | Software engineer | https://example.com/avatar.jpg|

-- user_profile_social table (1:1 with profile)
| id  | profile_id | twitter | github |
|-----|------------|---------|--------|
| uuid| uuid       | @user   | user   |
Enter fullscreen mode Exit fullscreen mode

Third Normal Form: Transitive Dependencies

The Problem: Multi-level nesting creates transitive dependencies.

The Solution: Each level gets its own table with references only to its immediate parent.

Example: 4-Level Deep Structure

Nesting

Transformation Pattern:

def flatten_nested_structure(doc: dict, parent_id: str = None, depth: int = 0):
    """Recursively flatten nested structures into separate table rows."""
    results = {}

    # Extract scalar fields for current level
    current_row = {k: v for k, v in doc.items()
                   if not isinstance(v, (dict, list))}

    if parent_id:
        current_row["parent_id"] = parent_id

    # Process nested objects
    for key, value in doc.items():
        if isinstance(value, dict):
            child_table = f"{current_table}_{key}"
            results[child_table] = flatten_nested_structure(
                value,
                parent_id=current_row["id"],
                depth=depth + 1
            )
        elif isinstance(value, list):
            # Handle arrays (covered in 1NF section)
            pass

    return results
Enter fullscreen mode Exit fullscreen mode

Handling Dynamic/Schemaless Data

The Problem: MongoDB allows arbitrary fields. Supabase requires defined columns.

The Solution: Key-value pattern with type discriminators, or JSONB columns for query-heavy dynamic data.

JSONB structure

Option 1: Key-Value Table (Better for simple lookups)

CREATE TABLE entity_custom_fields (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    entity_id TEXT REFERENCES entities(id) ON DELETE CASCADE,
    field_path TEXT NOT NULL,      -- Dot notation for nested: "address.city"
    field_key TEXT NOT NULL,       -- Final key name
    field_value TEXT,              -- Stored as text
    value_type TEXT NOT NULL,      -- 'string', 'number', 'boolean', 'null'
    ordinal INTEGER DEFAULT 0
);

-- Index for efficient lookups
CREATE INDEX idx_entity_custom_fields_lookup
    ON entity_custom_fields(entity_id, field_key);
Enter fullscreen mode Exit fullscreen mode

Option 2: JSONB Column (Better for complex queries)

-- Add JSONB column for flexible querying
ALTER TABLE entities ADD COLUMN custom_data JSONB DEFAULT '{}';

-- Create GIN index for JSONB queries
CREATE INDEX idx_entities_custom_data ON entities USING GIN(custom_data);

-- Query example
SELECT * FROM entities
WHERE custom_data->>'industry' = 'Technology'
  AND (custom_data->>'employeeCount')::int > 100;
Enter fullscreen mode Exit fullscreen mode

Transformation Code:

def transform_dynamic_fields(
    parent_id: str,
    fields: dict,
    path_prefix: str = ""
) -> list:
    """Transform dynamic fields into key-value rows with type preservation."""
    rows = []
    ordinal = 0

    for key, value in fields.items():
        current_path = f"{path_prefix}.{key}" if path_prefix else key

        if isinstance(value, dict):
            # Recurse into nested objects
            rows.extend(transform_dynamic_fields(parent_id, value, current_path))
        else:
            # Determine type
            if value is None:
                value_type = "null"
                str_value = None
            elif isinstance(value, bool):
                value_type = "boolean"
                str_value = str(value).lower()
            elif isinstance(value, (int, float)):
                value_type = "number"
                str_value = str(value)
            else:
                value_type = "string"
                str_value = str(value)

            rows.append({
                "entity_id": parent_id,
                "field_path": current_path,
                "field_key": key,
                "field_value": str_value,
                "value_type": value_type,
                "ordinal": ordinal
            })
            ordinal += 1

    return rows
Enter fullscreen mode Exit fullscreen mode

Data Type Mapping

Understanding type conversions is critical. Here's our complete mapping for Supabase:

MongoDB Type Supabase Type Conversion Notes
ObjectId TEXT Hex string (24 chars). Consider UUID for new IDs
String TEXT Direct mapping
Int32 INTEGER Direct mapping
Int64 / Long BIGINT Direct mapping
Double DOUBLE PRECISION Direct mapping
Boolean BOOLEAN Direct mapping
Date TIMESTAMPTZ Always timezone-aware in Supabase
Null NULL Handle explicitly in code
Array Child Table With ordinal column
Embedded Document Child Table FK with CASCADE
Binary BYTEA or Supabase Storage Large files → Storage
Decimal128 NUMERIC For precise decimals
Mixed/Dynamic JSONB or KV Table JSONB preferred for queries

Special Handling: IDs

Supabase prefers UUIDs for new records. For migrated data, we preserve MongoDB ObjectIds as TEXT:

def convert_object_id(oid) -> str:
    """Convert MongoDB ObjectId to string for Supabase."""
    if oid is None:
        return None
    return str(oid)  # Returns 24-character hex string
Enter fullscreen mode Exit fullscreen mode

For new tables or columns, use Supabase's UUID generation:

CREATE TABLE new_entities (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    legacy_mongo_id TEXT UNIQUE,  -- Preserve old ID for reference
    -- ...
);
Enter fullscreen mode Exit fullscreen mode

Special Handling: Dates

Supabase uses TIMESTAMPTZ which is timezone-aware:

from datetime import datetime, timezone

def convert_date(mongo_date) -> datetime:
    """Convert MongoDB date to Supabase timestamptz."""
    if mongo_date is None:
        return None
    if isinstance(mongo_date, datetime):
        # Ensure timezone awareness
        if mongo_date.tzinfo is None:
            return mongo_date.replace(tzinfo=timezone.utc)
        return mongo_date
    # Handle milliseconds since epoch
    return datetime.fromtimestamp(mongo_date / 1000, tz=timezone.utc)
Enter fullscreen mode Exit fullscreen mode

Special Handling: Binary/Files

Large binary data should go to Supabase Storage instead of the database:

from supabase import create_client

supabase = create_client(SUPABASE_URL, SUPABASE_KEY)

def migrate_file(file_data: bytes, filename: str, bucket: str = "migrated-files"):
    """Upload binary data to Supabase Storage."""
    path = f"legacy/{filename}"
    result = supabase.storage.from_(bucket).upload(path, file_data)
    # Return the public URL or path for database reference
    return supabase.storage.from_(bucket).get_public_url(path)
Enter fullscreen mode Exit fullscreen mode

The Transformation Engine

BSON Reader Layer

We read directly from BSON backup files rather than querying live MongoDB:

BSON structure

BSON Reading Implementation:

import struct
import bson

def read_bson_file(filepath: str):
    """Generator that yields documents from a BSON file."""
    with open(filepath, 'rb') as f:
        while True:
            # Read 4-byte document length (little-endian)
            length_bytes = f.read(4)
            if len(length_bytes) < 4:
                break  # EOF

            doc_length = struct.unpack('<i', length_bytes)[0]

            # Read rest of document (length includes the 4 bytes we already read)
            doc_body = f.read(doc_length - 4)
            if len(doc_body) < doc_length - 4:
                raise ValueError("Unexpected EOF in BSON file")

            # Parse BSON document
            full_doc = length_bytes + doc_body
            yield bson.decode(full_doc)
Enter fullscreen mode Exit fullscreen mode

Connecting to Supabase for Migration

For bulk migration, connect directly to Supabase's PostgreSQL:

import psycopg2
from urllib.parse import urlparse

# Use the direct database connection string from Supabase dashboard
# Settings > Database > Connection string > URI
DATABASE_URL = "postgresql://postgres.[project-ref]:[password]@aws-0-[region].pooler.supabase.com:6543/postgres"

def get_supabase_connection():
    """Get direct PostgreSQL connection to Supabase."""
    return psycopg2.connect(DATABASE_URL)
Enter fullscreen mode Exit fullscreen mode

Recursive Flattening Algorithm

The core transformation logic handles arbitrary nesting depth:

from dataclasses import dataclass, field
from typing import Any
import re

@dataclass
class FlattenResult:
    """Result of flattening a document."""
    main_row: dict
    child_tables: dict = field(default_factory=dict)

def flatten_document(
    doc: dict,
    table_name: str,
    parent_id: str = None,
    parent_key: str = None,
    max_depth: int = 10,
    current_depth: int = 0
) -> FlattenResult:
    """
    Recursively flatten a MongoDB document into Supabase table rows.

    Args:
        doc: The document to flatten
        table_name: Name of the current table
        parent_id: ID of the parent row (for child tables)
        parent_key: Foreign key column name
        max_depth: Maximum nesting depth (prevents infinite recursion)
        current_depth: Current recursion depth

    Returns:
        FlattenResult with main row and child table data
    """
    if current_depth > max_depth:
        raise ValueError(f"Max nesting depth exceeded: {max_depth}")

    result = FlattenResult(main_row={}, child_tables={})

    # Handle parent reference
    if parent_id and parent_key:
        result.main_row[parent_key] = parent_id

    # Get document ID
    doc_id = str(doc.get('_id', ''))

    for key, value in doc.items():
        if key == '_id':
            result.main_row['id'] = str(value)
            continue

        if value is None:
            result.main_row[to_snake_case(key)] = None

        elif isinstance(value, dict):
            # Nested object -> child table
            child_table = f"{table_name}_{to_snake_case(key)}"
            child_result = flatten_document(
                value,
                child_table,
                parent_id=doc_id,
                parent_key=f"{table_name}_id",
                max_depth=max_depth,
                current_depth=current_depth + 1
            )
            result.child_tables[child_table] = [child_result.main_row]
            # Merge nested child tables
            for ct_name, ct_rows in child_result.child_tables.items():
                result.child_tables.setdefault(ct_name, []).extend(ct_rows)

        elif isinstance(value, list):
            # Array -> child table with ordinal
            child_table = f"{table_name}_{to_snake_case(key)}"
            result.child_tables[child_table] = []

            for ordinal, item in enumerate(value):
                if isinstance(item, dict):
                    child_result = flatten_document(
                        item,
                        child_table,
                        parent_id=doc_id,
                        parent_key=f"{table_name}_id",
                        max_depth=max_depth,
                        current_depth=current_depth + 1
                    )
                    child_result.main_row['ordinal'] = ordinal
                    result.child_tables[child_table].append(child_result.main_row)
                    # Merge nested child tables
                    for ct_name, ct_rows in child_result.child_tables.items():
                        result.child_tables.setdefault(ct_name, []).extend(ct_rows)
                else:
                    # Simple value array
                    result.child_tables[child_table].append({
                        f"{table_name}_id": doc_id,
                        "value": convert_value(item),
                        "ordinal": ordinal
                    })

        else:
            # Scalar value
            result.main_row[to_snake_case(key)] = convert_value(value)

    return result


def to_snake_case(name: str) -> str:
    """Convert camelCase to snake_case."""
    s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()


def convert_value(value: Any) -> Any:
    """Convert MongoDB values to Supabase-compatible values."""
    from bson import ObjectId
    from datetime import datetime

    if isinstance(value, ObjectId):
        return str(value)
    elif isinstance(value, datetime):
        return value  # psycopg2 handles datetime
    elif isinstance(value, bytes):
        return value  # BYTEA
    elif isinstance(value, bool):
        return value
    elif isinstance(value, (int, float)):
        return value
    else:
        return str(value) if value is not None else None
Enter fullscreen mode Exit fullscreen mode

Batch Processing Strategy

Inserting one row at a time is catastrophically slow. We batch inserts with configurable thresholds:

Batch Processing

Batch Insert Implementation:

from typing import List
from threading import Lock

class BatchInserter:
    """Efficient batch insert manager for Supabase PostgreSQL."""

    def __init__(
        self,
        connection,
        table_name: str,
        columns: List[str],
        batch_size: int = 1000,
        on_conflict: str = "DO NOTHING"
    ):
        self.conn = connection
        self.table = table_name
        self.columns = columns
        self.batch_size = batch_size
        self.on_conflict = on_conflict
        self.buffer: List[tuple] = []
        self.total_inserted = 0
        self._lock = Lock()

    def add(self, row: dict):
        """Add a row to the buffer, flushing if threshold reached."""
        values = tuple(row.get(col) for col in self.columns)
        self.buffer.append(values)

        if len(self.buffer) >= self.batch_size:
            self.flush()

    def flush(self):
        """Execute batch insert for buffered rows."""
        if not self.buffer:
            return

        with self._lock:
            placeholders = ', '.join(['%s'] * len(self.columns))
            columns_str = ', '.join(self.columns)

            query = f"""
                INSERT INTO {self.table} ({columns_str})
                VALUES ({placeholders})
                ON CONFLICT {self.on_conflict}
            """

            with self.conn.cursor() as cur:
                cur.executemany(query, self.buffer)

            self.total_inserted += len(self.buffer)
            self.conn.commit()
            self.buffer.clear()

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.flush()  # Flush remaining on context exit
Enter fullscreen mode Exit fullscreen mode

Handling Complex Patterns

Graph Structures (Workflow Nodes/Edges)

Many applications store graph data in documents. Workflows with nodes and edges are a common example:

Workflows

Supabase Schema:

CREATE TABLE workflows (
    id TEXT PRIMARY KEY,
    organization_id UUID NOT NULL,
    name TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

ALTER TABLE workflows ENABLE ROW LEVEL SECURITY;

CREATE TABLE workflow_nodes (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    workflow_id TEXT REFERENCES workflows(id) ON DELETE CASCADE,
    node_type TEXT,
    position_x DOUBLE PRECISION,
    position_y DOUBLE PRECISION,
    label TEXT,
    config JSONB DEFAULT '{}',  -- Flexible node configuration
    ordinal INTEGER
);

ALTER TABLE workflow_nodes ENABLE ROW LEVEL SECURITY;

CREATE TABLE workflow_edges (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    workflow_id TEXT REFERENCES workflows(id) ON DELETE CASCADE,
    source_node_id TEXT,  -- References node by original ID
    target_node_id TEXT,
    edge_type TEXT,
    ordinal INTEGER
);

ALTER TABLE workflow_edges ENABLE ROW LEVEL SECURITY;
Enter fullscreen mode Exit fullscreen mode

Multi-Tenant Architecture

Every table includes organization scoping. This is where Supabase's RLS shines (covered in next section):

-- Every table follows this pattern
CREATE TABLE entities (
    id TEXT PRIMARY KEY,
    organization_id UUID NOT NULL,  -- Tenant scoping
    -- ... other columns
);

-- RLS policy enforces tenant isolation
CREATE POLICY tenant_isolation ON entities
    FOR ALL
    USING (organization_id = (auth.jwt()->>'org_id')::uuid);
Enter fullscreen mode Exit fullscreen mode

Audit Trail Preservation

Audit logs work beautifully with Supabase's JSONB support:

CREATE TABLE audit_logs (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organization_id UUID NOT NULL,
    resource_type TEXT NOT NULL,
    resource_id TEXT NOT NULL,
    action TEXT NOT NULL,           -- 'create', 'update', 'delete'
    actor_id UUID REFERENCES auth.users(id),
    actor_type TEXT,                -- 'user', 'system', 'api'
    timestamp TIMESTAMPTZ DEFAULT NOW(),
    changes JSONB,                  -- Before/after diff
    metadata JSONB                  -- Additional context
);

ALTER TABLE audit_logs ENABLE ROW LEVEL SECURITY;

-- Efficient queries
CREATE INDEX idx_audit_resource ON audit_logs(resource_type, resource_id);
CREATE INDEX idx_audit_org_time ON audit_logs(organization_id, timestamp DESC);
CREATE INDEX idx_audit_changes ON audit_logs USING GIN(changes);
Enter fullscreen mode Exit fullscreen mode

Row Level Security for Multi-Tenancy

This is one of Supabase's killer features. Instead of filtering in every query, we enforce isolation at the database level.

Basic Tenant Isolation Pattern

-- Enable RLS on the table
ALTER TABLE entities ENABLE ROW LEVEL SECURITY;

-- Create policy using JWT claims
CREATE POLICY tenant_isolation ON entities
    FOR ALL
    USING (
        organization_id = (
            SELECT (auth.jwt()->>'org_id')::uuid
        )
    );
Enter fullscreen mode Exit fullscreen mode

Policy Patterns by Operation

-- SELECT: Users see only their organization's data
CREATE POLICY select_own_org ON entities
    FOR SELECT
    USING (organization_id = auth.jwt()->>'org_id');

-- INSERT: New records must belong to user's organization
CREATE POLICY insert_own_org ON entities
    FOR INSERT
    WITH CHECK (organization_id = auth.jwt()->>'org_id');

-- UPDATE: Can only update own organization's records
CREATE POLICY update_own_org ON entities
    FOR UPDATE
    USING (organization_id = auth.jwt()->>'org_id')
    WITH CHECK (organization_id = auth.jwt()->>'org_id');

-- DELETE: Can only delete own organization's records
CREATE POLICY delete_own_org ON entities
    FOR DELETE
    USING (organization_id = auth.jwt()->>'org_id');
Enter fullscreen mode Exit fullscreen mode

Role-Based Access Within Tenants

-- Admins can do everything within their org
CREATE POLICY admin_full_access ON entities
    FOR ALL
    USING (
        organization_id = auth.jwt()->>'org_id'
        AND auth.jwt()->>'role' = 'admin'
    );

-- Regular users can only view
CREATE POLICY user_read_only ON entities
    FOR SELECT
    USING (
        organization_id = auth.jwt()->>'org_id'
        AND auth.jwt()->>'role' = 'user'
    );
Enter fullscreen mode Exit fullscreen mode

Service Role Bypass for Migration

During migration, use the service role to bypass RLS:

# For migration: use service role key (bypasses RLS)
# This is the "service_role" key from Supabase dashboard
SUPABASE_SERVICE_KEY = "eyJ..."

# Service role connection bypasses RLS
# Use direct database connection with service credentials
Enter fullscreen mode Exit fullscreen mode

Post-Migration: Verify RLS

After migration, verify RLS is working:

-- Check which policies exist
SELECT tablename, policyname, cmd, qual
FROM pg_policies
WHERE schemaname = 'public';

-- Test as a specific user (in Supabase SQL editor)
SET request.jwt.claims = '{"org_id": "org_123", "role": "user"}';
SELECT * FROM entities;  -- Should only see org_123's data
Enter fullscreen mode Exit fullscreen mode

Indexing Strategy

Proper indexing is critical for Supabase performance:

proper indexing

Index Categories

Category Example Index Type Notes
Primary Key id B-Tree (automatic) Use UUID for new tables
Foreign Key parent_id B-Tree Critical for JOINs
Tenant Scope organization_id B-Tree Required for RLS performance
Status status B-Tree Often combined with org
Temporal created_at B-Tree DESC For recent-first queries
JSONB custom_data GIN For containment queries
Text Search name GIN (tsvector) For full-text search

Index Creation Script Pattern

-- Foreign key indexes (critical for JOIN performance)
CREATE INDEX IF NOT EXISTS idx_entity_items_entity_id
    ON entity_items(entity_id);

-- Tenant-scoped queries (CRITICAL for RLS performance)
CREATE INDEX IF NOT EXISTS idx_entities_org
    ON entities(organization_id);

-- Combined index for common query patterns
CREATE INDEX IF NOT EXISTS idx_entities_org_status
    ON entities(organization_id, status);

-- Temporal queries
CREATE INDEX IF NOT EXISTS idx_entities_created
    ON entities(created_at DESC);

-- JSONB containment queries
CREATE INDEX IF NOT EXISTS idx_entities_custom_data
    ON entities USING GIN(custom_data);

-- JSONB specific key queries (expression index)
CREATE INDEX IF NOT EXISTS idx_entities_custom_industry
    ON entities((custom_data->>'industry'));
Enter fullscreen mode Exit fullscreen mode

Index Count by Category

Our final index breakdown:

Category Count Percentage
Foreign Key 217 44%
Tenant Scope 89 18%
Status/Type 72 15%
Temporal 58 12%
Composite 41 8%
JSONB/GIN 14 3%
Total 491 100%

Foreign Key Strategy

Phased Constraint Creation

Critical insight: Creating foreign keys during data load is extremely slow. Each INSERT triggers constraint validation.

Our approach:

  1. Create tables without foreign keys
  2. Migrate all data (using service role)
  3. Clean orphaned references
  4. Add foreign keys
  5. Create indexes
  6. Enable and create RLS policies

Schema and Data

Cleanup and Security

Orphan Cleanup

Before adding foreign keys, clean up invalid references:

-- Find orphans: child references non-existent parent
SELECT child.id, child.parent_id
FROM child_table child
LEFT JOIN parent_table parent ON child.parent_id = parent.id
WHERE child.parent_id IS NOT NULL
  AND parent.id IS NULL;

-- Clean orphans: set to NULL
UPDATE child_table child
SET parent_id = NULL
WHERE parent_id IS NOT NULL
  AND NOT EXISTS (
    SELECT 1 FROM parent_table
    WHERE id = child.parent_id
  );
Enter fullscreen mode Exit fullscreen mode

Cascade Strategy Decision Matrix

Relationship ON DELETE ON UPDATE Reasoning
Strong ownership CASCADE CASCADE Child meaningless without parent
Soft reference SET NULL CASCADE Preserve child, clear reference
Lookup reference RESTRICT CASCADE Prevent invalid state
Audit/history NO ACTION CASCADE Never delete audit records

Leveraging Supabase Features

After migration, you can take advantage of Supabase's full feature set.

Realtime Subscriptions

Replace MongoDB change streams with Supabase Realtime:

import { createClient } from '@supabase/supabase-js'

const supabase = createClient(SUPABASE_URL, SUPABASE_ANON_KEY)

// Subscribe to changes on a table
const subscription = supabase
  .channel('entities-changes')
  .on(
    'postgres_changes',
    {
      event: '*',  // INSERT, UPDATE, DELETE
      schema: 'public',
      table: 'entities',
      filter: `organization_id=eq.${orgId}`
    },
    (payload) => {
      console.log('Change received:', payload)
      // Handle the change
    }
  )
  .subscribe()
Enter fullscreen mode Exit fullscreen mode

Enable Realtime for Tables

-- Enable realtime for specific tables
ALTER PUBLICATION supabase_realtime ADD TABLE entities;
ALTER PUBLICATION supabase_realtime ADD TABLE entity_items;
Enter fullscreen mode Exit fullscreen mode

Supabase Client for Application Code

Replace MongoDB driver with Supabase client:

// Before: MongoDB
const entities = await db.collection('entities')
  .find({ organizationId: orgId, status: 'active' })
  .toArray()

// After: Supabase (RLS handles org filtering automatically!)
const { data: entities, error } = await supabase
  .from('entities')
  .select('*')
  .eq('status', 'active')
Enter fullscreen mode Exit fullscreen mode

Edge Functions for Complex Operations

Replace MongoDB aggregations with Edge Functions:

// supabase/functions/calculate-metrics/index.ts
import { serve } from 'https://deno.land/std@0.168.0/http/server.ts'
import { createClient } from 'https://esm.sh/@supabase/supabase-js@2'

serve(async (req) => {
  const supabase = createClient(
    Deno.env.get('SUPABASE_URL')!,
    Deno.env.get('SUPABASE_SERVICE_ROLE_KEY')!
  )

  // Complex aggregation that was painful in MongoDB
  const { data, error } = await supabase.rpc('calculate_entity_metrics')

  return new Response(JSON.stringify(data), {
    headers: { 'Content-Type': 'application/json' }
  })
})
Enter fullscreen mode Exit fullscreen mode

Database Functions for Complex Queries

-- Create a function for complex aggregations
CREATE OR REPLACE FUNCTION calculate_entity_metrics(org_id UUID)
RETURNS TABLE (
    status TEXT,
    count BIGINT,
    avg_items NUMERIC
) AS $$
BEGIN
    RETURN QUERY
    SELECT
        e.status,
        COUNT(*)::BIGINT,
        AVG(item_count)::NUMERIC
    FROM entities e
    LEFT JOIN (
        SELECT entity_id, COUNT(*) as item_count
        FROM entity_items
        GROUP BY entity_id
    ) items ON e.id = items.entity_id
    WHERE e.organization_id = org_id
    GROUP BY e.status;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
Enter fullscreen mode Exit fullscreen mode

Storage Integration

Move binary data to Supabase Storage:

// Upload file
const { data, error } = await supabase.storage
  .from('documents')
  .upload(`${orgId}/${entityId}/${filename}`, file)

// Get public URL
const { data: { publicUrl } } = supabase.storage
  .from('documents')
  .getPublicUrl(`${orgId}/${entityId}/${filename}`)

// Store URL in database
await supabase
  .from('entity_attachments')
  .insert({ entity_id: entityId, file_url: publicUrl })
Enter fullscreen mode Exit fullscreen mode

Error Handling & Data Quality

Graceful Degradation

A single bad document should not stop the entire migration:

import logging
from dataclasses import dataclass
from threading import Lock

@dataclass
class MigrationStats:
    """Thread-safe migration statistics."""
    documents_processed: int = 0
    documents_failed: int = 0
    rows_inserted: int = 0
    _lock: Lock = None

    def __post_init__(self):
        self._lock = Lock()

    def record_success(self, rows: int):
        with self._lock:
            self.documents_processed += 1
            self.rows_inserted += rows

    def record_failure(self):
        with self._lock:
            self.documents_processed += 1
            self.documents_failed += 1


def migrate_with_error_handling(bson_path: str, db_connection) -> MigrationStats:
    """Migrate collection with graceful error handling."""
    stats = MigrationStats()
    logger = logging.getLogger(__name__)

    for doc in read_bson_file(bson_path):
        try:
            result = flatten_document(doc, 'entities')
            rows_inserted = insert_all_rows(db_connection, result)
            stats.record_success(rows_inserted)

        except Exception as e:
            stats.record_failure()
            doc_id = doc.get('_id', 'unknown')
            logger.error(f"Failed to migrate document {doc_id}: {e}")

            # Save failed documents for retry
            save_failed_document(doc, str(e))

    return stats
Enter fullscreen mode Exit fullscreen mode

Data Validation

MongoDB allows messy data. Sanitize during transformation:

from typing import Optional

def sanitize_text(value: Optional[str]) -> Optional[str]:
    """Sanitize text for Supabase insertion."""
    if value is None:
        return None

    # Remove null bytes (PostgreSQL doesn't allow \x00)
    value = value.replace('\x00', '')

    # Ensure valid UTF-8
    value = value.encode('utf-8', errors='replace').decode('utf-8')

    # Trim excessive whitespace
    value = ' '.join(value.split())

    return value if value else None
Enter fullscreen mode Exit fullscreen mode

Performance Optimizations

Parallel Collection Processing

Collections are independent—process them in parallel:

from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import List

def migrate_all_collections(
    collection_paths: List[str],
    db_connection_factory,
    max_workers: int = 4
) -> dict:
    """Migrate multiple collections in parallel."""
    results = {}

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {}

        for path in collection_paths:
            conn = db_connection_factory()
            future = executor.submit(migrate_collection, path, conn)
            futures[future] = path

        for future in as_completed(futures):
            path = futures[future]
            try:
                stats = future.result()
                results[path] = stats
                print(f"{path}: {stats.documents_processed} docs")
            except Exception as e:
                print(f"{path}: {e}")
                results[path] = None

    return results
Enter fullscreen mode Exit fullscreen mode

Supabase Connection Pooling

Use Supabase's connection pooler for better performance:

# Use pooler connection string (port 6543) instead of direct (port 5432)
# Transaction mode for short queries
POOLER_URL = "postgresql://postgres.[ref]:[pass]@aws-0-[region].pooler.supabase.com:6543/postgres"

# Session mode for migrations (maintains state)
DIRECT_URL = "postgresql://postgres.[ref]:[pass]@db.[ref].supabase.co:5432/postgres"
Enter fullscreen mode Exit fullscreen mode

Deferred Constraint Validation

Disable triggers during bulk load:

-- Before migration (run as superuser/service role)
SET session_replication_role = replica;

-- ... perform bulk insert ...

-- After migration
SET session_replication_role = DEFAULT;
Enter fullscreen mode Exit fullscreen mode

Lessons Learned

1. Plan for Orphaned References

Problem: MongoDB doesn't enforce referential integrity. We found thousands of references to deleted documents.

Solution: Build orphan detection into your pipeline. Run cleanup before adding foreign keys.

2. RLS Performance Requires Indexes

Problem: RLS policies that filter by organization_id are slow without proper indexes.

Solution: Always create indexes on columns used in RLS policies BEFORE enabling policies.

-- Create index BEFORE policy
CREATE INDEX idx_entities_org ON entities(organization_id);

-- Then create policy
CREATE POLICY tenant_isolation ON entities
    USING (organization_id = auth.jwt()->>'org_id');
Enter fullscreen mode Exit fullscreen mode

3. Test with Production-Scale Data

Problem: A migration that works with 1,000 documents may fail with 100,000.

Solution: Test with full production data volume. Monitor memory and query performance.

4. Use Service Role for Migration

Problem: RLS policies block migration inserts if organization_id doesn't match JWT.

Solution: Use service role key during migration (bypasses RLS), then verify RLS works after.

5. Supabase CLI is Your Friend

Solution: Use Supabase CLI for:

  • Schema migrations (supabase migration new)
  • Local development (supabase start)
  • Type generation (supabase gen types typescript)

6. Handle Time Zones Explicitly

Problem: MongoDB dates might be in different time zones.

Solution: Always use TIMESTAMPTZ and explicitly convert to UTC during migration.


Results & Metrics

Migration Statistics

Metric Value
Source Collections 65
Target Tables 282
Total Rows ~80,000
Foreign Keys 217
Indexes 491
RLS Policies 156
Orphaned References Cleaned 2,847
Failed Documents 12
Success Rate 99.98%

Performance Comparison (Post-Migration)

Query Type MongoDB Supabase Improvement
Simple lookup 5ms 2ms 60%
Aggregation 250ms 45ms 82%
Join (3 tables) N/A* 15ms -
Reporting query 2.5s 180ms 93%
Realtime updates Custom code Built-in

*MongoDB required multiple queries and application-level joins.

Developer Experience Improvements

Aspect MongoDB Supabase
Multi-tenant security Application code Database RLS
Realtime Custom change streams Built-in
Type safety Manual Generated types
Auth integration Separate system Integrated
Dashboard MongoDB Compass Supabase Studio

Conclusion

Migrating from MongoDB to Supabase is more than a database change—it's an upgrade to a complete backend platform. The relational model brings data integrity, RLS brings security, and Supabase's ecosystem brings developer productivity.

Key Takeaways:

  1. Plan normalization carefully. Each nested structure needs a strategy.

  2. Embrace RLS. It's not just security—it simplifies your application code.

  3. Phase your migration. Schema → Data → Constraints → RLS policies.

  4. Use the full platform. Realtime, Storage, Edge Functions, and Auth are included.

  5. Test at scale. Performance characteristics change with data volume.

Next Steps After Migration

  1. Validation: Compare counts, verify data integrity, test RLS policies
  2. Update Application: Switch to Supabase client, remove manual tenant filtering
  3. Enable Realtime: Add subscriptions where needed
  4. Migrate Files: Move binary data to Supabase Storage
  5. Monitor: Use Supabase dashboard for query performance

Useful Resources


This post is based on a real production migration. Numbers have been rounded, and specific table names have been generalized.

Questions about your own migration? Drop them in the comments!


Top comments (0)