Roberto Araneda Espinoza for Gophers CL

Posted on Mar 8

SaaS Multi-Tenant en Go: Guia Practica de Row-Level Security en PostgreSQL

#go #postgres #security #saas

Estas construyendo una aplicacion SaaS. Multiples clientes comparten la misma base de datos. El Cliente A nunca debe ver datos del Cliente B. Un solo WHERE tenant_id = ? faltante y tienes una brecha de datos.

Este articulo muestra como implementar PostgreSQL Row-Level Security (RLS) en Go para que la base de datos misma aplique el aislamiento de tenant — de forma transparente, automatica, y sin depender de que los desarrolladores recuerden un filtro en cada query.

Por que RLS?

Hay cuatro enfoques comunes para multi-tenancy:

Enfoque	Como funciona	Riesgo
`WHERE` a nivel de app	Agregar `WHERE tenant_id = ?` a cada query	Uno faltante = brecha
Schema-per-tenant	Schema separado por cliente	Infierno de migraciones
Database-per-tenant	DB separada por cliente	Pesadilla de connection pooling
Row-Level Security	PostgreSQL aplica visibilidad por transaccion	Requiere disciplina transaccional

RLS gana porque la frontera de seguridad se mueve del codigo de aplicacion (donde los humanos cometen errores) al motor de base de datos (que aplica policies mecanicamente en cada query).

Arquitectura

El tenant ID fluye a traves de tres capas:

HTTP Request ──▶ Middleware ──▶ Go Context ──▶ PostgreSQL TX
(JWT/Header)     (Extraer)     (Propagar)     (SET LOCAL)
                                                   │
                                              RLS Policy
                                              (Enforce)

Cada operacion de base de datos corre dentro de una transaccion donde SET LOCAL setea el tenant actual. Las RLS policies luego filtran filas automaticamente. Tu codigo de aplicacion nunca escribe WHERE tenant_id = ?.

Paso 1: El Package de Contexto de Tenant

Un package pequeno que gestiona la identidad del tenant a traves de context.Context:

package tenant

import (
    "context"
    "errors"
    "regexp"
)

type contextKey string

const tenantIDKey contextKey = "tenant_id"

var (
    ErrNoTenant      = errors.New("no tenant ID in context")
    ErrInvalidTenant = errors.New("invalid tenant ID format")
)

var tenantIDRegex = regexp.MustCompile(`^[a-zA-Z0-9][a-zA-Z0-9._-]{0,63}$`)

func ValidateTenantID(id string) error {
    if id == "" {
        return ErrNoTenant
    }
    if !tenantIDRegex.MatchString(id) {
        return ErrInvalidTenant
    }
    return nil
}

func WithTenantID(ctx context.Context, id string) (context.Context, error) {
    if err := ValidateTenantID(id); err != nil {
        return ctx, err
    }
    return context.WithValue(ctx, tenantIDKey, id), nil
}

func GetTenantID(ctx context.Context) string {
    if v, ok := ctx.Value(tenantIDKey).(string); ok {
        return v
    }
    return ""
}

Tres decisiones de diseno importan aca:

Context key no exportado — El tipo contextKey es privado. Ningun otro package puede leer o sobrescribir el tenant ID.
Validacion con regex — El patron ^[a-zA-Z0-9][a-zA-Z0-9._-]{0,63}$ valida formato y previene SQL injection. Defensa en profundidad.
String vacio = single-tenant — Sin tenant en el contexto, el sistema opera sin overhead. Mismo codebase para single-tenant y multi-tenant.

Paso 2: HTTP Middleware

Extraer el tenant de dos fuentes: JWT tokens y HTTP headers, con validacion cruzada.

import (
    "errors"
    "net/http"
    "strings"
)

var ErrTenantMismatch = errors.New("tenant ID mismatch between header and JWT")

type MiddlewareConfig struct {
    ValidateToken      func(token string) (tenantID string, err error)
    HeaderName         string // ej: "X-Tenant-ID"
    AllowAnonymous     bool
    DefaultTenant      string
    RequireTenantInJWT bool   // previene spoofing por header
}

func Middleware(cfg MiddlewareConfig) func(http.Handler) http.Handler {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            var headerTenant, jwtTenant string

            // Fuente 1: Header HTTP
            if cfg.HeaderName != "" {
                headerTenant = r.Header.Get(cfg.HeaderName)
            }

            // Fuente 2: JWT Bearer token
            if cfg.ValidateToken != nil {
                if auth := r.Header.Get("Authorization"); strings.HasPrefix(auth, "Bearer ") {
                    var err error
                    jwtTenant, err = cfg.ValidateToken(strings.TrimPrefix(auth, "Bearer "))
                    if err != nil {
                        http.Error(w, "invalid token", http.StatusUnauthorized)
                        return
                    }
                }
            }

            // Cross-check: si ambas fuentes proveen tenant, deben coincidir
            if headerTenant != "" && jwtTenant != "" && headerTenant != jwtTenant {
                http.Error(w, ErrTenantMismatch.Error(), http.StatusForbidden)
                return
            }

            // JWT tiene prioridad (verificado criptograficamente)
            tenantID := jwtTenant
            if tenantID == "" {
                tenantID = headerTenant
            }

            // Anti-spoofing
            if cfg.RequireTenantInJWT && jwtTenant == "" && headerTenant != "" {
                http.Error(w, "tenant must be in JWT", http.StatusUnauthorized)
                return
            }

            // Fallback
            if tenantID == "" {
                if cfg.DefaultTenant != "" {
                    tenantID = cfg.DefaultTenant
                } else if !cfg.AllowAnonymous {
                    http.Error(w, "tenant required", http.StatusUnauthorized)
                    return
                }
            }

            ctx := r.Context()
            if tenantID != "" {
                var err error
                ctx, err = WithTenantID(ctx, tenantID)
                if err != nil {
                    http.Error(w, err.Error(), http.StatusBadRequest)
                    return
                }
            }

            next.ServeHTTP(w, r.WithContext(ctx))
        })
    }
}

Sin la validacion cruzada, un atacante podria enviar un JWT valido para Tenant A pero setear el header X-Tenant-ID a Tenant B. RequireTenantInJWT fuerza que el tenant venga del JWT, no de un header falsificable.

Paso 3: Schema de Base de Datos

Cada tabla incluye tenant_id como parte de su primary key:

CREATE TABLE orders (
    tenant_id   VARCHAR(64)   NOT NULL,
    id          BIGSERIAL     NOT NULL,
    customer_id BIGINT        NOT NULL,
    total       NUMERIC(12,2) NOT NULL,
    status      VARCHAR(32)   NOT NULL DEFAULT 'pending',
    created_at  TIMESTAMPTZ   NOT NULL DEFAULT NOW(),

    PRIMARY KEY (tenant_id, id)
);

Por que en el PK?

La Orden #1001 en Tenant A y la Orden #1001 en Tenant B son registros diferentes
El B-tree primario sobre (tenant_id, id) soporta queries filtrados por tenant naturalmente
Keys compuestos funcionan con tenants dinamicos — sin overhead de particionamiento

Diseno de indices

Siempre tenant_id como columna lider:

-- CORRECTO: index range scan por tenant
CREATE INDEX orders_tenant_status_idx
    ON orders (tenant_id, status, created_at DESC);

-- INCORRECTO: escanea todos los tenants, luego filtra
CREATE INDEX orders_status_tenant_idx
    ON orders (status, created_at DESC, tenant_id);

Paso 4: RLS Policies

Habilitar RLS y crear una policy vinculada a una variable de sesion de PostgreSQL:

ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
ALTER TABLE orders FORCE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON orders
    USING (tenant_id = current_setting('app.current_tenant', true))
    WITH CHECK (tenant_id = current_setting('app.current_tenant', true));

Como funciona:

USING — SELECT, UPDATE, DELETE solo ven filas donde tenant_id coincide con la variable de sesion
WITH CHECK — INSERT y UPDATE son rechazados si tenant_id no coincide
current_setting(..., true) — retorna NULL cuando no esta seteada. Como NULL = algo es siempre false en SQL → cero filas visibles por defecto (fail-closed)
FORCE — aplica RLS incluso al table owner (sin esto, el usuario de la app bypasea RLS)

Automatizando en Go

func enableRLS(ctx context.Context, db *sql.DB, tables []string) error {
    for _, table := range tables {
        stmts := []string{
            fmt.Sprintf(`ALTER TABLE %s ENABLE ROW LEVEL SECURITY`, table),
            fmt.Sprintf(`ALTER TABLE %s FORCE ROW LEVEL SECURITY`, table),
        }
        for _, s := range stmts {
            if _, err := db.ExecContext(ctx, s); err != nil {
                return fmt.Errorf("enabling RLS on %s: %w", table, err)
            }
        }

        policy := fmt.Sprintf(`
            CREATE POLICY tenant_isolation_%s ON %s
                USING (tenant_id = current_setting('app.current_tenant', true))
                WITH CHECK (tenant_id = current_setting('app.current_tenant', true))
        `, table, table)

        if _, err := db.ExecContext(ctx, policy); err != nil {
            drop := fmt.Sprintf(
                `DROP POLICY IF EXISTS tenant_isolation_%s ON %s`, table, table,
            )
            db.ExecContext(ctx, drop)
            if _, err := db.ExecContext(ctx, policy); err != nil {
                return fmt.Errorf("creating policy on %s: %w", table, err)
            }
        }
    }
    return nil
}

Paso 5: El Patron tenantQuerier — SET LOCAL

Este es el nucleo de todo. Cada operacion de base de datos activa el contexto de tenant via SET LOCAL dentro de una transaccion.

type dbQuerier interface {
    ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
    QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error)
    QueryRowContext(ctx context.Context, query string, args ...any) *sql.Row
}

func tenantQuerier(ctx context.Context, db *sql.DB) (dbQuerier, func(), error) {
    tenantID := tenant.GetTenantID(ctx)
    if tenantID == "" {
        return db, func() {}, nil // single-tenant: cero overhead
    }

    tx, err := db.BeginTx(ctx, &sql.TxOptions{ReadOnly: true})
    if err != nil {
        return nil, nil, fmt.Errorf("begin tenant tx: %w", err)
    }

    if _, err := tx.ExecContext(ctx,
        "SET LOCAL app.current_tenant = $1", tenantID,
    ); err != nil {
        tx.Rollback()
        return nil, nil, fmt.Errorf("set tenant: %w", err)
    }

    return tx, func() { tx.Commit() }, nil
}

Por que `SET LOCAL` y no `SET`?

La decision mas importante de toda la implementacion.

┌──────────────────────────────────────────────────────┐
│            Connection Pool (database/sql)            │
│                                                      │
│  ┌────────┐  ┌────────┐  ┌────────┐  ┌────────┐      │
│  │ Conn 1 │  │ Conn 2 │  │ Conn 3 │  │ Conn 4 │      │
│  └────────┘  └────────┘  └────────┘  └────────┘      │
│      ▲           ▲           ▲           ▲           │
│  Goroutine A Goroutine B Goroutine C Goroutine D     │
│  (Tenant X)  (Tenant Y)  (Tenant X)  (Tenant Z)      │
└──────────────────────────────────────────────────────┘

database/sql reutiliza conexiones entre goroutines. Si usas SET (scope de sesion), la variable persiste en la conexion. La siguiente goroutine hereda el tenant equivocado.

Comando	Scope	Seguridad en Pool
`SET app.current_tenant = 'X'`	Session	INSEGURO — se filtra al siguiente request
`SET LOCAL app.current_tenant = 'X'`	Transaction	SEGURO — no hay leaks posibles

SET LOCAL se limpia automaticamente en commit/rollback. Sin cleanup manual, sin race conditions.

Por que transacciones read-only?

&sql.TxOptions{ReadOnly: true} provee:

No puede modificar datos accidentalmente
PostgreSQL optimiza read-only (menor overhead de MVCC)
La transaccion existe solo para llevar SET LOCAL, no para agrupar escrituras

Paso 6: Usando el Patron

Lectura

func GetOrder(ctx context.Context, db *sql.DB, orderID int64) (*Order, error) {
    q, done, err := tenantQuerier(ctx, db)
    if err != nil {
        return nil, err
    }
    defer done()

    // Sin WHERE tenant_id — RLS lo maneja
    var o Order
    err = q.QueryRowContext(ctx,
        `SELECT id, customer_id, total, status, created_at
         FROM orders WHERE id = $1`,
        orderID,
    ).Scan(&o.ID, &o.CustomerID, &o.Total, &o.Status, &o.CreatedAt)

    if errors.Is(err, sql.ErrNoRows) {
        return nil, ErrNotFound
    }
    return &o, err
}

Si Tenant A pide la Orden #1001 pero pertenece a Tenant B → RLS retorna cero filas → 404. Indistinguible de una orden inexistente.

Escritura

Las escrituras ya usan transacciones. Setear tenant una vez al inicio:

func CreateOrder(ctx context.Context, db *sql.DB, order *Order, items []OrderItem) error {
    tenantID := tenant.GetTenantID(ctx)

    tx, err := db.BeginTx(ctx, nil)
    if err != nil {
        return err
    }
    defer tx.Rollback()

    if tenantID != "" {
        if _, err := tx.ExecContext(ctx,
            "SET LOCAL app.current_tenant = $1", tenantID,
        ); err != nil {
            return fmt.Errorf("set tenant: %w", err)
        }
    }

    var orderID int64
    err = tx.QueryRowContext(ctx,
        `INSERT INTO orders (tenant_id, customer_id, total, status)
         VALUES ($1, $2, $3, $4) RETURNING id`,
        tenantID, order.CustomerID, order.Total, order.Status,
    ).Scan(&orderID)
    if err != nil {
        return err
    }

    for _, item := range items {
        _, err := tx.ExecContext(ctx,
            `INSERT INTO order_items (tenant_id, order_id, product_id, quantity, price)
             VALUES ($1, $2, $3, $4, $5)`,
            tenantID, orderID, item.ProductID, item.Quantity, item.Price,
        )
        if err != nil {
            return err
        }
    }

    return tx.Commit()
}

Listado

func ListOrders(ctx context.Context, db *sql.DB, status string, limit int) ([]Order, error) {
    q, done, err := tenantQuerier(ctx, db)
    if err != nil {
        return nil, err
    }
    defer done()

    rows, err := q.QueryContext(ctx,
        `SELECT id, customer_id, total, status, created_at
         FROM orders WHERE status = $1
         ORDER BY created_at DESC LIMIT $2`,
        status, limit,
    )
    if err != nil {
        return nil, err
    }
    defer rows.Close()

    var orders []Order
    for rows.Next() {
        var o Order
        if err := rows.Scan(&o.ID, &o.CustomerID, &o.Total, &o.Status, &o.CreatedAt); err != nil {
            return nil, err
        }
        orders = append(orders, o)
    }
    return orders, rows.Err()
}

Sin filtro de tenant en el SQL. RLS se encarga.

Paso 7: Patron Repository (Produccion)

Encapsular el patron para codigo de produccion:

type OrderRepo struct {
    db              *sql.DB
    multiTenantMode bool
}

func NewOrderRepo(db *sql.DB, multiTenant bool) *OrderRepo {
    return &OrderRepo{db: db, multiTenantMode: multiTenant}
}

func (r *OrderRepo) querier(ctx context.Context) (dbQuerier, func(), error) {
    if !r.multiTenantMode {
        return r.db, func() {}, nil
    }
    return tenantQuerier(ctx, r.db)
}

func (r *OrderRepo) beginTx(ctx context.Context) (*sql.Tx, error) {
    tx, err := r.db.BeginTx(ctx, nil)
    if err != nil {
        return nil, err
    }

    if r.multiTenantMode {
        tenantID := tenant.GetTenantID(ctx)
        if tenantID != "" {
            if _, err := tx.ExecContext(ctx,
                "SET LOCAL app.current_tenant = $1", tenantID,
            ); err != nil {
                tx.Rollback()
                return nil, fmt.Errorf("set tenant: %w", err)
            }
        }
    }

    return tx, nil
}

func (r *OrderRepo) GetByID(ctx context.Context, id int64) (*Order, error) {
    q, done, err := r.querier(ctx)
    if err != nil {
        return nil, err
    }
    defer done()

    var o Order
    err = q.QueryRowContext(ctx,
        `SELECT id, customer_id, total, status FROM orders WHERE id = $1`, id,
    ).Scan(&o.ID, &o.CustomerID, &o.Total, &o.Status)

    if errors.Is(err, sql.ErrNoRows) {
        return nil, ErrNotFound
    }
    return &o, err
}

El flag multiTenantMode da cero overhead cuando multi-tenancy esta deshabilitado — querier() retorna *sql.DB directo.

Analisis de Seguridad

Modelo de amenazas

Amenaza	Mitigacion
`WHERE tenant_id` faltante	RLS enforced a nivel de DB
Spoofing via header	`RequireTenantInJWT` fuerza JWT
Mismatch header/JWT	Validacion cruzada rechaza
SQL injection en tenant	Regex + `SET LOCAL $1` parametrizado
Leak de variable en pool	`SET LOCAL` auto-limpiado
Superuser bypasea RLS	`FORCE ROW LEVEL SECURITY`
Tenant no seteado	`current_setting` → NULL → cero filas

Garantia fail-closed

Cada falla → cero acceso a datos, nunca acceso cross-tenant:

Variable no seteada → NULL → ninguna fila visible
JWT invalido → middleware rechaza antes de la DB
Header/JWT no coinciden → 403

No existe code path donde una misconfiguracion exponga datos cross-tenant.

Rendimiento

Modo	Lectura	Escritura
Single-tenant	Query directo	TX regular
Multi-tenant	TX read-only + `SET LOCAL` + query + commit	TX regular + `SET LOCAL`

El costo: un round-trip extra por lectura. SET LOCAL es sub-milisegundo — opera sobre variables GUC en memoria, sin I/O a disco. En la practica, el overhead es despreciable comparado con la latencia de red del query mismo.

Testing

Aislamiento basico

func TestOrderIsolation(t *testing.T) {
    db := setupTestDB(t)
    enableRLS(ctx, db, []string{"orders"})

    ctxA, _ := tenant.WithTenantID(context.Background(), "tenant-a")
    createTestOrder(t, db, ctxA, "order-1")

    ctxB, _ := tenant.WithTenantID(context.Background(), "tenant-b")
    createTestOrder(t, db, ctxB, "order-2")

    // Tenant A solo ve su orden
    orders := listOrders(t, db, ctxA)
    assert.Len(t, orders, 1)
    assert.Equal(t, "order-1", orders[0].Name)

    // Tenant B solo ve su orden
    orders = listOrders(t, db, ctxB)
    assert.Len(t, orders, 1)
    assert.Equal(t, "order-2", orders[0].Name)
}

Aislamiento concurrente

func TestConcurrentTenantIsolation(t *testing.T) {
    db := setupTestDB(t)
    enableRLS(ctx, db, []string{"orders"})

    for i := 0; i < 10; i++ {
        tid := fmt.Sprintf("tenant-%d", i)
        ctx, _ := tenant.WithTenantID(context.Background(), tid)
        createTestOrder(t, db, ctx, fmt.Sprintf("order-for-%s", tid))
    }

    var wg sync.WaitGroup
    errs := make(chan error, 100)

    for i := 0; i < 100; i++ {
        wg.Add(1)
        go func(n int) {
            defer wg.Done()
            tid := fmt.Sprintf("tenant-%d", n%10)
            ctx, _ := tenant.WithTenantID(context.Background(), tid)

            orders := listOrders(t, db, ctx)
            if len(orders) != 1 {
                errs <- fmt.Errorf("tenant %s: expected 1 order, got %d", tid, len(orders))
                return
            }
            if !strings.Contains(orders[0].Name, tid) {
                errs <- fmt.Errorf("tenant %s: order name %q doesn't match", tid, orders[0].Name)
            }
        }(i)
    }
    wg.Wait()
    close(errs)

    for err := range errs {
        t.Error(err)
    }
}

100 goroutines compartiendo el mismo connection pool. SET LOCAL previene contaminacion cruzada. Notar que las assertions se reportan via channel — t.Fatal/t.FailNow no son seguros desde goroutines.

5 Errores Comunes

1. `SET` en vez de `SET LOCAL`

// MAL — scope de sesion, se filtra en el pool
tx.ExecContext(ctx, "SET app.current_tenant = $1", tenantID)

// BIEN — scope de transaccion, auto-limpiado
tx.ExecContext(ctx, "SET LOCAL app.current_tenant = $1", tenantID)

2. Olvidar `FORCE ROW LEVEL SECURITY`

-- Sin FORCE, el table owner bypasea RLS
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;

-- Con FORCE, incluso el owner esta sujeto a policies
ALTER TABLE orders FORCE ROW LEVEL SECURITY;

3. No parametrizar `SET LOCAL`

// MAL — SQL injection
tx.ExecContext(ctx, fmt.Sprintf("SET LOCAL app.current_tenant = '%s'", tenantID))

// BIEN — parametrizado
tx.ExecContext(ctx, "SET LOCAL app.current_tenant = $1", tenantID)

4. `tenant_id` al final en indices

-- MAL — escanea todos los tenants
CREATE INDEX idx ON orders (status, tenant_id);

-- BIEN — escanea solo un tenant
CREATE INDEX idx ON orders (tenant_id, status);

5. Olvidar `defer done()`

// MAL — TX puede quedar abierto
q, done, err := tenantQuerier(ctx, db)

// BIEN — defer inmediato
q, done, err := tenantQuerier(ctx, db)
if err != nil { return err }
defer done()

Resumen

El patron completo en cuatro lineas:

tx, _ := db.BeginTx(ctx, &sql.TxOptions{ReadOnly: true})  // 1. Iniciar TX
tx.ExecContext(ctx, "SET LOCAL app.current_tenant = $1", t) // 2. Activar tenant
rows, _ := tx.QueryContext(ctx, "SELECT * FROM orders")     // 3. Query (RLS filtra)
tx.Commit()                                                  // 4. Cleanup automatico

PostgreSQL hace el resto. Sin WHERE tenant_id = ?. Sin filtrado a nivel de app. Sin brechas por filtros olvidados.

SET LOCAL + connection pooling de database/sql es la combinacion segura. SET filtra estado. SET LOCAL no. Esa es toda la diferencia entre un sistema seguro y una brecha de datos.

Somos GophersCL, la comunidad de Go en Chile. Siguenos en dev.to para mas contenido sobre Go en Latinoamerica.

Top comments (1)

vuleolabs • Apr 5

Nice article!
I've been building a SaaS landing page recently using Next.js and Tailwind.
It's interesting how modular components can make landing pages much easier to maintain.