DEV Community

Olivia Craft
Olivia Craft

Posted on

Cursor Rules for PostgreSQL: The Complete Guide to AI-Assisted Database Development

Cursor Rules for PostgreSQL: The Complete Guide to AI-Assisted PostgreSQL Development

PostgreSQL is the database where "it works in staging" is a promise that survives exactly as long as the data volume in staging resembles production — which is to say, almost never. The service ships. The dashboard loads in 200ms on 50k rows. Two months later, at 20 million rows, the same endpoint times out at 30 seconds because the query does a sequential scan on a table that's larger than shared_buffers, because the index somebody added is on (created_at) when every query filters by tenant_id, created_at, because SELECT * pulls back a jsonb column averaging 40KB that nobody needed, because the migration that added a NOT NULL column ran ALTER TABLE ... SET DEFAULT and rewrote every row while holding an ACCESS EXCLUSIVE lock, and because the "connection pool" is a fresh pg.Client per request so under load the database sits at 500 connections and rejects new ones while the pods wait. The database doesn't get slower — the query plan changes. And by the time you notice, the explanation is buried in fifteen commits by five people and three migrations that each seemed fine in isolation.

Then you add an AI assistant.

Cursor and Claude Code were trained on a decade of PostgreSQL content, most of it from tutorials that teach correct SQL but do not teach EXPLAIN, NOT NULL, foreign key indexes, or pg_stat_statements. Ask for "a table for orders with line items," and you get CREATE TABLE orders (id SERIAL PRIMARY KEY, user_id INT, total FLOAT)SERIAL instead of BIGINT GENERATED ALWAYS AS IDENTITY, INT instead of a typed reference with ON DELETE semantics, FLOAT for money, no created_at, no NOT NULL, no unique constraint on the natural key. Ask for "insert a user," and you get INSERT INTO users(email) VALUES('${email}') with string interpolation because the example on Stack Overflow does it. Ask for "find users by email," and you get SELECT * FROM users WHERE LOWER(email) = LOWER($1) against an index on email that doesn't match because LOWER() ruins index usage and no one wrote a functional index.

The fix is .cursorrules — one file in the repo that tells the AI what idiomatic, production-grade PostgreSQL looks like. Eight rules below, each with the failure mode, the rule, and a before/after. Copy-paste .cursorrules at the end. Examples use the pg Node driver; the rules apply equally to asyncpg (Python), pgx (Go), sqlx (Rust), and to ORM layers like Prisma, Drizzle, and Kysely.


How Cursor Rules Work for PostgreSQL Projects

Cursor reads project rules from two locations: .cursorrules (a single file at the repo root, still supported) and .cursor/rules/*.mdc (modular files with frontmatter, recommended). For PostgreSQL I recommend modular rules so the schema/DDL conventions don't bleed into application query code, and so test conventions are scoped to integration test files only:

.cursor/rules/
  pg-schema.mdc         # types, constraints, identifiers, naming
  pg-migrations.mdc     # forward-only, non-blocking DDL, expand/contract
  pg-indexes.mdc        # compound indexes, partial, covering, FK indexes
  pg-queries.mdc        # EXPLAIN, no SELECT *, bounded result sets
  pg-pool.mdc           # pool sizing, no per-request connect, pgBouncer
  pg-transactions.mdc   # explicit BEGIN, isolation, advisory locks
  pg-driver.mdc         # parameterized everything, typed rows
  pg-testing.mdc        # real Postgres (testcontainers), truncate vs transaction
Enter fullscreen mode Exit fullscreen mode

Frontmatter controls activation: globs: ["**/*.{sql,ts,js,py}", "**/migrations/**"] with alwaysApply: false. Now the rules.


Rule 1: Schema Design — Types That Match the Domain, Constraints That Enforce Invariants

The first sin of AI-generated PostgreSQL is always the same: SERIAL primary keys (deprecated since PG10 in favor of identity columns), FLOAT for monetary amounts, VARCHAR(255) out of MySQL muscle memory (in Postgres TEXT is the same storage, without the arbitrary cap), TIMESTAMP without time zone for a column that's clearly a point in time, INT foreign keys with no REFERENCES, no NOT NULL because "we'll fill it in later," and no CHECK constraints anywhere. The schema is the contract between every service and the data; it is the one place where you can enforce invariants that no amount of application code will catch. Cursor won't add a CHECK (total > 0) unless you tell it to.

The rule:

IDENTIFIERS
- Primary keys: `id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY`
  (or UUID where distribution / public exposure matters). Never SERIAL.
- UUIDs use `uuid` type and `gen_random_uuid()` (requires `pgcrypto`
  extension). Never text columns containing UUIDs.
- Foreign keys ALWAYS declare REFERENCES with explicit ON DELETE
  behavior (RESTRICT by default; CASCADE / SET NULL only when intentional).

TYPES
- Money: NUMERIC(precision, scale), never FLOAT/DOUBLE. Rates: NUMERIC.
- Timestamps: TIMESTAMPTZ (always). Never TIMESTAMP without time zone.
  Store UTC; convert at the display boundary.
- Strings: TEXT, constrained by CHECK (char_length(x) BETWEEN a AND b).
  Never VARCHAR(n) unless the length is a real domain constraint.
- Enums: native PG ENUM when values are stable; VARCHAR+CHECK when they
  may evolve; lookup TABLE + FK when values need metadata.
- JSON: JSONB (never JSON). Only when the shape is genuinely variable;
  columns are cheaper than JSONB keys.

CONSTRAINTS
- NOT NULL by default. Nullability is an explicit, justified decision.
- CHECK constraints for domain invariants: total >= 0,
  status IN ('pending','paid','cancelled'), email matches regex.
- UNIQUE constraints for every natural key (email, tenant+slug).
- created_at TIMESTAMPTZ NOT NULL DEFAULT now() on every business table;
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now() with a trigger.

NAMING
- snake_case, plural tables (users, orders), singular columns.
- FK: `<referenced>_id` (user_id, order_id). Indexes: `<table>_<cols>_idx`.
  Constraints: `<table>_<cols>_key` / `<table>_<cols>_check`.
- Schemas: `app` for production tables; `app_audit` for audit; `app_temp`
  for ephemera. Never tables in `public` beyond extensions.
Enter fullscreen mode Exit fullscreen mode

Before — Mongo-in-Postgres, no constraints, FLOAT money:

CREATE TABLE orders (
  id SERIAL PRIMARY KEY,
  user_id INT,
  total FLOAT,
  status VARCHAR(255),
  created_at TIMESTAMP
);
Enter fullscreen mode Exit fullscreen mode

user_id has no FK — delete a user and orphaned orders persist. total stored as FLOAT means 0.1 + 0.2 != 0.3 in any aggregation. status can be literally anything. created_at loses timezone.

After — typed, constrained, auditable:

CREATE TYPE order_status AS ENUM ('pending', 'paid', 'shipped', 'cancelled', 'refunded');

CREATE TABLE app.orders (
  id          BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_id     BIGINT NOT NULL REFERENCES app.users(id) ON DELETE RESTRICT,
  total       NUMERIC(12, 2) NOT NULL CHECK (total >= 0),
  currency    CHAR(3) NOT NULL CHECK (currency ~ '^[A-Z]{3}$'),
  status      order_status NOT NULL DEFAULT 'pending',
  created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TRIGGER orders_set_updated_at
  BEFORE UPDATE ON app.orders FOR EACH ROW
  EXECUTE FUNCTION app.set_updated_at();
Enter fullscreen mode Exit fullscreen mode

The schema now enforces invariants the application can't lie about. total < 0 is impossible. status is closed. Timezone is preserved.


Rule 2: Migrations — Forward-Only, Idempotent, Non-Blocking Under Load

The most dangerous migration is the one that ran fine on 50k rows in staging and held an ACCESS EXCLUSIVE lock for 47 minutes on the 20M-row production table. Cursor writes ALTER TABLE orders ADD COLUMN region TEXT NOT NULL DEFAULT 'US' — which on PG11+ is fast for the default, but ALTER TABLE orders ADD COLUMN region TEXT NOT NULL with a later UPDATE ... SET region = ... is a full table rewrite under lock. Worse: CREATE INDEX without CONCURRENTLY blocks writes for the duration. Worst: ALTER TABLE ... DROP COLUMN in the same deploy as the code that still reads that column. The rule is expand/contract: ship the additive change, deploy the code, ship the contraction.

The rule:

FORWARD-ONLY
- No `down` migrations in production. "Down" is a new forward migration
  that reverts. Rollbacks happen via `revert_<timestamp>_...` files.
- Every migration has a unique monotonic id (timestamp prefix) and is
  idempotent where possible (CREATE TABLE IF NOT EXISTS is not safe —
  schemas must match; use a tool like sqlx/migrate/node-pg-migrate/atlas
  that tracks applied migrations).

NON-BLOCKING DDL
- `CREATE INDEX CONCURRENTLY` — never bare CREATE INDEX on a populated
  production table. CONCURRENTLY cannot run in a transaction, so isolate
  it in its own migration file.
- `ADD COLUMN ... DEFAULT ...` is fast on PG11+ for a constant default.
  Safe. `ADD COLUMN ... DEFAULT non_constant()` is a table rewrite —
  use expand/contract.
- `ADD COLUMN ... NOT NULL` without a default rewrites the table. Do it
  in three steps: ADD nullable -> backfill in batches -> SET NOT NULL.
- `ALTER COLUMN ... TYPE ...` often rewrites. Cast-free changes (e.g.
  VARCHAR->TEXT) are fast; others need a new column + backfill + drop.

EXPAND/CONTRACT for breaking changes
  1. Expand: add new column/table/constraint as nullable/optional.
  2. Dual-write application: write both old and new.
  3. Backfill old data.
  4. Dual-read -> read only new.
  5. Contract: drop old column in a SEPARATE migration, SEPARATE deploy.

DEFAULTS
- Every migration starts with `SET lock_timeout = '5s';
  SET statement_timeout = '60min';` (or per-project). Avoid 10am-on-a-
  Tuesday surprises.
- Destructive operations (DROP TABLE, DROP COLUMN, TRUNCATE) require
  a `-- INTENT: irreversible data loss` comment and reviewer signoff.
  Never paired with a non-destructive change in the same file.

TESTING
- Every migration runs in CI against a real Postgres with representative
  data volume (not an empty schema). Migration + rollback-forward are
  both tested.
Enter fullscreen mode Exit fullscreen mode

Before — single-step NOT NULL addition, blocking lock, no timeout:

ALTER TABLE orders ADD COLUMN region TEXT NOT NULL DEFAULT 'US';
UPDATE orders SET region = user_region(user_id); -- seq scan, hours
CREATE INDEX ON orders(region);                   -- blocks writes
Enter fullscreen mode Exit fullscreen mode

The UPDATE runs a function per row and locks the table; the CREATE INDEX blocks writes on the same table. The production deploy stalls.

After — three migrations, concurrent index, batched backfill:

-- 20260422_100000_add_region_nullable.sql
SET lock_timeout = '5s';
ALTER TABLE app.orders ADD COLUMN region TEXT;

-- 20260422_100100_index_region_concurrently.sql
-- NOTE: must NOT be in a transaction
CREATE INDEX CONCURRENTLY IF NOT EXISTS orders_region_idx ON app.orders(region);

-- 20260422_101500_backfill_region.sql
-- run as a job, in batches of 5k, not a single UPDATE
-- pseudo-SQL; real backfill is app-code
UPDATE app.orders SET region = 'US' WHERE id BETWEEN :lo AND :hi AND region IS NULL;

-- 20260423_090000_region_not_null.sql  (after backfill complete + deploy)
SET lock_timeout = '5s';
ALTER TABLE app.orders ALTER COLUMN region SET NOT NULL;
Enter fullscreen mode Exit fullscreen mode

Each migration is short, acquires a minimal lock, and is safe to re-run.


Rule 3: Indexes — Compound, Ordered to Match Queries, Partial Where It Helps

A table without the right index is a table with the wrong index. Cursor adds CREATE INDEX ON orders(created_at) because "you'll want to query by date," but the actual query is WHERE tenant_id = $1 AND created_at > $2 — which scans the entire tenant_id partition because the leading column of the index is wrong. The rule for compound indexes is ESR (Equality, Sort, Range): columns used for equality come first, then the column used for sort/order, then the range column. Plus: foreign keys always get an index on the referencing side (Postgres does NOT auto-index them; only the primary key side is indexed), partial indexes for sparse WHERE clauses reduce index size by 10×, and covering indexes (INCLUDE) let queries return from index-only scans.

The rule:

CREATE AN INDEX ONLY FOR A REAL QUERY
- Every new index links to a PR where a query uses it. `EXPLAIN` confirms
  it's picked. No speculative indexes — they slow writes without helping
  reads.
- Remove unused indexes. `pg_stat_user_indexes` shows `idx_scan = 0`
  after a month in production -> drop it.

ORDER THE COLUMNS RIGHT (ESR)
- Equality columns first, then sort column, then range column.
- WHERE tenant_id = $1 AND user_id = $2 AND created_at > $3 ORDER BY
  created_at DESC -> INDEX ON (tenant_id, user_id, created_at DESC).
- Multi-column indexes serve multiple queries only when the query uses
  a left-prefix.

FOREIGN KEYS
- Every FK gets an index on the REFERENCING column. Postgres does not
  auto-index it. Missing FK index = slow joins + DELETE on parent
  table is O(n).

PARTIAL INDEXES
- WHERE clauses on a frequently-queried boolean / status subset ->
  CREATE INDEX ... WHERE status = 'pending'. 5-10x smaller, much
  faster.

UNIQUE INDEXES = CONSTRAINTS
- UNIQUE (email) is both an index and a constraint. Use for natural
  keys. LOWER(email) needs a functional unique index:
  CREATE UNIQUE INDEX users_email_lower_key ON users (LOWER(email)).

COVERING / INCLUDE
- Hot read-path queries that return a few columns from an indexed
  lookup: INCLUDE (col1, col2) enables index-only scans.

GIN/GIST/BRIN for specific cases
- GIN for array/jsonb/tsvector. GIST for ranges/geometry. BRIN for
  append-mostly time-series tables (tiny index, large scan still
  filters by min/max per block range).
Enter fullscreen mode Exit fullscreen mode

Before — single-column index, FK unindexed, query does bitmap + seq:

CREATE INDEX ON orders(created_at);
-- Query:
SELECT id FROM orders
  WHERE tenant_id = $1 AND status = 'pending' AND created_at > $2
  ORDER BY created_at DESC LIMIT 50;
-- Plan: Bitmap Heap Scan, filters tenant_id + status in-memory.
-- 200ms for 50k tenants, 20s at tenant scale.
Enter fullscreen mode Exit fullscreen mode

After — ESR-ordered partial compound index, index-only scan, stable latency:

CREATE INDEX orders_tenant_status_created_idx
  ON app.orders (tenant_id, created_at DESC)
  INCLUDE (id)
  WHERE status = 'pending';

EXPLAIN ANALYZE
  SELECT id FROM app.orders
  WHERE tenant_id = $1 AND status = 'pending' AND created_at > $2
  ORDER BY created_at DESC LIMIT 50;
-- Plan: Index Only Scan using orders_tenant_status_created_idx
-- 0.3ms regardless of table size.
Enter fullscreen mode Exit fullscreen mode

Partial index is 10× smaller, compound order matches the query, INCLUDE enables index-only scan.


Rule 4: Query Writing — EXPLAIN Before Merge, No SELECT *, Bound The Result Set

The reason production SQL is slow is rarely that SQL is slow. It is usually that a query returns 10× more rows than the application uses, a join is on an unindexed column, or an ORM emits N+1 queries one per row. Cursor writes SELECT * FROM orders JOIN users ON users.id = orders.user_id because it reads naturally, but the result set has every column of both tables when the route only needs orders.id and users.email. Add no LIMIT and a test fixture of 100 rows, and you never notice until production has 100M. The rule is to treat every query as a physical plan — read the plan before merge, keep projections narrow, and never run an unbounded query.

The rule:

EXPLAIN EVERY NEW QUERY BEFORE MERGE
- PR template asks for EXPLAIN (ANALYZE, BUFFERS) output against a
  representative dataset (prod-sized clone or pg_restore'd sample).
- Seq Scan on a table > 10k rows without a filter is a red flag.
  Nested Loop with > 10k outer rows is a red flag.

NO SELECT *
- List columns explicitly. Prevents accidental wide returns when new
  columns are added. Makes pg_stat_statements readable.

BOUND EVERY RESULT SET
- Every query that can return many rows has a LIMIT + pagination.
  Keyset pagination (`WHERE id > :cursor ORDER BY id LIMIT :n`) over
  OFFSET. OFFSET is O(n) at the offset.
- COUNT(*) over a large table is expensive. Use approximations
  (pg_class.reltuples) or precomputed counters for dashboards.

AVOID N+1
- Log every query at debug. Tests assert query count for hot paths
  (e.g. `expect(await countQueries(() => getFeed(userId))).toBeLessThan(5)`).
- Use IN / ANY for batch loads. JOIN when you need the joined data;
  otherwise separate query + in-memory merge.

FUNCTIONS THAT KILL INDEXES
- Don't wrap indexed columns in functions: WHERE lower(email) = $1
  needs a functional index OR uses citext.
- Don't cast the indexed side: WHERE created_at::date = ... forces
  seq scan. Use WHERE created_at >= :day AND created_at < :day+1.

WINDOW FUNCTIONS / CTES
- Materialized CTEs (default pre-PG12) block the planner. Use
  WITH ... AS MATERIALIZED only when intentional; otherwise PG12+
  inlines.
- Window functions for ranking, running totals, deduplication.

pg_stat_statements
- Enable it in production. Review top-10 mean_exec_time every week
  (or when a latency alert fires).
Enter fullscreen mode Exit fullscreen mode

Before — SELECT *, unbounded join, no EXPLAIN, function on indexed column:

SELECT * FROM orders o
JOIN users u ON u.id = o.user_id
WHERE LOWER(u.email) = LOWER($1);
Enter fullscreen mode Exit fullscreen mode

LOWER() kills the users(email) index. Returns every column. No limit — multiple orders per user could return many rows.

After — explicit columns, citext or functional index, bounded:

-- Schema:
CREATE EXTENSION IF NOT EXISTS citext;
ALTER TABLE app.users ALTER COLUMN email TYPE citext;
CREATE UNIQUE INDEX users_email_key ON app.users(email);

-- Query:
SELECT o.id, o.total, o.status, u.email
FROM app.orders o
JOIN app.users u ON u.id = o.user_id
WHERE u.email = $1
ORDER BY o.id DESC
LIMIT 50;

EXPLAIN (ANALYZE, BUFFERS)  -- confirmed: Index Scan using users_email_key + nested loop
...;
Enter fullscreen mode Exit fullscreen mode

Index used. Bounded. Projection narrow. Plan reviewed.


Rule 5: Connection Pooling — Size To The Database, Never Connect Per Request

Postgres connections are expensive: each one is an OS process with ~10MB of memory. A typical Postgres tuned for 200 connections falls over at 800. The canonical AI-generated pattern is const client = new Client(...); await client.connect(); await client.query(...); await client.end(); inside every request handler — which at 500 requests/second is 500 connect handshakes per second. The handshakes alone saturate the DB. The fix is a single Pool per process, sized against the database's actual connection limit divided by the number of pods, and — for high-fanout serverless or short-lived pods — a connection pooler (pgBouncer in transaction mode) in front of Postgres.

The rule:

ONE POOL PER PROCESS
- `new Pool({ ... })` once, at composition root. Every query goes
  through the pool. No `new Client()` per request.
- Pool size: conservative. If DB max_connections = 200 and you have
  10 pods, pool.max <= 15 per pod (leaves headroom for superuser,
  migrations, pgBouncer).

IDLE TIMEOUTS
- idleTimeoutMillis: 30_000 releases sockets a LB may have cut.
  connectionTimeoutMillis: 5_000. statement_timeout: 30s (or as
  appropriate per query).
- keep-alive on the socket so LB TCP idle timeout doesn't silently
  kill the connection mid-pool.

pgBouncer IN PRODUCTION
- Transaction-mode pooling with short idle connections reduces the
  effective connection count at Postgres to a fraction.
- Compatibility: no session-scoped features (LISTEN/NOTIFY, prepared
  statements by default, session GUCs). The driver must be configured
  accordingly.
- RDS Proxy / Neon / Supabase provide managed poolers; same rules.

NEVER HOLD A CONNECTION ACROSS AWAITS YOU DON'T NEED
- Acquire from pool just before query; release immediately after.
  `pool.query(sql, params)` is preferred over manual client acquire.
- For a transaction: acquire a client, BEGIN, work, COMMIT, release
  in a try/finally. Never leak a client.

MONITOR
- Export pool stats (pool.totalCount, pool.idleCount, pool.waitingCount).
  Waiting > 0 means the pool is undersized OR a slow query is holding
  connections.
- pg_stat_activity: active long-running queries, idle-in-transaction
  sessions (which are a red flag — always commit or rollback promptly).
Enter fullscreen mode Exit fullscreen mode

Before — per-request client, connection storm at load:

app.get('/orders', async (req, res) => {
  const client = new Client({ connectionString: process.env.DATABASE_URL });
  await client.connect();
  const { rows } = await client.query('SELECT id, total FROM orders WHERE user_id=$1', [req.user.id]);
  await client.end();
  res.json(rows);
});
Enter fullscreen mode Exit fullscreen mode

500 req/s = 500 handshakes/s. DB rejects connections; 5xx spikes.

After — single pool, explicit timeouts, pool.query:

// infra/db.ts
import { Pool } from 'pg';
export const pool = new Pool({
  connectionString: config.DATABASE_URL,
  max: config.DB_POOL_MAX,               // e.g. 10
  idleTimeoutMillis: 30_000,
  connectionTimeoutMillis: 5_000,
  statement_timeout: 30_000,
  application_name: config.APP_NAME,
});

// routes/orders.ts
router.get('/orders', asyncHandler(async (req, res) => {
  const { rows } = await pool.query<OrderSummary>(
    'SELECT id, total FROM app.orders WHERE user_id = $1 ORDER BY id DESC LIMIT 50',
    [req.user!.id],
  );
  res.json(rows);
}));
Enter fullscreen mode Exit fullscreen mode

One handshake per pod startup. Pool queues requests when the DB is saturated. Stats exportable.


Rule 6: Transactions — Explicit Scope, Correct Isolation, Never Idle

Every ORM quietly runs most queries outside a transaction, and most SQL tutorials say nothing about isolation levels. Cursor writes await db.query('INSERT INTO accounts ...'); await db.query('UPDATE ledger ...'); — two separate statements, two separate transactions, and a crash between them leaves the database inconsistent. The next version wraps them in BEGIN/COMMIT but forgets ROLLBACK on error, so on exception the client is returned to the pool with an open transaction that becomes "idle in transaction" — holding locks until it's reaped. The third version uses SERIALIZABLE isolation because "it's the safest" and deadlocks half the queries under load. The rule is: transactions are explicit, short, correctly isolated, and released in a finally.

The rule:

EXPLICIT TRANSACTIONS
- A `withTransaction(fn)` helper is the only way to open a transaction:
  acquires a client, BEGINs, runs fn(client), COMMITs on success,
  ROLLBACKs on throw, releases client in finally. Never manual
  BEGIN/COMMIT scattered across handlers.
- Transactions are SHORT. No awaiting a third-party HTTP call inside
  a transaction — it holds locks and a DB connection. Do the work
  outside; transaction is the DB-only boundary.

ISOLATION
- Default is READ COMMITTED (Postgres default). Enough for most reads.
- REPEATABLE READ when a transaction reads the same table twice and
  must see a consistent snapshot.
- SERIALIZABLE for money-moving / invariant-preserving writes
  (transfer between accounts). Expect occasional serialization
  failures (SQLSTATE 40001) — caller retries with backoff.
- Mark the isolation on the BEGIN: `BEGIN ISOLATION LEVEL SERIALIZABLE
  READ WRITE;`.

ROW LOCKS
- SELECT ... FOR UPDATE when the row is going to be updated in the
  same transaction (pessimistic). SKIP LOCKED for queue-like patterns.
- Never SELECT inside a transaction with no intention to update and
  no isolation upgrade — it holds locks unnecessarily.

ADVISORY LOCKS
- pg_try_advisory_xact_lock(key) for cross-transaction coordination
  without a physical row (e.g. "only one worker runs this job").
  Keys are bigint; use a hashed namespace + id.

NO IDLE IN TRANSACTION
- `idle_in_transaction_session_timeout` set at the database (e.g. 60s)
  so a leaked tx dies rather than blocking vacuum forever.
- App code observes long-running transactions via pg_stat_activity.

SAVEPOINTS
- For nested operations that may fail without aborting the outer tx:
  SAVEPOINT sp_x; ... RELEASE | ROLLBACK TO sp_x.
Enter fullscreen mode Exit fullscreen mode

Before — implicit mini-transactions, broken invariant, leaked client:

await pool.query('UPDATE accounts SET balance = balance - $1 WHERE id=$2', [amount, from]);
await pool.query('UPDATE accounts SET balance = balance + $1 WHERE id=$2', [amount, to]);
// Crash between = money lost / duplicated.
Enter fullscreen mode Exit fullscreen mode

No atomicity, no isolation, no lock. The classic double-entry bookkeeping failure mode.

After — withTransaction helper, serializable retry, short scope:

export async function withTransaction<T>(
  fn: (tx: PoolClient) => Promise<T>,
  opts: { isolation?: 'READ COMMITTED' | 'REPEATABLE READ' | 'SERIALIZABLE' } = {},
): Promise<T> {
  const client = await pool.connect();
  try {
    await client.query(`BEGIN${opts.isolation ? ` ISOLATION LEVEL ${opts.isolation}` : ''}`);
    const result = await fn(client);
    await client.query('COMMIT');
    return result;
  } catch (err) {
    await client.query('ROLLBACK').catch(() => {});
    throw err;
  } finally {
    client.release();
  }
}

export async function transfer(from: string, to: string, amount: string) {
  return withRetry(() =>
    withTransaction(async tx => {
      const debited = await tx.query(
        'UPDATE app.accounts SET balance = balance - $1 WHERE id = $2 AND balance >= $1',
        [amount, from],
      );
      if (debited.rowCount === 0) throw new ConflictError('insufficient funds');
      await tx.query('UPDATE app.accounts SET balance = balance + $1 WHERE id = $2', [amount, to]);
    }, { isolation: 'SERIALIZABLE' }),
  );
}
Enter fullscreen mode Exit fullscreen mode

Atomic. Correctly isolated. Client always released. Retries serialize failures.


Rule 7: Parameterized Everything — No String Interpolation, Typed Rows Out

Every SQL injection bug in history is a string interpolation. Cursor writes db.query('SELECT * FROM users WHERE email = \'' + email + '\'') because the example in the tutorial did. Even with a "clean" input it is wrong — numbers become strings, dates become strings, booleans become strings, and the planner doesn't cache the plan because the SQL text differs per call. Use parameter placeholders. Full stop. And on the way out: typed row shapes — pool.query<UserRow>(...) with a Zod or io-ts validation at the boundary so runtime drift from schema evolution surfaces in tests, not in production.

The rule:

PARAMETERIZED ALWAYS
- `pool.query('... WHERE x = $1', [value])`. Never string concatenation,
  never template literals with values, never `format()` helpers.
- Identifiers that MUST be dynamic (rare) use a safelist + an
  identifier quoting helper (pg-format's %I). Never interpolate
  user-provided identifiers.

BULK
- Batch INSERTs: a single statement with `unnest($1::text[], $2::int[])`
  or a driver's bulk insert helper. N×1-row INSERTs in a loop is
  N round-trips.
- UPSERT: INSERT ... ON CONFLICT DO UPDATE SET ... WHERE .... Explicit
  conflict target.

TYPED ROW SHAPES
- Every query function returns a typed result:
  `async function findUser(id): Promise<User | null>`, where User is
  a project-wide type.
- ORM / query-builder: Drizzle, Kysely, Prisma provide typed queries.
  For raw pg, a Zod schema at the boundary parses rows:
  `UserRowSchema.parse(rows[0])` — catches schema drift.

NULL HANDLING
- COALESCE only with typed defaults. `COALESCE(a, 0)` fine; COALESCE(a, '')
  for nullable text is a smell — usually the column should be NOT NULL.
- `IS DISTINCT FROM` / `IS NOT DISTINCT FROM` for NULL-safe equality.

ENUM MAPPING
- DB enum <-> TS union type. Zod's z.nativeEnum or z.enum at the
  boundary. Never use string literals inline for status values in
  application code.

DATES
- Driver converts TIMESTAMPTZ <-> JS Date (UTC). Never format dates
  into query strings. Never parse DB timestamps manually.
Enter fullscreen mode Exit fullscreen mode

Before — concatenated query, SQL injection, untyped rows:

async function getOrders(userId, status) {
  const sql = `SELECT * FROM orders WHERE user_id = ${userId} AND status = '${status}'`;
  const { rows } = await pool.query(sql);
  return rows;
}
Enter fullscreen mode Exit fullscreen mode

status = "' OR '1'='1" returns every order in the database. rows is any[].

After — parameters, typed result, schema validation at the boundary:

const OrderRowSchema = z.object({
  id: z.string(),
  user_id: z.string(),
  total: z.string().transform(s => new Decimal(s)),
  status: z.enum(['pending', 'paid', 'shipped', 'cancelled', 'refunded']),
  created_at: z.date(),
});
type OrderRow = z.infer<typeof OrderRowSchema>;

export async function getOrders(userId: string, status: OrderStatus): Promise<OrderRow[]> {
  const { rows } = await pool.query(
    `SELECT id, user_id, total, status, created_at
       FROM app.orders
      WHERE user_id = $1 AND status = $2
      ORDER BY id DESC LIMIT 100`,
    [userId, status],
  );
  return z.array(OrderRowSchema).parse(rows);
}
Enter fullscreen mode Exit fullscreen mode

SQL-injection-free. Plan cached. Rows typed. Schema drift becomes a test failure.


Rule 8: TypeScript/Node Integration — Schema Is The Source Of Truth

A PostgreSQL project in TypeScript has three sources of truth that can drift: the SQL DDL, the application types, and the query helpers. Cursor generates all three independently — a migration that adds NOT NULL to a column whose TS type is still optional; a findUser that returns { id: number } when the column is BIGINT (JavaScript silently truncates at 2^53); a query that selects a column that doesn't exist because a refactor renamed it. The fix is a single source of truth: the SQL schema — with a tool (Kysely + introspection, Drizzle with pull, PgTyped, sqlc) that generates TS types from the live database, so the TS compiler sees every schema change.

The rule:

SOURCE OF TRUTH IS THE DATABASE
- Schema lives in migrations. TS types are GENERATED from the DB
  (or from the migration files) — never hand-written to mirror the
  schema.
- Tools:
    Drizzle: `drizzle-kit pull` or codegen from the schema file.
    Kysely: kysely-codegen introspects the DB.
    Prisma: schema.prisma IS the truth; `prisma migrate` generates SQL
      AND TS.
    PgTyped / sqlc: write SQL, generate typed functions.
  Pick one; don't mix.

NUMERIC / BIGINT
- BIGINT comes back as `string` from node-postgres by default (safe).
  Never parse to Number if the column can exceed 2^53. Use bigint
  (native) or a Decimal type for monetary amounts.
- Configure pg types explicitly: `types.setTypeParser(20, val => BigInt(val))`
  or keep as string and parse at the boundary.

DATES
- Leave pg's default Date parsing unless the column is `date` (no
  time) — then parse to a LocalDate / plain string. Don't let
  Postgres timezone semantics leak into JS without intention.

JSONB
- JSONB -> unknown in TS. Validate on read with Zod: the column may
  have any shape (historical data, partial writes). Tests include a
  "schema drift" fixture with old-shape documents.

NULL VS UNDEFINED
- Postgres NULL -> JS null (driver default). `undefined` vs `null` is
  a common TS bug; prefer explicit null at the DB boundary, undefined
  only for "not loaded".

MIGRATIONS + TYPES IN CI
- CI step: spin up a Postgres container, apply all migrations,
  regenerate types, check `git diff` is empty. If the schema changes
  and types don't, CI fails.

QUERY LOCATIONS
- Raw SQL in `src/db/queries/*.sql` with PgTyped, OR query-builder
  calls in `src/db/queries/*.ts`. Never inline ad-hoc SQL in a
  handler.
Enter fullscreen mode Exit fullscreen mode

Before — hand-written types, BIGINT truncation, drift on rename:

interface User { id: number; email: string; }
// Migration later renames email -> email_address. This interface still compiles.
async function getUser(id: number) {
  const { rows } = await pool.query('SELECT id, email FROM users WHERE id=$1', [id]);
  return rows[0] as User; // now at runtime: { id, email_address }
}
Enter fullscreen mode Exit fullscreen mode

Cast hides the rename. id: number silently truncates BIGINTs > 2^53.

After — generated types, runtime validation, CI guards drift:

// Generated by kysely-codegen from the live DB:
export interface Database {
  users: {
    id: ColumnType<string, string | undefined, never>; // BIGINT -> string
    email: ColumnType<string, string, string>;
    created_at: ColumnType<Date, string | undefined, never>;
  };
}

export async function getUser(id: string) {
  const row = await db
    .selectFrom('users')
    .select(['id', 'email', 'created_at'])
    .where('id', '=', id)
    .executeTakeFirst();
  return row ?? null;
}
Enter fullscreen mode Exit fullscreen mode

Rename the column in a migration, re-run codegen, the handler fails to compile until updated. BIGINT stays a string. Types and schema don't drift.


The Complete .cursorrules File

Drop this in the repo root. Cursor and Claude Code both pick it up.

# PostgreSQL — Production Patterns

## Schema Design
- PK: BIGINT GENERATED ALWAYS AS IDENTITY (or UUID with pgcrypto).
  Never SERIAL.
- Money: NUMERIC(p, s); never FLOAT/DOUBLE. Timestamps: TIMESTAMPTZ
  always. Text: TEXT with CHECK length; never VARCHAR(n) unless length
  is a domain constraint.
- FKs always declare REFERENCES and explicit ON DELETE.
- NOT NULL by default. CHECK constraints for domain invariants.
  UNIQUE for natural keys. created_at / updated_at on every business
  table (updated_at via trigger).
- Identifiers: snake_case, plural tables. Schema `app`, never
  `public` for business tables.

## Migrations
- Forward-only. Rollbacks are new forward migrations.
- Every migration: `SET lock_timeout = '5s'`. CREATE INDEX uses
  CONCURRENTLY (own file, no tx).
- ADD COLUMN NOT NULL without constant default = full rewrite ->
  expand/contract (add nullable, backfill in batches, SET NOT NULL).
- Destructive operations in dedicated migrations with INTENT comment.
- Every migration tested in CI against representative data.

## Indexes
- Only for a real query in the same PR. Remove unused indexes.
- ESR ordering: equality, sort, range.
- Every FK gets an index on the referencing column.
- Partial indexes for hot subsets (`WHERE status = 'pending'`).
- INCLUDE for covering indexes / index-only scans.

## Query Writing
- EXPLAIN (ANALYZE, BUFFERS) on new queries. No Seq Scan on
  large tables without justification.
- Never SELECT *. List columns.
- Every query has a LIMIT; keyset pagination over OFFSET.
- No N+1 — test the query count on hot paths.
- Don't wrap indexed columns in functions (use citext / functional
  indexes).
- pg_stat_statements reviewed weekly.

## Connection Pooling
- One pg.Pool per process, sized conservatively (DB max /
  pods * safety).
- idleTimeoutMillis, connectionTimeoutMillis, statement_timeout
  explicit. Keep-alive on.
- pgBouncer transaction-mode in production; driver configured for
  compatibility (no session features).
- pool.query preferred. Manual client acquire ONLY for transactions,
  with finally-release.

## Transactions
- withTransaction(fn, { isolation }) is the only tx helper.
- Transactions are short; no 3rd-party HTTP inside.
- Isolation: READ COMMITTED default; SERIALIZABLE for invariants ->
  caller retries on 40001.
- SELECT FOR UPDATE / SKIP LOCKED for queues.
- idle_in_transaction_session_timeout set at the DB.

## Parameterized + Typed
- Always $1, $2 parameters. Never string interpolation. Dynamic
  identifiers only via %I with a safelist.
- Bulk INSERT via unnest arrays or driver bulk helpers.
- Typed result rows; Zod (or codegen) validates at the boundary.
- BIGINT as string (or bigint). Never truncate to Number.

## TS/Node Integration
- DB is the source of truth for types — generate via Drizzle /
  Kysely codegen / Prisma / PgTyped. Never hand-mirror.
- CI re-runs codegen and fails on drift.
- JSONB validated on read with Zod, including historical shapes.
- Null at the DB boundary, undefined only for "not loaded".
Enter fullscreen mode Exit fullscreen mode

End-to-End Example: A Tenant-Scoped Orders Feed

Without rules: SERIAL ids, unindexed joins, SELECT *, per-request connect, hand-built types.

CREATE TABLE orders (id SERIAL, user_id INT, total FLOAT, status VARCHAR(255), created_at TIMESTAMP);
Enter fullscreen mode Exit fullscreen mode
async function feed(userId) {
  const client = new Client(...); await client.connect();
  const { rows } = await client.query(`SELECT * FROM orders WHERE user_id = ${userId}`);
  await client.end();
  return rows;
}
Enter fullscreen mode Exit fullscreen mode

With rules: typed schema, partial compound index, parameterized query, pool, generated types.

CREATE INDEX orders_tenant_user_created_idx
  ON app.orders (tenant_id, user_id, created_at DESC)
  WHERE status IN ('paid', 'shipped');
Enter fullscreen mode Exit fullscreen mode
export async function feed(tenantId: string, userId: string): Promise<OrderRow[]> {
  const { rows } = await pool.query(
    `SELECT id, total, status, created_at
       FROM app.orders
      WHERE tenant_id = $1 AND user_id = $2 AND status IN ('paid','shipped')
      ORDER BY created_at DESC LIMIT 50`,
    [tenantId, userId],
  );
  return z.array(OrderRowSchema).parse(rows);
}
Enter fullscreen mode Exit fullscreen mode

Index-only scan. Parameterized. Types generated. Latency flat as the table grows.


Get the Full Pack

These eight rules cover the PostgreSQL patterns where AI assistants consistently reach for the wrong tutorial. Drop them into .cursorrules and the next SQL you generate will be constrained, indexed, bounded, typed, and pooled — without having to re-prompt.

If you want the expanded pack — these eight plus rules for partitioning, materialized views, logical replication, row-level security, full-text search, pg_notify patterns, pgvector / semantic search, audit via triggers, and testing with Testcontainers + truncation strategies — it is bundled in Cursor Rules Pack v2 ($27, one payment, lifetime updates). Drop it in your repo, stop fighting your AI, ship PostgreSQL you would actually merge.

Top comments (0)