Cursor Rules for PostgreSQL: The Complete Guide to AI-Assisted PostgreSQL Development
PostgreSQL is the database where "it works in staging" is a promise that survives exactly as long as the data volume in staging resembles production — which is to say, almost never. The service ships. The dashboard loads in 200ms on 50k rows. Two months later, at 20 million rows, the same endpoint times out at 30 seconds because the query does a sequential scan on a table that's larger than shared_buffers, because the index somebody added is on (created_at) when every query filters by tenant_id, created_at, because SELECT * pulls back a jsonb column averaging 40KB that nobody needed, because the migration that added a NOT NULL column ran ALTER TABLE ... SET DEFAULT and rewrote every row while holding an ACCESS EXCLUSIVE lock, and because the "connection pool" is a fresh pg.Client per request so under load the database sits at 500 connections and rejects new ones while the pods wait. The database doesn't get slower — the query plan changes. And by the time you notice, the explanation is buried in fifteen commits by five people and three migrations that each seemed fine in isolation.
Then you add an AI assistant.
Cursor and Claude Code were trained on a decade of PostgreSQL content, most of it from tutorials that teach correct SQL but do not teach EXPLAIN, NOT NULL, foreign key indexes, or pg_stat_statements. Ask for "a table for orders with line items," and you get CREATE TABLE orders (id SERIAL PRIMARY KEY, user_id INT, total FLOAT) — SERIAL instead of BIGINT GENERATED ALWAYS AS IDENTITY, INT instead of a typed reference with ON DELETE semantics, FLOAT for money, no created_at, no NOT NULL, no unique constraint on the natural key. Ask for "insert a user," and you get INSERT INTO users(email) VALUES('${email}') with string interpolation because the example on Stack Overflow does it. Ask for "find users by email," and you get SELECT * FROM users WHERE LOWER(email) = LOWER($1) against an index on email that doesn't match because LOWER() ruins index usage and no one wrote a functional index.
The fix is .cursorrules — one file in the repo that tells the AI what idiomatic, production-grade PostgreSQL looks like. Eight rules below, each with the failure mode, the rule, and a before/after. Copy-paste .cursorrules at the end. Examples use the pg Node driver; the rules apply equally to asyncpg (Python), pgx (Go), sqlx (Rust), and to ORM layers like Prisma, Drizzle, and Kysely.
How Cursor Rules Work for PostgreSQL Projects
Cursor reads project rules from two locations: .cursorrules (a single file at the repo root, still supported) and .cursor/rules/*.mdc (modular files with frontmatter, recommended). For PostgreSQL I recommend modular rules so the schema/DDL conventions don't bleed into application query code, and so test conventions are scoped to integration test files only:
.cursor/rules/
pg-schema.mdc # types, constraints, identifiers, naming
pg-migrations.mdc # forward-only, non-blocking DDL, expand/contract
pg-indexes.mdc # compound indexes, partial, covering, FK indexes
pg-queries.mdc # EXPLAIN, no SELECT *, bounded result sets
pg-pool.mdc # pool sizing, no per-request connect, pgBouncer
pg-transactions.mdc # explicit BEGIN, isolation, advisory locks
pg-driver.mdc # parameterized everything, typed rows
pg-testing.mdc # real Postgres (testcontainers), truncate vs transaction
Frontmatter controls activation: globs: ["**/*.{sql,ts,js,py}", "**/migrations/**"] with alwaysApply: false. Now the rules.
Rule 1: Schema Design — Types That Match the Domain, Constraints That Enforce Invariants
The first sin of AI-generated PostgreSQL is always the same: SERIAL primary keys (deprecated since PG10 in favor of identity columns), FLOAT for monetary amounts, VARCHAR(255) out of MySQL muscle memory (in Postgres TEXT is the same storage, without the arbitrary cap), TIMESTAMP without time zone for a column that's clearly a point in time, INT foreign keys with no REFERENCES, no NOT NULL because "we'll fill it in later," and no CHECK constraints anywhere. The schema is the contract between every service and the data; it is the one place where you can enforce invariants that no amount of application code will catch. Cursor won't add a CHECK (total > 0) unless you tell it to.
The rule:
IDENTIFIERS
- Primary keys: `id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY`
(or UUID where distribution / public exposure matters). Never SERIAL.
- UUIDs use `uuid` type and `gen_random_uuid()` (requires `pgcrypto`
extension). Never text columns containing UUIDs.
- Foreign keys ALWAYS declare REFERENCES with explicit ON DELETE
behavior (RESTRICT by default; CASCADE / SET NULL only when intentional).
TYPES
- Money: NUMERIC(precision, scale), never FLOAT/DOUBLE. Rates: NUMERIC.
- Timestamps: TIMESTAMPTZ (always). Never TIMESTAMP without time zone.
Store UTC; convert at the display boundary.
- Strings: TEXT, constrained by CHECK (char_length(x) BETWEEN a AND b).
Never VARCHAR(n) unless the length is a real domain constraint.
- Enums: native PG ENUM when values are stable; VARCHAR+CHECK when they
may evolve; lookup TABLE + FK when values need metadata.
- JSON: JSONB (never JSON). Only when the shape is genuinely variable;
columns are cheaper than JSONB keys.
CONSTRAINTS
- NOT NULL by default. Nullability is an explicit, justified decision.
- CHECK constraints for domain invariants: total >= 0,
status IN ('pending','paid','cancelled'), email matches regex.
- UNIQUE constraints for every natural key (email, tenant+slug).
- created_at TIMESTAMPTZ NOT NULL DEFAULT now() on every business table;
updated_at TIMESTAMPTZ NOT NULL DEFAULT now() with a trigger.
NAMING
- snake_case, plural tables (users, orders), singular columns.
- FK: `<referenced>_id` (user_id, order_id). Indexes: `<table>_<cols>_idx`.
Constraints: `<table>_<cols>_key` / `<table>_<cols>_check`.
- Schemas: `app` for production tables; `app_audit` for audit; `app_temp`
for ephemera. Never tables in `public` beyond extensions.
Before — Mongo-in-Postgres, no constraints, FLOAT money:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INT,
total FLOAT,
status VARCHAR(255),
created_at TIMESTAMP
);
user_id has no FK — delete a user and orphaned orders persist. total stored as FLOAT means 0.1 + 0.2 != 0.3 in any aggregation. status can be literally anything. created_at loses timezone.
After — typed, constrained, auditable:
CREATE TYPE order_status AS ENUM ('pending', 'paid', 'shipped', 'cancelled', 'refunded');
CREATE TABLE app.orders (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES app.users(id) ON DELETE RESTRICT,
total NUMERIC(12, 2) NOT NULL CHECK (total >= 0),
currency CHAR(3) NOT NULL CHECK (currency ~ '^[A-Z]{3}$'),
status order_status NOT NULL DEFAULT 'pending',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TRIGGER orders_set_updated_at
BEFORE UPDATE ON app.orders FOR EACH ROW
EXECUTE FUNCTION app.set_updated_at();
The schema now enforces invariants the application can't lie about. total < 0 is impossible. status is closed. Timezone is preserved.
Rule 2: Migrations — Forward-Only, Idempotent, Non-Blocking Under Load
The most dangerous migration is the one that ran fine on 50k rows in staging and held an ACCESS EXCLUSIVE lock for 47 minutes on the 20M-row production table. Cursor writes ALTER TABLE orders ADD COLUMN region TEXT NOT NULL DEFAULT 'US' — which on PG11+ is fast for the default, but ALTER TABLE orders ADD COLUMN region TEXT NOT NULL with a later UPDATE ... SET region = ... is a full table rewrite under lock. Worse: CREATE INDEX without CONCURRENTLY blocks writes for the duration. Worst: ALTER TABLE ... DROP COLUMN in the same deploy as the code that still reads that column. The rule is expand/contract: ship the additive change, deploy the code, ship the contraction.
The rule:
FORWARD-ONLY
- No `down` migrations in production. "Down" is a new forward migration
that reverts. Rollbacks happen via `revert_<timestamp>_...` files.
- Every migration has a unique monotonic id (timestamp prefix) and is
idempotent where possible (CREATE TABLE IF NOT EXISTS is not safe —
schemas must match; use a tool like sqlx/migrate/node-pg-migrate/atlas
that tracks applied migrations).
NON-BLOCKING DDL
- `CREATE INDEX CONCURRENTLY` — never bare CREATE INDEX on a populated
production table. CONCURRENTLY cannot run in a transaction, so isolate
it in its own migration file.
- `ADD COLUMN ... DEFAULT ...` is fast on PG11+ for a constant default.
Safe. `ADD COLUMN ... DEFAULT non_constant()` is a table rewrite —
use expand/contract.
- `ADD COLUMN ... NOT NULL` without a default rewrites the table. Do it
in three steps: ADD nullable -> backfill in batches -> SET NOT NULL.
- `ALTER COLUMN ... TYPE ...` often rewrites. Cast-free changes (e.g.
VARCHAR->TEXT) are fast; others need a new column + backfill + drop.
EXPAND/CONTRACT for breaking changes
1. Expand: add new column/table/constraint as nullable/optional.
2. Dual-write application: write both old and new.
3. Backfill old data.
4. Dual-read -> read only new.
5. Contract: drop old column in a SEPARATE migration, SEPARATE deploy.
DEFAULTS
- Every migration starts with `SET lock_timeout = '5s';
SET statement_timeout = '60min';` (or per-project). Avoid 10am-on-a-
Tuesday surprises.
- Destructive operations (DROP TABLE, DROP COLUMN, TRUNCATE) require
a `-- INTENT: irreversible data loss` comment and reviewer signoff.
Never paired with a non-destructive change in the same file.
TESTING
- Every migration runs in CI against a real Postgres with representative
data volume (not an empty schema). Migration + rollback-forward are
both tested.
Before — single-step NOT NULL addition, blocking lock, no timeout:
ALTER TABLE orders ADD COLUMN region TEXT NOT NULL DEFAULT 'US';
UPDATE orders SET region = user_region(user_id); -- seq scan, hours
CREATE INDEX ON orders(region); -- blocks writes
The UPDATE runs a function per row and locks the table; the CREATE INDEX blocks writes on the same table. The production deploy stalls.
After — three migrations, concurrent index, batched backfill:
-- 20260422_100000_add_region_nullable.sql
SET lock_timeout = '5s';
ALTER TABLE app.orders ADD COLUMN region TEXT;
-- 20260422_100100_index_region_concurrently.sql
-- NOTE: must NOT be in a transaction
CREATE INDEX CONCURRENTLY IF NOT EXISTS orders_region_idx ON app.orders(region);
-- 20260422_101500_backfill_region.sql
-- run as a job, in batches of 5k, not a single UPDATE
-- pseudo-SQL; real backfill is app-code
UPDATE app.orders SET region = 'US' WHERE id BETWEEN :lo AND :hi AND region IS NULL;
-- 20260423_090000_region_not_null.sql (after backfill complete + deploy)
SET lock_timeout = '5s';
ALTER TABLE app.orders ALTER COLUMN region SET NOT NULL;
Each migration is short, acquires a minimal lock, and is safe to re-run.
Rule 3: Indexes — Compound, Ordered to Match Queries, Partial Where It Helps
A table without the right index is a table with the wrong index. Cursor adds CREATE INDEX ON orders(created_at) because "you'll want to query by date," but the actual query is WHERE tenant_id = $1 AND created_at > $2 — which scans the entire tenant_id partition because the leading column of the index is wrong. The rule for compound indexes is ESR (Equality, Sort, Range): columns used for equality come first, then the column used for sort/order, then the range column. Plus: foreign keys always get an index on the referencing side (Postgres does NOT auto-index them; only the primary key side is indexed), partial indexes for sparse WHERE clauses reduce index size by 10×, and covering indexes (INCLUDE) let queries return from index-only scans.
The rule:
CREATE AN INDEX ONLY FOR A REAL QUERY
- Every new index links to a PR where a query uses it. `EXPLAIN` confirms
it's picked. No speculative indexes — they slow writes without helping
reads.
- Remove unused indexes. `pg_stat_user_indexes` shows `idx_scan = 0`
after a month in production -> drop it.
ORDER THE COLUMNS RIGHT (ESR)
- Equality columns first, then sort column, then range column.
- WHERE tenant_id = $1 AND user_id = $2 AND created_at > $3 ORDER BY
created_at DESC -> INDEX ON (tenant_id, user_id, created_at DESC).
- Multi-column indexes serve multiple queries only when the query uses
a left-prefix.
FOREIGN KEYS
- Every FK gets an index on the REFERENCING column. Postgres does not
auto-index it. Missing FK index = slow joins + DELETE on parent
table is O(n).
PARTIAL INDEXES
- WHERE clauses on a frequently-queried boolean / status subset ->
CREATE INDEX ... WHERE status = 'pending'. 5-10x smaller, much
faster.
UNIQUE INDEXES = CONSTRAINTS
- UNIQUE (email) is both an index and a constraint. Use for natural
keys. LOWER(email) needs a functional unique index:
CREATE UNIQUE INDEX users_email_lower_key ON users (LOWER(email)).
COVERING / INCLUDE
- Hot read-path queries that return a few columns from an indexed
lookup: INCLUDE (col1, col2) enables index-only scans.
GIN/GIST/BRIN for specific cases
- GIN for array/jsonb/tsvector. GIST for ranges/geometry. BRIN for
append-mostly time-series tables (tiny index, large scan still
filters by min/max per block range).
Before — single-column index, FK unindexed, query does bitmap + seq:
CREATE INDEX ON orders(created_at);
-- Query:
SELECT id FROM orders
WHERE tenant_id = $1 AND status = 'pending' AND created_at > $2
ORDER BY created_at DESC LIMIT 50;
-- Plan: Bitmap Heap Scan, filters tenant_id + status in-memory.
-- 200ms for 50k tenants, 20s at tenant scale.
After — ESR-ordered partial compound index, index-only scan, stable latency:
CREATE INDEX orders_tenant_status_created_idx
ON app.orders (tenant_id, created_at DESC)
INCLUDE (id)
WHERE status = 'pending';
EXPLAIN ANALYZE
SELECT id FROM app.orders
WHERE tenant_id = $1 AND status = 'pending' AND created_at > $2
ORDER BY created_at DESC LIMIT 50;
-- Plan: Index Only Scan using orders_tenant_status_created_idx
-- 0.3ms regardless of table size.
Partial index is 10× smaller, compound order matches the query, INCLUDE enables index-only scan.
Rule 4: Query Writing — EXPLAIN Before Merge, No SELECT *, Bound The Result Set
The reason production SQL is slow is rarely that SQL is slow. It is usually that a query returns 10× more rows than the application uses, a join is on an unindexed column, or an ORM emits N+1 queries one per row. Cursor writes SELECT * FROM orders JOIN users ON users.id = orders.user_id because it reads naturally, but the result set has every column of both tables when the route only needs orders.id and users.email. Add no LIMIT and a test fixture of 100 rows, and you never notice until production has 100M. The rule is to treat every query as a physical plan — read the plan before merge, keep projections narrow, and never run an unbounded query.
The rule:
EXPLAIN EVERY NEW QUERY BEFORE MERGE
- PR template asks for EXPLAIN (ANALYZE, BUFFERS) output against a
representative dataset (prod-sized clone or pg_restore'd sample).
- Seq Scan on a table > 10k rows without a filter is a red flag.
Nested Loop with > 10k outer rows is a red flag.
NO SELECT *
- List columns explicitly. Prevents accidental wide returns when new
columns are added. Makes pg_stat_statements readable.
BOUND EVERY RESULT SET
- Every query that can return many rows has a LIMIT + pagination.
Keyset pagination (`WHERE id > :cursor ORDER BY id LIMIT :n`) over
OFFSET. OFFSET is O(n) at the offset.
- COUNT(*) over a large table is expensive. Use approximations
(pg_class.reltuples) or precomputed counters for dashboards.
AVOID N+1
- Log every query at debug. Tests assert query count for hot paths
(e.g. `expect(await countQueries(() => getFeed(userId))).toBeLessThan(5)`).
- Use IN / ANY for batch loads. JOIN when you need the joined data;
otherwise separate query + in-memory merge.
FUNCTIONS THAT KILL INDEXES
- Don't wrap indexed columns in functions: WHERE lower(email) = $1
needs a functional index OR uses citext.
- Don't cast the indexed side: WHERE created_at::date = ... forces
seq scan. Use WHERE created_at >= :day AND created_at < :day+1.
WINDOW FUNCTIONS / CTES
- Materialized CTEs (default pre-PG12) block the planner. Use
WITH ... AS MATERIALIZED only when intentional; otherwise PG12+
inlines.
- Window functions for ranking, running totals, deduplication.
pg_stat_statements
- Enable it in production. Review top-10 mean_exec_time every week
(or when a latency alert fires).
Before — SELECT *, unbounded join, no EXPLAIN, function on indexed column:
SELECT * FROM orders o
JOIN users u ON u.id = o.user_id
WHERE LOWER(u.email) = LOWER($1);
LOWER() kills the users(email) index. Returns every column. No limit — multiple orders per user could return many rows.
After — explicit columns, citext or functional index, bounded:
-- Schema:
CREATE EXTENSION IF NOT EXISTS citext;
ALTER TABLE app.users ALTER COLUMN email TYPE citext;
CREATE UNIQUE INDEX users_email_key ON app.users(email);
-- Query:
SELECT o.id, o.total, o.status, u.email
FROM app.orders o
JOIN app.users u ON u.id = o.user_id
WHERE u.email = $1
ORDER BY o.id DESC
LIMIT 50;
EXPLAIN (ANALYZE, BUFFERS) -- confirmed: Index Scan using users_email_key + nested loop
...;
Index used. Bounded. Projection narrow. Plan reviewed.
Rule 5: Connection Pooling — Size To The Database, Never Connect Per Request
Postgres connections are expensive: each one is an OS process with ~10MB of memory. A typical Postgres tuned for 200 connections falls over at 800. The canonical AI-generated pattern is const client = new Client(...); await client.connect(); await client.query(...); await client.end(); inside every request handler — which at 500 requests/second is 500 connect handshakes per second. The handshakes alone saturate the DB. The fix is a single Pool per process, sized against the database's actual connection limit divided by the number of pods, and — for high-fanout serverless or short-lived pods — a connection pooler (pgBouncer in transaction mode) in front of Postgres.
The rule:
ONE POOL PER PROCESS
- `new Pool({ ... })` once, at composition root. Every query goes
through the pool. No `new Client()` per request.
- Pool size: conservative. If DB max_connections = 200 and you have
10 pods, pool.max <= 15 per pod (leaves headroom for superuser,
migrations, pgBouncer).
IDLE TIMEOUTS
- idleTimeoutMillis: 30_000 releases sockets a LB may have cut.
connectionTimeoutMillis: 5_000. statement_timeout: 30s (or as
appropriate per query).
- keep-alive on the socket so LB TCP idle timeout doesn't silently
kill the connection mid-pool.
pgBouncer IN PRODUCTION
- Transaction-mode pooling with short idle connections reduces the
effective connection count at Postgres to a fraction.
- Compatibility: no session-scoped features (LISTEN/NOTIFY, prepared
statements by default, session GUCs). The driver must be configured
accordingly.
- RDS Proxy / Neon / Supabase provide managed poolers; same rules.
NEVER HOLD A CONNECTION ACROSS AWAITS YOU DON'T NEED
- Acquire from pool just before query; release immediately after.
`pool.query(sql, params)` is preferred over manual client acquire.
- For a transaction: acquire a client, BEGIN, work, COMMIT, release
in a try/finally. Never leak a client.
MONITOR
- Export pool stats (pool.totalCount, pool.idleCount, pool.waitingCount).
Waiting > 0 means the pool is undersized OR a slow query is holding
connections.
- pg_stat_activity: active long-running queries, idle-in-transaction
sessions (which are a red flag — always commit or rollback promptly).
Before — per-request client, connection storm at load:
app.get('/orders', async (req, res) => {
const client = new Client({ connectionString: process.env.DATABASE_URL });
await client.connect();
const { rows } = await client.query('SELECT id, total FROM orders WHERE user_id=$1', [req.user.id]);
await client.end();
res.json(rows);
});
500 req/s = 500 handshakes/s. DB rejects connections; 5xx spikes.
After — single pool, explicit timeouts, pool.query:
// infra/db.ts
import { Pool } from 'pg';
export const pool = new Pool({
connectionString: config.DATABASE_URL,
max: config.DB_POOL_MAX, // e.g. 10
idleTimeoutMillis: 30_000,
connectionTimeoutMillis: 5_000,
statement_timeout: 30_000,
application_name: config.APP_NAME,
});
// routes/orders.ts
router.get('/orders', asyncHandler(async (req, res) => {
const { rows } = await pool.query<OrderSummary>(
'SELECT id, total FROM app.orders WHERE user_id = $1 ORDER BY id DESC LIMIT 50',
[req.user!.id],
);
res.json(rows);
}));
One handshake per pod startup. Pool queues requests when the DB is saturated. Stats exportable.
Rule 6: Transactions — Explicit Scope, Correct Isolation, Never Idle
Every ORM quietly runs most queries outside a transaction, and most SQL tutorials say nothing about isolation levels. Cursor writes await db.query('INSERT INTO accounts ...'); await db.query('UPDATE ledger ...'); — two separate statements, two separate transactions, and a crash between them leaves the database inconsistent. The next version wraps them in BEGIN/COMMIT but forgets ROLLBACK on error, so on exception the client is returned to the pool with an open transaction that becomes "idle in transaction" — holding locks until it's reaped. The third version uses SERIALIZABLE isolation because "it's the safest" and deadlocks half the queries under load. The rule is: transactions are explicit, short, correctly isolated, and released in a finally.
The rule:
EXPLICIT TRANSACTIONS
- A `withTransaction(fn)` helper is the only way to open a transaction:
acquires a client, BEGINs, runs fn(client), COMMITs on success,
ROLLBACKs on throw, releases client in finally. Never manual
BEGIN/COMMIT scattered across handlers.
- Transactions are SHORT. No awaiting a third-party HTTP call inside
a transaction — it holds locks and a DB connection. Do the work
outside; transaction is the DB-only boundary.
ISOLATION
- Default is READ COMMITTED (Postgres default). Enough for most reads.
- REPEATABLE READ when a transaction reads the same table twice and
must see a consistent snapshot.
- SERIALIZABLE for money-moving / invariant-preserving writes
(transfer between accounts). Expect occasional serialization
failures (SQLSTATE 40001) — caller retries with backoff.
- Mark the isolation on the BEGIN: `BEGIN ISOLATION LEVEL SERIALIZABLE
READ WRITE;`.
ROW LOCKS
- SELECT ... FOR UPDATE when the row is going to be updated in the
same transaction (pessimistic). SKIP LOCKED for queue-like patterns.
- Never SELECT inside a transaction with no intention to update and
no isolation upgrade — it holds locks unnecessarily.
ADVISORY LOCKS
- pg_try_advisory_xact_lock(key) for cross-transaction coordination
without a physical row (e.g. "only one worker runs this job").
Keys are bigint; use a hashed namespace + id.
NO IDLE IN TRANSACTION
- `idle_in_transaction_session_timeout` set at the database (e.g. 60s)
so a leaked tx dies rather than blocking vacuum forever.
- App code observes long-running transactions via pg_stat_activity.
SAVEPOINTS
- For nested operations that may fail without aborting the outer tx:
SAVEPOINT sp_x; ... RELEASE | ROLLBACK TO sp_x.
Before — implicit mini-transactions, broken invariant, leaked client:
await pool.query('UPDATE accounts SET balance = balance - $1 WHERE id=$2', [amount, from]);
await pool.query('UPDATE accounts SET balance = balance + $1 WHERE id=$2', [amount, to]);
// Crash between = money lost / duplicated.
No atomicity, no isolation, no lock. The classic double-entry bookkeeping failure mode.
After — withTransaction helper, serializable retry, short scope:
export async function withTransaction<T>(
fn: (tx: PoolClient) => Promise<T>,
opts: { isolation?: 'READ COMMITTED' | 'REPEATABLE READ' | 'SERIALIZABLE' } = {},
): Promise<T> {
const client = await pool.connect();
try {
await client.query(`BEGIN${opts.isolation ? ` ISOLATION LEVEL ${opts.isolation}` : ''}`);
const result = await fn(client);
await client.query('COMMIT');
return result;
} catch (err) {
await client.query('ROLLBACK').catch(() => {});
throw err;
} finally {
client.release();
}
}
export async function transfer(from: string, to: string, amount: string) {
return withRetry(() =>
withTransaction(async tx => {
const debited = await tx.query(
'UPDATE app.accounts SET balance = balance - $1 WHERE id = $2 AND balance >= $1',
[amount, from],
);
if (debited.rowCount === 0) throw new ConflictError('insufficient funds');
await tx.query('UPDATE app.accounts SET balance = balance + $1 WHERE id = $2', [amount, to]);
}, { isolation: 'SERIALIZABLE' }),
);
}
Atomic. Correctly isolated. Client always released. Retries serialize failures.
Rule 7: Parameterized Everything — No String Interpolation, Typed Rows Out
Every SQL injection bug in history is a string interpolation. Cursor writes db.query('SELECT * FROM users WHERE email = \'' + email + '\'') because the example in the tutorial did. Even with a "clean" input it is wrong — numbers become strings, dates become strings, booleans become strings, and the planner doesn't cache the plan because the SQL text differs per call. Use parameter placeholders. Full stop. And on the way out: typed row shapes — pool.query<UserRow>(...) with a Zod or io-ts validation at the boundary so runtime drift from schema evolution surfaces in tests, not in production.
The rule:
PARAMETERIZED ALWAYS
- `pool.query('... WHERE x = $1', [value])`. Never string concatenation,
never template literals with values, never `format()` helpers.
- Identifiers that MUST be dynamic (rare) use a safelist + an
identifier quoting helper (pg-format's %I). Never interpolate
user-provided identifiers.
BULK
- Batch INSERTs: a single statement with `unnest($1::text[], $2::int[])`
or a driver's bulk insert helper. N×1-row INSERTs in a loop is
N round-trips.
- UPSERT: INSERT ... ON CONFLICT DO UPDATE SET ... WHERE .... Explicit
conflict target.
TYPED ROW SHAPES
- Every query function returns a typed result:
`async function findUser(id): Promise<User | null>`, where User is
a project-wide type.
- ORM / query-builder: Drizzle, Kysely, Prisma provide typed queries.
For raw pg, a Zod schema at the boundary parses rows:
`UserRowSchema.parse(rows[0])` — catches schema drift.
NULL HANDLING
- COALESCE only with typed defaults. `COALESCE(a, 0)` fine; COALESCE(a, '')
for nullable text is a smell — usually the column should be NOT NULL.
- `IS DISTINCT FROM` / `IS NOT DISTINCT FROM` for NULL-safe equality.
ENUM MAPPING
- DB enum <-> TS union type. Zod's z.nativeEnum or z.enum at the
boundary. Never use string literals inline for status values in
application code.
DATES
- Driver converts TIMESTAMPTZ <-> JS Date (UTC). Never format dates
into query strings. Never parse DB timestamps manually.
Before — concatenated query, SQL injection, untyped rows:
async function getOrders(userId, status) {
const sql = `SELECT * FROM orders WHERE user_id = ${userId} AND status = '${status}'`;
const { rows } = await pool.query(sql);
return rows;
}
status = "' OR '1'='1" returns every order in the database. rows is any[].
After — parameters, typed result, schema validation at the boundary:
const OrderRowSchema = z.object({
id: z.string(),
user_id: z.string(),
total: z.string().transform(s => new Decimal(s)),
status: z.enum(['pending', 'paid', 'shipped', 'cancelled', 'refunded']),
created_at: z.date(),
});
type OrderRow = z.infer<typeof OrderRowSchema>;
export async function getOrders(userId: string, status: OrderStatus): Promise<OrderRow[]> {
const { rows } = await pool.query(
`SELECT id, user_id, total, status, created_at
FROM app.orders
WHERE user_id = $1 AND status = $2
ORDER BY id DESC LIMIT 100`,
[userId, status],
);
return z.array(OrderRowSchema).parse(rows);
}
SQL-injection-free. Plan cached. Rows typed. Schema drift becomes a test failure.
Rule 8: TypeScript/Node Integration — Schema Is The Source Of Truth
A PostgreSQL project in TypeScript has three sources of truth that can drift: the SQL DDL, the application types, and the query helpers. Cursor generates all three independently — a migration that adds NOT NULL to a column whose TS type is still optional; a findUser that returns { id: number } when the column is BIGINT (JavaScript silently truncates at 2^53); a query that selects a column that doesn't exist because a refactor renamed it. The fix is a single source of truth: the SQL schema — with a tool (Kysely + introspection, Drizzle with pull, PgTyped, sqlc) that generates TS types from the live database, so the TS compiler sees every schema change.
The rule:
SOURCE OF TRUTH IS THE DATABASE
- Schema lives in migrations. TS types are GENERATED from the DB
(or from the migration files) — never hand-written to mirror the
schema.
- Tools:
Drizzle: `drizzle-kit pull` or codegen from the schema file.
Kysely: kysely-codegen introspects the DB.
Prisma: schema.prisma IS the truth; `prisma migrate` generates SQL
AND TS.
PgTyped / sqlc: write SQL, generate typed functions.
Pick one; don't mix.
NUMERIC / BIGINT
- BIGINT comes back as `string` from node-postgres by default (safe).
Never parse to Number if the column can exceed 2^53. Use bigint
(native) or a Decimal type for monetary amounts.
- Configure pg types explicitly: `types.setTypeParser(20, val => BigInt(val))`
or keep as string and parse at the boundary.
DATES
- Leave pg's default Date parsing unless the column is `date` (no
time) — then parse to a LocalDate / plain string. Don't let
Postgres timezone semantics leak into JS without intention.
JSONB
- JSONB -> unknown in TS. Validate on read with Zod: the column may
have any shape (historical data, partial writes). Tests include a
"schema drift" fixture with old-shape documents.
NULL VS UNDEFINED
- Postgres NULL -> JS null (driver default). `undefined` vs `null` is
a common TS bug; prefer explicit null at the DB boundary, undefined
only for "not loaded".
MIGRATIONS + TYPES IN CI
- CI step: spin up a Postgres container, apply all migrations,
regenerate types, check `git diff` is empty. If the schema changes
and types don't, CI fails.
QUERY LOCATIONS
- Raw SQL in `src/db/queries/*.sql` with PgTyped, OR query-builder
calls in `src/db/queries/*.ts`. Never inline ad-hoc SQL in a
handler.
Before — hand-written types, BIGINT truncation, drift on rename:
interface User { id: number; email: string; }
// Migration later renames email -> email_address. This interface still compiles.
async function getUser(id: number) {
const { rows } = await pool.query('SELECT id, email FROM users WHERE id=$1', [id]);
return rows[0] as User; // now at runtime: { id, email_address }
}
Cast hides the rename. id: number silently truncates BIGINTs > 2^53.
After — generated types, runtime validation, CI guards drift:
// Generated by kysely-codegen from the live DB:
export interface Database {
users: {
id: ColumnType<string, string | undefined, never>; // BIGINT -> string
email: ColumnType<string, string, string>;
created_at: ColumnType<Date, string | undefined, never>;
};
}
export async function getUser(id: string) {
const row = await db
.selectFrom('users')
.select(['id', 'email', 'created_at'])
.where('id', '=', id)
.executeTakeFirst();
return row ?? null;
}
Rename the column in a migration, re-run codegen, the handler fails to compile until updated. BIGINT stays a string. Types and schema don't drift.
The Complete .cursorrules File
Drop this in the repo root. Cursor and Claude Code both pick it up.
# PostgreSQL — Production Patterns
## Schema Design
- PK: BIGINT GENERATED ALWAYS AS IDENTITY (or UUID with pgcrypto).
Never SERIAL.
- Money: NUMERIC(p, s); never FLOAT/DOUBLE. Timestamps: TIMESTAMPTZ
always. Text: TEXT with CHECK length; never VARCHAR(n) unless length
is a domain constraint.
- FKs always declare REFERENCES and explicit ON DELETE.
- NOT NULL by default. CHECK constraints for domain invariants.
UNIQUE for natural keys. created_at / updated_at on every business
table (updated_at via trigger).
- Identifiers: snake_case, plural tables. Schema `app`, never
`public` for business tables.
## Migrations
- Forward-only. Rollbacks are new forward migrations.
- Every migration: `SET lock_timeout = '5s'`. CREATE INDEX uses
CONCURRENTLY (own file, no tx).
- ADD COLUMN NOT NULL without constant default = full rewrite ->
expand/contract (add nullable, backfill in batches, SET NOT NULL).
- Destructive operations in dedicated migrations with INTENT comment.
- Every migration tested in CI against representative data.
## Indexes
- Only for a real query in the same PR. Remove unused indexes.
- ESR ordering: equality, sort, range.
- Every FK gets an index on the referencing column.
- Partial indexes for hot subsets (`WHERE status = 'pending'`).
- INCLUDE for covering indexes / index-only scans.
## Query Writing
- EXPLAIN (ANALYZE, BUFFERS) on new queries. No Seq Scan on
large tables without justification.
- Never SELECT *. List columns.
- Every query has a LIMIT; keyset pagination over OFFSET.
- No N+1 — test the query count on hot paths.
- Don't wrap indexed columns in functions (use citext / functional
indexes).
- pg_stat_statements reviewed weekly.
## Connection Pooling
- One pg.Pool per process, sized conservatively (DB max /
pods * safety).
- idleTimeoutMillis, connectionTimeoutMillis, statement_timeout
explicit. Keep-alive on.
- pgBouncer transaction-mode in production; driver configured for
compatibility (no session features).
- pool.query preferred. Manual client acquire ONLY for transactions,
with finally-release.
## Transactions
- withTransaction(fn, { isolation }) is the only tx helper.
- Transactions are short; no 3rd-party HTTP inside.
- Isolation: READ COMMITTED default; SERIALIZABLE for invariants ->
caller retries on 40001.
- SELECT FOR UPDATE / SKIP LOCKED for queues.
- idle_in_transaction_session_timeout set at the DB.
## Parameterized + Typed
- Always $1, $2 parameters. Never string interpolation. Dynamic
identifiers only via %I with a safelist.
- Bulk INSERT via unnest arrays or driver bulk helpers.
- Typed result rows; Zod (or codegen) validates at the boundary.
- BIGINT as string (or bigint). Never truncate to Number.
## TS/Node Integration
- DB is the source of truth for types — generate via Drizzle /
Kysely codegen / Prisma / PgTyped. Never hand-mirror.
- CI re-runs codegen and fails on drift.
- JSONB validated on read with Zod, including historical shapes.
- Null at the DB boundary, undefined only for "not loaded".
End-to-End Example: A Tenant-Scoped Orders Feed
Without rules: SERIAL ids, unindexed joins, SELECT *, per-request connect, hand-built types.
CREATE TABLE orders (id SERIAL, user_id INT, total FLOAT, status VARCHAR(255), created_at TIMESTAMP);
async function feed(userId) {
const client = new Client(...); await client.connect();
const { rows } = await client.query(`SELECT * FROM orders WHERE user_id = ${userId}`);
await client.end();
return rows;
}
With rules: typed schema, partial compound index, parameterized query, pool, generated types.
CREATE INDEX orders_tenant_user_created_idx
ON app.orders (tenant_id, user_id, created_at DESC)
WHERE status IN ('paid', 'shipped');
export async function feed(tenantId: string, userId: string): Promise<OrderRow[]> {
const { rows } = await pool.query(
`SELECT id, total, status, created_at
FROM app.orders
WHERE tenant_id = $1 AND user_id = $2 AND status IN ('paid','shipped')
ORDER BY created_at DESC LIMIT 50`,
[tenantId, userId],
);
return z.array(OrderRowSchema).parse(rows);
}
Index-only scan. Parameterized. Types generated. Latency flat as the table grows.
Get the Full Pack
These eight rules cover the PostgreSQL patterns where AI assistants consistently reach for the wrong tutorial. Drop them into .cursorrules and the next SQL you generate will be constrained, indexed, bounded, typed, and pooled — without having to re-prompt.
If you want the expanded pack — these eight plus rules for partitioning, materialized views, logical replication, row-level security, full-text search, pg_notify patterns, pgvector / semantic search, audit via triggers, and testing with Testcontainers + truncation strategies — it is bundled in Cursor Rules Pack v2 ($27, one payment, lifetime updates). Drop it in your repo, stop fighting your AI, ship PostgreSQL you would actually merge.
Top comments (0)