A practical, developer-friendly guide to PostgreSQL with real, copy-pasteable code. This article covers core concepts, architecture highlights (brief), and—most importantly—hands-on examples: DDL, DML, indexes, JSONB, full-text search, partitioning, transactions, and short client examples in Node.js and Python. Use it as a dev.to post or a quick reference.
Why PostgreSQL (short)
PostgreSQL is a powerful open-source object-relational database. It gives you SQL standards compliance, extensibility (custom types, operators), excellent JSON support (jsonb), robust concurrency via MVCC, strong tool ecosystem (PostGIS, TimescaleDB), and many index types.
Quick setup (commands)
Run these in a Unix shell (assuming postgres user exists):
# start psql as postgres user
sudo -u postgres psql
-- inside psql: create DB and user
CREATE DATABASE demo;
CREATE USER demo_user WITH PASSWORD 'demo_pass';
GRANT ALL PRIVILEGES ON DATABASE demo TO demo_user;
\q
Connect as the new user:
psql -h localhost -U demo_user -d demo
# or
PGPASSWORD=demo_pass psql -h localhost -U demo_user -d demo
Schema: Blog example
We’ll build a small blog schema to demonstrate features.
-- create tables
CREATE TABLE authors (
id SERIAL PRIMARY KEY,
username TEXT NOT NULL UNIQUE,
email TEXT NOT NULL UNIQUE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT now()
);
CREATE TABLE posts (
id BIGSERIAL PRIMARY KEY,
author_id INT NOT NULL REFERENCES authors(id) ON DELETE CASCADE,
title TEXT NOT NULL,
body TEXT NOT NULL,
tags TEXT[], -- simple array of tags
metadata JSONB, -- flexible JSON metadata
published_at TIMESTAMP WITH TIME ZONE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT now()
);
Basic CRUD
Insert author and post:
INSERT INTO authors (username, email) VALUES ('alice', 'alice@example.com') RETURNING id;
INSERT INTO posts (author_id, title, body, tags, metadata, published_at)
VALUES
(1, 'Hello Postgres', 'This is the body', ARRAY['postgres','db'], '{"views": 0, "likes": 0}', now())
RETURNING id;
Select with JOIN:
SELECT p.id, p.title, a.username, p.published_at
FROM posts p
JOIN authors a ON a.id = p.author_id
WHERE p.published_at IS NOT NULL
ORDER BY p.published_at DESC
LIMIT 10;
Update and delete:
UPDATE posts SET metadata = jsonb_set(metadata, '{views}', ( (metadata->>'views')::int + 1 )::text::jsonb )
WHERE id = 1;
DELETE FROM posts WHERE id = 123;
Indexes & EXPLAIN
Create indexes used commonly:
-- index on foreign key for faster joins
CREATE INDEX idx_posts_author ON posts(author_id);
-- index on published_at for range queries/sorting
CREATE INDEX idx_posts_published_at ON posts(published_at);
-- GIN index for tags array
CREATE INDEX idx_posts_tags_gin ON posts USING GIN (tags);
-- GIN index for jsonb metadata or text search on JSON fields
CREATE INDEX idx_posts_metadata_gin ON posts USING GIN (metadata);
Use EXPLAIN ANALYZE to inspect plans:
EXPLAIN ANALYZE
SELECT * FROM posts WHERE tags @> ARRAY['postgres'];
Read the output: look for Index Scan vs Seq Scan, estimated vs actual rows, timing. Use it to change indexes or rewrite queries.
JSONB: storing and querying JSON
Store structured data and query it efficiently.
-- add more metadata
UPDATE posts SET metadata = metadata || '{"comments": [{"user":"bob","text":"nice!"}], "views": 10}' WHERE id = 1;
-- read a scalar from JSONB
SELECT metadata->>'views' AS views_text FROM posts WHERE id = 1;
-- query by JSON key
SELECT * FROM posts WHERE metadata->>'views' IS NOT NULL AND (metadata->>'views')::int > 100;
-- find posts where metadata has a nested field
SELECT * FROM posts WHERE metadata @> '{"comments": [{"user": "bob"}] }';
Combine jsonb with GIN index for speed:
CREATE INDEX idx_posts_metadata_gin ON posts USING GIN (metadata jsonb_path_ops);
-- or default gin (better for existence/containment)
CREATE INDEX idx_posts_metadata_gin_default ON posts USING GIN (metadata);
Full-text search (simple)
Add a tsvector column or compute on the fly.
-- add a tsvector column for efficient search
ALTER TABLE posts ADD COLUMN search_vector tsvector;
-- populate it (concatenate title and body)
UPDATE posts SET search_vector = to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,''));
-- keep it updated with trigger
CREATE FUNCTION posts_search_vector_update() RETURNS trigger AS $$
BEGIN
NEW.search_vector := to_tsvector('english', coalesce(NEW.title,'') || ' ' || coalesce(NEW.body,''));
RETURN NEW;
END
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_posts_search_vector BEFORE INSERT OR UPDATE
ON posts FOR EACH ROW EXECUTE FUNCTION posts_search_vector_update();
-- index the vector
CREATE INDEX idx_posts_search_vector ON posts USING GIN (search_vector);
-- search
SELECT id, title, ts_rank(search_vector, plainto_tsquery('english', 'postgres tutorial')) AS rank
FROM posts
WHERE search_vector @@ plainto_tsquery('english', 'postgres tutorial')
ORDER BY rank DESC
LIMIT 10;
Window Functions & Analytics
Example: top authors by number of posts with running totals.
SELECT author_id, cnt,
SUM(cnt) OVER (ORDER BY cnt DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM (
SELECT author_id, COUNT(*) AS cnt
FROM posts
GROUP BY author_id
) sub
ORDER BY cnt DESC;
Transactions & LOCKs (safe updates)
Use transactions to ensure atomic updates:
BEGIN;
-- example: increment view count safely
UPDATE posts
SET metadata = jsonb_set(metadata, '{views}', ((metadata->>'views')::int + 1)::text::jsonb)
WHERE id = 1;
-- do other related writes...
COMMIT;
If you need stronger isolation or to avoid race conditions, you can lock a row:
BEGIN;
SELECT * FROM posts WHERE id = 1 FOR UPDATE;
-- now safe to read/modify and write back
COMMIT;
Partitioning (time-based)
Partition posts by year (declarative partitioning):
-- create partitioned table
CREATE TABLE posts_partitioned (
id bigserial PRIMARY KEY,
author_id int NOT NULL,
title text,
body text,
created_at timestamptz NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);
-- create partitions
CREATE TABLE posts_2024 PARTITION OF posts_partitioned
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
CREATE TABLE posts_2025 PARTITION OF posts_partitioned
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
Insert goes automatically to correct partition. Partitioning helps large tables performance and maintenance.
Logical replication example (basic)
Set up a publication on primary:
-- on primary
CREATE PUBLICATION my_publication FOR TABLE posts, authors;
On the subscriber:
-- create subscriber (runs on subscriber)
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=primary_host port=5432 dbname=demo user=replicator password=secret'
PUBLICATION my_publication;
This keeps data replicated (logical replication). (Note: replication setup requires configuration at server/pg_hba and WAL settings—see server docs.)
Short client examples
Node.js (using pg)
// npm install pg
const { Pool } = require('pg');
const pool = new Pool({
user: 'demo_user',
host: 'localhost',
database: 'demo',
password: 'demo_pass',
port: 5432
});
async function getRecentPosts() {
const res = await pool.query(
`SELECT p.id, p.title, a.username
FROM posts p JOIN authors a ON a.id = p.author_id
WHERE p.published_at IS NOT NULL
ORDER BY p.published_at DESC LIMIT 10`
);
return res.rows;
}
getRecentPosts().then(rows => {
console.log(rows);
pool.end();
}).catch(err => {
console.error(err);
pool.end();
});
Use connection pooling (PgBouncer) for production.
Python (using psycopg)
## pip install psycopg[binary]
import psycopg
dsn = "dbname=demo user=demo_user password=demo_pass host=localhost"
with psycopg.connect(dsn) as conn:
with conn.cursor() as cur:
cur.execute("""
SELECT p.id, p.title, a.username
FROM posts p
JOIN authors a ON a.id = p.author_id
WHERE p.published_at IS NOT NULL
ORDER BY p.published_at DESC LIMIT 10
""")
for row in cur.fetchall():
print(row)
Tips & Best Practices (practical)
-
Use proper data types: smallints/ints/bigints where appropriate;
timestamptzfor timezone-aware times. - Index the columns you filter/join on, but avoid excessive indexes which slow writes.
-
Use
EXPLAIN ANALYZEbefore optimizing; measure before changing. - VACUUM / Autovacuum: keep it healthy; tune if you have many updates/deletes.
- Avoid long transactions that hold old row versions and cause bloat.
- Use prepared statements or parameterized queries to avoid SQL injection and boost planner stability.
-
Monitor
pg_stat_activityto find blocking queries and long running transactions. -
Backups: use
pg_basebackupfor physical orpg_dumpfor logical backups. - Use connection pooling (PgBouncer) to handle many short-lived connections.
Advanced directions (where to go next)
- PostGIS for geospatial workloads.
- Logical replication + logical decoding for CDC (change data capture).
- TimescaleDB for time-series.
- Custom extensions (create types, operators).
- Parallel queries and server tuning (
work_mem,shared_buffers,effective_cache_size, etc.). - Security: roles, row level security (RLS), TLS.
Conclusion
PostgreSQL is more than a database—it’s a platform you can extend and tune for a wide variety of workloads. Above you have a compact, practical set of examples: schema, indexes, JSONB, full-text, partitioning, transactions, and client code. Copy the snippets, run them locally, and use EXPLAIN ANALYZE to see how queries behave on your data.
Top comments (0)