Farhad Rahimi Klie

Posted on Nov 16

PostgreSQL: The Powerhouse Database

#postgres #postgressql #database

A practical, developer-friendly guide to PostgreSQL with real, copy-pasteable code. This article covers core concepts, architecture highlights (brief), and—most importantly—hands-on examples: DDL, DML, indexes, JSONB, full-text search, partitioning, transactions, and short client examples in Node.js and Python. Use it as a dev.to post or a quick reference.

Why PostgreSQL (short)

PostgreSQL is a powerful open-source object-relational database. It gives you SQL standards compliance, extensibility (custom types, operators), excellent JSON support (jsonb), robust concurrency via MVCC, strong tool ecosystem (PostGIS, TimescaleDB), and many index types.

Quick setup (commands)

Run these in a Unix shell (assuming postgres user exists):

# start psql as postgres user
sudo -u postgres psql

-- inside psql: create DB and user
CREATE DATABASE demo;
CREATE USER demo_user WITH PASSWORD 'demo_pass';
GRANT ALL PRIVILEGES ON DATABASE demo TO demo_user;
\q

Connect as the new user:

psql -h localhost -U demo_user -d demo
# or
PGPASSWORD=demo_pass psql -h localhost -U demo_user -d demo

Schema: Blog example

We’ll build a small blog schema to demonstrate features.

-- create tables
CREATE TABLE authors (
  id SERIAL PRIMARY KEY,
  username TEXT NOT NULL UNIQUE,
  email TEXT NOT NULL UNIQUE,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT now()
);

CREATE TABLE posts (
  id BIGSERIAL PRIMARY KEY,
  author_id INT NOT NULL REFERENCES authors(id) ON DELETE CASCADE,
  title TEXT NOT NULL,
  body TEXT NOT NULL,
  tags TEXT[],                 -- simple array of tags
  metadata JSONB,              -- flexible JSON metadata
  published_at TIMESTAMP WITH TIME ZONE,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT now()
);

Basic CRUD

Insert author and post:

INSERT INTO authors (username, email) VALUES ('alice', 'alice@example.com') RETURNING id;

INSERT INTO posts (author_id, title, body, tags, metadata, published_at)
VALUES
(1, 'Hello Postgres', 'This is the body', ARRAY['postgres','db'], '{"views": 0, "likes": 0}', now())
RETURNING id;

Select with JOIN:

SELECT p.id, p.title, a.username, p.published_at
FROM posts p
JOIN authors a ON a.id = p.author_id
WHERE p.published_at IS NOT NULL
ORDER BY p.published_at DESC
LIMIT 10;

Update and delete:

UPDATE posts SET metadata = jsonb_set(metadata, '{views}', ( (metadata->>'views')::int + 1 )::text::jsonb )
WHERE id = 1;

DELETE FROM posts WHERE id = 123;

Indexes & EXPLAIN

Create indexes used commonly:

-- index on foreign key for faster joins
CREATE INDEX idx_posts_author ON posts(author_id);

-- index on published_at for range queries/sorting
CREATE INDEX idx_posts_published_at ON posts(published_at);

-- GIN index for tags array
CREATE INDEX idx_posts_tags_gin ON posts USING GIN (tags);

-- GIN index for jsonb metadata or text search on JSON fields
CREATE INDEX idx_posts_metadata_gin ON posts USING GIN (metadata);

Use EXPLAIN ANALYZE to inspect plans:

EXPLAIN ANALYZE
SELECT * FROM posts WHERE tags @> ARRAY['postgres'];

Read the output: look for Index Scan vs Seq Scan, estimated vs actual rows, timing. Use it to change indexes or rewrite queries.

JSONB: storing and querying JSON

Store structured data and query it efficiently.

-- add more metadata
UPDATE posts SET metadata = metadata || '{"comments": [{"user":"bob","text":"nice!"}], "views": 10}' WHERE id = 1;

-- read a scalar from JSONB
SELECT metadata->>'views' AS views_text FROM posts WHERE id = 1;

-- query by JSON key
SELECT * FROM posts WHERE metadata->>'views' IS NOT NULL AND (metadata->>'views')::int > 100;

-- find posts where metadata has a nested field
SELECT * FROM posts WHERE metadata @> '{"comments": [{"user": "bob"}] }';

Combine jsonb with GIN index for speed:

CREATE INDEX idx_posts_metadata_gin ON posts USING GIN (metadata jsonb_path_ops);
-- or default gin (better for existence/containment)
CREATE INDEX idx_posts_metadata_gin_default ON posts USING GIN (metadata);

Full-text search (simple)

Add a tsvector column or compute on the fly.

-- add a tsvector column for efficient search
ALTER TABLE posts ADD COLUMN search_vector tsvector;

-- populate it (concatenate title and body)
UPDATE posts SET search_vector = to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,''));

-- keep it updated with trigger
CREATE FUNCTION posts_search_vector_update() RETURNS trigger AS $$
BEGIN
  NEW.search_vector := to_tsvector('english', coalesce(NEW.title,'') || ' ' || coalesce(NEW.body,''));
  RETURN NEW;
END
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_posts_search_vector BEFORE INSERT OR UPDATE
  ON posts FOR EACH ROW EXECUTE FUNCTION posts_search_vector_update();

-- index the vector
CREATE INDEX idx_posts_search_vector ON posts USING GIN (search_vector);

-- search
SELECT id, title, ts_rank(search_vector, plainto_tsquery('english', 'postgres tutorial')) AS rank
FROM posts
WHERE search_vector @@ plainto_tsquery('english', 'postgres tutorial')
ORDER BY rank DESC
LIMIT 10;

Window Functions & Analytics

Example: top authors by number of posts with running totals.

SELECT author_id, cnt,
       SUM(cnt) OVER (ORDER BY cnt DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total
FROM (
  SELECT author_id, COUNT(*) AS cnt
  FROM posts
  GROUP BY author_id
) sub
ORDER BY cnt DESC;

Transactions & LOCKs (safe updates)

Use transactions to ensure atomic updates:

BEGIN;

-- example: increment view count safely
UPDATE posts
SET metadata = jsonb_set(metadata, '{views}', ((metadata->>'views')::int + 1)::text::jsonb)
WHERE id = 1;

-- do other related writes...
COMMIT;

If you need stronger isolation or to avoid race conditions, you can lock a row:

BEGIN;
SELECT * FROM posts WHERE id = 1 FOR UPDATE;
-- now safe to read/modify and write back
COMMIT;

Partitioning (time-based)

Partition posts by year (declarative partitioning):

-- create partitioned table
CREATE TABLE posts_partitioned (
  id bigserial PRIMARY KEY,
  author_id int NOT NULL,
  title text,
  body text,
  created_at timestamptz NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);

-- create partitions
CREATE TABLE posts_2024 PARTITION OF posts_partitioned
  FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE TABLE posts_2025 PARTITION OF posts_partitioned
  FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

Insert goes automatically to correct partition. Partitioning helps large tables performance and maintenance.

Logical replication example (basic)

Set up a publication on primary:

-- on primary
CREATE PUBLICATION my_publication FOR TABLE posts, authors;

On the subscriber:

-- create subscriber (runs on subscriber)
CREATE SUBSCRIPTION my_subscription
  CONNECTION 'host=primary_host port=5432 dbname=demo user=replicator password=secret'
  PUBLICATION my_publication;

This keeps data replicated (logical replication). (Note: replication setup requires configuration at server/pg_hba and WAL settings—see server docs.)

Short client examples

Node.js (using `pg`)

// npm install pg
const { Pool } = require('pg');

const pool = new Pool({
  user: 'demo_user',
  host: 'localhost',
  database: 'demo',
  password: 'demo_pass',
  port: 5432
});

async function getRecentPosts() {
  const res = await pool.query(
    `SELECT p.id, p.title, a.username
     FROM posts p JOIN authors a ON a.id = p.author_id
     WHERE p.published_at IS NOT NULL
     ORDER BY p.published_at DESC LIMIT 10`
  );
  return res.rows;
}

getRecentPosts().then(rows => {
  console.log(rows);
  pool.end();
}).catch(err => {
  console.error(err);
  pool.end();
});

Use connection pooling (PgBouncer) for production.

Python (using `psycopg`)

## pip install psycopg[binary]
import psycopg

dsn = "dbname=demo user=demo_user password=demo_pass host=localhost"

with psycopg.connect(dsn) as conn:
    with conn.cursor() as cur:
        cur.execute("""
            SELECT p.id, p.title, a.username
            FROM posts p
            JOIN authors a ON a.id = p.author_id
            WHERE p.published_at IS NOT NULL
            ORDER BY p.published_at DESC LIMIT 10
        """)
        for row in cur.fetchall():
            print(row)

Tips & Best Practices (practical)

Use proper data types: smallints/ints/bigints where appropriate; timestamptz for timezone-aware times.
Index the columns you filter/join on, but avoid excessive indexes which slow writes.
Use EXPLAIN ANALYZE before optimizing; measure before changing.
VACUUM / Autovacuum: keep it healthy; tune if you have many updates/deletes.
Avoid long transactions that hold old row versions and cause bloat.
Use prepared statements or parameterized queries to avoid SQL injection and boost planner stability.
Monitor pg_stat_activity to find blocking queries and long running transactions.
Backups: use pg_basebackup for physical or pg_dump for logical backups.
Use connection pooling (PgBouncer) to handle many short-lived connections.

Advanced directions (where to go next)

PostGIS for geospatial workloads.
Logical replication + logical decoding for CDC (change data capture).
TimescaleDB for time-series.
Custom extensions (create types, operators).
Parallel queries and server tuning (work_mem, shared_buffers, effective_cache_size, etc.).
Security: roles, row level security (RLS), TLS.

Conclusion

PostgreSQL is more than a database—it’s a platform you can extend and tune for a wide variety of workloads. Above you have a compact, practical set of examples: schema, indexes, JSONB, full-text, partitioning, transactions, and client code. Copy the snippets, run them locally, and use EXPLAIN ANALYZE to see how queries behave on your data.

DEV Community

PostgreSQL: The Powerhouse Database

Why PostgreSQL (short)

Quick setup (commands)

Schema: Blog example

Basic CRUD

Indexes & EXPLAIN

JSONB: storing and querying JSON

Full-text search (simple)

Window Functions & Analytics

Transactions & LOCKs (safe updates)

Partitioning (time-based)

Logical replication example (basic)

Short client examples

Node.js (using `pg`)

Python (using `psycopg`)

Tips & Best Practices (practical)

Advanced directions (where to go next)

Conclusion

Top comments (0)

Why PostgreSQL (short)

Quick setup (commands)

Schema: Blog example

Basic CRUD

Indexes & EXPLAIN

JSONB: storing and querying JSON

Full-text search (simple)

Window Functions & Analytics

Transactions & LOCKs (safe updates)

Partitioning (time-based)

Logical replication example (basic)

Short client examples

Node.js (using pg)

Python (using psycopg)

Tips & Best Practices (practical)

Advanced directions (where to go next)

Conclusion

Node.js (using `pg`)

Python (using `psycopg`)