Dan Draper for CipherStash

Posted on Jun 10 • Originally published at cipherstash.com

Securing PostgreSQL, in the order an attacker would try things

#postgres #security #database #tutorial

Postgres has a well-earned reputation for security. Roles, privileges, host-based authentication, SCRAM. The primitives are all there, and they're good.

And yet databases keep leaking. 82% of data breaches involve human error.
The engine is rarely the weak point. The configuration around it is.

Most hardening guides list fixes in no particular order, as if every gap were equally
likely to be the one that gets you. I think it's more useful to think like the other
side. So this walkthrough is ordered the way an attacker would actually probe your
database: the front door, the keys, the side channels you're leaking through, and
finally the data itself. Every step comes with the commands to do it. You should be
able to work through the whole list against a real database in an afternoon.

Door one: who can connect, and as what

An attacker's first question is the simplest one: can I reach this database at all,
and what will it let me do once I'm in?

Start at the network. Postgres decides who may connect, from where, and how they must
authenticate in pg_hba.conf. Restrict connections to the IP ranges that genuinely
need them and you've already closed off the casual scan. The
official documentation
covers the format.

Then assume someone gets a connection anyway, because eventually someone will. What
they can do is governed by roles
and privileges, and the
principle is boring but unbeatable: grant the minimum.

-- A role for the sales team, with exactly the access they need
CREATE ROLE sales_team;
ALTER ROLE sales_team LOGIN;
GRANT SELECT ON products TO sales_team;
GRANT SELECT ON orders TO sales_team;

Privileges drift. People change teams, contractors leave, and the grants stay. Audit
them on a schedule:

SELECT grantee, privilege_type, table_schema, table_name, is_grantable
FROM information_schema.role_table_grants
WHERE grantee = 'sales_team';

And revoke what no longer belongs:

REVOKE SELECT ON sensitive_data FROM temporary_contractors;

Door two: passwords that hold up

If the network lets them talk to Postgres, credentials are next. Two things matter
more than the rest.

First, make sure passwords are hashed and exchanged with SCRAM rather than MD5, and
set them in a way that never echoes plaintext into a terminal or a shell history:

SET password_encryption = 'scram-sha-256';

\password

The \password prompt in psql masks your input and passes it to the server already
encrypted. That beats typing ALTER USER ... PASSWORD 'hunter2' into a console that
keeps history.

Second, put a clock on credentials:

-- Set password expiration for a user
ALTER USER user1 VALID UNTIL '2026-12-31';

One detail that surprises people: Postgres does not roll that date forward for you.
Expiry is something you have to operationalize, not a switch you flip once.

Door three: the leak you configured yourself

Counterintuitively, one of the most common ways sensitive data escapes Postgres
involves no attacker at all. You write it to disk yourself, in the logs.

log_statement = 'all'

That single line copies every SQL statement into the log files, including the INSERT
that carried a customer's SSN and the ALTER USER that carried a password. Logs tend
to live with weaker access controls than the database, get shipped to third-party
aggregators, and linger in backups. A carefully secured table doesn't help if the
plaintext went to /var/log on the way in.

Safer defaults:

# Set log format and level in postgresql.conf
log_statement = 'none'
log_line_prefix = 'time=%t, pid=%p %q db=%d, usr=%u, client=%h , app=%a, line=%l '

You keep the operational metadata (who connected, from where, when) without recording
statement contents. The
logging documentation
covers the prefix tokens.

Wherever the logs land, treat them as data: restrict access, rotate, archive.

# Specify secure log location and rotation settings
log_directory = '/var/log/postgresql/'
log_filename = 'postgresql.log'
log_rotation_age = 1d
log_rotation_size = 0

And if what you actually need is an audit trail rather than ad hoc logging,
PGAudit does the job properly, with session and
object-level auditing, and pgaudit_analyze
loads the results back into a database for analysis.

Door four: assume they get in

Everything so far keeps people out. This layer decides what they get when that fails
anyway. A stolen snapshot, a leaked backup, an over-privileged role nobody revoked.
What does the data look like then?

The built-in answer is pgcrypto,
and it's worth understanding exactly why it falls short.

CREATE EXTENSION IF NOT EXISTS pgcrypto;

CREATE TABLE sensitive_data (
  id   serial PRIMARY KEY,
  name text,
  ssn  bytea
);

-- Insert data with server-side encryption
INSERT INTO sensitive_data (name, ssn)
VALUES ('Alice', pgp_sym_encrypt('123-45-6789', 'encryption_key'));

Alice's SSN is now ciphertext in the table. But look at where the key is: in the SQL,
on the server. Decryption happens server-side too:

SELECT name, pgp_sym_decrypt(ssn, 'encryption_key') AS decrypted_ssn
FROM sensitive_data;

If the server is compromised, the keys are compromised with it, and the encryption
bought you nothing. There's a performance tax as well: any query that needs to search
or sort that column has to decrypt it row by row first. This is a large part of why
pgcrypto is considered a poor fit for critical use.

Encrypt in the application, keep your queries

The fix for the key problem is to encrypt before the data ever reaches Postgres, so
the server never holds plaintext or key material. The catch has always been search:
WHERE email = ? is useless when ciphertext is random.
CipherStash Stack addresses both at once.
Data is encrypted in your application, and the values stored in Postgres carry
encrypted index structures (HMACs for equality, bloom filters for free-text search,
Order-Revealing-Encryption blocks for ranges) that the planner uses the way it uses
any other index.

A minimal end-to-end setup:

-- 1. Encrypted columns use the eql_v2_encrypted type
CREATE TABLE users (
  id    SERIAL PRIMARY KEY,
  email eql_v2_encrypted NOT NULL,
  ssn   eql_v2_encrypted NOT NULL
);

-- 2. An index per query pattern you actually use
CREATE INDEX idx_users_email_eq ON users USING HASH (eql_v2.hmac_256(email));
CREATE INDEX idx_users_ssn_eq   ON users USING HASH (eql_v2.hmac_256(ssn));

// 3. Describe what's encrypted in TypeScript — the source of truth
import { encryptedTable, encryptedColumn } from '@cipherstash/stack/schema'
import { Encryption } from '@cipherstash/stack'

const users = encryptedTable('users', {
  email: encryptedColumn('email').equality(),
  ssn:   encryptedColumn('ssn').equality(),
})

// 4. Initialise the client — keys come from your AWS KMS via ZeroKMS,
//    not from the database server.
const encryption = await Encryption({ schemas: [users] })

// 5. Insert + query — plaintext never enters Postgres
const encrypted = await encryption.encrypt('alice@example.com', {
  table: users, column: users.email,
})
await db.query('INSERT INTO users (email) VALUES ($1::jsonb)', [encrypted.data])

const term = await encryption.encryptQuery('alice@example.com', {
  table: users, column: users.email, queryType: 'equality',
})
const result = await db.query(
  'SELECT * FROM users WHERE email = $1::eql_v2_encrypted', [term.data],
)
const decrypted = await encryption.bulkDecryptModels(result.rows)

Per-value keys are derived on demand from a root key held in your own AWS KMS
account, brokered by ZeroKMS,
so there is no key on the database server for an attacker to find. A pg_dump of
that table contains ciphertext and opaque index structures. Nothing else. The stolen
snapshot, the over-privileged role, and the leaked SQL log all see the same thing.

(One naming note in case you go looking: older guides reference
@cipherstash/protectjs or @cipherstash/jseql. Those are predecessor libraries.
@cipherstash/stack is the current surface for new code.)

The full walkthrough, including free-text search and range queries, is in
Searchable encryption in Postgres: a working guide.

And if you can't change the application

Some databases serve code you can't modify: a legacy app, a BI tool, a third-party
client connecting over the wire. CipherStash Proxy
covers that case at the connection layer. It speaks the Postgres wire protocol on
both sides, so clients connect to it exactly as they would connect to Postgres.
Encrypted columns are configured in the Proxy rather than the application; queries
with predicates on those columns are rewritten to run over the encrypted indexes, and
results are decrypted on the way back.

Same eql_v2_encrypted column types, same indexes, either way. The choice between
Stack and Proxy is only about where encryption happens: in your app, or in front of
your database.

Door five: the copies you forgot about

Every backup is a complete copy of your database with none of its access controls.
Lock the files down:

# Restrict file permissions to owner only
chmod 600 backup_file.dump

And encrypt backups as they're created, not as a cleanup job afterward:

export BACKUP_PASSWORD="your_password_here"
pg_dump -d <dbname> -h localhost | openssl enc -aes-256-cbc -salt -pass env:BACKUP_PASSWORD -out dbdump.sql.encrypted

If backups go to cloud storage, use a service that supports client-side encryption so
the provider never holds the decryption key. And note the quiet bonus from door four:
if your sensitive columns are encrypted application-side, the dump was never
plaintext to begin with. That's the nicest property a backup can have.

Door six: the exploit you already know about

Stay current with PostgreSQL releases. It's
the least glamorous item here and the one with the clearest payoff, because most
real-world exploitation targets vulnerabilities that were patched long before the
attack. An unapplied update is a door you know is open, and so does everyone else.

What's behind the last door

No checklist makes you breach-proof, and I won't pretend this one does. What it
changes is the cost of each failure. If the network rule fails, privileges contain
it. If privileges fail, the logs hold nothing worth stealing. And if everything
fails, if someone walks out with the database files themselves, application-side
encryption decides whether they're holding your customers' data or noise.

Work through the doors in order. The attacker will.

:wq

DEV Community