DEV Community

Cover image for Stop Killing Your Indexes: A PostgreSQL Guide to Optimizing Non-SARGable Queries
Abhishek Gautam
Abhishek Gautam

Posted on

Stop Killing Your Indexes: A PostgreSQL Guide to Optimizing Non-SARGable Queries

1. The 30-second Elevator Pitch

Imagine you’re racing a delivery scooter through downtown traffic. You have GPS, but you tape over the screen and decide to check every single street sign instead. That is exactly what PostgreSQL does when your predicate is non-SARGable: it throws the index in the trash and inspects every row.

If you have an index on orders(order_date) but write

SELECT *
FROM orders
WHERE date_trunc('day', order_date) = '2024-07-16';
Enter fullscreen mode Exit fullscreen mode

PostgreSQL cannot use the index. Instead, it scans the entire table.

That scan scales linearly with table size, and on a busy OLTP system it quickly becomes the bottleneck that ruins your evening.

In this article we’ll turn that sequential horror into an index-only victory lap and measure the difference with EXPLAIN (ANALYZE, BUFFERS). No fluff, just battle-tested rewrites you can ship today.


2. What “SARGable” Actually Means

SARGable is an acronym coined in the 1970s by mainframe optimizers: Search ARGument ABLE.

In plain English, a predicate is SARGable when the planner can seek an index instead of scanning every row.

The concept is surprisingly simple once you internalize the rule of thumb:

Keep the left side of the operator naked.

In PostgreSQL this boils down to:

✅ SARGable form ❌ Non-SARGable form
col op constant function(col) op constant
col BETWEEN a AND b CAST(col AS text) LIKE '%foo'
col = ANY($1) lower(col) = 'foo'

If you wrap the column in anything—a cast, a function, arithmetic—the planner must treat the expression as an opaque blob and falls back to a sequential scan.

The good news? Once you see the pattern, every rewrite becomes mechanical.


3. Anatomy of a Non-SARGable Predicate

Let’s dissect the three most frequent patterns we see during code reviews and on-call incidents.

3.1 Function on the column (the classic)

-- BAD
SELECT *
FROM events
WHERE extract(epoch FROM created_at) > 1_724_000_000;
Enter fullscreen mode Exit fullscreen mode

At first glance this looks harmless. After all, extract() is a built-in, right?

But the b-tree index on created_at stores timestamps, not epochs.

The planner has no rule that lets it invert extract(epoch FROM created_at) back to a timestamp range, so it discards the index and scans the whole table.

3.2 Implicit casting (the silent killer)

-- BAD
SELECT *
FROM users
WHERE phone::bigint = 49_179_551_234;
Enter fullscreen mode Exit fullscreen mode

The cast is on the left side, so the index on phone (text) is useless.

Often this slips in when the application layer passes an integer but the column is text, or vice-versa. The SQL still runs, but the performance cliff arrives at scale.

3.3 Leading wildcard (LIKE and ILIKE)

-- BAD
SELECT *
FROM products
WHERE name ILIKE '%headphones';
Enter fullscreen mode Exit fullscreen mode

B-tree indexes store values in lexicographical order. A leading % says “I don’t know where the string starts,” which means the planner has to walk the entire index—or worse, the entire table.

In OLAP workloads this is sometimes acceptable; in OLTP it is a guarantee for 3 a.m. pages.


4. Detect Non-SARGable Queries in the Wild

You can’t fix what you can’t see. Here are three battle-tested ways to surface these anti-patterns.

4.1 Manual inspection with EXPLAIN

EXPLAIN (ANALYZE, BUFFERS, FORMAT text)
SELECT *
FROM orders
WHERE date_trunc('day', order_date) = '2024-07-16';
Enter fullscreen mode Exit fullscreen mode

Output (trimmed):

Seq Scan on orders  (cost=0.00..195 000.00 rows=1 width=...)
  Filter: (date_trunc('day'::text, order_date) = '2024-07-16'::date)
Enter fullscreen mode Exit fullscreen mode

The Seq Scan is the smoking gun. If you ever see a Seq Scan on a multi-million-row table for a selective predicate, assume non-SARGability until proven otherwise.

4.2 Bulk detection in pg_stat_statements

Enable the extension in postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'
Enter fullscreen mode Exit fullscreen mode

Then query for sequential scans:

SELECT query, calls, mean_exec_time, shared_blks_read
FROM pg_stat_statements
WHERE query LIKE '%Seq Scan%'
ORDER BY mean_exec_time DESC
LIMIT 10;
Enter fullscreen mode Exit fullscreen mode

This surfaces the worst offenders without manual inspection.

Pro tip: dump the result into a Grafana panel and alert on mean_exec_time > 100 ms.

4.3 Automated linter: pg_query

pip install pg_query
pg_query -f app.sql | jq '.. | .FuncCall? | select(.funcname[0].String.str=="date_trunc")'
Enter fullscreen mode Exit fullscreen mode

Hook this into your CI pipeline. If the linter finds a function wrapped around an indexed column, the build fails. Culture beats tooling every time.


5. Rewrite Recipes: Turning Bad into Bad-A$$

Below are drop-in rewrites you can paste into your codebase today. Each one starts with the non-SARGable original, explains why it is slow, and then shows the fixed version with before/after timings.

5.1 Date truncation (the daily trap)

Developers love date_trunc('day', …) because it feels natural. Unfortunately it is kryptonite to b-tree indexes.

-- BEFORE (non-SARGable)
SELECT *
FROM orders
WHERE date_trunc('day', order_date) = '2024-07-16';
Enter fullscreen mode Exit fullscreen mode

The planner has to evaluate date_trunc() for every row.

-- AFTER (SARGable)
SELECT *
FROM orders
WHERE order_date >= '2024-07-16'
  AND order_date <  '2024-07-17';
Enter fullscreen mode Exit fullscreen mode

Plan:

Index Only Scan using orders_order_date_idx
  Index Cond: (order_date >= ... AND order_date < ...)
Enter fullscreen mode Exit fullscreen mode

Performance delta on a 10 M-row table:

Seq Scan 1 247 ms → Index Only Scan 0.8 ms.

That is three orders of magnitude faster for the cost of two keystrokes.

5.2 Lower-casing text (the case-insensitive email)

Email uniqueness checks are ubiquitous. The naïve way looks like this:

-- BEFORE
SELECT *
FROM users
WHERE lower(email) = 'alice@example.com';
Enter fullscreen mode Exit fullscreen mode

Even with an index on email, the planner cannot use it because the predicate is wrapped in lower().

You have three escape hatches:

  1. citext extension (zero code changes):
   CREATE EXTENSION IF NOT EXISTS citext;
   ALTER TABLE users ALTER COLUMN email TYPE citext;
   -- query stays the same
Enter fullscreen mode Exit fullscreen mode
  1. Expression index (keep the text column):
   CREATE INDEX users_lower_email_idx ON users ((lower(email)));
Enter fullscreen mode Exit fullscreen mode
  1. Case-insensitive collation (if you control the schema):
   SELECT *
   FROM users
   WHERE email = 'alice@example.com' COLLATE "und-x-icu";
Enter fullscreen mode Exit fullscreen mode

Pick the one that aligns with your team’s conventions.

5.3 LIKE with leading wildcard (the search box)

Frontend teams love suffix search because “users expect Google.”

The naïve query is:

-- BEFORE
SELECT *
FROM products
WHERE name ILIKE '%headphones';
Enter fullscreen mode Exit fullscreen mode

Plan:

Seq Scan on products
  Filter: (lower(name) ~~ '%headphones'::text)
Enter fullscreen mode Exit fullscreen mode

The database must read every row.

-- AFTER (trigram index)
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX products_name_gin_trgm ON products USING gin (name gin_trgm_ops);
-- Query becomes
SELECT *
FROM products
WHERE name % 'headphones';  -- similarity operator
Enter fullscreen mode Exit fullscreen mode

Plan:

Bitmap Heap Scan using products_name_gin_trgm
  Recheck Cond: (name % 'headphones'::text)
Enter fullscreen mode Exit fullscreen mode

The index returns candidate rows; the heap recheck removes false positives.

On a 5 M-row catalog this drops runtime from 4.2 s to 18 ms.


6. Advanced Patterns: When Rewriting Gets Messy

Sometimes business logic forces you to keep the function. In those cases you have two escape hatches: expression indexes and materialized columns. Both are well supported by PostgreSQL tooling.

6.1 Expression index (the quick win)

Suppose accounting insists on month-level reports:

-- Keep the query logic
SELECT *
FROM invoices
WHERE extract(month FROM due_date) = 12;
Enter fullscreen mode Exit fullscreen mode

You cannot change the query, but you can make the expression indexable:

CREATE INDEX invoices_month_idx ON invoices ((extract(month FROM due_date)));
Enter fullscreen mode Exit fullscreen mode

Check the plan:

Bitmap Heap Scan using invoices_month_idx
  Index Cond: (extract(month FROM due_date) = 12)
Enter fullscreen mode Exit fullscreen mode

The planner now performs an index scan.

Caveat: the index is 1 byte wider per row, so measure the bloat on very wide tables.

6.2 Materialized column + trigger (the heavy artillery)

If the expression is expensive (e.g., jsonb_extract_path_text(payload, 'user', 'id')) an expression index alone may not be enough.

Instead, store the extracted value:

ALTER TABLE invoices
  ADD COLUMN due_month int
  GENERATED ALWAYS AS (extract(month FROM due_date)) STORED;

CREATE INDEX invoices_due_month_idx ON invoices(due_month);
Enter fullscreen mode Exit fullscreen mode

Because the column is stored, it participates in partition pruning and foreign keys.

The trade-off is disk space and the need to VACUUM after updates.

6.3 Partial index for hot values (the surgical strike)

If you only care about the current fiscal year, a partial index shrinks the footprint:

CREATE INDEX orders_2024_idx ON orders(order_date)
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01';
Enter fullscreen mode Exit fullscreen mode

This index is tiny and keeps buffer churn low.

Just remember to add new indexes each January or automate it with a migration.


7. Partitioned Tables: Don’t Let Non-SARGable Predicates Kill Pruning

Partitioning is a brilliant scalability tool—until a stray to_char() disables partition pruning.

Assume monthly partitions on orders(order_date).

7.1 The wrong way

SELECT *
FROM orders
WHERE to_char(order_date, 'YYYY-MM') = '2024-07';
Enter fullscreen mode Exit fullscreen mode

Plan:

Append
  ->  Seq Scan on orders_2024_01
  ->  Seq Scan on orders_2024_02
  ...
  ->  Seq Scan on orders_2024_12
Enter fullscreen mode Exit fullscreen mode

Twelve partitions scanned for a single month.

7.2 The right way

SELECT *
FROM orders
WHERE order_date >= '2024-07-01'
  AND order_date <  '2024-08-01';
Enter fullscreen mode Exit fullscreen mode

Plan:

Append
  ->  Seq Scan on orders_2024_07
Enter fullscreen mode Exit fullscreen mode

The planner prunes the other 11 partitions.

On a 300-partition table this drops I/O from 120 GB to 10 GB.

Verify with:

EXPLAIN (ANALYZE, BUFFERS)
SELECT *
FROM orders
WHERE order_date >= '2024-07-01'
  AND order_date <  '2024-08-01';
Enter fullscreen mode Exit fullscreen mode

Look for the single partition in the Append node.


8. Measuring the Gain: A Reproducible Benchmark

Numbers without context are just noise. Spin up a disposable cluster with Docker:

docker run --name pg16 -e POSTGRES_PASSWORD=pass -p 5432:5432 -d postgres:16
psql -h localhost -U postgres -c "CREATE EXTENSION pgbench"
Enter fullscreen mode Exit fullscreen mode

8.1 Seed data

CREATE TABLE orders (
  id bigserial PRIMARY KEY,
  order_date timestamptz NOT NULL
);
INSERT INTO orders(order_date)
SELECT '2020-01-01'::timestamptz + random() * interval '4 years'
FROM generate_series(1, 10_000_000);
CREATE INDEX orders_order_date_idx ON orders(order_date);
Enter fullscreen mode Exit fullscreen mode

8.2 Run non-SARGable query

\timing on
SELECT count(*)
FROM orders
WHERE date_trunc('day', order_date) = '2023-12-25';
-- Time: 1 247 ms
Enter fullscreen mode Exit fullscreen mode

8.3 Run SARGable rewrite

SELECT count(*)
FROM orders
WHERE order_date >= '2023-12-25'
  AND order_date <  '2023-12-26';
-- Time: 0.8 ms
Enter fullscreen mode Exit fullscreen mode

1 556× faster with one line changed.


9. Monitoring & Regression Prevention

Fixing queries once is table stakes; preventing regressions is engineering.

9.1 Continuous profiling

Enable auto_explain in postgresql.conf:

shared_preload_libraries = 'auto_explain'
Enter fullscreen mode Exit fullscreen mode

Then:

LOAD 'auto_explain';
SET auto_explain.log_min_duration = 100;
SET auto_explain.log_analyze = on;
Enter fullscreen mode Exit fullscreen mode

Every query slower than 100 ms arrives in the log with full EXPLAIN (ANALYZE, BUFFERS).

Feed that into Loki or CloudWatch and alert on sequential scans.

9.2 Regression tests with pg_regress

Create a test file test_sargable.sql:

BEGIN;
EXPLAIN (COSTS OFF)
SELECT * FROM orders WHERE date_trunc('day', order_date) = '2024-01-01';
ROLLBACK;
Enter fullscreen mode Exit fullscreen mode

Expect the plan to contain Index Scan or fail the build.

9.3 Dashboard in Grafana

Query pg_stat_statements every minute:

SELECT query, mean_exec_time
FROM pg_stat_statements
WHERE query LIKE '%date_trunc%' AND mean_exec_time > 100;
Enter fullscreen mode Exit fullscreen mode

Attach an alert that pages on-call if the trend rises.


10. Checklist for Code Reviews

Paste this into your PR template:

  • [ ] No function calls on indexed columns in WHERE/JOIN
  • [ ] Date ranges written as >= start AND < end
  • [ ] Text comparisons use citext, expression index, or explicit collation
  • [ ] LIKE '%foo' has a trigram index or is rewritten with full-text search
  • [ ] EXPLAIN (ANALYZE) output shows Index Scan or Index Only Scan

11. TL;DR Cheat Sheet

Pattern Quick Fix
WHERE extract(epoch FROM col) Shift function to constant or create expression index
WHERE col::type = value Cast the constant instead
WHERE lower(col) = 'foo' Use citext, expression index, or case-insensitive collation
WHERE col LIKE '%foo' Add trigram index or use full-text search
WHERE to_char(col,'YYYY-MM')='2024-07' Use range predicates so partition pruning works

12. Further Reading & Credits

  • PostgreSQL Docs: Expression Indexes
  • OOZOU blog: Postgres Functions & Non-sargable Queries
  • Egor Rogov: Partition Pruning in PostgreSQL 15

Now go forth and let your indexes do what they were born to do.

Top comments (0)