Abhishek Gautam

Posted on Jul 16

Stop Killing Your Indexes: A PostgreSQL Guide to Optimizing Non-SARGable Queries

#database #postgres #performance #programming

1. The 30-second Elevator Pitch

Imagine you’re racing a delivery scooter through downtown traffic. You have GPS, but you tape over the screen and decide to check every single street sign instead. That is exactly what PostgreSQL does when your predicate is non-SARGable: it throws the index in the trash and inspects every row.

If you have an index on orders(order_date) but write

SELECT *
FROM orders
WHERE date_trunc('day', order_date) = '2024-07-16';

PostgreSQL cannot use the index. Instead, it scans the entire table.

That scan scales linearly with table size, and on a busy OLTP system it quickly becomes the bottleneck that ruins your evening.

In this article we’ll turn that sequential horror into an index-only victory lap and measure the difference with EXPLAIN (ANALYZE, BUFFERS). No fluff, just battle-tested rewrites you can ship today.

2. What “SARGable” Actually Means

SARGable is an acronym coined in the 1970s by mainframe optimizers: Search ARGument ABLE.

In plain English, a predicate is SARGable when the planner can seek an index instead of scanning every row.

The concept is surprisingly simple once you internalize the rule of thumb:

Keep the left side of the operator naked.

In PostgreSQL this boils down to:

✅ SARGable form	❌ Non-SARGable form
`col op constant`	`function(col) op constant`
`col BETWEEN a AND b`	`CAST(col AS text) LIKE '%foo'`
`col = ANY($1)`	`lower(col) = 'foo'`

If you wrap the column in anything—a cast, a function, arithmetic—the planner must treat the expression as an opaque blob and falls back to a sequential scan.

The good news? Once you see the pattern, every rewrite becomes mechanical.

3. Anatomy of a Non-SARGable Predicate

Let’s dissect the three most frequent patterns we see during code reviews and on-call incidents.

3.1 Function on the column (the classic)

-- BAD
SELECT *
FROM events
WHERE extract(epoch FROM created_at) > 1_724_000_000;

At first glance this looks harmless. After all, extract() is a built-in, right?

But the b-tree index on created_at stores timestamps, not epochs.

The planner has no rule that lets it invert extract(epoch FROM created_at) back to a timestamp range, so it discards the index and scans the whole table.

3.2 Implicit casting (the silent killer)

-- BAD
SELECT *
FROM users
WHERE phone::bigint = 49_179_551_234;

The cast is on the left side, so the index on phone (text) is useless.

Often this slips in when the application layer passes an integer but the column is text, or vice-versa. The SQL still runs, but the performance cliff arrives at scale.

3.3 Leading wildcard (LIKE and ILIKE)

-- BAD
SELECT *
FROM products
WHERE name ILIKE '%headphones';

B-tree indexes store values in lexicographical order. A leading % says “I don’t know where the string starts,” which means the planner has to walk the entire index—or worse, the entire table.

In OLAP workloads this is sometimes acceptable; in OLTP it is a guarantee for 3 a.m. pages.

4. Detect Non-SARGable Queries in the Wild

You can’t fix what you can’t see. Here are three battle-tested ways to surface these anti-patterns.

4.1 Manual inspection with `EXPLAIN`

EXPLAIN (ANALYZE, BUFFERS, FORMAT text)
SELECT *
FROM orders
WHERE date_trunc('day', order_date) = '2024-07-16';

Output (trimmed):

Seq Scan on orders  (cost=0.00..195 000.00 rows=1 width=...)
  Filter: (date_trunc('day'::text, order_date) = '2024-07-16'::date)

The Seq Scan is the smoking gun. If you ever see a Seq Scan on a multi-million-row table for a selective predicate, assume non-SARGability until proven otherwise.

4.2 Bulk detection in `pg_stat_statements`

Enable the extension in postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'

Then query for sequential scans:

SELECT query, calls, mean_exec_time, shared_blks_read
FROM pg_stat_statements
WHERE query LIKE '%Seq Scan%'
ORDER BY mean_exec_time DESC
LIMIT 10;

This surfaces the worst offenders without manual inspection.

Pro tip: dump the result into a Grafana panel and alert on mean_exec_time > 100 ms.

4.3 Automated linter: `pg_query`

pip install pg_query
pg_query -f app.sql | jq '.. | .FuncCall? | select(.funcname[0].String.str=="date_trunc")'

Hook this into your CI pipeline. If the linter finds a function wrapped around an indexed column, the build fails. Culture beats tooling every time.

5. Rewrite Recipes: Turning Bad into Bad-A$$

Below are drop-in rewrites you can paste into your codebase today. Each one starts with the non-SARGable original, explains why it is slow, and then shows the fixed version with before/after timings.

5.1 Date truncation (the daily trap)

Developers love date_trunc('day', …) because it feels natural. Unfortunately it is kryptonite to b-tree indexes.

-- BEFORE (non-SARGable)
SELECT *
FROM orders
WHERE date_trunc('day', order_date) = '2024-07-16';

The planner has to evaluate date_trunc() for every row.

-- AFTER (SARGable)
SELECT *
FROM orders
WHERE order_date >= '2024-07-16'
  AND order_date <  '2024-07-17';

Plan:

Index Only Scan using orders_order_date_idx
  Index Cond: (order_date >= ... AND order_date < ...)

Performance delta on a 10 M-row table:

Seq Scan 1 247 ms → Index Only Scan 0.8 ms.

That is three orders of magnitude faster for the cost of two keystrokes.

5.2 Lower-casing text (the case-insensitive email)

Email uniqueness checks are ubiquitous. The naïve way looks like this:

-- BEFORE
SELECT *
FROM users
WHERE lower(email) = 'alice@example.com';

Even with an index on email, the planner cannot use it because the predicate is wrapped in lower().

You have three escape hatches:

citext extension (zero code changes):

   CREATE EXTENSION IF NOT EXISTS citext;
   ALTER TABLE users ALTER COLUMN email TYPE citext;
   -- query stays the same

Expression index (keep the text column):

   CREATE INDEX users_lower_email_idx ON users ((lower(email)));

Case-insensitive collation (if you control the schema):

   SELECT *
   FROM users
   WHERE email = 'alice@example.com' COLLATE "und-x-icu";

Pick the one that aligns with your team’s conventions.

5.3 LIKE with leading wildcard (the search box)

Frontend teams love suffix search because “users expect Google.”

The naïve query is:

-- BEFORE
SELECT *
FROM products
WHERE name ILIKE '%headphones';

Plan:

Seq Scan on products
  Filter: (lower(name) ~~ '%headphones'::text)

The database must read every row.

-- AFTER (trigram index)
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX products_name_gin_trgm ON products USING gin (name gin_trgm_ops);
-- Query becomes
SELECT *
FROM products
WHERE name % 'headphones';  -- similarity operator

Plan:

Bitmap Heap Scan using products_name_gin_trgm
  Recheck Cond: (name % 'headphones'::text)

The index returns candidate rows; the heap recheck removes false positives.

On a 5 M-row catalog this drops runtime from 4.2 s to 18 ms.

6. Advanced Patterns: When Rewriting Gets Messy

Sometimes business logic forces you to keep the function. In those cases you have two escape hatches: expression indexes and materialized columns. Both are well supported by PostgreSQL tooling.

6.1 Expression index (the quick win)

Suppose accounting insists on month-level reports:

-- Keep the query logic
SELECT *
FROM invoices
WHERE extract(month FROM due_date) = 12;

You cannot change the query, but you can make the expression indexable:

CREATE INDEX invoices_month_idx ON invoices ((extract(month FROM due_date)));

Check the plan:

Bitmap Heap Scan using invoices_month_idx
  Index Cond: (extract(month FROM due_date) = 12)

The planner now performs an index scan.

Caveat: the index is 1 byte wider per row, so measure the bloat on very wide tables.

6.2 Materialized column + trigger (the heavy artillery)

If the expression is expensive (e.g., jsonb_extract_path_text(payload, 'user', 'id')) an expression index alone may not be enough.

Instead, store the extracted value:

ALTER TABLE invoices
  ADD COLUMN due_month int
  GENERATED ALWAYS AS (extract(month FROM due_date)) STORED;

CREATE INDEX invoices_due_month_idx ON invoices(due_month);

Because the column is stored, it participates in partition pruning and foreign keys.

The trade-off is disk space and the need to VACUUM after updates.

6.3 Partial index for hot values (the surgical strike)

If you only care about the current fiscal year, a partial index shrinks the footprint:

CREATE INDEX orders_2024_idx ON orders(order_date)
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01';

This index is tiny and keeps buffer churn low.

Just remember to add new indexes each January or automate it with a migration.

7. Partitioned Tables: Don’t Let Non-SARGable Predicates Kill Pruning

Partitioning is a brilliant scalability tool—until a stray to_char() disables partition pruning.

Assume monthly partitions on orders(order_date).

7.1 The wrong way

SELECT *
FROM orders
WHERE to_char(order_date, 'YYYY-MM') = '2024-07';

Plan:

Append
  ->  Seq Scan on orders_2024_01
  ->  Seq Scan on orders_2024_02
  ...
  ->  Seq Scan on orders_2024_12

Twelve partitions scanned for a single month.

7.2 The right way

SELECT *
FROM orders
WHERE order_date >= '2024-07-01'
  AND order_date <  '2024-08-01';

Plan:

Append
  ->  Seq Scan on orders_2024_07

The planner prunes the other 11 partitions.

On a 300-partition table this drops I/O from 120 GB to 10 GB.

Verify with:

EXPLAIN (ANALYZE, BUFFERS)
SELECT *
FROM orders
WHERE order_date >= '2024-07-01'
  AND order_date <  '2024-08-01';

Look for the single partition in the Append node.

8. Measuring the Gain: A Reproducible Benchmark

Numbers without context are just noise. Spin up a disposable cluster with Docker:

docker run --name pg16 -e POSTGRES_PASSWORD=pass -p 5432:5432 -d postgres:16
psql -h localhost -U postgres -c "CREATE EXTENSION pgbench"

8.1 Seed data

CREATE TABLE orders (
  id bigserial PRIMARY KEY,
  order_date timestamptz NOT NULL
);
INSERT INTO orders(order_date)
SELECT '2020-01-01'::timestamptz + random() * interval '4 years'
FROM generate_series(1, 10_000_000);
CREATE INDEX orders_order_date_idx ON orders(order_date);

8.2 Run non-SARGable query

\timing on
SELECT count(*)
FROM orders
WHERE date_trunc('day', order_date) = '2023-12-25';
-- Time: 1 247 ms

8.3 Run SARGable rewrite

SELECT count(*)
FROM orders
WHERE order_date >= '2023-12-25'
  AND order_date <  '2023-12-26';
-- Time: 0.8 ms

1 556× faster with one line changed.

9. Monitoring & Regression Prevention

Fixing queries once is table stakes; preventing regressions is engineering.

9.1 Continuous profiling

Enable auto_explain in postgresql.conf:

shared_preload_libraries = 'auto_explain'

Then:

LOAD 'auto_explain';
SET auto_explain.log_min_duration = 100;
SET auto_explain.log_analyze = on;

Every query slower than 100 ms arrives in the log with full EXPLAIN (ANALYZE, BUFFERS).

Feed that into Loki or CloudWatch and alert on sequential scans.

9.2 Regression tests with `pg_regress`

Create a test file test_sargable.sql:

BEGIN;
EXPLAIN (COSTS OFF)
SELECT * FROM orders WHERE date_trunc('day', order_date) = '2024-01-01';
ROLLBACK;

Expect the plan to contain Index Scan or fail the build.

9.3 Dashboard in Grafana

Query pg_stat_statements every minute:

SELECT query, mean_exec_time
FROM pg_stat_statements
WHERE query LIKE '%date_trunc%' AND mean_exec_time > 100;

Attach an alert that pages on-call if the trend rises.

10. Checklist for Code Reviews

Paste this into your PR template:

[ ] No function calls on indexed columns in WHERE/JOIN
[ ] Date ranges written as >= start AND < end
[ ] Text comparisons use citext, expression index, or explicit collation
[ ] LIKE '%foo' has a trigram index or is rewritten with full-text search
[ ] EXPLAIN (ANALYZE) output shows Index Scan or Index Only Scan

11. TL;DR Cheat Sheet

Pattern	Quick Fix
`WHERE extract(epoch FROM col)`	Shift function to constant or create expression index
`WHERE col::type = value`	Cast the constant instead
`WHERE lower(col) = 'foo'`	Use `citext`, expression index, or case-insensitive collation
`WHERE col LIKE '%foo'`	Add trigram index or use `full-text search`
`WHERE to_char(col,'YYYY-MM')='2024-07'`	Use range predicates so partition pruning works

12. Further Reading & Credits

PostgreSQL Docs: Expression Indexes
OOZOU blog: Postgres Functions & Non-sargable Queries
Egor Rogov: Partition Pruning in PostgreSQL 15

Now go forth and let your indexes do what they were born to do.

DEV Community

Stop Killing Your Indexes: A PostgreSQL Guide to Optimizing Non-SARGable Queries

1. The 30-second Elevator Pitch

2. What “SARGable” Actually Means

3. Anatomy of a Non-SARGable Predicate

3.1 Function on the column (the classic)

3.2 Implicit casting (the silent killer)

3.3 Leading wildcard (LIKE and ILIKE)

4. Detect Non-SARGable Queries in the Wild

4.1 Manual inspection with `EXPLAIN`

4.2 Bulk detection in `pg_stat_statements`

4.3 Automated linter: `pg_query`

5. Rewrite Recipes: Turning Bad into Bad-A$$

5.1 Date truncation (the daily trap)

5.2 Lower-casing text (the case-insensitive email)

5.3 LIKE with leading wildcard (the search box)

6. Advanced Patterns: When Rewriting Gets Messy

6.1 Expression index (the quick win)

6.2 Materialized column + trigger (the heavy artillery)

6.3 Partial index for hot values (the surgical strike)

7. Partitioned Tables: Don’t Let Non-SARGable Predicates Kill Pruning

7.1 The wrong way

7.2 The right way

8. Measuring the Gain: A Reproducible Benchmark

8.1 Seed data

8.2 Run non-SARGable query

8.3 Run SARGable rewrite

9. Monitoring & Regression Prevention

9.1 Continuous profiling

9.2 Regression tests with `pg_regress`

9.3 Dashboard in Grafana

10. Checklist for Code Reviews

11. TL;DR Cheat Sheet

12. Further Reading & Credits

Top comments (0)

1. The 30-second Elevator Pitch

2. What “SARGable” Actually Means

3. Anatomy of a Non-SARGable Predicate

3.1 Function on the column (the classic)

3.2 Implicit casting (the silent killer)

3.3 Leading wildcard (LIKE and ILIKE)

4. Detect Non-SARGable Queries in the Wild

4.1 Manual inspection with EXPLAIN

4.2 Bulk detection in pg_stat_statements

4.3 Automated linter: pg_query

5. Rewrite Recipes: Turning Bad into Bad-A$$

5.1 Date truncation (the daily trap)

5.2 Lower-casing text (the case-insensitive email)

5.3 LIKE with leading wildcard (the search box)

6. Advanced Patterns: When Rewriting Gets Messy

6.1 Expression index (the quick win)

6.2 Materialized column + trigger (the heavy artillery)

6.3 Partial index for hot values (the surgical strike)

7. Partitioned Tables: Don’t Let Non-SARGable Predicates Kill Pruning

7.1 The wrong way

7.2 The right way

8. Measuring the Gain: A Reproducible Benchmark

8.1 Seed data

8.2 Run non-SARGable query

8.3 Run SARGable rewrite

9. Monitoring & Regression Prevention

9.1 Continuous profiling

9.2 Regression tests with pg_regress

9.3 Dashboard in Grafana

10. Checklist for Code Reviews

11. TL;DR Cheat Sheet

12. Further Reading & Credits

4.1 Manual inspection with `EXPLAIN`

4.2 Bulk detection in `pg_stat_statements`

4.3 Automated linter: `pg_query`

9.2 Regression tests with `pg_regress`