Postgres Query Sorting Traps

Chris Milne — Thu, 13 Feb 2025 21:09:20 +0000

The Problem...

Recently, in our AWS RDS based Postgres database, we saw the emergence of request timeouts (>30 sec) occurring in relation to preview loads of one of our larger database queries. In our system, a preview load fetches up to 20 rows of data via a standard http request and is therefore bound to a 30 second timeout (full report loads are handed off to an event cue with more generous timeouts).

That is to say, this problem was around a query that was terminated with limit 20...

The Query

At first glance, the query seemed harmless:

SELECT id, name, created_at, description
FROM some_table
JOIN ... DO MANY JOINS ...
LEFT JOIN ... MORE JOINS AGAIN ...
LATERAL JOIN ... RE-USED CALCULATED FIELDS ...
WHERE some_filter = 'some_value'
ORDER BY description
LIMIT 20;

Given that LIMIT 20 was in place, our expectation was that this query would quickly return results. However, it was consistently timing out.

The Investigation

Through using the Postgres query planer, we realised that the key detail we overlooked was that ORDER BY was being applied to a non-indexed text column (description). This meant that Postgres had to:

Scan all rows matching some_filter – a full table scan or index scan, depending on filters.
Sort the entire result set based on description, which is an expensive operation for text fields.
Apply the LIMIT after sorting was complete.

In essence, Postgres couldn't just grab the first 20 rows — it had to evaluate all potential matches, sort them, and only then select the top 20. This resulted in an unexpectedly expensive query plan, regardless of the 20 row limit.

The Fix

Since the sort order was only needed for consistency and not for any business logic, we changed the query to order by the primary key instead:

SELECT id, name, created_at, description
FROM some_table
JOIN ... DO MANY JOINS ...
LEFT JOIN ... MORE JOINS AGAIN ...
LATERAL JOIN ... RE-USED CALCULATED FIELDS ...
WHERE some_filter = 'some_value'
ORDER BY id
LIMIT 20;

Because id is the primary key and indexed, Postgres could efficiently retrieve and return the top 20 rows without a costly sorting operation. The result? The query execution time dropped from 30+ seconds to milliseconds.

Lessons Learned

ORDER BY isn't free — even on a seemingly small dataset, sorting can introduce significant overhead if done on a non-indexed column.
LIMIT doesn't prevent unnecessary work—the database must still evaluate all potential rows before limiting the results (seems obvious when you say it like that!).
Indexes matter — if sorting is necessary, ensure the column is indexed appropriately to avoid full-table sorting.
Question your sort requirements — if ordering is only for consistency rather than user-facing logic, sorting by an indexed column can be a simple but powerful optimization.

Final Thoughts

This was a great reminder that even small queries can hide performance pitfalls. If you're facing slow queries, always check the query plan (EXPLAIN ANALYZE) and ensure you're not accidentally forcing Postgres to do more work than necessary.

Have you run into similar performance surprises? Have I missed a key detail in my analysis? Let me know in the comments!