<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Chris Milne</title>
    <description>The latest articles on DEV Community by Chris Milne (@cmilne84).</description>
    <link>https://dev.to/cmilne84</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F271710%2F9b81dd61-b24c-4da4-9567-ecec22a737f1.jpg</url>
      <title>DEV Community: Chris Milne</title>
      <link>https://dev.to/cmilne84</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cmilne84"/>
    <language>en</language>
    <item>
      <title>Postgres Query Sorting Traps</title>
      <dc:creator>Chris Milne</dc:creator>
      <pubDate>Thu, 13 Feb 2025 21:09:20 +0000</pubDate>
      <link>https://dev.to/cmilne84/postgres-query-sorting-traps-4aa7</link>
      <guid>https://dev.to/cmilne84/postgres-query-sorting-traps-4aa7</guid>
      <description>&lt;h2&gt;
  
  
  The Problem...
&lt;/h2&gt;

&lt;p&gt;Recently, in our AWS RDS based Postgres database, we saw the emergence of request timeouts (&amp;gt;30 sec) occurring in relation to preview loads of one of our larger database queries. In our system, a preview load fetches up to 20 rows of data via a standard http request and is therefore bound to a 30 second timeout (full report loads are handed off to an event cue with more generous timeouts).&lt;/p&gt;

&lt;p&gt;That is to say, this problem was around a query that was terminated with &lt;code&gt;limit 20&lt;/code&gt;...&lt;/p&gt;

&lt;h3&gt;
  
  
  The Query
&lt;/h3&gt;

&lt;p&gt;At first glance, the query seemed harmless:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;some_table&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="n"&gt;MANY&lt;/span&gt; &lt;span class="n"&gt;JOINS&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;MORE&lt;/span&gt; &lt;span class="n"&gt;JOINS&lt;/span&gt; &lt;span class="n"&gt;AGAIN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;LATERAL&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="n"&gt;RE&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;USED&lt;/span&gt; &lt;span class="n"&gt;CALCULATED&lt;/span&gt; &lt;span class="n"&gt;FIELDS&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;some_filter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'some_value'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Given that LIMIT 20 was in place, our expectation was that this query would quickly return results. However, it was consistently timing out.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Investigation
&lt;/h3&gt;

&lt;p&gt;Through using the Postgres query planer, we realised that the key detail we overlooked was that ORDER BY was being applied to a non-indexed text column (&lt;code&gt;description&lt;/code&gt;). This meant that Postgres had to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Scan all rows matching &lt;code&gt;some_filter&lt;/code&gt; – a full table scan or index scan, depending on filters.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Sort the entire result set based on description, which is an expensive operation for text fields.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Apply the LIMIT after sorting was complete.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In essence, Postgres couldn't just grab the first 20 rows — it had to evaluate all potential matches, sort them, and only then select the top 20. This resulted in an unexpectedly expensive query plan, regardless of the 20 row limit.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix
&lt;/h3&gt;

&lt;p&gt;Since the sort order was only needed for consistency and not for any business logic, we changed the query to order by the primary key instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;some_table&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="n"&gt;MANY&lt;/span&gt; &lt;span class="n"&gt;JOINS&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;MORE&lt;/span&gt; &lt;span class="n"&gt;JOINS&lt;/span&gt; &lt;span class="n"&gt;AGAIN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;LATERAL&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="n"&gt;RE&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;USED&lt;/span&gt; &lt;span class="n"&gt;CALCULATED&lt;/span&gt; &lt;span class="n"&gt;FIELDS&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;some_filter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'some_value'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because &lt;code&gt;id&lt;/code&gt; is the primary key and indexed, Postgres could efficiently retrieve and return the top 20 rows without a costly sorting operation. The result? The query execution time dropped from 30+ seconds to milliseconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lessons Learned
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;ORDER BY isn't free — even on a seemingly small dataset, sorting can introduce significant overhead if done on a non-indexed column.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;LIMIT doesn't prevent unnecessary work—the database must still evaluate all potential rows before limiting the results (seems obvious when you say it like that!).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Indexes matter — if sorting is necessary, ensure the column is indexed appropriately to avoid full-table sorting.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Question your sort requirements — if ordering is only for consistency rather than user-facing logic, sorting by an indexed column can be a simple but powerful optimization.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Final Thoughts
&lt;/h3&gt;

&lt;p&gt;This was a great reminder that even small queries can hide performance pitfalls. If you're facing slow queries, always check the query plan (&lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt;) and ensure you're not accidentally forcing Postgres to do more work than necessary.&lt;/p&gt;

&lt;p&gt;Have you run into similar performance surprises? Have I missed a key detail in my analysis? Let me know in the comments!&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>performance</category>
      <category>sql</category>
    </item>
  </channel>
</rss>
