<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vivek Kumar</title>
    <description>The latest articles on DEV Community by Vivek Kumar (@vivekdraxlr).</description>
    <link>https://dev.to/vivekdraxlr</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3870326%2F6ff71248-2d18-4571-9d3e-9cd0e728f3a2.png</url>
      <title>DEV Community: Vivek Kumar</title>
      <link>https://dev.to/vivekdraxlr</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vivekdraxlr"/>
    <language>en</language>
    <item>
      <title>Real-Time Metrics Dashboards with SQL: Patterns That Actually Scale</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Mon, 15 Jun 2026 09:03:05 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/real-time-metrics-dashboards-with-sql-patterns-that-actually-scale-3i73</link>
      <guid>https://dev.to/vivekdraxlr/real-time-metrics-dashboards-with-sql-patterns-that-actually-scale-3i73</guid>
      <description>&lt;p&gt;You ship a SaaS app. A customer asks for a dashboard. "Just show us live numbers," they say. Easy.&lt;/p&gt;

&lt;p&gt;Three weeks later, your &lt;code&gt;SELECT COUNT(*) FROM events WHERE created_at &amp;gt; now() - interval '24 hours'&lt;/code&gt; query is taking eight seconds, your database CPU is pegged, and every dashboard auto-refresh slows down the rest of the app. Sound familiar?&lt;/p&gt;

&lt;p&gt;Real-time metrics dashboards look trivial on the product roadmap and turn into a swamp the moment real data shows up. The good news: most of the hard parts have well-understood SQL patterns. You don't need a separate streaming database, a Kafka cluster, or a six-figure analytics vendor to ship a fast, live dashboard. You just need to stop treating "real-time" as one problem.&lt;/p&gt;

&lt;p&gt;This post walks through the SQL patterns that hold up when traffic grows: pre-aggregation, materialized views, change streaming with &lt;code&gt;LISTEN&lt;/code&gt;/&lt;code&gt;NOTIFY&lt;/code&gt;, and how to pick the right refresh strategy for each tile. Examples use PostgreSQL, but the patterns work in MySQL, SQL Server, or any modern relational database.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "real-time" actually means
&lt;/h2&gt;

&lt;p&gt;Before writing a single query, ask: how fresh does this number need to be?&lt;/p&gt;

&lt;p&gt;A lot of "real-time" requirements melt away under that question. A signup funnel doesn't need sub-second updates. An admin status page might need true streaming. A revenue chart for last quarter doesn't need any freshness at all. Group the tiles on your dashboard into three buckets:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;Bucket&lt;/th&gt;
&lt;th&gt;Freshness&lt;/th&gt;
&lt;th&gt;Typical pattern&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;Historical&lt;/td&gt;
&lt;td&gt;Hours to days&lt;/td&gt;
&lt;td&gt;Pre-aggregated rollup tables, refreshed nightly&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Near real-time&lt;/td&gt;
&lt;td&gt;Seconds to minutes&lt;/td&gt;
&lt;td&gt;Materialized views refreshed on a schedule&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;True real-time&lt;/td&gt;
&lt;td&gt;Sub-second&lt;/td&gt;
&lt;td&gt;LISTEN/NOTIFY or change-data capture pushed to clients&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most dashboards are 80% historical, 15% near real-time, and 5% true real-time. The reason your dashboard is slow is almost always that you put a tile in the wrong bucket.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 1: Pre-aggregation for historical tiles
&lt;/h2&gt;

&lt;p&gt;The cheapest query is one you've already run. For anything that doesn't need to change minute-to-minute, roll it up into a separate table.&lt;/p&gt;

&lt;p&gt;Say you have a 200 million row &lt;code&gt;events&lt;/code&gt; table and you want a chart of daily active users for the last 90 days. Running the raw query every page load is malpractice:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- DON'T DO THIS on every dashboard refresh&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;date_trunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'90 days'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead, build a rollup table and refresh it once a day:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;daily_active_users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;day&lt;/span&gt;        &lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;dau&lt;/span&gt;        &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;computed_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Run nightly via pg_cron, your job runner, or a scheduled task&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;daily_active_users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;date_trunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;current_date&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'2 days'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;  &lt;span class="k"&gt;current_date&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;CONFLICT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;
  &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;EXCLUDED&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dau&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;computed_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your dashboard query becomes a trivial scan of 90 rows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;daily_active_users&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;current_date&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Latency drops from seconds to under a millisecond. This is the same trick used inside almost every analytics product you've ever paid for.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 2: Materialized views for near real-time
&lt;/h2&gt;

&lt;p&gt;For tiles that need to be a few minutes fresh — open tickets, signups today, MRR this hour — a materialized view gives you the same precompute benefit with less plumbing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;mv_signups_today&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                          &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;signups&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;FILTER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;paid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;paid_signups&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                   &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;last_signup_at&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;current_date&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;mv_signups_today&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Refresh it on a schedule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;REFRESH&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;CONCURRENTLY&lt;/span&gt; &lt;span class="n"&gt;mv_signups_today&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;CONCURRENTLY&lt;/code&gt; is the magic word — it lets readers keep querying the old data while the new snapshot is built. Without it, your dashboard locks every time the view refreshes. And &lt;code&gt;CONCURRENTLY&lt;/code&gt; requires a unique index on the view, which is why we created one above.&lt;/p&gt;

&lt;p&gt;How often to refresh depends on the tile. A &lt;code&gt;pg_cron&lt;/code&gt; job that runs every minute is fine for most "live" SaaS metrics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;cron&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;schedule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'refresh-signups'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'* * * * *'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="n"&gt;REFRESH&lt;/span&gt; &lt;span class="n"&gt;MATERIALIZED&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;CONCURRENTLY&lt;/span&gt; &lt;span class="n"&gt;mv_signups_today&lt;/span&gt;&lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vanilla PostgreSQL materialized views always do a full rebuild — there's no built-in incremental mode. For big tables you'll want incremental view maintenance via an extension like &lt;code&gt;pg_ivm&lt;/code&gt;, or a streaming database that maintains views incrementally for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 3: True real-time with LISTEN/NOTIFY
&lt;/h2&gt;

&lt;p&gt;Some tiles genuinely need to update the instant something happens — an order placed, a build finished, a user joined a call. Polling for those is wasteful, especially across many concurrent dashboard users. PostgreSQL has a built-in pub/sub mechanism that's perfect for this.&lt;/p&gt;

&lt;p&gt;A trigger emits a notification whenever a row changes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;notify_order_change&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
  &lt;span class="n"&gt;PERFORM&lt;/span&gt; &lt;span class="n"&gt;pg_notify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'orders_channel'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;json_build_object&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s1"&gt;'op'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="n"&gt;TG_OP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s1"&gt;'id'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s1"&gt;'amount'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s1"&gt;'status'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;
    &lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;orders_notify&lt;/span&gt;
&lt;span class="k"&gt;AFTER&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;notify_order_change&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your app subscribes once per process and fans the update out to every dashboard client over Server-Sent Events or WebSockets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# pseudo-Python with asyncpg
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;acquire&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_listener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;orders_channel&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_order_change&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Future&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# keep the listener alive
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_order_change&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;broadcast_to_dashboard_clients&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now a dashboard with 50 active viewers makes one database connection for the listener, instead of 50 clients each polling every few seconds. This is the same idea Supabase Realtime productizes through a managed WebSocket layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 4: Time-bucketed aggregations that match the chart
&lt;/h2&gt;

&lt;p&gt;Even with rollups in place, the SQL that drives a chart should match the chart's resolution. A line chart for the last 24 hours needs about 1,440 data points at minute granularity — anything finer just slows things down and looks worse.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;date_trunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'minute'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                          &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;event_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;unique_users&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'24 hours'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For longer windows, drop the resolution. A monthly chart at minute granularity is 43,200 rows — your browser will choke. Hour buckets give 720 rows, daily buckets give 30. Match the bucket size to the window:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;Window&lt;/th&gt;
&lt;th&gt;Bucket&lt;/th&gt;
&lt;th&gt;Approximate rows&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;Last hour&lt;/td&gt;
&lt;td&gt;10 seconds&lt;/td&gt;
&lt;td&gt;360&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Last 24 hours&lt;/td&gt;
&lt;td&gt;1 minute&lt;/td&gt;
&lt;td&gt;1,440&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Last 7 days&lt;/td&gt;
&lt;td&gt;1 hour&lt;/td&gt;
&lt;td&gt;168&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Last 30 days&lt;/td&gt;
&lt;td&gt;1 day&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Last year&lt;/td&gt;
&lt;td&gt;1 week&lt;/td&gt;
&lt;td&gt;52&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Pair this with the right index — usually a B-tree on &lt;code&gt;created_at&lt;/code&gt;, and a &lt;code&gt;BRIN&lt;/code&gt; index if the table is append-mostly and large — and these queries stay under 100 ms even on billions of rows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common mistakes that kill dashboards
&lt;/h2&gt;

&lt;p&gt;A few patterns to avoid:&lt;/p&gt;

&lt;p&gt;The first is &lt;strong&gt;counting everything on every load&lt;/strong&gt;. &lt;code&gt;SELECT COUNT(*) FROM orders&lt;/code&gt; on a 50 million row table will scan the table and ruin your day. Either keep a running counter in a small summary table, or use &lt;code&gt;pg_class.reltuples&lt;/code&gt; for an estimate when exact accuracy doesn't matter.&lt;/p&gt;

&lt;p&gt;The second is &lt;strong&gt;&lt;code&gt;SELECT *&lt;/code&gt; in dashboard queries&lt;/strong&gt;. Every extra column is bytes over the wire and memory pressure on the database. Pick the columns you need.&lt;/p&gt;

&lt;p&gt;The third is &lt;strong&gt;N+1 dashboard queries&lt;/strong&gt;. A dashboard with 12 tiles often issues 12 separate queries. With a 50 ms round trip, that's 600 ms of latency before anything renders. Batch related tiles into a single query using CTEs, or run them concurrently from the application.&lt;/p&gt;

&lt;p&gt;The fourth is &lt;strong&gt;refreshing a materialized view without &lt;code&gt;CONCURRENTLY&lt;/code&gt;&lt;/strong&gt;. The first time it happens during business hours, you'll know. The view takes an &lt;code&gt;ACCESS EXCLUSIVE&lt;/code&gt; lock and every other query waits.&lt;/p&gt;

&lt;p&gt;The fifth is &lt;strong&gt;forgetting time zones&lt;/strong&gt;. &lt;code&gt;date_trunc('day', created_at)&lt;/code&gt; uses the session time zone, so a customer in Mumbai and one in Berlin will see different "today" numbers. Always be explicit: &lt;code&gt;date_trunc('day', created_at AT TIME ZONE 'UTC')&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Putting it together
&lt;/h2&gt;

&lt;p&gt;A practical dashboard on PostgreSQL alone looks like this: historical tiles (last 90 days revenue, all-time signups) query nightly rollup tables; near real-time tiles (signups today, open tickets, MRR this week) read from materialized views refreshed every minute; the one or two true real-time tiles (active users right now, last order) subscribe to a &lt;code&gt;LISTEN&lt;/code&gt;/&lt;code&gt;NOTIFY&lt;/code&gt; channel and update via SSE.&lt;/p&gt;

&lt;p&gt;That stack handles thousands of concurrent dashboard users on commodity hardware. When you outgrow it — usually around tens of millions of events per day — you can graduate to TimescaleDB continuous aggregates, ClickHouse, or a streaming database without rewriting your dashboard logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;p&gt;Real-time is a spectrum, not a setting. Sort your tiles by freshness requirement first, and most of your performance problems disappear. Pre-aggregate anything historical. Use materialized views with &lt;code&gt;CONCURRENTLY&lt;/code&gt; for near real-time. Reach for &lt;code&gt;LISTEN&lt;/code&gt;/&lt;code&gt;NOTIFY&lt;/code&gt; only when you genuinely need sub-second updates. Match bucket granularity to chart resolution. And test with realistic data volumes — a dashboard that's fast on 10,000 rows tells you nothing about how it behaves on 100 million.&lt;/p&gt;

&lt;h2&gt;
  
  
  Over to you
&lt;/h2&gt;

&lt;p&gt;What's the slowest dashboard query you've ever shipped, and what fixed it? Did you end up with a rollup table, a materialized view, a streaming database, or did you just throw more hardware at it? Drop a comment — especially if you've battle-tested one of these patterns at scale, or if you use a different one entirely.&lt;/p&gt;

&lt;p&gt;If you're building this inside your own SaaS product, the patterns above will get you far. When you want to stop hand-rolling chart components and SQL editors, tools like Draxlr, Metabase, or Cube can handle the embedding layer so you can focus on the queries.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>postgres</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Row-Level Security for Embedded Dashboards: A Practical Postgres Guide</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Mon, 01 Jun 2026 07:07:59 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/row-level-security-for-embedded-dashboards-a-practical-postgres-guide-bn2</link>
      <guid>https://dev.to/vivekdraxlr/row-level-security-for-embedded-dashboards-a-practical-postgres-guide-bn2</guid>
      <description>&lt;p&gt;You just shipped your first embedded dashboard. Customers can log into your SaaS, click "Analytics," and see live charts of &lt;em&gt;their&lt;/em&gt; data. Great. Then the support emails roll in.&lt;/p&gt;

&lt;p&gt;"Why can my finance lead see the executive comp report?"&lt;br&gt;
"Our intern can see everyone's pipeline."&lt;br&gt;
"This customer can see another customer's invoice in a CSV export — please fix immediately."&lt;/p&gt;

&lt;p&gt;Embedded analytics has a security model that is subtly different from your normal app authorization. Dashboards run many queries you didn't write — exports, drill-downs, ad-hoc filters, AI-generated SQL. The "just add a &lt;code&gt;WHERE&lt;/code&gt; clause" trick that works in your handcrafted API endpoints does not work here, because there is no single endpoint anymore. Every query is a potential leak.&lt;/p&gt;

&lt;p&gt;This article is about how to use &lt;strong&gt;Postgres Row-Level Security (RLS)&lt;/strong&gt; to make embedded dashboards safe by default. We'll cover the embed-token-to-session-context flow, per-user roles within a tenant, column masking, performance, and the connection pooler gotcha that quietly breaks RLS in production.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why "WHERE tenant_id = ?" isn't enough
&lt;/h2&gt;

&lt;p&gt;The classic pattern is to stamp every row with a &lt;code&gt;tenant_id&lt;/code&gt; and add &lt;code&gt;WHERE tenant_id = :current_tenant&lt;/code&gt; to every query. That works when &lt;em&gt;you&lt;/em&gt; write every query.&lt;/p&gt;

&lt;p&gt;But in an embedded dashboard:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Users can build their own queries through the UI.&lt;/li&gt;
&lt;li&gt;An AI assistant might generate SQL on the fly.&lt;/li&gt;
&lt;li&gt;CSV exports run separate queries.&lt;/li&gt;
&lt;li&gt;A drill-down chart may join three tables, any one of which could be missing the filter.&lt;/li&gt;
&lt;li&gt;Power users might paste raw SQL into a query console.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You don't want the safety of your customers' data to depend on every developer (and every LLM) remembering the right filter. RLS moves that enforcement into the database itself — the storage layer literally refuses to return rows the current session isn't allowed to see.&lt;/p&gt;
&lt;h2&gt;
  
  
  The model: enable RLS, write a policy, set context per request
&lt;/h2&gt;

&lt;p&gt;The three moving parts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Enable RLS&lt;/strong&gt; on the table.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write a policy&lt;/strong&gt; that uses a session variable to decide which rows are visible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;At request time&lt;/strong&gt;, your backend sets that session variable from the verified embed token (JWT) before running any dashboard SQL.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's the minimum viable setup for a SaaS with an &lt;code&gt;orders&lt;/code&gt; table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- 1. Enable RLS on the table&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;ENABLE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- 2. Create a policy keyed on a session-local setting&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;tenant_isolation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- 3. (Optional but recommended) force RLS even for table owners&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;FORCE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The variable name must contain a period — &lt;code&gt;app.current_tenant&lt;/code&gt;, &lt;code&gt;jwt.claims.user_id&lt;/code&gt;, etc. — so Postgres knows it's a custom setting and not a system one.&lt;/p&gt;

&lt;p&gt;Then, in your backend, before running &lt;em&gt;any&lt;/em&gt; dashboard query for that request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="k"&gt;LOCAL&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_tenant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'8c1c...e9a3'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;SET LOCAL&lt;/code&gt; scopes the value to the current transaction, which is exactly what you want — it auto-clears when the transaction ends, so a stray query later in the request can't leak it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wiring it to the embed JWT
&lt;/h2&gt;

&lt;p&gt;The embed flow most teams use looks like this. The dashboard tool (or your own iframe) receives a signed JWT minted by your backend. That JWT carries the user's tenant and role claims. Before your backend opens a query, it verifies the JWT and pushes the claims into the session:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudocode — runs at the start of every dashboard request
&lt;/span&gt;&lt;span class="n"&gt;claims&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;verify_jwt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;             &lt;span class="c1"&gt;# tenant_id, user_id, role
&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT set_config(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;app.current_tenant&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, %s, true)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenant_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT set_config(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;app.current_user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, %s, true)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT set_config(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;app.current_role&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, %s, true)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Now run the dashboard query — RLS does the rest
&lt;/span&gt;    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dashboard_query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The third argument &lt;code&gt;true&lt;/code&gt; to &lt;code&gt;set_config&lt;/code&gt; is the transaction-local flag, equivalent to &lt;code&gt;SET LOCAL&lt;/code&gt;. From this point on, every query the dashboard fires — including ones written by an end user or an AI — runs under the right identity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Going beyond multi-tenancy: per-user roles inside a tenant
&lt;/h2&gt;

&lt;p&gt;Most teams start with &lt;code&gt;tenant_id&lt;/code&gt; filtering and stop there. But the real questions show up &lt;em&gt;inside&lt;/em&gt; a tenant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A sales rep should see only their own pipeline.&lt;/li&gt;
&lt;li&gt;A regional manager should see their region.&lt;/li&gt;
&lt;li&gt;The CRO should see everything in the company.&lt;/li&gt;
&lt;li&gt;Finance should see revenue but not employee salaries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can express all of that in a single policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;orders_visibility&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_role'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'admin'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'cro'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_role'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'manager'&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_region'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;owner_user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_user'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now an embedded chart that runs &lt;code&gt;SELECT region, SUM(amount) FROM orders GROUP BY region&lt;/code&gt; returns different rows for the intern, the manager, and the CRO — without a single line of application code changing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Column-level masking for sensitive fields
&lt;/h2&gt;

&lt;p&gt;RLS hides rows. Sometimes you need to hide &lt;em&gt;columns&lt;/em&gt; — show the row, but mask one field. A common pattern is a view that wraps the base table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;orders_safe&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CASE&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_role'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'admin'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'finance'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;customer_email&lt;/span&gt;
    &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="n"&gt;regexp_replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'(^.).*(@.*$)'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\1&lt;/span&gt;&lt;span class="s1"&gt;***&lt;/span&gt;&lt;span class="se"&gt;\2&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer_email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Point your embedded dashboard at &lt;code&gt;orders_safe&lt;/code&gt; instead of &lt;code&gt;orders&lt;/code&gt; and the masking happens for free. Combine with RLS on the underlying table and you get row + column protection in one model.&lt;/p&gt;

&lt;p&gt;A heads-up: views run with the privileges of their &lt;em&gt;creator&lt;/em&gt; by default, which can bypass RLS unexpectedly. Either create the view as a non-superuser, or in modern Postgres set &lt;code&gt;security_invoker = true&lt;/code&gt; on the view so it runs as the calling user.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-world example: a customer-facing usage dashboard
&lt;/h2&gt;

&lt;p&gt;Say you're building a usage dashboard for an API product. Each customer should see their own request counts, latency, and errors. Inside the customer, only org admins should see per-user breakdowns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;api_events&lt;/span&gt; &lt;span class="n"&gt;ENABLE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;api_events_read&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;api_events&lt;/span&gt;
  &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_role'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'org_admin'&lt;/span&gt;
      &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_user'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;api_events_tenant_user_idx&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;api_events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A regular user's dashboard query like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;date_trunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'hour'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latency_ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_latency&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;api_events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'24 hours'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…silently returns only that user's events. An org admin gets the org-wide view from the same SQL. Your dashboard code, your AI SQL assistant, and your CSV exporter all stay identical.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance: don't blow up your dashboards
&lt;/h2&gt;

&lt;p&gt;RLS is fast when the policy predicate is &lt;strong&gt;indexable&lt;/strong&gt;. The policy &lt;code&gt;tenant_id = current_setting(...)&lt;/code&gt; becomes part of every query plan; if &lt;code&gt;tenant_id&lt;/code&gt; is the leading column of your indexes, you're fine. Things that quietly tank performance:&lt;/p&gt;

&lt;p&gt;A &lt;code&gt;current_setting()&lt;/code&gt; call that isn't marked &lt;code&gt;STABLE&lt;/code&gt; will be re-evaluated per row. Wrap it in a &lt;code&gt;STABLE&lt;/code&gt; SQL function and call that in the policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;current_tenant&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;
&lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="k"&gt;sql&lt;/span&gt; &lt;span class="k"&gt;STABLE&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;tenant_isolation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_tenant&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Subquery-style policies that join other tables can be expensive. Prefer denormalising a &lt;code&gt;tenant_id&lt;/code&gt; or &lt;code&gt;owner_id&lt;/code&gt; onto the table over a policy like &lt;code&gt;EXISTS (SELECT 1 FROM memberships WHERE ...)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Make sure your composite indexes lead with the column the RLS policy filters on — that's almost always &lt;code&gt;tenant_id&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common mistakes and gotchas
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;connection pooler trap&lt;/strong&gt; is the one that bites everyone. If you use PgBouncer in transaction or statement mode, &lt;code&gt;SET&lt;/code&gt; (without &lt;code&gt;LOCAL&lt;/code&gt;) persists across requests on the same backend connection, leaking context between tenants. Always use &lt;code&gt;SET LOCAL&lt;/code&gt; (or &lt;code&gt;set_config(..., true)&lt;/code&gt;) inside an explicit transaction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Superusers and &lt;code&gt;BYPASSRLS&lt;/code&gt; ignore policies entirely.&lt;/strong&gt; Your application role must &lt;em&gt;not&lt;/em&gt; be a superuser and must not have &lt;code&gt;BYPASSRLS&lt;/code&gt;. Test as a normal application role, not as &lt;code&gt;postgres&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Forgetting &lt;code&gt;FORCE ROW LEVEL SECURITY&lt;/code&gt;&lt;/strong&gt; means the table owner still sees everything. If your migrations or background jobs run as the owner, they bypass policies — sometimes that's what you want, but it's worth being explicit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;USING&lt;/code&gt; vs &lt;code&gt;WITH CHECK&lt;/code&gt;.&lt;/strong&gt; &lt;code&gt;USING&lt;/code&gt; filters what's read, &lt;code&gt;WITH CHECK&lt;/code&gt; validates what's written. If you only specify &lt;code&gt;USING&lt;/code&gt; on an &lt;code&gt;INSERT&lt;/code&gt;/&lt;code&gt;UPDATE&lt;/code&gt; policy, a user could write rows tagged with another tenant's id. Specify both for write operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Views inherit creator privileges.&lt;/strong&gt; If you create a view as &lt;code&gt;postgres&lt;/code&gt;, queries against it can see everything. Either create views as a non-superuser, or set &lt;code&gt;security_invoker = true&lt;/code&gt; (Postgres 15+).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No fallback policy.&lt;/strong&gt; If no policy matches, the default behavior is to &lt;em&gt;deny&lt;/em&gt; — which is good for security but can look like "the dashboard is empty" in dev. Add a clear deny-by-default policy and a verbose error path so missing context fails loudly during development, not silently in prod.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;p&gt;Embedded dashboards run too many queries — including ones written by users and LLMs — for application-level filtering to be safe. Row-Level Security pushes the rules into the database, where every query is filtered automatically.&lt;/p&gt;

&lt;p&gt;The pattern is small: enable RLS on the table, write a policy that reads &lt;code&gt;current_setting('app.something')&lt;/code&gt;, and have your backend set those settings inside a &lt;code&gt;SET LOCAL&lt;/code&gt; transaction after verifying the embed JWT. From there, you can layer per-user roles, column masking, and view-level transforms without touching dashboard code.&lt;/p&gt;

&lt;p&gt;Get the connection-pool detail right, keep your indexes aligned with your policy predicates, and test as a non-superuser role. Do that and your embedded analytics layer becomes one of the safest parts of your stack instead of the scariest.&lt;/p&gt;

&lt;h2&gt;
  
  
  Over to you
&lt;/h2&gt;

&lt;p&gt;If you're shipping embedded dashboards today, how are you handling tenant + per-user isolation? Are you using RLS, application-layer filtering, schema-per-tenant, or a BI tool's own permission model? Drop a comment with the pattern (and the gotchas) you've hit — especially around AI-generated SQL, where the "the LLM might forget the filter" problem makes RLS feel inevitable.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>postgres</category>
      <category>webdev</category>
    </item>
    <item>
      <title>KPI Tracking with SQL: A Practical Starter Kit for SaaS Developers</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Fri, 22 May 2026 04:50:54 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/kpi-tracking-with-sql-a-practical-starter-kit-for-saas-developers-49h3</link>
      <guid>https://dev.to/vivekdraxlr/kpi-tracking-with-sql-a-practical-starter-kit-for-saas-developers-49h3</guid>
      <description>&lt;p&gt;You've shipped the product. Users are signing up. Subscriptions are flowing in. Now your founder, your investors, or your own curiosity is asking: &lt;em&gt;how are we actually doing?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Most teams reach for a BI tool or a spreadsheet at this point. But if your data lives in a SQL database—Postgres, MySQL, Redshift, BigQuery—you already have everything you need to answer the most important product and business questions. No third-party analytics event pipeline required.&lt;/p&gt;

&lt;p&gt;This guide gives you practical SQL queries for the KPIs that matter most in a SaaS product: revenue, engagement, retention, and growth. These aren't toy examples—they use realistic table names you'd find in a real product codebase, and they come with the gotchas that trip most developers up the first time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Your Assumed Schema
&lt;/h2&gt;

&lt;p&gt;To keep the queries grounded, here's the database schema we'll work against throughout:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;Table&lt;/th&gt;
&lt;th&gt;Key Columns&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;&lt;code&gt;users&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;id&lt;/code&gt;, &lt;code&gt;email&lt;/code&gt;, &lt;code&gt;created_at&lt;/code&gt;, &lt;code&gt;plan_id&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;&lt;code&gt;subscriptions&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;id&lt;/code&gt;, &lt;code&gt;user_id&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;, &lt;code&gt;amount_cents&lt;/code&gt;, &lt;code&gt;billing_cycle&lt;/code&gt;, &lt;code&gt;started_at&lt;/code&gt;, &lt;code&gt;canceled_at&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;&lt;code&gt;payments&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;id&lt;/code&gt;, &lt;code&gt;user_id&lt;/code&gt;, &lt;code&gt;amount_cents&lt;/code&gt;, &lt;code&gt;paid_at&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;&lt;code&gt;events&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;id&lt;/code&gt;, &lt;code&gt;user_id&lt;/code&gt;, &lt;code&gt;event_name&lt;/code&gt;, &lt;code&gt;occurred_at&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;&lt;code&gt;trials&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;id&lt;/code&gt;, &lt;code&gt;user_id&lt;/code&gt;, &lt;code&gt;started_at&lt;/code&gt;, &lt;code&gt;converted_at&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Adapt column names to match your own schema, but the patterns will transfer directly.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Monthly Recurring Revenue (MRR)
&lt;/h2&gt;

&lt;p&gt;MRR is the single most-watched number in any subscription business. It answers: &lt;em&gt;how much predictable revenue do we have right now?&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount_cents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;          &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;mrr&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;billing_cycle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'monthly'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you have annual subscribers, normalize their value to monthly by dividing by 12:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="n"&gt;billing_cycle&lt;/span&gt;
      &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="s1"&gt;'monthly'&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;amount_cents&lt;/span&gt;
      &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="s1"&gt;'annual'&lt;/span&gt;  &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;amount_cents&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;mrr&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gotcha:&lt;/strong&gt; Don't double-count. If a user can have multiple subscriptions (e.g. per-seat billing), make sure your &lt;code&gt;subscriptions&lt;/code&gt; table represents the correct unit. Summing at the user level first and then aggregating is safer when your data model is complex.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Churn Rate
&lt;/h2&gt;

&lt;p&gt;Churn tells you how fast you're losing customers. Monthly churn is typically calculated as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Churn Rate = Customers Lost This Month ÷ Customers at Start of Month&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;start_of_month&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;started_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;churned&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;lost&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'canceled'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;canceled_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;canceled_at&lt;/span&gt;  &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'1 month'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lost&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;monthly_churn_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;start_of_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;churned&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gotcha:&lt;/strong&gt; Using &lt;code&gt;NULLIF(total, 0)&lt;/code&gt; prevents division-by-zero errors, which will otherwise silently crash your query or return &lt;code&gt;NULL&lt;/code&gt; without warning. A 2–3% monthly churn rate is a common benchmark to aim below; above 5% is a red flag for most B2B SaaS.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Daily Active Users (DAU) and the Stickiness Ratio
&lt;/h2&gt;

&lt;p&gt;DAU tells you engagement. But the ratio of DAU to MAU—your "stickiness ratio"—tells you whether users are building habits.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- DAU: today&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;occurred_at&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- MAU: last 30 days&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;mau&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;occurred_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combine them to get the stickiness ratio:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;daily&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;occurred_at&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;mau&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;monthly&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;occurred_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;daily&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;monthly&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;stickiness_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;dau&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mau&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A stickiness ratio above 20% is generally considered healthy. Slack and Facebook sit north of 50%—most B2B tools are happy at 25–40%.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gotcha:&lt;/strong&gt; Be precise about what counts as an "active" event. Including page loads inflates the number. Stick to intentional actions: creating a record, running a query, submitting a form.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Average Revenue Per User (ARPU)
&lt;/h2&gt;

&lt;p&gt;ARPU helps you understand whether your pricing is working and how different segments compare.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount_cents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;arpu&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Break it down by plan to see which tier drives the most value:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plan_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                             &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount_cents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plan_id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;avg_revenue&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  5. Trial-to-Paid Conversion Rate
&lt;/h2&gt;

&lt;p&gt;If you have a freemium or free trial model, conversion rate is often more valuable than new signup counts—it's what actually drives revenue.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;trial_cohort&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;started&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;trials&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;started_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;converted&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;paid&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;trials&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;started_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;converted_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paid&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;started&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;conversion_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;trial_cohort&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;converted&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gotcha:&lt;/strong&gt; Be careful comparing trials from &lt;em&gt;this&lt;/em&gt; month to conversions from &lt;em&gt;this&lt;/em&gt; month. Many trials take 14–30 days to convert. A better approach is to look at the conversion rate for trials that &lt;em&gt;started&lt;/em&gt; 30–60 days ago, giving them time to complete their journey.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Revenue Growth Rate (Month-over-Month)
&lt;/h2&gt;

&lt;p&gt;Revenue growth tells you if the business is accelerating, stalling, or shrinking. Track it month-over-month.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;paid_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount_cents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;    &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'succeeded'&lt;/span&gt;
  &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;prev_month_revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;mom_growth_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query uses a window function (&lt;code&gt;LAG&lt;/code&gt;) to pull the previous month's value and compute the percentage change. The result looks something like:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;month&lt;/th&gt;
&lt;th&gt;revenue&lt;/th&gt;
&lt;th&gt;prev_month_revenue&lt;/th&gt;
&lt;th&gt;mom_growth_pct&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;2026-05-01&lt;/td&gt;
&lt;td&gt;48,200&lt;/td&gt;
&lt;td&gt;41,500&lt;/td&gt;
&lt;td&gt;16.1%&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;2026-04-01&lt;/td&gt;
&lt;td&gt;41,500&lt;/td&gt;
&lt;td&gt;36,800&lt;/td&gt;
&lt;td&gt;12.8%&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;2026-03-01&lt;/td&gt;
&lt;td&gt;36,800&lt;/td&gt;
&lt;td&gt;34,100&lt;/td&gt;
&lt;td&gt;7.9%&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  7. Feature Adoption Rate
&lt;/h2&gt;

&lt;p&gt;If you're building a product, you need to know which features people actually use.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;event_name&lt;/span&gt;                                                &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;feature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                                   &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;users_used&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
   &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;occurred_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_active_users&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;
    &lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
       &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;occurred_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;                                                         &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;adoption_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;occurred_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;event_name&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'export_report'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'invite_teammate'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'create_dashboard'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'connect_datasource'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;event_name&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;adoption_pct&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Common Mistakes to Avoid
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Forgetting time zones.&lt;/strong&gt; &lt;code&gt;CURRENT_DATE&lt;/code&gt; returns a date in your server's time zone. If your users are spread across time zones, you may be cutting off events. Store timestamps in UTC and always convert explicitly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Counting users vs. counting events.&lt;/strong&gt; Many queries should use &lt;code&gt;COUNT(DISTINCT user_id)&lt;/code&gt;, not &lt;code&gt;COUNT(*)&lt;/code&gt;. One user generating 100 events should not count as 100 active users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Including test accounts.&lt;/strong&gt; Seed data, internal team emails, and test subscriptions will skew every metric. Add a &lt;code&gt;WHERE email NOT LIKE '%@yourcompany.com%'&lt;/code&gt; filter or flag test accounts in your users table with an &lt;code&gt;is_test&lt;/code&gt; column.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Pulling raw SQL into spreadsheets.&lt;/strong&gt; Once your KPI queries are working, don't copy-paste results into a spreadsheet every week. That workflow breaks. Connect a dashboard tool directly to your database to keep numbers fresh and reduce manual error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Ignoring failed payments.&lt;/strong&gt; When calculating MRR from &lt;code&gt;payments&lt;/code&gt;, always filter by &lt;code&gt;status = 'succeeded'&lt;/code&gt;. Failed payments are not revenue, and including them will inflate your numbers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;SQL is one of the most powerful tools you have for understanding your own product. You don't need a data warehouse or a dedicated analyst to get started—you need a database connection and the right queries.&lt;/p&gt;

&lt;p&gt;Start with these seven metrics: MRR, churn rate, DAU/MAU stickiness, ARPU, trial conversion, revenue growth, and feature adoption. Each one answers a specific question about whether your product is healthy and growing.&lt;/p&gt;

&lt;p&gt;Build them once, pin them to a dashboard you check weekly, and you'll make better product decisions with far less guesswork.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;How do you track KPIs at your company? Do you query your database directly, use a BI tool, or pull from a third-party analytics service? Drop your setup in the comments — I'd love to hear what works for your team.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>analytics</category>
      <category>webdev</category>
    </item>
    <item>
      <title>From English to SQL: How LLMs Actually Understand Your Database Schema</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Thu, 21 May 2026 05:28:35 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/from-english-to-sql-how-llms-actually-understand-your-database-schema-424o</link>
      <guid>https://dev.to/vivekdraxlr/from-english-to-sql-how-llms-actually-understand-your-database-schema-424o</guid>
      <description>&lt;p&gt;You type "show me total revenue by customer for last month" into a chat interface. A second later, out pops a perfectly formatted SQL query against your &lt;code&gt;orders&lt;/code&gt; and &lt;code&gt;customers&lt;/code&gt; tables. It feels like magic.&lt;/p&gt;

&lt;p&gt;But it's not. And once you understand the mechanics — specifically, how the model reads and interprets your database schema — you'll know exactly why text-to-SQL tools sometimes get it right, sometimes get it embarrassingly wrong, and what you can do to tip the odds in your favor.&lt;/p&gt;

&lt;p&gt;This post breaks down the internal mechanics of schema understanding in LLM-powered text-to-SQL systems, with practical advice for developers building or integrating these tools.&lt;/p&gt;




&lt;h2&gt;
  
  
  The First Problem: The LLM Doesn't Know Your Database
&lt;/h2&gt;

&lt;p&gt;Out of the box, an LLM has no idea that your SaaS product has a &lt;code&gt;subscriptions&lt;/code&gt; table, or that &lt;code&gt;mrr&lt;/code&gt; lives in &lt;code&gt;billing_events&lt;/code&gt;, or that you store &lt;code&gt;user_id&lt;/code&gt; in lowercase snake_case everywhere except the one legacy table where it's &lt;code&gt;UserID&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;That knowledge has to be injected into the model's prompt — every single time. The process of selecting &lt;em&gt;which&lt;/em&gt; schema information to include, and how to format it, is called &lt;strong&gt;schema linking&lt;/strong&gt;. It's the most critical (and most commonly fumbled) part of any text-to-SQL pipeline.&lt;/p&gt;

&lt;p&gt;A typical schema injection looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Passed to the LLM in the system or user prompt:&lt;/span&gt;

&lt;span class="k"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
&lt;span class="n"&gt;Columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PK&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="n"&gt;Columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PK&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FK&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;amount_cents&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="n"&gt;Columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PK&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FK&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;plan_id&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;started_at&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;cancelled_at&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model reads this and builds an internal understanding of your data model — which tables exist, what columns they have, how they relate to each other, and what data types to expect.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Schema Parsing — What the Model Actually Sees
&lt;/h2&gt;

&lt;p&gt;When a user asks "which customers churned last month?", the LLM doesn't just search for the word "churned" in your schema. It reasons through what "churned" likely means given the available columns: probably a &lt;code&gt;cancelled_at&lt;/code&gt; timestamp that falls within the last 30 days, joined against &lt;code&gt;users&lt;/code&gt; via &lt;code&gt;subscriptions&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This reasoning — connecting semantic intent in the question to structural elements in the schema — is called &lt;strong&gt;schema linking&lt;/strong&gt;. It involves three sub-problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Table selection:&lt;/strong&gt; Which tables are relevant to this question? For large databases with hundreds of tables, the model must filter down to the handful that matter. Dumping your entire schema into the prompt doesn't scale — a 300-table database easily exceeds a model's effective context window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Column selection:&lt;/strong&gt; Within those tables, which columns are needed? This is where naming matters enormously. &lt;code&gt;amt&lt;/code&gt; is ambiguous. &lt;code&gt;amount_cents&lt;/code&gt; is clear. &lt;code&gt;ts&lt;/code&gt; is useless. &lt;code&gt;created_at&lt;/code&gt; is useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Join inference:&lt;/strong&gt; How do the relevant tables connect? The model needs to reconstruct the relationship graph — ideally from explicit foreign key declarations in the schema you provide, rather than guessing from column name patterns alone.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: Why Column Names and Table Names Matter More Than You Think
&lt;/h2&gt;

&lt;p&gt;Here's a test. Give an LLM these two schemas and ask "how many users signed up last week?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Schema A:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="n"&gt;Columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;em&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pl&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Schema B:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;Table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
&lt;span class="n"&gt;Columns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Schema B wins every time. Natural language tokens in column names (&lt;code&gt;email&lt;/code&gt;, &lt;code&gt;created_at&lt;/code&gt;, &lt;code&gt;plan&lt;/code&gt;) map cleanly to words a model understands. Abbreviations break that mapping.&lt;/p&gt;

&lt;p&gt;Research from IBM and others has shown that schema-aware models — those given rich metadata including table descriptions, column definitions, and sample values — outperform models handed raw DDL by a significant margin.&lt;/p&gt;

&lt;p&gt;The practical takeaway: &lt;strong&gt;invest in good naming conventions, and pass them into your text-to-SQL pipeline explicitly.&lt;/strong&gt; If you're stuck with legacy short column names, add descriptions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: How to Structure Schema for Maximum LLM Comprehension
&lt;/h2&gt;

&lt;p&gt;Here's an example of a well-formatted schema prompt that a text-to-SQL system might use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are a SQL expert. Use the following schema to answer the user's question.

### Schema

Table: users
Description: One row per registered user.
Columns:
  - id (integer, primary key)
  - email (text) — the user's login email
  - created_at (timestamp) — when the account was created
  - plan (text) — current plan: 'free', 'pro', 'enterprise'

Table: orders
Description: Purchases made by users.
Columns:
  - id (integer, primary key)
  - user_id (integer) — foreign key → users.id
  - amount_cents (integer) — order total in cents (divide by 100 for dollars)
  - status (text) — 'pending', 'completed', 'refunded'
  - created_at (timestamp)

### Question
Which users on the 'pro' plan have made more than 3 orders in the past 30 days?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice what's included beyond just column names:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Table descriptions&lt;/strong&gt; that explain business purpose&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inline comments&lt;/strong&gt; on ambiguous columns (units, enumerations)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explicit foreign key relationships&lt;/strong&gt; spelled out in plain language&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the difference between a mediocre text-to-SQL result and one you'd actually run in production.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4: The Large Schema Problem — When You Can't Fit Everything
&lt;/h2&gt;

&lt;p&gt;Most real-world databases have far more tables than you'd ever want in a single prompt. Cramming 200 table definitions into the context window creates two problems: cost, and accuracy. Models are measurably worse at attending to schema details when surrounded by irrelevant tables.&lt;/p&gt;

&lt;p&gt;The solution is &lt;strong&gt;retrieval-augmented schema selection&lt;/strong&gt;. Before calling the LLM, a retrieval step filters the schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudocode for schema-aware RAG in a text-to-SQL pipeline
&lt;/span&gt;
&lt;span class="n"&gt;user_question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;show me monthly revenue by plan for the last 6 months&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Embed the user question
&lt;/span&gt;&lt;span class="n"&gt;question_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Search a vector index of table + column descriptions
&lt;/span&gt;&lt;span class="n"&gt;relevant_tables&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;vector_search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;schema_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;question_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Build a focused schema prompt with only relevant tables
&lt;/span&gt;&lt;span class="n"&gt;schema_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;format_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relevant_tables&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 4. Call the LLM with a focused prompt
&lt;/span&gt;&lt;span class="n"&gt;sql&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;system_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Use this schema:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;schema_context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;user_prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;user_question&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Systems like RASL (Retrieval Augmented Schema Linking) from Amazon use this approach to handle databases with thousands of tables, achieving dramatically better accuracy than naive full-schema injection.&lt;/p&gt;

&lt;p&gt;The key insight: &lt;strong&gt;the right schema in the prompt beats a larger model every time.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 5: Foreign Keys Are the Secret Weapon
&lt;/h2&gt;

&lt;p&gt;Joins are where text-to-SQL models fail most visibly. A model might correctly identify that you need both &lt;code&gt;users&lt;/code&gt; and &lt;code&gt;orders&lt;/code&gt;, then produce a broken query because it guessed the wrong join column.&lt;/p&gt;

&lt;p&gt;The fix is explicit: always include foreign key relationships in your schema prompt. Compare these two approaches:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vague:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Table: orders
Columns: id, user_id, amount_cents, created_at
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Explicit:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Table: orders
Columns:
  - id (integer, primary key)
  - user_id (integer) — foreign key → users.id; never null
  - amount_cents (integer)
  - created_at (timestamp)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When foreign keys are declared, models reliably produce correct JOINs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Generated correctly when FK is explicit:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
  &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount_cents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'completed'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without the FK hint, you're relying on the model to guess that &lt;code&gt;user_id&lt;/code&gt; in &lt;code&gt;orders&lt;/code&gt; maps to &lt;code&gt;id&lt;/code&gt; in &lt;code&gt;users&lt;/code&gt;. It usually gets this right — but "usually" isn't good enough for production.&lt;/p&gt;




&lt;h2&gt;
  
  
  Common Mistakes Developers Make
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Passing raw DDL without descriptions.&lt;/strong&gt; Your CREATE TABLE statements are a start, but they're optimized for the database engine, not for an LLM. Add comments and descriptions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Don't just pass this:&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;ev&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;uid&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;typ&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Also include:&lt;/span&gt;
&lt;span class="c1"&gt;-- Table: events (alias: ev)&lt;/span&gt;
&lt;span class="c1"&gt;-- Description: user activity events for product analytics&lt;/span&gt;
&lt;span class="c1"&gt;-- uid → users.id, typ = 'click' | 'page_view' | 'signup' | 'purchase'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Including every table for every query.&lt;/strong&gt; Token-bloat hurts accuracy. Filter schema by relevance before injecting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ignoring enum values.&lt;/strong&gt; If a column like &lt;code&gt;status&lt;/code&gt; has a fixed set of values, list them. The model needs to know that &lt;code&gt;'completed'&lt;/code&gt; is a valid status, not &lt;code&gt;'done'&lt;/code&gt; or &lt;code&gt;'success'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No sample data.&lt;/strong&gt; Even a single example row per table dramatically improves accuracy for edge cases — especially date formats, unit conventions, and enum spellings.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;The gap between a text-to-SQL tool that impresses and one you'd trust in production almost always comes down to schema quality, not model quality. Here's the short version:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use descriptive table and column names; avoid abbreviations&lt;/li&gt;
&lt;li&gt;Declare foreign key relationships explicitly in your schema prompt&lt;/li&gt;
&lt;li&gt;Include column descriptions for ambiguous fields (units, enums, nullable meaning)&lt;/li&gt;
&lt;li&gt;Filter schema by relevance using vector search for large databases&lt;/li&gt;
&lt;li&gt;Add sample values or enum lists for categorical columns&lt;/li&gt;
&lt;li&gt;Never assume the model will infer what you didn't tell it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The LLM is doing surprisingly sophisticated reasoning — it's matching your English words to your schema structure, inferring joins, and constructing syntactically valid SQL all in one pass. Give it the right schema context, and it'll give you queries you can actually use.&lt;/p&gt;




&lt;p&gt;Have you built a text-to-SQL system or integrated one into your product? I'd love to hear how you handle schema injection — drop a comment below with your approach, especially if you've tackled the large-schema problem in an interesting way.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Prompt AI Tools to Write Accurate SQL Queries (And Why Most Developers Get This Wrong)</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Wed, 20 May 2026 17:03:43 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/how-to-prompt-ai-tools-to-write-accurate-sql-queries-and-why-most-developers-get-this-wrong-55k4</link>
      <guid>https://dev.to/vivekdraxlr/how-to-prompt-ai-tools-to-write-accurate-sql-queries-and-why-most-developers-get-this-wrong-55k4</guid>
      <description>&lt;p&gt;If you've tried asking ChatGPT, Claude, or any AI SQL assistant to generate a query and gotten back something that looked plausible but was subtly wrong — you're not alone. The frustrating part is it often &lt;em&gt;runs&lt;/em&gt;. The database returns rows, the numbers look reasonable, and you ship it. Three days later, someone points out the totals are off by 20%.&lt;/p&gt;

&lt;p&gt;The problem isn't the AI. The problem is the prompt.&lt;/p&gt;

&lt;p&gt;Text-to-SQL works remarkably well when you give the model what it actually needs. According to AWS's benchmarking, GPT-4-class models achieve a &lt;strong&gt;94% first-try success rate&lt;/strong&gt; on ad-hoc analytics queries when the schema and foreign key constraints are properly provided. Without that context? You're closer to 60%. The difference is entirely in how you prompt.&lt;/p&gt;

&lt;p&gt;This guide covers the practical techniques — schema context, few-shot examples, business term definitions, and chain-of-thought decomposition — that separate accurate AI-generated SQL from the kind that silently lies to you.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why AI Gets SQL Wrong (It's Not the Model's Fault)
&lt;/h2&gt;

&lt;p&gt;When you ask an AI "give me last month's revenue by plan tier," the model has to make a series of guesses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which table holds revenue? &lt;code&gt;orders&lt;/code&gt;? &lt;code&gt;subscriptions&lt;/code&gt;? &lt;code&gt;invoices&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;What column tracks the amount? &lt;code&gt;total&lt;/code&gt;? &lt;code&gt;amount_cents&lt;/code&gt;? &lt;code&gt;mrr&lt;/code&gt;?&lt;/li&gt;
&lt;li&gt;What does "last month" mean in your timezone?&lt;/li&gt;
&lt;li&gt;Is "revenue" recognized revenue, gross, or net of refunds?&lt;/li&gt;
&lt;li&gt;What column stores "plan tier"?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without your schema, the model invents plausible-sounding answers to all of these. It will write syntactically valid SQL that is semantically wrong for your specific database.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The key insight:&lt;/strong&gt; From the model's perspective, the schema &lt;em&gt;is&lt;/em&gt; the problem space. You wouldn't ask a contractor to remodel your kitchen without giving them the floor plan.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Technique 1: Always Include Your Schema (But Not All of It)
&lt;/h2&gt;

&lt;p&gt;The most impactful single change you can make is providing your table definitions in the prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The naive approach&lt;/strong&gt; — dumping your entire database schema — backfires. Enterprise databases with 100+ tables drown the model in noise and push important tables out of its effective context window. The right move is to include only the tables relevant to the question.&lt;/p&gt;

&lt;p&gt;Here's a solid schema context block to include in your prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="k"&gt;are&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="k"&gt;SQL&lt;/span&gt; &lt;span class="n"&gt;expert&lt;/span&gt; &lt;span class="n"&gt;working&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;PostgreSQL&lt;/span&gt; &lt;span class="k"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;

&lt;span class="n"&gt;Relevant&lt;/span&gt; &lt;span class="n"&gt;tables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;          &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;user_id&lt;/span&gt;     &lt;span class="nb"&gt;INTEGER&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;plan&lt;/span&gt;        &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="c1"&gt;-- 'starter', 'pro', 'enterprise'&lt;/span&gt;
  &lt;span class="n"&gt;status&lt;/span&gt;      &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="c1"&gt;-- 'active', 'cancelled', 'trialing'&lt;/span&gt;
  &lt;span class="n"&gt;mrr_cents&lt;/span&gt;   &lt;span class="nb"&gt;INTEGER&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;-- monthly recurring revenue in cents&lt;/span&gt;
  &lt;span class="n"&gt;started_at&lt;/span&gt;  &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;cancelled_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;          &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;email&lt;/span&gt;       &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt;  &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;country&lt;/span&gt;     &lt;span class="nb"&gt;TEXT&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;Write&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="k"&gt;SQL&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;your&lt;/span&gt; &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="n"&gt;here&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the inline comments on columns like &lt;code&gt;plan&lt;/code&gt; and &lt;code&gt;status&lt;/code&gt;. These are crucial — they tell the model the exact set of values to filter on. Without them, the model might write &lt;code&gt;WHERE plan = 'professional'&lt;/code&gt; instead of &lt;code&gt;WHERE plan = 'pro'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practical tip:&lt;/strong&gt; If you don't know which tables are relevant upfront, ask the model first: &lt;em&gt;"Given this question, which tables from the following list would you need?"&lt;/em&gt; Then provide only those.&lt;/p&gt;




&lt;h2&gt;
  
  
  Technique 2: Define Your Business Terms
&lt;/h2&gt;

&lt;p&gt;Business vocabulary rarely maps cleanly to column names. "Revenue" might mean &lt;code&gt;mrr_cents&lt;/code&gt;. "Active users" might mean users who logged in within 30 days, not users with &lt;code&gt;status = 'active'&lt;/code&gt;. "Churn" could be calculated six different ways.&lt;/p&gt;

&lt;p&gt;Add a &lt;strong&gt;glossary block&lt;/strong&gt; to your prompt for any domain-specific terms:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Business term definitions:
- "Active subscriber": a user with subscriptions.status = 'active'
- "MRR": SUM(mrr_cents) / 100.0, expressed in dollars
- "Last month": the full calendar month before the current month
  (e.g., if today is May 15 2026, last month = April 1–30 2026)
- "Churned": status changed from 'active' to 'cancelled' in the period
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the step most developers skip. And it's the reason queries that look right return numbers that don't match your finance team's spreadsheet.&lt;/p&gt;




&lt;h2&gt;
  
  
  Technique 3: Use Few-Shot Examples
&lt;/h2&gt;

&lt;p&gt;Few-shot prompting — showing the model one or two example question/query pairs — dramatically improves accuracy on complex queries. It teaches the model your style, your naming conventions, and the kinds of JOINs your schema requires.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Here&lt;/span&gt; &lt;span class="k"&gt;are&lt;/span&gt; &lt;span class="n"&gt;two&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt; &lt;span class="n"&gt;queries&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;guide&lt;/span&gt; &lt;span class="n"&gt;your&lt;/span&gt; &lt;span class="k"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="c1"&gt;-- Question: How many new subscribers signed up in March 2026?&lt;/span&gt;
&lt;span class="c1"&gt;-- Query:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;started_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-03-01'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;started_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-04-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Question: What is total MRR by plan for active subscriptions?&lt;/span&gt;
&lt;span class="c1"&gt;-- Query:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mrr_cents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;mrr_dollars&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;plan&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;mrr_dollars&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Now answer this question: [your new question]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The examples serve as a template. The model learns that you use &lt;code&gt;&amp;gt;=&lt;/code&gt; / &lt;code&gt;&amp;lt;&lt;/code&gt; for date ranges (not &lt;code&gt;BETWEEN&lt;/code&gt;), that you divide cents by 100.0, and that you prefer &lt;code&gt;ORDER BY&lt;/code&gt; descending. You'll get output that fits into your codebase without needing cleanup.&lt;/p&gt;




&lt;h2&gt;
  
  
  Technique 4: Break Complex Questions into Steps (Chain-of-Thought)
&lt;/h2&gt;

&lt;p&gt;For multi-step analytical queries — cohort analysis, funnel calculations, retention — don't ask for the whole query at once. Ask the model to reason through it first.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;I&lt;/span&gt; &lt;span class="n"&gt;need&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;calculate&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="n"&gt;retention&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;who&lt;/span&gt; &lt;span class="nb"&gt;signed&lt;/span&gt; &lt;span class="n"&gt;up&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;April&lt;/span&gt; &lt;span class="mi"&gt;2026&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;Retention&lt;/span&gt; &lt;span class="n"&gt;means&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;they&lt;/span&gt; &lt;span class="n"&gt;had&lt;/span&gt; &lt;span class="k"&gt;at&lt;/span&gt; &lt;span class="n"&gt;least&lt;/span&gt; &lt;span class="n"&gt;one&lt;/span&gt; &lt;span class="n"&gt;active&lt;/span&gt; &lt;span class="k"&gt;session&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="k"&gt;window&lt;/span&gt;
&lt;span class="k"&gt;after&lt;/span&gt; &lt;span class="n"&gt;signup&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tracked&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="nv"&gt;`events`&lt;/span&gt; &lt;span class="k"&gt;table&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;event_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'session_start'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;

&lt;span class="k"&gt;Before&lt;/span&gt; &lt;span class="n"&gt;writing&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="k"&gt;SQL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;What&lt;/span&gt; &lt;span class="n"&gt;intermediate&lt;/span&gt; &lt;span class="n"&gt;steps&lt;/span&gt; &lt;span class="k"&gt;are&lt;/span&gt; &lt;span class="n"&gt;needed&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Which&lt;/span&gt; &lt;span class="n"&gt;tables&lt;/span&gt; &lt;span class="n"&gt;will&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;How&lt;/span&gt; &lt;span class="n"&gt;will&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;define&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;day&lt;/span&gt; &lt;span class="k"&gt;window&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;

&lt;span class="k"&gt;Then&lt;/span&gt; &lt;span class="k"&gt;write&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="k"&gt;final&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you ask the model to think through the steps explicitly, it catches its own mistakes before producing the final SQL. This mirrors what an experienced developer does mentally before writing a complex query. The chain-of-thought approach consistently outperforms direct generation on multi-table, multi-condition queries.&lt;/p&gt;

&lt;p&gt;Here's what the output might look like for that retention query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;april_signups&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-04-01'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="s1"&gt;'2026-05-01'&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;retained&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;april_signups&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
  &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;event_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'session_start'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;occurred_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'1 day'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;occurred_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                        &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_signups&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                        &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;retained_users&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="nb"&gt;NUMERIC&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt;
    &lt;span class="k"&gt;NULLIF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;                                           &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;retention_pct&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;april_signups&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;retained&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A direct one-shot prompt rarely produces something this structurally sound. Decomposing the problem does.&lt;/p&gt;




&lt;h2&gt;
  
  
  Technique 5: Ask for Explanation and Validation
&lt;/h2&gt;

&lt;p&gt;After receiving a query, always ask the model to explain it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;Explain&lt;/span&gt; &lt;span class="n"&gt;what&lt;/span&gt; &lt;span class="n"&gt;this&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="n"&gt;does&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="k"&gt;Are&lt;/span&gt; &lt;span class="n"&gt;there&lt;/span&gt; &lt;span class="k"&gt;any&lt;/span&gt; &lt;span class="n"&gt;edge&lt;/span&gt; &lt;span class="n"&gt;cases&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nulls&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;division&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;zero&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;time&lt;/span&gt; &lt;span class="k"&gt;zone&lt;/span&gt; &lt;span class="n"&gt;assumptions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;could&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;incorrect&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This serves two purposes: it helps you catch logical errors before running the query in production, and it forces the model to re-examine its own output. Models often catch their own mistakes during this review pass.&lt;/p&gt;

&lt;p&gt;For example, the model might flag:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"This query uses &lt;code&gt;CURRENT_DATE&lt;/code&gt; which assumes UTC. If your database runs in a different timezone, you may want &lt;code&gt;CURRENT_DATE AT TIME ZONE 'America/New_York'&lt;/code&gt;."&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"If &lt;code&gt;mrr_cents&lt;/code&gt; is NULL for any rows, &lt;code&gt;SUM()&lt;/code&gt; will silently exclude them. You may want &lt;code&gt;COALESCE(mrr_cents, 0)&lt;/code&gt;."&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are exactly the gotchas that cause silent data quality issues.&lt;/p&gt;




&lt;h2&gt;
  
  
  Common Mistakes to Avoid
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Vague questions.&lt;/strong&gt; "Show me user activity" will produce a vague, probably useless query. "Show me the count of distinct users who triggered at least one &lt;code&gt;purchase&lt;/code&gt; event in the last 7 days, grouped by their &lt;code&gt;plan&lt;/code&gt;" will produce something accurate and useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No schema, but expecting column-level accuracy.&lt;/strong&gt; The model will guess, and its guesses will be syntactically valid. That's what makes them dangerous.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ignoring dialect differences.&lt;/strong&gt; If you're on BigQuery, asking a generic model for date functions might get you &lt;code&gt;DATE_TRUNC&lt;/code&gt; when you need &lt;code&gt;DATE_TRUNC(date, MONTH)&lt;/code&gt;. Always tell the model your database flavor (PostgreSQL, MySQL, BigQuery, Snowflake, etc.).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trusting output that executes without errors.&lt;/strong&gt; A query that runs is not a query that's correct. Always sanity-check with a small sample (&lt;code&gt;LIMIT 100&lt;/code&gt;), compare a few rows to known-good data, and check totals against a source of truth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Not iterating.&lt;/strong&gt; The first query you get back is a first draft. Paste it back in with corrections: &lt;em&gt;"This is close, but I need the results grouped by week, not month, and I only want users in the 'pro' plan."&lt;/em&gt; Iteration gets you to accuracy faster than trying to write the perfect prompt on the first try.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;The gap between 60% and 94% AI SQL accuracy comes down to what you put into the prompt:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Include your schema&lt;/strong&gt; — just the relevant tables, with column comments for values&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Define business terms&lt;/strong&gt; — don't let the model guess what "revenue" or "active" means&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add 1–2 example queries&lt;/strong&gt; — establish your style and naming conventions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decompose complex questions&lt;/strong&gt; — ask for reasoning steps before the final query&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ask for explanation and edge-case review&lt;/strong&gt; — let the model catch its own mistakes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Always specify your database flavor&lt;/strong&gt; — PostgreSQL, MySQL, BigQuery, etc.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;AI-powered SQL generation is genuinely useful once you treat prompting as a craft, not an afterthought. The developers who get the most out of these tools aren't the ones using the fanciest models — they're the ones who've learned to give the model what it actually needs to succeed.&lt;/p&gt;




&lt;p&gt;What techniques have worked well for you when prompting AI for SQL? Have you found ways to handle large, complex schemas effectively? Drop your approach in the comments — I'd love to compare notes.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Multi-Tenant SQL Reporting: How to Show Each Customer Only Their Own Data</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Wed, 13 May 2026 18:43:48 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/multi-tenant-sql-reporting-how-to-show-each-customer-only-their-own-data-1762</link>
      <guid>https://dev.to/vivekdraxlr/multi-tenant-sql-reporting-how-to-show-each-customer-only-their-own-data-1762</guid>
      <description>&lt;p&gt;You're three months into your SaaS. Customers want dashboards. The first one is easy — Acme Corp wants to see &lt;em&gt;their&lt;/em&gt; orders, &lt;em&gt;their&lt;/em&gt; signups, &lt;em&gt;their&lt;/em&gt; revenue. You write a quick &lt;code&gt;WHERE tenant_id = 42&lt;/code&gt; and ship it.&lt;/p&gt;

&lt;p&gt;Six months later you have 400 tenants, 30 reports, an AI assistant that writes SQL on the fly, and an engineer who one Tuesday morning forgets the &lt;code&gt;WHERE&lt;/code&gt; clause on a single query. Now Acme Corp can see Globex's data. That's a Monday-morning headline you don't want.&lt;/p&gt;

&lt;p&gt;This article is about how to architect multi-tenant SQL reporting so a single forgotten clause never becomes a breach. We'll cover tenant schema design, Postgres Row-Level Security, how to wire it to JWTs, indexing for performance, and the gotchas that bite people in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three multi-tenant patterns (and why one wins for reporting)
&lt;/h2&gt;

&lt;p&gt;Before any SQL, pick your isolation model. There are three classic options:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Isolation&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;Database-per-tenant&lt;/td&gt;
&lt;td&gt;Strongest&lt;/td&gt;
&lt;td&gt;Highest (ops, migrations, connections)&lt;/td&gt;
&lt;td&gt;Enterprise, regulated industries&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Schema-per-tenant&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;High (schemas multiply fast)&lt;/td&gt;
&lt;td&gt;Mid-market with &amp;lt;1000 tenants&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Shared schema with tenant_id&lt;/td&gt;
&lt;td&gt;Logical&lt;/td&gt;
&lt;td&gt;Lowest&lt;/td&gt;
&lt;td&gt;Most B2B SaaS&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For 90% of SaaS apps, &lt;strong&gt;shared schema with a &lt;code&gt;tenant_id&lt;/code&gt; column&lt;/strong&gt; is the right call. It's cheap, it scales, and Postgres gives you the tools to make it safe. Everything below assumes this pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Tag every row with a tenant_id
&lt;/h2&gt;

&lt;p&gt;Every table that stores customer data needs a tenant discriminator. No exceptions. Even your &lt;code&gt;events&lt;/code&gt;, &lt;code&gt;audit_logs&lt;/code&gt;, and &lt;code&gt;ai_query_history&lt;/code&gt; tables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;           &lt;span class="n"&gt;BIGSERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;tenant_id&lt;/span&gt;    &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;tenants&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;  &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;amount_cents&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;status&lt;/span&gt;       &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt;   &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;orders_tenant_created_idx&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things to notice: &lt;code&gt;tenant_id&lt;/code&gt; is &lt;code&gt;NOT NULL&lt;/code&gt; (a null tenant means an orphan row that bypasses your filters), and the index leads with &lt;code&gt;tenant_id&lt;/code&gt;. Every reporting query will filter by tenant, so this becomes your bread-and-butter index.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Enable Row-Level Security as a safety net
&lt;/h2&gt;

&lt;p&gt;Application-level filtering is fine until someone forgets. Postgres Row-Level Security (RLS) enforces the filter at the database, so even a raw &lt;code&gt;SELECT * FROM orders&lt;/code&gt; returns only the rows the current tenant is allowed to see.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- 1. Turn on RLS&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;ENABLE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- 2. Define the policy&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;tenant_isolation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.tenant_id'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="nb"&gt;BIGINT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.tenant_id'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="nb"&gt;BIGINT&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;USING&lt;/code&gt; clause filters rows you can read; the &lt;code&gt;WITH CHECK&lt;/code&gt; clause prevents you from inserting or updating rows into another tenant. Together they cover both directions.&lt;/p&gt;

&lt;p&gt;Now every connection has to set &lt;code&gt;app.tenant_id&lt;/code&gt; before issuing queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="k"&gt;LOCAL&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'42'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;-- automatically scoped to tenant 42&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;SET LOCAL&lt;/code&gt; ties the setting to the current transaction, so it can't leak across requests on a pooled connection — important.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Wire the tenant context to your auth layer
&lt;/h2&gt;

&lt;p&gt;The setting has to come from a trusted source. The standard pattern is to extract it from a verified JWT:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudocode — runs on every request inside a transaction
&lt;/span&gt;&lt;span class="n"&gt;claims&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;public_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;claims&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tenant_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SET LOCAL app.tenant_id = %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# All queries inside this transaction are now tenant-scoped
&lt;/span&gt;    &lt;span class="nf"&gt;run_report&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two non-negotiables:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Decode the JWT server-side with a verified signature.&lt;/strong&gt; Never trust a &lt;code&gt;tenant_id&lt;/code&gt; coming from a query parameter or request body.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;SET LOCAL&lt;/code&gt;, not &lt;code&gt;SET&lt;/code&gt;.&lt;/strong&gt; Connection pools recycle connections; &lt;code&gt;SET&lt;/code&gt; persists, &lt;code&gt;SET LOCAL&lt;/code&gt; doesn't.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Step 4: Reporting queries that "just work"
&lt;/h2&gt;

&lt;p&gt;With RLS on, your reporting queries get simpler — you don't write &lt;code&gt;WHERE tenant_id = ?&lt;/code&gt; in every query. Compare:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Before RLS — you must remember every time&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;date_trunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount_cents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- After RLS — same query, scoped automatically&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;date_trunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount_cents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This matters most for &lt;strong&gt;AI-generated SQL&lt;/strong&gt; and &lt;strong&gt;user-defined reports&lt;/strong&gt;. If an AI assistant writes a query against your warehouse, you don't want safety to depend on whether it remembered the tenant filter. RLS makes the database itself the last line of defense.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Handle cross-tenant admin queries explicitly
&lt;/h2&gt;

&lt;p&gt;Eventually you'll want internal queries that span tenants — billing rollups, churn cohorts, support tooling. Don't disable RLS; create a separate admin role:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;ROLE&lt;/span&gt; &lt;span class="n"&gt;app_admin&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;FORCE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;admin_all_access&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;
  &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;app_admin&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;FORCE ROW LEVEL SECURITY&lt;/code&gt; makes RLS apply even to the table owner, which prevents your migrations user from accidentally reading everyone's data. The &lt;code&gt;app_admin&lt;/code&gt; role has its own explicit policy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Indexing for tenant-scoped queries
&lt;/h2&gt;

&lt;p&gt;A common surprise: after adding RLS, queries that used to be fast get slow. The fix is almost always indexes that lead with &lt;code&gt;tenant_id&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Good — supports the most common reporting pattern&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;orders_tenant_status_created_idx&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Also useful for tenant-scoped joins&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;events_tenant_user_idx&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tenant_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Rule of thumb: any index you'd create on a single-tenant table, prefix with &lt;code&gt;tenant_id&lt;/code&gt; for a multi-tenant one. Postgres can then satisfy both the RLS predicate and your &lt;code&gt;ORDER BY&lt;/code&gt;/filter from the same index.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common gotchas
&lt;/h2&gt;

&lt;p&gt;Here's what bites people in production:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Forgetting to set &lt;code&gt;app.tenant_id&lt;/code&gt;.&lt;/strong&gt; If RLS is on and the setting isn't set, queries return zero rows — not an error. Wrap your transaction setup so missing context throws loudly in development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Using &lt;code&gt;SET&lt;/code&gt; instead of &lt;code&gt;SET LOCAL&lt;/code&gt;.&lt;/strong&gt; On a pooled connection, the next request inherits the previous tenant's context. This is a real, observed bug — and a nasty one because the symptom is "wrong customer's data shown."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RLS on the parent table but not on partitions.&lt;/strong&gt; Partitioned tables need RLS enabled on each partition, or queries skip the policy entirely. Easy to miss.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-tenant joins.&lt;/strong&gt; A query like &lt;code&gt;SELECT * FROM orders o JOIN users u ON o.user_id = u.id&lt;/code&gt; works fine if both tables have RLS. But if &lt;code&gt;users&lt;/code&gt; doesn't, you can leak data through the join. Enable RLS on every tenant-scoped table — not just the ones you think are sensitive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bypassing RLS for views.&lt;/strong&gt; Views run with the privileges of their creator by default. Use &lt;code&gt;CREATE VIEW … WITH (security_invoker = true)&lt;/code&gt; (Postgres 15+) so RLS evaluates against the calling user, not the view owner.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No tenant_id on derived tables.&lt;/strong&gt; Materialized views and reporting tables get rebuilt nightly and someone forgets the tenant column. Now your "fast dashboard" returns everyone's data. Always carry tenant_id through.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;p&gt;For multi-tenant SQL reporting, shared schema with &lt;code&gt;tenant_id&lt;/code&gt; is the pragmatic default. Row-Level Security turns that filter from a convention you have to remember into a constraint the database enforces. Pair it with JWT-driven session context, indexes that lead with &lt;code&gt;tenant_id&lt;/code&gt;, and an explicit admin role for cross-tenant queries.&lt;/p&gt;

&lt;p&gt;The payoff is that your reporting layer — whether you build it yourself, embed a third-party dashboard, or let an AI assistant write queries on top of your schema — becomes safe by default. The database itself refuses to leak data, no matter what query lands on it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Over to you
&lt;/h2&gt;

&lt;p&gt;How are you handling tenant isolation in your reporting layer today — RLS, application filters, schema-per-tenant, or something else? Have you hit a multi-tenant gotcha that wasn't on this list? Drop it in the comments — I'd love to compare notes.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>postgres</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Don't Trust AI-Generated SQL Blindly: A Developer's Validation Checklist</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Tue, 12 May 2026 09:44:13 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/dont-trust-ai-generated-sql-blindly-a-developers-validation-checklist-5f9g</link>
      <guid>https://dev.to/vivekdraxlr/dont-trust-ai-generated-sql-blindly-a-developers-validation-checklist-5f9g</guid>
      <description>&lt;p&gt;AI coding assistants can knock out a SQL query in seconds. You describe what you want in plain English, and out comes a &lt;code&gt;SELECT&lt;/code&gt; statement that looks completely reasonable. So you copy it, paste it into your app or dashboard query runner, and hit execute.&lt;/p&gt;

&lt;p&gt;This is where things go wrong.&lt;/p&gt;

&lt;p&gt;Unlike a syntax error that crashes loudly, a &lt;em&gt;semantically wrong&lt;/em&gt; SQL query runs successfully and returns a result. The result looks plausible. You ship the dashboard. A week later, someone notices the revenue numbers are 15% off because the AI joined on the wrong foreign key. The silent failure is the dangerous one.&lt;/p&gt;

&lt;p&gt;This post is a practical checklist for validating AI-generated SQL before it touches your production database — whether you're building a reporting feature, an embedded dashboard, or using a text-to-SQL tool internally.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why AI Gets SQL Wrong
&lt;/h2&gt;

&lt;p&gt;Large language models generate SQL by predicting likely token sequences based on training data. They don't &lt;em&gt;execute&lt;/em&gt; the query or check your actual schema — they hallucinate a plausible-looking result. The four failure patterns that show up most often are:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Schema hallucination&lt;/strong&gt; — the model invents column names that don't exist, or references the right column from the wrong table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Faulty joins&lt;/strong&gt; — missing &lt;code&gt;ON&lt;/code&gt; clauses (producing accidental Cartesian products), wrong join keys, or incorrect join types (&lt;code&gt;LEFT&lt;/code&gt; vs &lt;code&gt;INNER&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Aggregation mistakes&lt;/strong&gt; — applying &lt;code&gt;COUNT&lt;/code&gt; or &lt;code&gt;SUM&lt;/code&gt; to the wrong column, forgetting &lt;code&gt;GROUP BY&lt;/code&gt; columns, or confusing &lt;code&gt;WHERE&lt;/code&gt; (row-level) with &lt;code&gt;HAVING&lt;/code&gt; (group-level).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Missing filters&lt;/strong&gt; — dropping a &lt;code&gt;WHERE&lt;/code&gt; clause entirely, or missing a tenant/user scoping condition that you absolutely need for correctness and security.&lt;/p&gt;

&lt;p&gt;A 2026 benchmark study found that even top models only achieve ~78% execution accuracy on zero-shot text-to-SQL tasks. That means roughly 1 in 5 queries is wrong — often in ways that won't throw an error.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Validation Checklist
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ✅ 1. Check Every Column Against Your Actual Schema
&lt;/h3&gt;

&lt;p&gt;Before anything else, look up every column name in the generated query against your real schema. Don't assume — AI models frequently confuse similar column names like &lt;code&gt;user_id&lt;/code&gt; vs &lt;code&gt;account_id&lt;/code&gt;, or reference &lt;code&gt;created_at&lt;/code&gt; on a table that uses &lt;code&gt;created_date&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- AI generated this&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'completed'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Your schema actually uses:&lt;/span&gt;
&lt;span class="c1"&gt;-- users.id (not users.user_id)&lt;/span&gt;
&lt;span class="c1"&gt;-- orders.customer_id (not orders.user_id)&lt;/span&gt;
&lt;span class="c1"&gt;-- orders.amount (not orders.total_amount)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result if run as-is: either a syntax error (best case) or a silent Cartesian product (worst case). The fix is one minute of schema cross-referencing.&lt;/p&gt;




&lt;h3&gt;
  
  
  ✅ 2. Trace Every JOIN Condition
&lt;/h3&gt;

&lt;p&gt;For each &lt;code&gt;JOIN&lt;/code&gt;, ask yourself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is the &lt;code&gt;ON&lt;/code&gt; condition using the actual primary key → foreign key relationship?&lt;/li&gt;
&lt;li&gt;Is the join type (&lt;code&gt;INNER&lt;/code&gt;, &lt;code&gt;LEFT&lt;/code&gt;, &lt;code&gt;RIGHT&lt;/code&gt;) what you actually want?&lt;/li&gt;
&lt;li&gt;Could this produce duplicate rows?
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Potentially dangerous: AI used INNER JOIN, dropping users without orders&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Correct if you want ALL users, including those with zero orders:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;INNER JOIN&lt;/code&gt; version silently excludes users with no orders. If you're building a "user activity" report, that's a significant data gap.&lt;/p&gt;




&lt;h3&gt;
  
  
  ✅ 3. Verify Aggregation Logic
&lt;/h3&gt;

&lt;p&gt;Check that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;GROUP BY&lt;/code&gt; includes all non-aggregated columns in &lt;code&gt;SELECT&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SUM&lt;/code&gt; and &lt;code&gt;COUNT&lt;/code&gt; are applied to the right columns&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;HAVING&lt;/code&gt; is used for filtering aggregated results (not &lt;code&gt;WHERE&lt;/code&gt;)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Bug: filtering on an aggregate in WHERE (will error in most databases)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Correct:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;AI tools make this exact mistake surprisingly often, especially when you phrase the question as "show me customers who spent more than $1000."&lt;/p&gt;




&lt;h3&gt;
  
  
  ✅ 4. Check for Missing Tenant/User Scoping
&lt;/h3&gt;

&lt;p&gt;If you're building a multi-tenant SaaS app or any system where different users should only see their own data, &lt;strong&gt;this is the most critical check&lt;/strong&gt;. AI tools have no idea your &lt;code&gt;events&lt;/code&gt; table contains data from 500 different customers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- AI generates this when asked "show me top pages by view count"&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;page_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;views&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;page_events&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;page_url&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;views&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- What you actually need in a multi-tenant app:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;page_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;views&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;page_events&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;account_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;current_account_id&lt;/span&gt;  &lt;span class="c1"&gt;-- ← THIS is what the AI forgot&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;page_url&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;views&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running the AI version against production would expose every customer's data to every other customer. Always add tenant scoping filters as a mandatory review step.&lt;/p&gt;




&lt;h3&gt;
  
  
  ✅ 5. Run Against a Staging Database First
&lt;/h3&gt;

&lt;p&gt;Never run an untested AI-generated query directly on production — especially anything that writes data. For &lt;code&gt;SELECT&lt;/code&gt; queries, use a staging copy of your database and compare:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Row counts (are they in the expected ballpark?)&lt;/li&gt;
&lt;li&gt;Known values (does the result match what you'd expect for a test account?)&lt;/li&gt;
&lt;li&gt;Edge cases (what happens with NULL values? What about accounts with no data?)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Spot-check: does this query return the right count for a known account?&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;-- you know this customer has 7 orders&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the result is &lt;code&gt;7&lt;/code&gt;, you're on the right track. If it's &lt;code&gt;0&lt;/code&gt; or &lt;code&gt;700&lt;/code&gt;, something is wrong with the query logic.&lt;/p&gt;




&lt;h3&gt;
  
  
  ✅ 6. Watch Out for NULL Handling
&lt;/h3&gt;

&lt;p&gt;AI-generated queries often miss NULL edge cases. The &lt;code&gt;=&lt;/code&gt; operator doesn't work with NULL — it silently excludes rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- This will miss rows where discount_code IS NULL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;discount_code&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;'PROMO10'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Correct if you want all orders without that specific code, including nulls:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;discount_code&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;'PROMO10'&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;discount_code&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is especially common in &lt;code&gt;WHERE&lt;/code&gt; filters on optional fields like coupon codes, referral sources, or plan tiers.&lt;/p&gt;




&lt;h3&gt;
  
  
  ✅ 7. For Write Operations: Always Dry-Run First
&lt;/h3&gt;

&lt;p&gt;If the AI generates an &lt;code&gt;UPDATE&lt;/code&gt;, &lt;code&gt;INSERT&lt;/code&gt;, or &lt;code&gt;DELETE&lt;/code&gt;, wrap it in a transaction and inspect the affected rows before committing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'cancelled'&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;trial_ends_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'trialing'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Check what would be affected before committing:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;trial_ends_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'trialing'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;ROLLBACK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;-- Don't commit until you're sure&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a preview of the blast radius before any data changes are permanent.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Simple Pre-Run Mental Model
&lt;/h2&gt;

&lt;p&gt;Before executing any AI-generated query, ask yourself:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Question&lt;/th&gt;
      &lt;th&gt;What to check&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Do all columns exist?&lt;/td&gt;
      &lt;td&gt;Cross-reference against schema&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Are all JOINs correct?&lt;/td&gt;
      &lt;td&gt;Verify keys and join types&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Is aggregation right?&lt;/td&gt;
      &lt;td&gt;Check GROUP BY, HAVING vs WHERE&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Is tenant data scoped?&lt;/td&gt;
      &lt;td&gt;Look for account_id / user_id filters&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Are NULLs handled?&lt;/td&gt;
      &lt;td&gt;Check optional column filters&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Is this a write operation?&lt;/td&gt;
      &lt;td&gt;Use BEGIN/ROLLBACK to preview&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Common Mistakes to Watch For
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trusting output that "looks right"&lt;/strong&gt; — plausible-looking results are the most dangerous kind of wrong&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skipping staging for read-only queries&lt;/strong&gt; — even SELECTs can cause problems (performance, data leakage) if they're wrong&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not providing schema context to the AI&lt;/strong&gt; — the better your prompt (including table definitions and sample values), the more accurate the output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Running AI queries against production without row limits&lt;/strong&gt; — add &lt;code&gt;LIMIT 100&lt;/code&gt; while testing to avoid accidentally returning or processing millions of rows&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;AI tools are genuinely useful for writing SQL faster — especially for repetitive patterns and boilerplate joins. But they're drafting assistants, not a replacement for your understanding of the data model. Every AI-generated query deserves at least a 30-second review against the checklist above.&lt;/p&gt;

&lt;p&gt;The silent failure — a query that runs cleanly but returns wrong data — is what makes SQL validation non-negotiable. A 15% revenue discrepancy discovered a week after shipping is a bad day. A 30-second schema check is not.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Do you have a validation workflow you follow for AI-generated SQL?&lt;/strong&gt; Share it in the comments — I'd love to hear how other teams handle this, especially in multi-tenant or high-stakes reporting contexts.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Beyond Basics: Mastering SQL's LEAD, LAG, RANK, DENSE_RANK, and NTILE</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Mon, 11 May 2026 13:58:12 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/beyond-basics-mastering-sqls-lead-lag-rank-denserank-and-ntile-3i8j</link>
      <guid>https://dev.to/vivekdraxlr/beyond-basics-mastering-sqls-lead-lag-rank-denserank-and-ntile-3i8j</guid>
      <description>&lt;p&gt;You've learned that window functions exist. You've run a &lt;code&gt;ROW_NUMBER()&lt;/code&gt; or two. But there's a whole tier of power hiding just one step further — functions that let you compare rows against their neighbors, assign competition-style rankings, and slice datasets into meaningful percentile buckets. No self-joins. No correlated subqueries. No tears.&lt;/p&gt;

&lt;p&gt;This article goes deep on five advanced window functions: &lt;code&gt;LEAD&lt;/code&gt;, &lt;code&gt;LAG&lt;/code&gt;, &lt;code&gt;RANK&lt;/code&gt;, &lt;code&gt;DENSE_RANK&lt;/code&gt;, and &lt;code&gt;NTILE&lt;/code&gt;. Each one solves a specific class of problem you'll hit constantly in analytics, reporting, and data engineering. We'll use realistic scenarios — e-commerce orders, employee salaries, and product sales — so you can see exactly where these functions shine.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Prerequisite:&lt;/strong&gt; This article assumes you already know what the &lt;code&gt;OVER()&lt;/code&gt; clause and &lt;code&gt;PARTITION BY&lt;/code&gt; are. If not, check out the intro to SQL window functions first.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The LEAD and LAG Functions: Looking Forward and Backward
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;LAG&lt;/code&gt; retrieves a value from a &lt;em&gt;previous&lt;/em&gt; row. &lt;code&gt;LEAD&lt;/code&gt; retrieves a value from a &lt;em&gt;future&lt;/em&gt; row. Both are essential for time-series comparisons where you need to know "how did this value change from last period?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Syntax:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default_value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;span class="n"&gt;LEAD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default_value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;offset&lt;/code&gt; — how many rows back (LAG) or forward (LEAD) to look. Defaults to 1.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;default_value&lt;/code&gt; — what to return when there's no row at that offset (instead of NULL).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example 1: Month-over-Month Revenue Change
&lt;/h3&gt;

&lt;p&gt;You have a &lt;code&gt;monthly_revenue&lt;/code&gt; table tracking each month's sales:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;month&lt;/span&gt;       &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;product_id&lt;/span&gt;  &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;revenue&lt;/span&gt;     &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To calculate the change from the previous month for each product:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;prev_month_revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;revenue&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue_change&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result (sample rows):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;month&lt;/th&gt;
&lt;th&gt;product_id&lt;/th&gt;
&lt;th&gt;revenue&lt;/th&gt;
&lt;th&gt;prev_month_revenue&lt;/th&gt;
&lt;th&gt;revenue_change&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;2026-01-01&lt;/td&gt;
&lt;td&gt;101&lt;/td&gt;
&lt;td&gt;12000.00&lt;/td&gt;
&lt;td&gt;0.00&lt;/td&gt;
&lt;td&gt;12000.00&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;2026-02-01&lt;/td&gt;
&lt;td&gt;101&lt;/td&gt;
&lt;td&gt;15400.00&lt;/td&gt;
&lt;td&gt;12000.00&lt;/td&gt;
&lt;td&gt;3400.00&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;2026-03-01&lt;/td&gt;
&lt;td&gt;101&lt;/td&gt;
&lt;td&gt;13800.00&lt;/td&gt;
&lt;td&gt;15400.00&lt;/td&gt;
&lt;td&gt;-1600.00&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Notice the &lt;code&gt;default_value&lt;/code&gt; of &lt;code&gt;0&lt;/code&gt; prevents the first row from showing &lt;code&gt;NULL&lt;/code&gt;. That's a small touch that makes downstream reporting much cleaner.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 2: Detecting Customer Churn Risk with LEAD
&lt;/h3&gt;

&lt;p&gt;You want to flag customers who placed an order but then didn't order again within 90 days:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;LEAD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;next_order_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CASE&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;LEAD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
           &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;
         &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
      &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;LEAD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
           &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;
         &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;
    &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'At Risk'&lt;/span&gt;
    &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Active'&lt;/span&gt;
  &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;churn_status&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the kind of query that would previously require a self-join on &lt;code&gt;orders o1 JOIN orders o2 ON o1.customer_id = o2.customer_id AND o2.order_date &amp;gt; o1.order_date&lt;/code&gt; — messy, slow, and hard to read.&lt;/p&gt;




&lt;h2&gt;
  
  
  RANK vs DENSE_RANK vs ROW_NUMBER: Choosing the Right Ranking Function
&lt;/h2&gt;

&lt;p&gt;These three functions all assign row numbers, but they handle &lt;em&gt;ties&lt;/em&gt; very differently. Picking the wrong one is a silent bug that poisons your reports.&lt;/p&gt;

&lt;p&gt;Here's the core difference on a single dataset:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;employee_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;   &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;row_num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;         &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rnk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;DENSE_RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;   &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dense_rnk&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result (Engineering department):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;employee_name&lt;/th&gt;
&lt;th&gt;salary&lt;/th&gt;
&lt;th&gt;row_num&lt;/th&gt;
&lt;th&gt;rnk&lt;/th&gt;
&lt;th&gt;dense_rnk&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;Alice&lt;/td&gt;
&lt;td&gt;120000&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Bob&lt;/td&gt;
&lt;td&gt;105000&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Carol&lt;/td&gt;
&lt;td&gt;105000&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;Dan&lt;/td&gt;
&lt;td&gt;98000&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ROW_NUMBER&lt;/code&gt; always gives unique numbers — ties are broken arbitrarily. Use this for pagination.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;RANK&lt;/code&gt; skips numbers after ties — Bob and Carol are both rank 2, so Dan is rank 4 (not 3). Use this for competition-style rankings ("top 3 scores").&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DENSE_RANK&lt;/code&gt; never skips — Bob and Carol are both rank 2, and Dan is rank 3. Use this when gaps in ranking would be confusing or misleading.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example 3: Top-Earning Employee Per Department
&lt;/h3&gt;

&lt;p&gt;A very common analytics requirement — "show me the top earner in each department":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;ranked_employees&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;employee_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DENSE_RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;
      &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;salary_rank&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employee_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;ranked_employees&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;salary_rank&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why &lt;code&gt;DENSE_RANK&lt;/code&gt; instead of &lt;code&gt;RANK&lt;/code&gt; here? Because if two people tie for top salary, you want &lt;em&gt;both&lt;/em&gt; to appear. &lt;code&gt;RANK = 1&lt;/code&gt; and &lt;code&gt;DENSE_RANK = 1&lt;/code&gt; both return tied rows, but with &lt;code&gt;DENSE_RANK&lt;/code&gt; you avoid the counterintuitive gap at rank 2.&lt;/p&gt;




&lt;h2&gt;
  
  
  NTILE: Slicing Data into Buckets
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;NTILE(n)&lt;/code&gt; divides ordered rows into &lt;code&gt;n&lt;/code&gt; roughly equal groups and assigns each row a bucket number (1 through n). This is the window function for percentile analysis, quartiles, and cohort segmentation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Syntax:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;NTILE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Example 4: Segmenting Customers by Purchase Value
&lt;/h3&gt;

&lt;p&gt;Imagine you want to label customers as Bronze, Silver, Gold, or Platinum based on their total annual spend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;customer_spend&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;annual_spend&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;YEAR&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2025&lt;/span&gt;
  &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;customer_tiers&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;annual_spend&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NTILE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;annual_spend&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;spend_quartile&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_spend&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;annual_spend&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="n"&gt;spend_quartile&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Bronze'&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Silver'&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Gold'&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Platinum'&lt;/span&gt;
  &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;tier&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_tiers&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;annual_spend&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result (sample):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;customer_id&lt;/th&gt;
&lt;th&gt;annual_spend&lt;/th&gt;
&lt;th&gt;tier&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;C-4821&lt;/td&gt;
&lt;td&gt;48320.00&lt;/td&gt;
&lt;td&gt;Platinum&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;C-3019&lt;/td&gt;
&lt;td&gt;31005.00&lt;/td&gt;
&lt;td&gt;Gold&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;C-7733&lt;/td&gt;
&lt;td&gt;12400.00&lt;/td&gt;
&lt;td&gt;Silver&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;C-2210&lt;/td&gt;
&lt;td&gt;1890.00&lt;/td&gt;
&lt;td&gt;Bronze&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;One important caveat: if the total number of rows isn't evenly divisible by &lt;code&gt;n&lt;/code&gt;, the higher-numbered buckets get one fewer row. This is mathematically correct but can surprise you if you expect perfectly even groups.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 5: Identifying Performance Outliers with NTILE
&lt;/h3&gt;

&lt;p&gt;In a data pipeline, you want to flag the slowest 25% of queries for investigation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;query_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;execution_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;NTILE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;execution_ms&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;slowness_quartile&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;query_logs&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;logged_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'7 days'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Filter &lt;code&gt;WHERE slowness_quartile = 1&lt;/code&gt; and you have your slowest 25% — no hardcoded thresholds, no percentile math in application code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Common Mistakes and Gotchas
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Missing ORDER BY inside OVER()&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;code&gt;LEAD&lt;/code&gt;, &lt;code&gt;LAG&lt;/code&gt;, &lt;code&gt;RANK&lt;/code&gt;, and &lt;code&gt;DENSE_RANK&lt;/code&gt; are meaningless without an &lt;code&gt;ORDER BY&lt;/code&gt; inside &lt;code&gt;OVER()&lt;/code&gt;. The database will either error out or return unpredictable results. Always specify it explicitly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Confusing the outer ORDER BY with the window ORDER BY&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The &lt;code&gt;ORDER BY&lt;/code&gt; inside &lt;code&gt;OVER(...)&lt;/code&gt; determines the &lt;em&gt;window ordering&lt;/em&gt; — which row is "previous" or "next." The &lt;code&gt;ORDER BY&lt;/code&gt; at the end of the query determines the &lt;em&gt;display order&lt;/em&gt;. These are independent. Your results might look sorted even without an outer &lt;code&gt;ORDER BY&lt;/code&gt;, but never rely on that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. RANK vs DENSE_RANK for filtering&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Using &lt;code&gt;WHERE rank = 2&lt;/code&gt; after &lt;code&gt;RANK()&lt;/code&gt; might return zero rows if rank 2 was skipped due to a tie. &lt;code&gt;DENSE_RANK&lt;/code&gt; avoids this. When filtering by rank in a CTE, think carefully about which behavior you want.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. NULLs in LAG/LEAD&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The first row for each partition has no previous row, so &lt;code&gt;LAG&lt;/code&gt; returns &lt;code&gt;NULL&lt;/code&gt; by default. Always provide a sensible default value (the third argument) if downstream calculations can't handle &lt;code&gt;NULL&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. NTILE and uneven groups&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;code&gt;NTILE(3)&lt;/code&gt; on 10 rows gives groups of sizes 4, 3, 3 — not 3.33 each. Document this behavior if you're presenting "even segments" to stakeholders.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;All five of these functions share the same &lt;code&gt;OVER()&lt;/code&gt; structure you already know — what changes is &lt;em&gt;what they compute&lt;/em&gt;. In summary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LAG / LEAD&lt;/strong&gt; — access values from neighboring rows, ideal for time-series deltas and gap detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RANK&lt;/strong&gt; — competition-style ranking with gaps after ties, best for "top N" filtering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DENSE_RANK&lt;/strong&gt; — ranking without gaps after ties, best when missing rank numbers would be confusing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ROW_NUMBER&lt;/strong&gt; — always unique, best for pagination and deduplication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NTILE(n)&lt;/strong&gt; — bucket rows into n groups, best for percentiles and cohort segmentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The real superpower here is combining them. Use &lt;code&gt;LAG&lt;/code&gt; to calculate MoM change, then &lt;code&gt;NTILE&lt;/code&gt; to segment products into growth quartiles, then &lt;code&gt;DENSE_RANK&lt;/code&gt; to rank within each quartile — all in a single query using CTEs.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Try rewriting a query you currently use that relies on a self-join for row comparison. Chances are &lt;code&gt;LAG&lt;/code&gt; or &lt;code&gt;LEAD&lt;/code&gt; collapses it into something half the length. Share what you built in the comments — I'd love to see real-world use cases you've solved with these functions!&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>postgres</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Beyond Basic Indexes: Mastering Partial, Composite, and Covering Indexes in SQL</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Mon, 11 May 2026 04:21:41 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/beyond-basic-indexes-mastering-partial-composite-and-covering-indexes-in-sql-2led</link>
      <guid>https://dev.to/vivekdraxlr/beyond-basic-indexes-mastering-partial-composite-and-covering-indexes-in-sql-2led</guid>
      <description>&lt;p&gt;Most developers know that adding an index speeds up queries. But slapping a basic index on every column you filter by is a recipe for bloated storage, slower writes, and no real performance gain. The real power comes from &lt;em&gt;choosing the right type of index for the job&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;In this article, we'll go beyond the basics and explore three advanced indexing strategies: &lt;strong&gt;partial indexes&lt;/strong&gt;, &lt;strong&gt;composite indexes&lt;/strong&gt;, and &lt;strong&gt;covering indexes&lt;/strong&gt;. Each solves a different class of performance problem, and knowing when to reach for each one separates good database work from great database work.&lt;/p&gt;

&lt;p&gt;We'll use PostgreSQL throughout, but these concepts apply (with some syntax variation) to MySQL, SQL Server, and other major databases.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Setup: An E-Commerce Orders Table
&lt;/h2&gt;

&lt;p&gt;Let's ground everything in a realistic scenario. Imagine an &lt;code&gt;orders&lt;/code&gt; table with tens of millions of rows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;          &lt;span class="n"&gt;BIGSERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt;        &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;status&lt;/span&gt;      &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- 'pending', 'shipped', 'delivered', 'cancelled'&lt;/span&gt;
  &lt;span class="n"&gt;total&lt;/span&gt;       &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt;  &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;   &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="n"&gt;region&lt;/span&gt;      &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Over time, the table grows to 50 million rows. Most orders are &lt;code&gt;'delivered'&lt;/code&gt; or &lt;code&gt;'cancelled'&lt;/code&gt; — a tiny fraction (maybe 2%) are &lt;code&gt;'pending'&lt;/code&gt; or &lt;code&gt;'shipped'&lt;/code&gt;. This distribution is the key to understanding why generic indexes often fall short.&lt;/p&gt;




&lt;h2&gt;
  
  
  Partial Indexes: Index Only What You Query
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;partial index&lt;/strong&gt; is built over a &lt;em&gt;subset&lt;/em&gt; of rows, defined by a &lt;code&gt;WHERE&lt;/code&gt; condition. It's one of the most underused features in PostgreSQL.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem
&lt;/h3&gt;

&lt;p&gt;Your support dashboard queries pending and shipped orders constantly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'shipped'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A plain index on &lt;code&gt;status&lt;/code&gt; would index all 50 million rows, even though 98% of them will never match this query. That's 50 million index entries serving 1 million actual lookups.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Solution: A Partial Index
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_active&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'shipped'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now PostgreSQL only indexes the ~1 million "active" rows. The index is 98% smaller, fits more easily in memory, and stays faster even as the full table grows — because delivered/cancelled orders never bloat it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Query Plan Comparison
&lt;/h3&gt;

&lt;p&gt;Before (sequential scan on 50M rows):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Seq&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;1245000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;00&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1100000&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;Filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;ANY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'{pending,shipped}'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After (index scan on 1M-entry partial index):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;Index&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;idx_orders_active&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;43&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;9834&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1100000&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  When to Use Partial Indexes
&lt;/h3&gt;

&lt;p&gt;Partial indexes shine when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A column has highly skewed distribution (most queries target a small minority of rows)&lt;/li&gt;
&lt;li&gt;You filter on "active" or "open" records and the closed ones pile up over time&lt;/li&gt;
&lt;li&gt;You want a unique constraint only for non-null values: &lt;code&gt;CREATE UNIQUE INDEX ON users (email) WHERE email IS NOT NULL;&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Composite Indexes: Column Order Is Everything
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;composite index&lt;/strong&gt; (also called a multi-column index) covers more than one column. Done right, it eliminates multiple sequential scans. Done wrong, it wastes space and never gets used.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Left-to-Right Rule
&lt;/h3&gt;

&lt;p&gt;PostgreSQL can use a composite index only when the query filters on a &lt;em&gt;prefix&lt;/em&gt; of the indexed columns. Given:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_customer_status_date&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Query predicate&lt;/th&gt;
      &lt;th&gt;Uses index?&lt;/th&gt;
      &lt;th&gt;Why&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;WHERE customer_id = 42&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;✅ Yes&lt;/td&gt;
      &lt;td&gt;Matches leading column&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;WHERE customer_id = 42 AND status = 'pending'&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;✅ Yes&lt;/td&gt;
      &lt;td&gt;Matches first two columns&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;WHERE customer_id = 42 AND status = 'pending' AND created_at &amp;gt; '2026-01-01'&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;✅ Yes (full use)&lt;/td&gt;
      &lt;td&gt;Matches all three columns&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;WHERE status = 'pending'&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;❌ No&lt;/td&gt;
      &lt;td&gt;Skips leading column&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code&gt;WHERE created_at &amp;gt; '2026-01-01'&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;❌ No&lt;/td&gt;
      &lt;td&gt;Skips first two columns&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Column Ordering Strategy
&lt;/h3&gt;

&lt;p&gt;Follow this rule of thumb for column order in a composite index:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Equality columns first&lt;/strong&gt; — columns used with &lt;code&gt;=&lt;/code&gt; or &lt;code&gt;IN&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Range columns next&lt;/strong&gt; — columns used with &lt;code&gt;&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;&lt;/code&gt;, &lt;code&gt;BETWEEN&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sort columns last&lt;/strong&gt; — columns used in &lt;code&gt;ORDER BY&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So for this query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;interval&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ideal index is exactly what we created: &lt;code&gt;(customer_id, status, created_at DESC)&lt;/code&gt;. PostgreSQL can narrow by &lt;code&gt;customer_id&lt;/code&gt; (equality), then narrow further by &lt;code&gt;status&lt;/code&gt; (equality), then scan only the recent range of &lt;code&gt;created_at&lt;/code&gt; — all within the index structure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Avoid Index Proliferation
&lt;/h3&gt;

&lt;p&gt;One well-designed composite index often replaces two or three single-column indexes. If you have &lt;code&gt;idx_customer_id&lt;/code&gt; and &lt;code&gt;idx_status&lt;/code&gt; separately, queries needing both columns force a "bitmap AND" merge — two index lookups joined in memory. A single composite index eliminates that overhead.&lt;/p&gt;




&lt;h2&gt;
  
  
  Covering Indexes: Make the Index the Answer
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;covering index&lt;/strong&gt; is one that contains &lt;em&gt;every column a query needs&lt;/em&gt; — so PostgreSQL never has to touch the actual table rows. This is called an &lt;strong&gt;index-only scan&lt;/strong&gt;, and it's the fastest possible read path.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem: Heap Fetches
&lt;/h3&gt;

&lt;p&gt;Normally, after an index lookup, PostgreSQL must fetch the actual row from the table heap to retrieve non-indexed columns. On large tables with many concurrent writes, those heap fetches cause random I/O that kills performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  The INCLUDE Clause
&lt;/h3&gt;

&lt;p&gt;PostgreSQL 11+ introduced the &lt;code&gt;INCLUDE&lt;/code&gt; clause to add "payload" columns to an index without making them part of the sort key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_customer_covering&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;INCLUDE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now this query runs entirely off the index — no heap access at all:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'shipped'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  INCLUDE vs Adding to the Key
&lt;/h3&gt;

&lt;p&gt;Why not just add &lt;code&gt;total&lt;/code&gt; and &lt;code&gt;created_at&lt;/code&gt; to the index key itself? Because:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Non-key columns aren't sorted&lt;/strong&gt;, so they can't help filter or sort — they just live as payload in the leaf pages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smaller B-tree height&lt;/strong&gt; — fewer levels to traverse since non-key columns only appear in leaf nodes, not interior nodes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoids over-indexing&lt;/strong&gt; — the key remains selective; &lt;code&gt;INCLUDE&lt;/code&gt; columns just come along for the ride&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Seeing It in Action
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ANALYZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BUFFERS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'shipped'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without &lt;code&gt;INCLUDE&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;Index&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;idx_orders_customer_status&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;56&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;892&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;34&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;847&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;031&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;821&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;847&lt;/span&gt; &lt;span class="n"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;Buffers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shared&lt;/span&gt; &lt;span class="n"&gt;hit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;423&lt;/span&gt; &lt;span class="k"&gt;read&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;634&lt;/span&gt;   &lt;span class="err"&gt;←&lt;/span&gt; &lt;span class="mi"&gt;634&lt;/span&gt; &lt;span class="n"&gt;heap&lt;/span&gt; &lt;span class="n"&gt;blocks&lt;/span&gt; &lt;span class="k"&gt;read&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With &lt;code&gt;INCLUDE (total, created_at)&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;Index&lt;/span&gt; &lt;span class="k"&gt;Only&lt;/span&gt; &lt;span class="n"&gt;Scan&lt;/span&gt; &lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="n"&gt;idx_orders_customer_covering&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cost&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;56&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;112&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;847&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;actual&lt;/span&gt; &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;025&lt;/span&gt;&lt;span class="p"&gt;..&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;893&lt;/span&gt; &lt;span class="k"&gt;rows&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;847&lt;/span&gt; &lt;span class="n"&gt;loops&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;Buffers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;shared&lt;/span&gt; &lt;span class="n"&gt;hit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;52&lt;/span&gt; &lt;span class="k"&gt;read&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;      &lt;span class="err"&gt;←&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="n"&gt;heap&lt;/span&gt; &lt;span class="n"&gt;blocks&lt;/span&gt; &lt;span class="k"&gt;read&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's a 5× reduction in execution time and zero heap I/O.&lt;/p&gt;




&lt;h2&gt;
  
  
  Combining All Three: The Power Move
&lt;/h2&gt;

&lt;p&gt;You can combine all three strategies. Here's a partial covering composite index:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_active_by_region&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;INCLUDE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'shipped'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This index:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Only indexes active orders (partial — keeps it small)&lt;/li&gt;
&lt;li&gt;Supports filtering by region and sorting by date (composite — ordered key)&lt;/li&gt;
&lt;li&gt;Returns &lt;code&gt;customer_id&lt;/code&gt; and &lt;code&gt;total&lt;/code&gt; without a heap fetch (covering — zero table I/O)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A query like this will hit this index perfectly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'EU-West'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'shipped'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Common Gotchas
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Over-indexing writes more than you read.&lt;/strong&gt; Every index is updated on every &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, and &lt;code&gt;DELETE&lt;/code&gt;. An extra 5 indexes on a high-write table can make inserts 2–3× slower. Audit with &lt;code&gt;pg_stat_user_indexes&lt;/code&gt; to find unused indexes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;indexrelname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;idx_scan&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;idx_tup_read&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_user_indexes&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;schemaname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'public'&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;idx_scan&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Partial index predicates must match exactly.&lt;/strong&gt; PostgreSQL uses a partial index only when the query's &lt;code&gt;WHERE&lt;/code&gt; clause implies (or matches) the index predicate. &lt;code&gt;WHERE status = 'pending'&lt;/code&gt; won't use an index defined as &lt;code&gt;WHERE status IN ('pending', 'shipped')&lt;/code&gt; — you need &lt;code&gt;WHERE status IN ('pending', 'shipped')&lt;/code&gt; or the planner must be able to prove the subset relationship.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;INCLUDE&lt;/code&gt; columns don't help WHERE or ORDER BY.&lt;/strong&gt; They are non-key payload only — PostgreSQL can't use them to narrow or sort. If you filter or sort by a column, it must be in the key part of the index, not in &lt;code&gt;INCLUDE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build indexes concurrently on live tables.&lt;/strong&gt; Creating an index on a large table locks out writes. Use &lt;code&gt;CREATE INDEX CONCURRENTLY&lt;/code&gt; to build the index without blocking:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;CONCURRENTLY&lt;/span&gt; &lt;span class="n"&gt;idx_orders_active_by_region&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;INCLUDE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'shipped'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Partial indexes&lt;/strong&gt; target a narrow, frequently queried subset of rows. They stay small and fast even as the full table balloons in size.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Composite indexes&lt;/strong&gt; cover multi-column queries efficiently — but column order matters. Equality predicates first, range predicates second, sort columns last.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Covering indexes&lt;/strong&gt; (with &lt;code&gt;INCLUDE&lt;/code&gt;) allow index-only scans, eliminating costly random I/O to the table heap.&lt;/li&gt;
&lt;li&gt;Combining all three gives you a precision instrument: small, ordered, and self-contained.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The best index strategy always starts with understanding your query patterns. Profile with &lt;code&gt;EXPLAIN (ANALYZE, BUFFERS)&lt;/code&gt;, identify where heap reads or sequential scans dominate, then choose the index type that surgically removes that bottleneck.&lt;/p&gt;




&lt;p&gt;Have you tuned a slow query with one of these techniques? Drop a comment below — especially if you've hit a surprising edge case. And if you're new to indexing fundamentals, check out the earlier post in this series on &lt;a href="https://dev.to"&gt;basic SQL indexes&lt;/a&gt; before diving deeper.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>postgres</category>
      <category>performance</category>
    </item>
    <item>
      <title>SQL Views Explained: Write Once, Query Forever</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Thu, 07 May 2026 03:01:49 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/sql-views-explained-write-once-query-forever-khc</link>
      <guid>https://dev.to/vivekdraxlr/sql-views-explained-write-once-query-forever-khc</guid>
      <description>&lt;p&gt;You've written the same 12-line JOIN query for the third time this week. It lives in your reporting script, your admin dashboard, and your data export job. Every time the business logic changes, you update it in three places — and inevitably miss one.&lt;/p&gt;

&lt;p&gt;SQL views were built to fix exactly this. A view is a named, stored query that you can treat like a table. Define your logic once, then reference it from anywhere. This article walks through everything you need to know to start using views effectively: what they are, how to create them, real-world scenarios where they shine, and the gotchas that trip people up.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is a SQL View?
&lt;/h2&gt;

&lt;p&gt;A view is a virtual table based on the result of a &lt;code&gt;SELECT&lt;/code&gt; statement. It doesn't store data itself — it stores the &lt;em&gt;query&lt;/em&gt;. Every time you &lt;code&gt;SELECT&lt;/code&gt; from a view, the database runs that underlying query and returns fresh results.&lt;/p&gt;

&lt;p&gt;Here's the simplest possible view:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;active_customers&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now instead of repeating that filter everywhere, you just do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;active_customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The database sees through the view and runs the full underlying query — you get current data, every time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Creating, Replacing, and Dropping Views
&lt;/h2&gt;

&lt;h3&gt;
  
  
  CREATE VIEW
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                        &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'completed'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this exists, any analyst on your team can run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No need to remember the aggregation logic — it's baked into the view.&lt;/p&gt;

&lt;h3&gt;
  
  
  CREATE OR REPLACE VIEW
&lt;/h3&gt;

&lt;p&gt;To update a view's definition without dropping and recreating it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                        &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_order_value&lt;/span&gt;  &lt;span class="c1"&gt;-- new column added&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'completed'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  DROP VIEW
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use &lt;code&gt;IF EXISTS&lt;/code&gt; to avoid errors if the view doesn't exist yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Simplifying Complex Reports
&lt;/h3&gt;

&lt;p&gt;Imagine you have an e-commerce schema: &lt;code&gt;orders&lt;/code&gt;, &lt;code&gt;order_items&lt;/code&gt;, &lt;code&gt;products&lt;/code&gt;, &lt;code&gt;customers&lt;/code&gt;. A typical report query might join all four tables and apply several filters. Put that in a view:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;order_summary&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;            &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;          &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;quantity&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unit_price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                     &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;item_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;  &lt;span class="k"&gt;c&lt;/span&gt;  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;order_items&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now your dashboard queries look clean:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Top 10 customers by spending this year&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;lifetime_value&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;order_summary&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;lifetime_value&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Hiding Sensitive Columns
&lt;/h3&gt;

&lt;p&gt;You want your analytics team to query the &lt;code&gt;employees&lt;/code&gt; table but not see salaries or social security numbers. Create a view that exposes only safe columns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;employees_public&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;job_title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;hire_date&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- salary, ssn, and bank_account columns are never exposed&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Grant access to the view, not the base table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;GRANT&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;employees_public&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;analytics_role&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is one of the most practical uses of views in production systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Maintaining Backward Compatibility
&lt;/h3&gt;

&lt;p&gt;Your application queries a table called &lt;code&gt;user_profiles&lt;/code&gt;. You need to refactor the schema — split it into &lt;code&gt;users&lt;/code&gt; and &lt;code&gt;profile_details&lt;/code&gt;. Create a view with the old name that reconstructs the original shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;user_profiles&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;avatar_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;profile_details&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Old queries keep working. You can migrate downstream consumers at your own pace.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Pre-Filtering for Multi-Tenant Apps
&lt;/h3&gt;

&lt;p&gt;In a SaaS app, data is often segmented by &lt;code&gt;tenant_id&lt;/code&gt;. Views can bake in row-level filtering:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;tenant_orders&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;tenant_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_tenant'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(PostgreSQL supports this pattern using session-level settings.)&lt;/p&gt;




&lt;h2&gt;
  
  
  Updatable Views
&lt;/h2&gt;

&lt;p&gt;In some situations you can run &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, and &lt;code&gt;DELETE&lt;/code&gt; directly on a view. This works when the view is simple enough that the database can unambiguously map the change back to a single base table row.&lt;/p&gt;

&lt;p&gt;A view is generally updatable if it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Selects from a &lt;strong&gt;single base table&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Does &lt;strong&gt;not&lt;/strong&gt; use &lt;code&gt;DISTINCT&lt;/code&gt;, &lt;code&gt;GROUP BY&lt;/code&gt;, &lt;code&gt;HAVING&lt;/code&gt;, &lt;code&gt;UNION&lt;/code&gt;, or aggregate functions&lt;/li&gt;
&lt;li&gt;Does &lt;strong&gt;not&lt;/strong&gt; include computed columns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example of an updatable view:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;active_products&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stock_quantity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;is_active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;TRUE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- This works because it maps cleanly to one row in `products`&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;active_products&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;29&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;99&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  WITH CHECK OPTION
&lt;/h3&gt;

&lt;p&gt;Add &lt;code&gt;WITH CHECK OPTION&lt;/code&gt; to prevent updates that would push rows &lt;em&gt;out of&lt;/em&gt; the view's filter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;active_products&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stock_quantity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;is_active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;TRUE&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="k"&gt;OPTION&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- This would fail — it would make the row invisible to the view&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;active_products&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;is_active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;FALSE&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- ERROR: new row violates check option for view "active_products"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Views vs. CTEs: When to Use Which
&lt;/h2&gt;

&lt;p&gt;Both views and CTEs (covered in a &lt;a href="///dev.to%20articles/2026-04-10-sql-cte-mastering-common-table-expressions.md"&gt;previous article&lt;/a&gt;) let you name and reuse query logic. The key difference:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Feature&lt;/th&gt;
      &lt;th&gt;View&lt;/th&gt;
      &lt;th&gt;CTE&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Persistence&lt;/td&gt;
      &lt;td&gt;Stored in database permanently&lt;/td&gt;
      &lt;td&gt;Exists only during one query&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Reusability&lt;/td&gt;
      &lt;td&gt;Available to all queries, all users&lt;/td&gt;
      &lt;td&gt;Local to the single SQL statement&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Access control&lt;/td&gt;
      &lt;td&gt;Can be granted/revoked independently&lt;/td&gt;
      &lt;td&gt;Not applicable&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Best for&lt;/td&gt;
      &lt;td&gt;Shared, frequently used logic&lt;/td&gt;
      &lt;td&gt;Ad-hoc, one-off query breakdowns&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Recursive queries&lt;/td&gt;
      &lt;td&gt;No&lt;/td&gt;
      &lt;td&gt;Yes&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb:&lt;/strong&gt; If you're writing the same subquery in multiple queries or multiple applications, make it a view. If you're just breaking up a complex single query for readability, use a CTE.&lt;/p&gt;




&lt;h2&gt;
  
  
  Common Mistakes and Gotchas
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Assuming Views Cache Results
&lt;/h3&gt;

&lt;p&gt;Views do &lt;strong&gt;not&lt;/strong&gt; store data. Every query against a view re-executes the underlying SQL. If that underlying query is slow, your view will be slow too. Views give you reusability, not performance — for that, look at &lt;strong&gt;materialized views&lt;/strong&gt; (a topic for another day).&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Nesting Views Too Deeply
&lt;/h3&gt;

&lt;p&gt;It's tempting to build views on top of views on top of views. Resist this. Deep view nesting makes debugging painful and can confuse the query planner. If your view references another view which references another view, the optimizer has to unpack all of them before building an execution plan. Keep hierarchies shallow (two levels max).&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Putting Too Much Logic Into One View
&lt;/h3&gt;

&lt;p&gt;Views encourage DRY code, but that doesn't mean one monster view should serve every use case. A view that joins 8 tables for every query will be slow for callers who only need 2 of those tables. Create targeted views for specific, well-understood use cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Forgetting That Schema Changes Can Break Views
&lt;/h3&gt;

&lt;p&gt;If you drop a column from a base table that a view depends on, the view breaks — often silently until someone queries it. Most databases won't warn you at &lt;code&gt;ALTER TABLE&lt;/code&gt; time. Build a habit of checking view dependencies before modifying base tables.&lt;/p&gt;

&lt;p&gt;In PostgreSQL, you can check what views depend on a table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;viewname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;definition&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_views&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;definition&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="s1"&gt;'%your_table_name%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Trying to Pass Parameters Into Views
&lt;/h3&gt;

&lt;p&gt;Views are static — they can't accept parameters the way stored procedures can. If you find yourself creating dozens of near-identical views (&lt;code&gt;orders_2024&lt;/code&gt;, &lt;code&gt;orders_2025&lt;/code&gt;, &lt;code&gt;orders_jan&lt;/code&gt;, &lt;code&gt;orders_feb&lt;/code&gt;), that's a sign you want a parameterized query, a function, or a filtered query against a single general-purpose view instead.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;SQL views are a powerful tool for keeping your database logic clean, consistent, and secure. In summary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A view is a &lt;strong&gt;named, stored SELECT query&lt;/strong&gt; — not stored data&lt;/li&gt;
&lt;li&gt;Use them to &lt;strong&gt;centralize shared query logic&lt;/strong&gt; so changes propagate everywhere at once&lt;/li&gt;
&lt;li&gt;Use them to &lt;strong&gt;restrict access&lt;/strong&gt; by exposing only safe columns to certain roles&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;CREATE OR REPLACE&lt;/code&gt; to update view definitions without downtime&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;WITH CHECK OPTION&lt;/code&gt; to enforce constraints on updatable views&lt;/li&gt;
&lt;li&gt;Prefer &lt;strong&gt;CTEs for one-off breakdowns&lt;/strong&gt;, &lt;strong&gt;views for recurring shared logic&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Watch out for deep nesting, bloated view definitions, and silent schema-change failures&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;Take one query you write repeatedly — a filter, a join, a report — and convert it into a view today. Then ask: who else on your team could benefit from having access to it?&lt;/p&gt;

&lt;p&gt;Have you run into situations where views saved you from repetition, or caused unexpected performance headaches? Drop a comment below — I'd love to hear how you're using them in production.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is part of an ongoing series on SQL databases. Previous articles covered &lt;a href="https://dev.to/"&gt;CTEs&lt;/a&gt;, &lt;a href="https://dev.to/"&gt;window functions&lt;/a&gt;, &lt;a href="https://dev.to/"&gt;indexes&lt;/a&gt;, and more. Follow along for new posts every few days.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>postgres</category>
      <category>beginners</category>
    </item>
    <item>
      <title>SQL String Functions: Clean, Transform, and Tame Messy Data Like a Pro</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Wed, 06 May 2026 04:45:52 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/sql-string-functions-clean-transform-and-tame-messy-data-like-a-pro-52h6</link>
      <guid>https://dev.to/vivekdraxlr/sql-string-functions-clean-transform-and-tame-messy-data-like-a-pro-52h6</guid>
      <description>&lt;p&gt;Real-world data is messy. Users type their names in ALL CAPS, phone numbers come in five different formats, email addresses have trailing spaces, and city names are spelled three different ways. Before you can analyze or display that data meaningfully, you need to clean and reshape it — and SQL string functions are one of the most powerful tools in your toolkit for doing exactly that.&lt;/p&gt;

&lt;p&gt;In this guide, we'll go beyond the basics. You'll see how string functions work &lt;em&gt;together&lt;/em&gt; to solve real data problems — the kind that show up constantly in reporting pipelines, backend APIs, and data migrations.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Sample Database
&lt;/h2&gt;

&lt;p&gt;We'll use a realistic &lt;code&gt;customers&lt;/code&gt; table throughout this article:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;   &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;first_name&lt;/span&gt;    &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;last_name&lt;/span&gt;     &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;email&lt;/span&gt;         &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;phone&lt;/span&gt;         &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;city&lt;/span&gt;          &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;country_code&lt;/span&gt;  &lt;span class="nb"&gt;CHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here's the kind of data you might find in it after a messy data import:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;customer_id&lt;/th&gt;
      &lt;th&gt;first_name&lt;/th&gt;
      &lt;th&gt;last_name&lt;/th&gt;
      &lt;th&gt;email&lt;/th&gt;
      &lt;th&gt;phone&lt;/th&gt;
      &lt;th&gt;city&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;ALICE&lt;/td&gt;
&lt;td&gt;  Johnson  &lt;/td&gt;
&lt;td&gt;alice@example.com  &lt;/td&gt;
&lt;td&gt;(555) 123-4567&lt;/td&gt;
&lt;td&gt;new york&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;bob&lt;/td&gt;
&lt;td&gt;Smith&lt;/td&gt;
&lt;td&gt;BOB@EXAMPLE.COM&lt;/td&gt;
&lt;td&gt;555.987.6543&lt;/td&gt;
&lt;td&gt;Los Angeles &lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;María&lt;/td&gt;
&lt;td&gt;García&lt;/td&gt;
&lt;td&gt; maria@example.com&lt;/td&gt;
&lt;td&gt;5552345678&lt;/td&gt;
&lt;td&gt;CHICAGO&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;James&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;james@example.com&lt;/td&gt;
&lt;td&gt;+1-555-876-5432&lt;/td&gt;
&lt;td&gt;Houston&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Classic import chaos. Let's fix it.&lt;/p&gt;




&lt;h2&gt;
  
  
  TRIM: Killing the Invisible Enemies
&lt;/h2&gt;

&lt;p&gt;Leading and trailing whitespace is the cockroach of data problems — invisible, persistent, and it breaks things at the worst moments. &lt;code&gt;TRIM&lt;/code&gt; removes it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Remove spaces from both sides (default behavior)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;clean_last_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;clean_last_name
---------------
Johnson
Smith
García
(null)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also target only one side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;LTRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;no_leading_space&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;RTRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;no_trailing_space&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gotcha:&lt;/strong&gt; &lt;code&gt;TRIM&lt;/code&gt; only removes spaces by default. If your data has tabs (&lt;code&gt;\t&lt;/code&gt;) or newlines (&lt;code&gt;\n&lt;/code&gt;) — common in CSV imports — you need &lt;code&gt;REGEXP_REPLACE&lt;/code&gt; (covered below).&lt;/p&gt;




&lt;h2&gt;
  
  
  UPPER and LOWER: Standardizing Case
&lt;/h2&gt;

&lt;p&gt;Mixed-case data makes comparisons a nightmare. &lt;code&gt;alice@example.com&lt;/code&gt; and &lt;code&gt;ALICE@example.com&lt;/code&gt; will fail an equality check even though they're the same address.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Normalize all emails to lowercase&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;normalized_email&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;normalized_email&lt;/span&gt;
&lt;span class="c1"&gt;------------+-------------------&lt;/span&gt;
&lt;span class="mi"&gt;1&lt;/span&gt;           &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;alice&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;span class="mi"&gt;2&lt;/span&gt;           &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;bob&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;span class="mi"&gt;3&lt;/span&gt;           &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;maria&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;span class="mi"&gt;4&lt;/span&gt;           &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;james&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For display names, you often want title case. SQL doesn't have a built-in &lt;code&gt;INITCAP&lt;/code&gt; function in MySQL, but PostgreSQL does:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL only&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;display_first&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;display_last&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;display_first&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;display_last&lt;/span&gt;
&lt;span class="c1"&gt;--------------+-------------&lt;/span&gt;
&lt;span class="n"&gt;Alice&lt;/span&gt;         &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;Johnson&lt;/span&gt;
&lt;span class="n"&gt;Bob&lt;/span&gt;           &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;Smith&lt;/span&gt;
&lt;span class="n"&gt;Mar&lt;/span&gt;&lt;span class="err"&gt;í&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;         &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;Garc&lt;/span&gt;&lt;span class="err"&gt;í&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;
&lt;span class="n"&gt;James&lt;/span&gt;         &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In MySQL, you'd build your own with &lt;code&gt;CONCAT(UPPER(LEFT(name,1)), LOWER(SUBSTRING(name,2)))&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  CONCAT: Building Strings From Parts
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;CONCAT&lt;/code&gt; joins strings together. Use it for display names, slugs, full addresses — anywhere you need to assemble a value from multiple columns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Build a full display name&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt;
    &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;
&lt;span class="c1"&gt;-----------&lt;/span&gt;
&lt;span class="n"&gt;Alice&lt;/span&gt; &lt;span class="n"&gt;Johnson&lt;/span&gt;
&lt;span class="n"&gt;Bob&lt;/span&gt; &lt;span class="n"&gt;Smith&lt;/span&gt;
&lt;span class="n"&gt;Mar&lt;/span&gt;&lt;span class="err"&gt;í&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;Garc&lt;/span&gt;&lt;span class="err"&gt;í&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;||&lt;/code&gt; operator is the ANSI-standard alternative to &lt;code&gt;CONCAT()&lt;/code&gt; and works in PostgreSQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Gotcha:&lt;/strong&gt; In MySQL, &lt;code&gt;CONCAT(value, NULL)&lt;/code&gt; returns &lt;code&gt;NULL&lt;/code&gt; — a single null column poisons the whole concatenation. Use &lt;code&gt;CONCAT_WS&lt;/code&gt; (concat with separator) or &lt;code&gt;COALESCE&lt;/code&gt; to guard against this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Safe version that handles NULLs&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  REPLACE: Fixing Formatted Junk
&lt;/h2&gt;

&lt;p&gt;Our phone numbers are a mess — parentheses, dots, dashes, country codes. &lt;code&gt;REPLACE&lt;/code&gt; lets you strip unwanted characters one at a time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Strip common phone number formatting characters&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;phone&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'('&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s1"&gt;')'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="s1"&gt;'-'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="s1"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;digits_only&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
&lt;th&gt;original&lt;/th&gt;
&lt;th&gt;digits_only&lt;/th&gt;
&lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;(555) 123-4567&lt;/td&gt;
&lt;td&gt;5551234567&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;555.987.6543&lt;/td&gt;
&lt;td&gt;5559876543&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;5552345678&lt;/td&gt;
&lt;td&gt;5552345678&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;+1-555-876-5432&lt;/td&gt;
&lt;td&gt;+15558765432&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For the last row, you'd still need to handle the &lt;code&gt;+1&lt;/code&gt; country prefix — which is where &lt;code&gt;REGEXP_REPLACE&lt;/code&gt; shines.&lt;/p&gt;




&lt;h2&gt;
  
  
  REGEXP_REPLACE: The Power Tool
&lt;/h2&gt;

&lt;p&gt;When &lt;code&gt;REPLACE&lt;/code&gt; isn't enough, regular expressions let you target patterns instead of literal characters.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL: strip ALL non-digit characters from phone numbers&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;REGEXP_REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'[^0-9]'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'g'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;digits_only&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;phone&lt;/span&gt;            &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;digits_only&lt;/span&gt;
&lt;span class="c1"&gt;-----------------+------------&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;555&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;4567&lt;/span&gt;   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mi"&gt;5551234567&lt;/span&gt;
&lt;span class="mi"&gt;555&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;987&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;6543&lt;/span&gt;     &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mi"&gt;5559876543&lt;/span&gt;
&lt;span class="mi"&gt;5552345678&lt;/span&gt;       &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mi"&gt;5552345678&lt;/span&gt;
&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;555&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;876&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;5432&lt;/span&gt;  &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mi"&gt;15558765432&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In MySQL 8+, use &lt;code&gt;REGEXP_REPLACE(phone, '[^0-9]', '')&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gotcha:&lt;/strong&gt; Regex is powerful but slow on large tables. Don't run &lt;code&gt;REGEXP_REPLACE&lt;/code&gt; on a column inside a &lt;code&gt;WHERE&lt;/code&gt; clause if you can help it — you'll kill index usage and force a full scan. Clean your data in a migration, not on every query.&lt;/p&gt;




&lt;h2&gt;
  
  
  SUBSTRING: Extracting Exactly What You Need
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;SUBSTRING&lt;/code&gt; (or &lt;code&gt;SUBSTR&lt;/code&gt;) extracts part of a string by position.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Extract domain from email addresses&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUBSTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;POSITION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'@'&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;domain&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;email&lt;/span&gt;                   &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="k"&gt;domain&lt;/span&gt;
&lt;span class="c1"&gt;------------------------+------------&lt;/span&gt;
&lt;span class="n"&gt;alice&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;       &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;span class="n"&gt;bob&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;         &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;span class="n"&gt;maria&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;       &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;span class="n"&gt;james&lt;/span&gt;&lt;span class="o"&gt;@&lt;/span&gt;&lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;       &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;com&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A more practical example — extracting year from a date stored as text (not ideal, but it happens):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Pull the year from 'YYYY-MM-DD' stored as text&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;order_ref&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUBSTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_ref&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_year&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_ref&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'____-__-__%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;SUBSTRING(string, start, length)&lt;/code&gt; is the MySQL/SQL-standard syntax; PostgreSQL also supports &lt;code&gt;SUBSTRING(string FROM start FOR length)&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  LENGTH and POSITION: Measuring and Locating
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;LENGTH&lt;/code&gt; tells you how many characters a string has. &lt;code&gt;POSITION&lt;/code&gt; (or &lt;code&gt;CHARINDEX&lt;/code&gt; in SQL Server) tells you where a substring appears.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find customers with suspiciously short email addresses&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;LENGTH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;email_length&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;LENGTH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Find where the '@' sits in each email&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;POSITION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'@'&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;at_position&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Together, &lt;code&gt;POSITION&lt;/code&gt; and &lt;code&gt;SUBSTRING&lt;/code&gt; let you split strings on a delimiter without regex — useful for parsing values like &lt;code&gt;"Smith, John"&lt;/code&gt; or &lt;code&gt;"Houston, TX"&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Split "city, state" into two columns&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUBSTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;POSITION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;','&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUBSTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;location&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;POSITION&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;','&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="k"&gt;location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;       &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;offices&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Putting It All Together: A Real Cleanup Query
&lt;/h2&gt;

&lt;p&gt;Here's a single query that normalizes the entire &lt;code&gt;customers&lt;/code&gt; table for a reporting export:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;                         &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;))))&lt;/span&gt;            &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;                                       &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;REGEXP_REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'[^0-9]'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'g'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                 &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;phone_digits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;                               &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;UPPER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;country_code&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                                      &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;country_code&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;customer_id&lt;/th&gt;
      &lt;th&gt;first_name&lt;/th&gt;
      &lt;th&gt;last_name&lt;/th&gt;
      &lt;th&gt;email&lt;/th&gt;
      &lt;th&gt;phone_digits&lt;/th&gt;
      &lt;th&gt;city&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Alice&lt;/td&gt;
&lt;td&gt;Johnson&lt;/td&gt;
&lt;td&gt;alice@example.com&lt;/td&gt;
&lt;td&gt;5551234567&lt;/td&gt;
&lt;td&gt;New York&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Bob&lt;/td&gt;
&lt;td&gt;Smith&lt;/td&gt;
&lt;td&gt;bob@example.com&lt;/td&gt;
&lt;td&gt;5559876543&lt;/td&gt;
&lt;td&gt;Los Angeles&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;María&lt;/td&gt;
&lt;td&gt;García&lt;/td&gt;
&lt;td&gt;maria@example.com&lt;/td&gt;
&lt;td&gt;5552345678&lt;/td&gt;
&lt;td&gt;Chicago&lt;/td&gt;
&lt;/tr&gt;
    &lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;James&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;james@example.com&lt;/td&gt;
&lt;td&gt;15558765432&lt;/td&gt;
&lt;td&gt;Houston&lt;/td&gt;
&lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Common Mistakes to Avoid
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Assuming NULLs are empty strings.&lt;/strong&gt; &lt;code&gt;LENGTH(NULL)&lt;/code&gt; returns &lt;code&gt;NULL&lt;/code&gt;, not 0. Always wrap nullable columns in &lt;code&gt;COALESCE(column, '')&lt;/code&gt; before passing them to string functions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chaining REPLACE instead of using REGEXP_REPLACE.&lt;/strong&gt; Five nested &lt;code&gt;REPLACE&lt;/code&gt; calls are harder to read and maintain than a single regex pattern. If you find yourself nesting more than two, switch to regex.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Running string transformations in WHERE clauses on large tables.&lt;/strong&gt; &lt;code&gt;WHERE LOWER(email) = 'alice@example.com'&lt;/code&gt; prevents the database from using an index on &lt;code&gt;email&lt;/code&gt;. Either store emails already lowercased, or create a functional index: &lt;code&gt;CREATE INDEX idx_email_lower ON customers (LOWER(email))&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Forgetting character vs. byte length.&lt;/strong&gt; In MySQL, &lt;code&gt;LENGTH()&lt;/code&gt; returns byte count — a multi-byte UTF-8 character (like &lt;code&gt;é&lt;/code&gt; or &lt;code&gt;中&lt;/code&gt;) counts as 2–3 bytes. Use &lt;code&gt;CHAR_LENGTH()&lt;/code&gt; when you care about the number of visible characters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;SQL string functions aren't just cosmetic — they're essential data engineering tools. A few patterns worth remembering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Clean at storage time, not query time&lt;/strong&gt; — normalize data when it enters your database, not on every &lt;code&gt;SELECT&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Layer functions&lt;/strong&gt; — &lt;code&gt;INITCAP(LOWER(TRIM(name)))&lt;/code&gt; is perfectly valid and readable SQL&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;COALESCE&lt;/code&gt; defensively&lt;/strong&gt; — any function receiving &lt;code&gt;NULL&lt;/code&gt; will return &lt;code&gt;NULL&lt;/code&gt;, so guard early&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prefer &lt;code&gt;REGEXP_REPLACE&lt;/code&gt; for pattern-based cleanup&lt;/strong&gt; — it's more maintainable than chained &lt;code&gt;REPLACE&lt;/code&gt; calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Watch performance&lt;/strong&gt; — function calls in &lt;code&gt;WHERE&lt;/code&gt; clauses can kill index efficiency on large tables&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;String functions pair beautifully with SQL's &lt;code&gt;UPDATE&lt;/code&gt; statement — you can run a one-time data cleanup across your whole table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt;
  &lt;span class="n"&gt;email&lt;/span&gt;      &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
  &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt;
  &lt;span class="n"&gt;city&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;INITCAP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Have you run into a particularly gnarly data cleaning challenge in SQL? Drop it in the comments — I'd love to see how you solved it, or we can work through it together.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>postgres</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>SQL Date &amp; Time Functions: A Practical Guide for Real-World Queries</title>
      <dc:creator>Vivek Kumar</dc:creator>
      <pubDate>Mon, 04 May 2026 05:53:12 +0000</pubDate>
      <link>https://dev.to/vivekdraxlr/sql-date-time-functions-a-practical-guide-for-real-world-queries-5hef</link>
      <guid>https://dev.to/vivekdraxlr/sql-date-time-functions-a-practical-guide-for-real-world-queries-5hef</guid>
      <description>&lt;p&gt;Working with dates and times is one of those things that sounds simple until you're staring at a timestamp column wondering why your "last 30 days" query returns unexpected results. Date and time functions are essential for analytics, reporting, and any application that tracks events over time — yet the subtle differences between database engines trip up even experienced developers.&lt;/p&gt;

&lt;p&gt;In this guide, we'll cover the most useful SQL date/time functions across &lt;strong&gt;PostgreSQL&lt;/strong&gt; and &lt;strong&gt;MySQL&lt;/strong&gt; — two of the most widely used databases — with practical, real-world examples. By the end, you'll know how to confidently filter by time ranges, calculate durations, group data by period, and avoid the most common pitfalls.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Date/Time Functions Matter
&lt;/h2&gt;

&lt;p&gt;Think about how many queries involve time: "Show me orders placed in the last 7 days," "Calculate how long customers have been subscribed," "Group revenue by month." Nearly every meaningful business question has a temporal dimension.&lt;/p&gt;

&lt;p&gt;The challenge is that SQL databases don't all speak the same date dialect. PostgreSQL has &lt;code&gt;DATE_TRUNC&lt;/code&gt; and &lt;code&gt;AGE&lt;/code&gt;. MySQL has &lt;code&gt;DATE_FORMAT&lt;/code&gt; and &lt;code&gt;TIMESTAMPDIFF&lt;/code&gt;. Standard SQL gives you &lt;code&gt;EXTRACT&lt;/code&gt; and &lt;code&gt;CURRENT_TIMESTAMP&lt;/code&gt;. Knowing what's available — and what differs — saves you hours of debugging.&lt;/p&gt;

&lt;p&gt;We'll use a consistent example throughout: an e-commerce database with &lt;code&gt;orders&lt;/code&gt; and &lt;code&gt;customers&lt;/code&gt; tables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Our example schema&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;   &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;          &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;joined_at&lt;/span&gt;     &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;order_id&lt;/span&gt;      &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;   &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;placed_at&lt;/span&gt;     &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;shipped_at&lt;/span&gt;    &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;total_amount&lt;/span&gt;  &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  1. Getting the Current Date and Time
&lt;/h2&gt;

&lt;p&gt;The most basic need: "what time is it right now?" Every major database supports these, but syntax varies slightly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;                  &lt;span class="c1"&gt;-- current timestamp with time zone&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;-- same as NOW(), SQL standard&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;           &lt;span class="c1"&gt;-- date only (no time)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIME&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;           &lt;span class="c1"&gt;-- time only&lt;/span&gt;

&lt;span class="c1"&gt;-- MySQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;                  &lt;span class="c1"&gt;-- current datetime&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CURDATE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;              &lt;span class="c1"&gt;-- date only&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CURTIME&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;              &lt;span class="c1"&gt;-- time only&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;UTC_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;        &lt;span class="c1"&gt;-- current UTC datetime (very useful!)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real-world use:&lt;/strong&gt; Find all orders placed today.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- MySQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CURDATE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Casting a timestamp to &lt;code&gt;date&lt;/code&gt; (or using &lt;code&gt;DATE()&lt;/code&gt;) forces the database to discard the time component before comparing — otherwise &lt;code&gt;placed_at = CURRENT_DATE&lt;/code&gt; would fail because a timestamp like &lt;code&gt;2026-04-28 14:33:00&lt;/code&gt; is not equal to the date &lt;code&gt;2026-04-28&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  2. Extracting Parts of a Date
&lt;/h2&gt;

&lt;p&gt;Sometimes you need just the year, month, or hour from a timestamp. &lt;code&gt;EXTRACT&lt;/code&gt; is the SQL-standard way to do this, supported by both PostgreSQL and MySQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Standard SQL — works in both PostgreSQL and MySQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;YEAR&lt;/span&gt;  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;MONTH&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DOW&lt;/span&gt;   &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;day_of_week&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- 0=Sunday in PostgreSQL&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;HOUR&lt;/span&gt;  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_hour&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real-world use:&lt;/strong&gt; Group orders by month to see monthly revenue trends.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;YEAR&lt;/span&gt;  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;yr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;MONTH&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;mo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                      &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;             &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;yr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mo&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;yr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mo&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;MySQL shorthand functions:&lt;/strong&gt; MySQL also offers dedicated extraction functions that are a bit more readable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- MySQL-specific shortcuts&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="nb"&gt;YEAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;MONTH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;DAY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_day&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;HOUR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_hour&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;DAYNAME&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;weekday_name&lt;/span&gt;   &lt;span class="c1"&gt;-- e.g. "Tuesday"&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  3. Date Arithmetic with INTERVAL
&lt;/h2&gt;

&lt;p&gt;Adding or subtracting time from a date is a common need: "all orders from the last 30 days," "subscriptions expiring in the next 7 days," etc.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL: use INTERVAL keyword&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- MySQL: use DATE_SUB() or INTERVAL notation&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt; &lt;span class="k"&gt;DAY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Also valid in MySQL:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;DATE_SUB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt; &lt;span class="k"&gt;DAY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;INTERVAL supports a wide range of units:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL examples&lt;/span&gt;
&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'1 year'&lt;/span&gt;
&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'3 months'&lt;/span&gt;
&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'2 hours 30 minutes'&lt;/span&gt;
&lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'90 days'&lt;/span&gt;

&lt;span class="c1"&gt;-- MySQL examples&lt;/span&gt;
&lt;span class="n"&gt;DATE_ADD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="nb"&gt;YEAR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;DATE_ADD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="k"&gt;MONTH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;DATE_ADD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shipped_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="mi"&gt;14&lt;/span&gt; &lt;span class="k"&gt;DAY&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;-- estimated delivery date&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real-world use:&lt;/strong&gt; Flag orders that shipped more than 14 days ago but never had a follow-up.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shipped_at&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;shipped_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;shipped_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'14 days'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  4. Calculating Differences Between Dates
&lt;/h2&gt;

&lt;p&gt;How many days between two dates? How long has a customer been with us? These questions require date difference functions.&lt;/p&gt;

&lt;h3&gt;
  
  
  DATEDIFF (MySQL)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- MySQL: DATEDIFF returns difference in days&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;shipped_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;DATEDIFF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shipped_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;days_to_ship&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;shipped_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  TIMESTAMPDIFF (MySQL)
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;TIMESTAMPDIFF&lt;/code&gt; is more flexible — you choose the unit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- MySQL: specify the unit you want&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;joined_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;TIMESTAMPDIFF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;YEAR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;joined_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;years_as_customer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;TIMESTAMPDIFF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;MONTH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;joined_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;months_as_customer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;TIMESTAMPDIFF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DAY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;joined_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;days_as_customer&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AGE (PostgreSQL)
&lt;/h3&gt;

&lt;p&gt;PostgreSQL has the elegant &lt;code&gt;AGE()&lt;/code&gt; function that returns a human-readable interval:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;joined_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;AGE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;joined_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer_age&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- e.g. "2 years 3 mons 12 days"&lt;/span&gt;
  &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DAY&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;AGE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;joined_at&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;           &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;days_as_customer&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Real-world use:&lt;/strong&gt; Find customers who joined more than 1 year ago but have placed no order in the last 90 days (candidates for a win-back campaign).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;joined_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;last_order_date&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;joined_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'1 year'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;joined_at&lt;/span&gt;
&lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'90 days'&lt;/span&gt;
    &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  5. Truncating Dates with DATE_TRUNC (PostgreSQL)
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;DATE_TRUNC&lt;/code&gt; is a PostgreSQL superpower for grouping data by time periods. It rounds a timestamp &lt;em&gt;down&lt;/em&gt; to the start of the given period:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="c1"&gt;-- Result: 2026-04-01 00:00:00+00&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'week'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="c1"&gt;-- Result: 2026-04-27 00:00:00+00 (start of current week, Monday)&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'year'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;span class="c1"&gt;-- Result: 2026-01-01 00:00:00+00&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Supported precision values include: &lt;code&gt;microseconds&lt;/code&gt;, &lt;code&gt;milliseconds&lt;/code&gt;, &lt;code&gt;second&lt;/code&gt;, &lt;code&gt;minute&lt;/code&gt;, &lt;code&gt;hour&lt;/code&gt;, &lt;code&gt;day&lt;/code&gt;, &lt;code&gt;week&lt;/code&gt;, &lt;code&gt;month&lt;/code&gt;, &lt;code&gt;quarter&lt;/code&gt;, &lt;code&gt;year&lt;/code&gt;, &lt;code&gt;decade&lt;/code&gt;, &lt;code&gt;century&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world use:&lt;/strong&gt; Weekly revenue report with clean week boundaries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'week'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;week_start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                        &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_order_value&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'12 weeks'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'week'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;week_start&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In MySQL, you can approximate this with &lt;code&gt;DATE_FORMAT&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- MySQL: group by week using DATE_FORMAT&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;DATE_FORMAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'%Y-%u'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;year_week&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- e.g. "2026-17"&lt;/span&gt;
  &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                          &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                 &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;DATE_SUB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt; &lt;span class="n"&gt;WEEK&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;DATE_FORMAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'%Y-%u'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;year_week&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  6. Formatting Dates for Display
&lt;/h2&gt;

&lt;p&gt;Raw timestamps aren't user-friendly. Format them for reports and APIs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL: TO_CHAR&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;TO_CHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Month DD, YYYY'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;display_date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;-- "April 28, 2026"&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;TO_CHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'YYYY-MM-DD HH24:MI'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;display_dt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;-- "2026-04-28 14:33"&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;TO_CHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Day'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;weekday&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;-- "Tuesday"&lt;/span&gt;

&lt;span class="c1"&gt;-- MySQL: DATE_FORMAT&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DATE_FORMAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'%M %d, %Y'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;display_date&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;-- "April 28, 2026"&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DATE_FORMAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'%Y-%m-%d %H:%i'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;display_dt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;-- "2026-04-28 14:33"&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DATE_FORMAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'%W'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;             &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;weekday&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;-- "Tuesday"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Common Mistakes and Gotchas
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Applying a function to an indexed column kills performance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you have an index on &lt;code&gt;placed_at&lt;/code&gt;, this query can't use it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- ❌ Wrapping the column in a function prevents index usage&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CURDATE&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;          &lt;span class="c1"&gt;-- MySQL&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'day'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;   &lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead, use a range comparison that leaves the column untouched:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- ✅ Range filter — the index on placed_at can be used&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2026-04-28 00:00:00'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;  &lt;span class="s1"&gt;'2026-04-29 00:00:00'&lt;/span&gt;

&lt;span class="c1"&gt;-- Or dynamically:&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;  &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'1 day'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Ignoring time zones&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;NOW()&lt;/code&gt; in PostgreSQL returns &lt;code&gt;timestamptz&lt;/code&gt; (with time zone). If your app stores timestamps in UTC but your &lt;code&gt;NOW()&lt;/code&gt; uses a local session timezone, you can get wildly wrong results. Always be explicit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL: force UTC&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;AT&lt;/span&gt; &lt;span class="nb"&gt;TIME&lt;/span&gt; &lt;span class="k"&gt;ZONE&lt;/span&gt; &lt;span class="s1"&gt;'UTC'&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'24 hours'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In MySQL, use &lt;code&gt;UTC_TIMESTAMP()&lt;/code&gt; instead of &lt;code&gt;NOW()&lt;/code&gt; when you want UTC regardless of session settings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. NULL shipped_at breaks duration calculations&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If &lt;code&gt;shipped_at&lt;/code&gt; is NULL (order not yet shipped), a DATEDIFF or subtraction involving it returns NULL — not an error, just silent NULL propagation. Always handle this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- MySQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DATEDIFF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;shipped_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;placed_at&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'Not yet shipped'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;fulfillment_days&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. DATEDIFF argument order differs by database&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MySQL: &lt;code&gt;DATEDIFF(end_date, start_date)&lt;/code&gt; → positive if end is after start&lt;/li&gt;
&lt;li&gt;SQL Server: &lt;code&gt;DATEDIFF(unit, start_date, end_date)&lt;/code&gt; → note the unit comes first!&lt;/li&gt;
&lt;li&gt;PostgreSQL: use subtraction directly: &lt;code&gt;end_date - start_date&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This trips up developers switching between databases constantly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary: Key Takeaways
&lt;/h2&gt;

&lt;p&gt;Working with dates in SQL is a skill that pays dividends across virtually every query you'll write for analytics or reporting. The core ideas to remember:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;NOW()&lt;/code&gt; / &lt;code&gt;CURRENT_TIMESTAMP&lt;/code&gt; for the current moment; &lt;code&gt;CURRENT_DATE&lt;/code&gt; for today's date&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;EXTRACT()&lt;/code&gt; pulls out specific date parts (year, month, day of week) — it's standard SQL&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;INTERVAL&lt;/code&gt; for date arithmetic — keep the indexed column bare on the left side of your &lt;code&gt;WHERE&lt;/code&gt; clause&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DATEDIFF&lt;/code&gt; / &lt;code&gt;TIMESTAMPDIFF&lt;/code&gt; (MySQL) and &lt;code&gt;AGE()&lt;/code&gt; / simple subtraction (PostgreSQL) handle duration calculations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DATE_TRUNC&lt;/code&gt; (PostgreSQL) is your best friend for grouping by week/month/quarter&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DATE_FORMAT&lt;/code&gt; (MySQL) and &lt;code&gt;TO_CHAR&lt;/code&gt; (PostgreSQL) format dates for display&lt;/li&gt;
&lt;li&gt;Always consider NULLs and time zones — they're the source of most date-related bugs&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Your Favorite Date Function?
&lt;/h2&gt;

&lt;p&gt;Date functions are deceptively powerful. Once you're comfortable with &lt;code&gt;DATE_TRUNC&lt;/code&gt; for time-series aggregation or &lt;code&gt;TIMESTAMPDIFF&lt;/code&gt; for cohort analysis, your SQL queries go from useful to genuinely insightful.&lt;/p&gt;

&lt;p&gt;Drop a comment below — what's the most useful date/time query pattern you use regularly? I'd love to see what real-world problems you're solving! And if you found this helpful, share it with a colleague who's wrestling with a "last 30 days" filter. 🗓️&lt;/p&gt;

</description>
      <category>sql</category>
      <category>database</category>
      <category>postgres</category>
      <category>mysql</category>
    </item>
  </channel>
</rss>
