<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Collins Njeru</title>
    <description>The latest articles on DEV Community by Collins Njeru (@cnew_aerospace_85c7b7d3cb).</description>
    <link>https://dev.to/cnew_aerospace_85c7b7d3cb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3827043%2F2c4c96fc-a5b3-4c95-9122-12dfd0fdf4bd.png</url>
      <title>DEV Community: Collins Njeru</title>
      <link>https://dev.to/cnew_aerospace_85c7b7d3cb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cnew_aerospace_85c7b7d3cb"/>
    <language>en</language>
    <item>
      <title>From Chaos to Clarity: How SQL Transforms Raw Data into Powerful Stories</title>
      <dc:creator>Collins Njeru</dc:creator>
      <pubDate>Mon, 27 Apr 2026 05:33:32 +0000</pubDate>
      <link>https://dev.to/cnew_aerospace_85c7b7d3cb/from-chaos-to-clarity-how-sql-transforms-raw-data-into-powerful-stories-3gmg</link>
      <guid>https://dev.to/cnew_aerospace_85c7b7d3cb/from-chaos-to-clarity-how-sql-transforms-raw-data-into-powerful-stories-3gmg</guid>
      <description>&lt;p&gt;&lt;strong&gt;By Collins Njeru&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;I still remember the first time I ran a query that actually meant something. Not a tutorial exercise with fake names and random numbers , but a real question, against real data, that came back with a real answer. That moment is hard to describe. It feels less like writing code and more like having a conversation with your database.&lt;/p&gt;

&lt;p&gt;That is what SQL really is. A conversation. And like any conversation, the quality of what you get back depends entirely on how well you frame what you are asking.&lt;/p&gt;

&lt;p&gt;This article is about learning to ask better questions. We will use a retail sales database - customers, products, sales records, inventory  and work through the full range of SQL from the basics all the way to the kind of queries that make a hiring manager lean forward in their chair.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Database We Are Working With
&lt;/h2&gt;

&lt;p&gt;Before writing a single query, you need to understand your data. That sounds obvious, but you would be surprised how many people skip this step and wonder why their results look off.&lt;/p&gt;

&lt;p&gt;Our database has four tables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;customers&lt;/strong&gt; - 50 records. Each one has a name, email, phone number, the date they registered, and a membership tier: Bronze, Silver, or Gold.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;products&lt;/strong&gt; - 15 items spread across Electronics, Appliances, and Accessories. Prices range from a $25 wireless mouse all the way to a $1,500 smart TV.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sales&lt;/strong&gt; - 15 transactions. Each one links a customer to a product, records how many units were sold, the date, and the total amount.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;inventory&lt;/strong&gt; -  stock levels for each product.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Small dataset. Big concepts. The patterns you learn here work exactly the same way when you are dealing with millions of rows.&lt;/p&gt;




&lt;h2&gt;
  
  
  Start Simple, Stay Curious
&lt;/h2&gt;

&lt;p&gt;Every SQL query starts the same way , you ask for something, and the database gives it back to you.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That second query is where things start getting interesting. You are not just pulling data , you are filtering it. You are asking a specific question: &lt;em&gt;which products cost more than $500?&lt;/em&gt; That shift in thinking, from "give me everything" to "give me exactly this," is the foundation of good data analysis.&lt;/p&gt;

&lt;p&gt;Aggregates take it a step further. Instead of looking at individual rows, you start summarising them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_products&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;average_price&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing that trips people up early on is the difference between &lt;code&gt;WHERE&lt;/code&gt; and &lt;code&gt;HAVING&lt;/code&gt;. They look similar and both filter data, but they work at completely different stages. &lt;code&gt;WHERE&lt;/code&gt; runs before any grouping happens , it narrows down the rows going in. &lt;code&gt;HAVING&lt;/code&gt; runs after grouping , it filters the results that came out. Mix them up and you either get an error or, worse, results that look right but are actually wrong.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;quantity_sold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_qty&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;quantity_sold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  JOINs — Where the Real Magic Happens
&lt;/h2&gt;

&lt;p&gt;Here is the thing about relational databases: the interesting answers almost never live in a single table. You need to pull things together. That is what JOINs do.&lt;/p&gt;

&lt;p&gt;The most common one is the INNER JOIN , it only returns rows where there is a match on both sides:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;quantity_sold&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That one query answers: who bought what, and how much of it? Three tables, one clean result.&lt;/p&gt;

&lt;p&gt;LEFT JOINs are for when you need to keep rows even if there is no match on the other side. The classic use case is finding gaps , things that should have happened but did not:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Products that have never been sold&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is called an anti-join. You are not looking for what matches,you are looking for what does not. It is one of the most useful patterns in SQL and one of the least talked about in beginner tutorials.&lt;/p&gt;

&lt;p&gt;There is also the self-join, which sounds strange until you need it. Joining a table to itself lets you compare rows within the same dataset , like finding all pairs of customers who share the same membership tier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;c1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;c1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;c2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;c2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;c1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;membership_status&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="n"&gt;c1&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="n"&gt;c2&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;c1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;membership_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;c2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;membership_status&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;c1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;c2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;c1.customer_id &amp;lt; c2.customer_id&lt;/code&gt; at the end is easy to miss but important — without it, you get every customer paired with themselves, and every pair listed twice.&lt;/p&gt;




&lt;h2&gt;
  
  
  Subqueries — Answering Questions Inside Questions
&lt;/h2&gt;

&lt;p&gt;Some questions cannot be answered in one shot. You need to calculate something first, then use that result to filter or compare. That is where subqueries come in.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Products priced above their own category average&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p2&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;p2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The inner query runs first and figures out the average price per category. The outer query then compares each product against that benchmark. This is called a correlated subquery because the inner part depends on the outer part , specifically on &lt;code&gt;p.category&lt;/code&gt;. It is powerful, though on very large tables it can be slow. That is a trade-off worth knowing.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;NOT EXISTS&lt;/code&gt; is one of the cleanest constructs in SQL for its clarity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Customers who never made a purchase&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the &lt;code&gt;SELECT 1&lt;/code&gt; inside , it does not matter what you select there. You are not asking for data. You are just checking: does a matching row exist? If it does not, the customer makes the cut. Clean, readable, efficient.&lt;/p&gt;




&lt;h2&gt;
  
  
  CTEs — Writing SQL You Can Actually Read Later
&lt;/h2&gt;

&lt;p&gt;Nested subqueries work, but they get ugly fast. Stack a few of them and your query starts looking like a puzzle that even you cannot solve two weeks later.&lt;/p&gt;

&lt;p&gt;CTEs — Common Table Expressions  fix that. You define named, temporary result sets using the &lt;code&gt;WITH&lt;/code&gt; keyword and refer to them cleanly in the main query. The logic does not change. The readability does.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Without CTE:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;sub&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;inner_sub&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;With CTE:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;customer_spending&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;avg_spending&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_spent&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_spending&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_spending&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;avg_spent&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;avg_spending&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same result. The CTE version reads almost like plain English ,here is the spending per customer, here is the average, now show me everyone above it. That is the kind of code that survives handovers and 3am debugging sessions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Window Functions — The Thing That Changes How You Think About SQL
&lt;/h2&gt;

&lt;p&gt;If there is one part of SQL that genuinely changes the way you approach data problems, it is window functions. Once you understand them, a whole category of questions that used to feel complicated becomes straightforward.&lt;/p&gt;

&lt;p&gt;The key difference from &lt;code&gt;GROUP BY&lt;/code&gt;: window functions compute across rows without collapsing them. You still see every individual row , you just get extra computed columns alongside them.&lt;/p&gt;

&lt;p&gt;Ranking is the simplest example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every customer keeps their own row. You just get a rank column added.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;NTILE&lt;/code&gt; is great for segmentation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NTILE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;quartile&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Split your customers into four equal groups by spending. Bucket 1 is your top 25%. That is a customer segmentation model in six lines of SQL.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;LAG&lt;/code&gt; and &lt;code&gt;LEAD&lt;/code&gt; let you look at the row before or after the current one , incredibly useful for trend analysis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;sale_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;previous_sale&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LEAD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;next_sale&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And running totals — a staple of any financial report:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;sale_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;running_total&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;PARTITION BY&lt;/code&gt; is where things get really interesting. It lets you run window functions independently within groups , like ranking products inside each category separately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank_in_category&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each category gets its own ranking that resets to 1. Electronics and Appliances are ranked independently. People often try to solve this kind of thing with complicated joins or multiple queries, when a single window function handles it cleanly.&lt;/p&gt;




&lt;h2&gt;
  
  
  CASE WHEN — Making Your Data Tell a Story
&lt;/h2&gt;

&lt;p&gt;Raw numbers are fine for machines. People need context. &lt;code&gt;CASE WHEN&lt;/code&gt; is how you translate numbers into meaning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Premium'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Standard'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Budget'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;price_category&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can do the same for customers based on their total spending:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;20000&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'VIP'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="mi"&gt;20000&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Regular'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'New'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer_level&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the &lt;code&gt;COALESCE&lt;/code&gt; wrapping the &lt;code&gt;SUM&lt;/code&gt;. Customers with zero purchases would return &lt;code&gt;NULL&lt;/code&gt; from the sum  &lt;code&gt;COALESCE&lt;/code&gt; catches that and turns it into &lt;code&gt;0&lt;/code&gt; before the &lt;code&gt;CASE&lt;/code&gt; logic runs. Small detail, big difference in your results.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Production Stuff Nobody Talks About Enough
&lt;/h2&gt;

&lt;p&gt;Writing queries is one skill. Building something reliable that other people can use is another. These are the tools that close that gap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stored functions&lt;/strong&gt; let you wrap logic into something reusable and callable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;get_customer_spending&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cust_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;DECLARE&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="nb"&gt;NUMERIC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cust_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now instead of rewriting that aggregation every time, you call &lt;code&gt;get_customer_spending(42)&lt;/code&gt;. One place to maintain if anything ever changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Indexes&lt;/strong&gt; are a performance tool most beginners ignore until their queries start timing out on real data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;idx_sales_product_id&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells PostgreSQL to build a sorted lookup structure on that column. JOINs and WHERE filters on indexed columns can run orders of magnitude faster. The trade-off is slightly slower inserts , but for read-heavy analytical workloads, that trade is almost always worth it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Views&lt;/strong&gt; are saved queries that behave like tables, great for hiding complexity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;VIEW&lt;/span&gt; &lt;span class="n"&gt;product_revenue&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And &lt;strong&gt;transactions&lt;/strong&gt; are how you protect your data when multiple things need to happen together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1999&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;98&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;stock_quantity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stock_quantity&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Either both statements execute, or neither does. A sale gets recorded and stock gets updated  or nothing changes at all. That guarantee matters more than most people realise until the day a partial update corrupts their production data.&lt;/p&gt;




&lt;h2&gt;
  
  
  On Dirty Data — It Will Humble You
&lt;/h2&gt;

&lt;p&gt;No matter how well-designed a database is, real-world data finds a way to be messy. NULLs show up where you did not expect them. Phone numbers are empty strings instead of NULL. Emails have trailing spaces or inconsistent capitalisation. These things happen constantly.&lt;/p&gt;

&lt;p&gt;SQL gives you the tools:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'No Email Provided'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;clean_name&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;phone_number&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="n"&gt;phone_number&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not glamorous work. But dirty data going into an aggregation means wrong numbers going into a report, which means bad decisions being made somewhere upstream. Treating data cleaning seriously is not optional , it is part of the craft.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three Bugs Worth Learning From
&lt;/h2&gt;

&lt;p&gt;These mistakes show up constantly, even in experienced developers' work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Forgetting to include non-aggregated columns in GROUP BY:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Breaks&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Works&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Using WHERE to filter an aggregate result:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Breaks&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Works&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Getting the JOIN condition backwards:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Wrong — no error thrown, just nonsense results&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Right&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last one is especially sneaky. It does not crash. It just quietly returns results that make no logical sense, and you might not notice until someone is staring at a report asking why the numbers look strange.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where to Go From Here
&lt;/h2&gt;

&lt;p&gt;SQL has been around since the 1970s and there is a good reason it has outlasted almost every technology trend that has come and gone in that time. It is precise, it is expressive, and it is built around a simple idea , your data has answers in it, and SQL is how you ask for them.&lt;/p&gt;

&lt;p&gt;The gap between someone who is just starting and someone who is genuinely good at this is not really about syntax. Syntax is easy to look up. The gap is in how you think. Whether you can take a business question, break it into logical steps, and translate each one into a query. That kind of thinking builds up with practice, with reading other people's SQL, and honestly with making enough mistakes that the right patterns start to feel natural.&lt;/p&gt;

&lt;p&gt;Get comfortable with JOINs until they stop feeling complicated. Then go learn window functions properly , that is probably the single investment in SQL that pays off fastest. Everything else tends to fall into place after that.&lt;/p&gt;

&lt;p&gt;The data is already there. It is just waiting to be asked the right questions.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Written by **Collins Njeru&lt;/em&gt;* — Nairobi, Kenya.*&lt;/p&gt;

</description>
      <category>database</category>
      <category>node</category>
      <category>sql</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>5 SQL FUNCTIONS EVERY BEGINNER SHOULD KNOW</title>
      <dc:creator>Collins Njeru</dc:creator>
      <pubDate>Mon, 20 Apr 2026 08:23:01 +0000</pubDate>
      <link>https://dev.to/cnew_aerospace_85c7b7d3cb/5-sql-functions-every-beginner-should-know-6o6</link>
      <guid>https://dev.to/cnew_aerospace_85c7b7d3cb/5-sql-functions-every-beginner-should-know-6o6</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Data is the new oil — and SQL is the refinery."&lt;/em&gt;&lt;br&gt;&lt;br&gt;
— Anonymous Data Engineer&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1544383835-bda2bc66a55d%3Fw%3D1200%26auto%3Dformat%26fit%3Dcrop%26q%3D80" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1544383835-bda2bc66a55d%3Fw%3D1200%26auto%3Dformat%26fit%3Dcrop%26q%3D80" alt="SQL Banner" width="1200" height="833"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: A developer working with databases — the everyday environment of an SQL practitioner.&lt;/em&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Introduction&lt;/li&gt;
&lt;li&gt;What Is SQL?&lt;/li&gt;
&lt;li&gt;Function 1 — COUNT()&lt;/li&gt;
&lt;li&gt;Function 2 — SUM()&lt;/li&gt;
&lt;li&gt;Function 3 — AVG()&lt;/li&gt;
&lt;li&gt;Function 4 — UPPER() and LOWER()&lt;/li&gt;
&lt;li&gt;Function 5 — CONCAT()&lt;/li&gt;
&lt;li&gt;Combining Functions Together&lt;/li&gt;
&lt;li&gt;Best Practices&lt;/li&gt;
&lt;li&gt;Conclusion&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;If you have ever opened a spreadsheet and thought, &lt;em&gt;"There has to be a better way to manage this data,"&lt;/em&gt; — you were right. That better way is &lt;strong&gt;SQL&lt;/strong&gt; (Structured Query Language), the industry-standard language for talking to databases.&lt;/p&gt;

&lt;p&gt;Whether you are a data analyst, a backend developer, a business intelligence professional, or simply a curious learner, SQL is one of the most valuable skills you can add to your toolkit. It is estimated that over &lt;strong&gt;75% of companies&lt;/strong&gt; rely on relational databases for their core operations, and every single one of them uses SQL in some form.&lt;/p&gt;

&lt;p&gt;However, learning SQL can feel overwhelming at first. There are dozens of functions, keywords, clauses, and syntax rules to absorb. The good news? You do not need to know everything to be effective. In fact, mastering just &lt;strong&gt;five core SQL functions&lt;/strong&gt; will empower you to answer real business questions, clean messy data, and produce meaningful reports.&lt;/p&gt;

&lt;p&gt;In this article, we will explore those five essential SQL functions in depth — with clear syntax breakdowns, real-world examples, practical use cases, and tips to help you avoid common beginner mistakes.&lt;/p&gt;

&lt;p&gt;Let us get started.&lt;/p&gt;


&lt;h2&gt;
  
  
  What Is SQL?
&lt;/h2&gt;

&lt;p&gt;Before diving into the functions, it helps to have a brief refresher on what SQL actually is and how it operates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL&lt;/strong&gt; stands for &lt;strong&gt;Structured Query Language&lt;/strong&gt;. It is a domain-specific language designed for managing and manipulating data stored in &lt;strong&gt;relational database management systems (RDBMS)&lt;/strong&gt; such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MySQL&lt;/li&gt;
&lt;li&gt;PostgreSQL&lt;/li&gt;
&lt;li&gt;Microsoft SQL Server&lt;/li&gt;
&lt;li&gt;SQLite&lt;/li&gt;
&lt;li&gt;Oracle Database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1558494949-ef010cbdcc31%3Fw%3D1200%26auto%3Dformat%26fit%3Dcrop%26q%3D80" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1558494949-ef010cbdcc31%3Fw%3D1200%26auto%3Dformat%26fit%3Dcrop%26q%3D80" alt="Database Diagram" width="1200" height="673"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: A visual representation of how relational databases organize data into tables with rows and columns.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;SQL operates on a simple principle: data is organized into &lt;strong&gt;tables&lt;/strong&gt; (similar to spreadsheets), where each table has &lt;strong&gt;columns&lt;/strong&gt; (attributes/fields) and &lt;strong&gt;rows&lt;/strong&gt; (records/entries). You write queries to &lt;strong&gt;select&lt;/strong&gt;, &lt;strong&gt;filter&lt;/strong&gt;, &lt;strong&gt;sort&lt;/strong&gt;, &lt;strong&gt;group&lt;/strong&gt;, &lt;strong&gt;insert&lt;/strong&gt;, &lt;strong&gt;update&lt;/strong&gt;, or &lt;strong&gt;delete&lt;/strong&gt; data from these tables.&lt;/p&gt;

&lt;p&gt;SQL commands are generally categorized into:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Full Name&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DQL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Data Query Language&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SELECT&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DML&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Data Manipulation Language&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, &lt;code&gt;DELETE&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DDL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Data Definition Language&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;CREATE&lt;/code&gt;, &lt;code&gt;ALTER&lt;/code&gt;, &lt;code&gt;DROP&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DCL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Data Control Language&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;GRANT&lt;/code&gt;, &lt;code&gt;REVOKE&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For this article, we will focus on &lt;strong&gt;SQL functions&lt;/strong&gt;, which are built-in operations that process and return values from your data. Specifically, we will cover the five most beginner-friendly and most commonly used SQL functions.&lt;/p&gt;


&lt;h2&gt;
  
  
  Function 1 — &lt;code&gt;COUNT()&lt;/code&gt;
&lt;/h2&gt;
&lt;h3&gt;
  
  
  What It Does
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;COUNT()&lt;/code&gt; function returns the &lt;strong&gt;number of rows&lt;/strong&gt; that match a specified condition. It is one of the most frequently used SQL functions and is essential for generating summary statistics and reports.&lt;/p&gt;
&lt;h3&gt;
  
  
  Syntax
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;COUNT(*)&lt;/code&gt; to count &lt;strong&gt;all rows&lt;/strong&gt;, including those with &lt;code&gt;NULL&lt;/code&gt; values.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;COUNT(column_name)&lt;/code&gt; to count rows where that column is &lt;strong&gt;not NULL&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;
  
  
  Real-World Example
&lt;/h3&gt;

&lt;p&gt;Imagine you work at an e-commerce company and you have an &lt;code&gt;orders&lt;/code&gt; table like this:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;order_id&lt;/th&gt;
&lt;th&gt;customer_name&lt;/th&gt;
&lt;th&gt;product&lt;/th&gt;
&lt;th&gt;status&lt;/th&gt;
&lt;th&gt;amount&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Alice Njeri&lt;/td&gt;
&lt;td&gt;Laptop&lt;/td&gt;
&lt;td&gt;Completed&lt;/td&gt;
&lt;td&gt;85000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Brian Otieno&lt;/td&gt;
&lt;td&gt;Headphones&lt;/td&gt;
&lt;td&gt;Pending&lt;/td&gt;
&lt;td&gt;4500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Carol Wanjiku&lt;/td&gt;
&lt;td&gt;Phone&lt;/td&gt;
&lt;td&gt;Completed&lt;/td&gt;
&lt;td&gt;35000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;David Kamau&lt;/td&gt;
&lt;td&gt;Keyboard&lt;/td&gt;
&lt;td&gt;Cancelled&lt;/td&gt;
&lt;td&gt;2800&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Eve Achieng&lt;/td&gt;
&lt;td&gt;Monitor&lt;/td&gt;
&lt;td&gt;Completed&lt;/td&gt;
&lt;td&gt;22000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt; How many orders are in the table?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_orders&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;total_orders&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt; How many orders have the status &lt;code&gt;'Completed'&lt;/code&gt;?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;completed_orders&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Completed'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;completed_orders&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  Use Cases for &lt;code&gt;COUNT()&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Counting total customers in a database&lt;/li&gt;
&lt;li&gt;Counting how many products are out of stock&lt;/li&gt;
&lt;li&gt;Counting the number of users who signed up this month&lt;/li&gt;
&lt;li&gt;Counting failed transactions in a payment system&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Beginner Mistake
&lt;/h3&gt;

&lt;p&gt;Beginners often confuse &lt;code&gt;COUNT(*)&lt;/code&gt; with &lt;code&gt;COUNT(column_name)&lt;/code&gt;. Remember:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Counts ALL rows (even if some columns are NULL)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Counts only rows where 'email' is NOT NULL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Function 2 — &lt;code&gt;SUM()&lt;/code&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What It Does
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;SUM()&lt;/code&gt; function calculates the &lt;strong&gt;total sum&lt;/strong&gt; of a numeric column. It is indispensable for financial reporting, sales tracking, inventory management, and any scenario where you need an aggregate total.&lt;/p&gt;

&lt;h3&gt;
  
  
  Syntax
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Real-World Example
&lt;/h3&gt;

&lt;p&gt;Using the same &lt;code&gt;orders&lt;/code&gt; table from above:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt; What is the total revenue from all completed orders?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Completed'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;total_revenue&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;142000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This tells you that completed orders have brought in a total of &lt;strong&gt;KES 142,000&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;You can also use &lt;code&gt;SUM()&lt;/code&gt; with &lt;code&gt;GROUP BY&lt;/code&gt; to break totals down by category:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue_by_status&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;status&lt;/th&gt;
&lt;th&gt;revenue_by_status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Completed&lt;/td&gt;
&lt;td&gt;142000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pending&lt;/td&gt;
&lt;td&gt;4500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cancelled&lt;/td&gt;
&lt;td&gt;2800&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1551288049-bebda4e38f71%3Fw%3D1200%26auto%3Dformat%26fit%3Dcrop%26q%3D80" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1551288049-bebda4e38f71%3Fw%3D1200%26auto%3Dformat%26fit%3Dcrop%26q%3D80" alt="Data Analysis" width="1200" height="800"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: Data analysis and financial summaries are classic use cases for the SUM() function.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Use Cases for &lt;code&gt;SUM()&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Total sales revenue for a period&lt;/li&gt;
&lt;li&gt;Total inventory value in a warehouse&lt;/li&gt;
&lt;li&gt;Total hours logged by employees in a week&lt;/li&gt;
&lt;li&gt;Total amount of discounts applied to orders&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Common Beginner Mistake
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;SUM()&lt;/code&gt; only works with &lt;strong&gt;numeric columns&lt;/strong&gt;. Attempting it on a text column will throw an error. Also, &lt;code&gt;SUM()&lt;/code&gt; &lt;strong&gt;ignores NULL values&lt;/strong&gt; automatically — which is usually the desired behavior, but good to know!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- This will FAIL if 'product' is a text column&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;--  Error&lt;/span&gt;

&lt;span class="c1"&gt;-- This is CORRECT&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;--  Works&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Function 3 — &lt;code&gt;AVG()&lt;/code&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What It Does
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;AVG()&lt;/code&gt; function calculates the &lt;strong&gt;arithmetic mean (average)&lt;/strong&gt; of a set of numeric values. Rather than knowing the total, &lt;code&gt;AVG()&lt;/code&gt; tells you the typical or central value — which is often more meaningful for decision-making.&lt;/p&gt;

&lt;h3&gt;
  
  
  Syntax
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Real-World Example
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt; What is the average order amount across all orders?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;average_order_value&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;average_order_value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;29860.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This tells the business that, on average, each order is worth approximately &lt;strong&gt;KES 29,860&lt;/strong&gt; — a useful figure for forecasting and setting minimum order targets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Combining with GROUP BY:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_amount_by_status&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;status&lt;/th&gt;
&lt;th&gt;avg_amount_by_status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Completed&lt;/td&gt;
&lt;td&gt;47333.33&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pending&lt;/td&gt;
&lt;td&gt;4500.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cancelled&lt;/td&gt;
&lt;td&gt;2800.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  Use Cases for &lt;code&gt;AVG()&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Average salary across a department&lt;/li&gt;
&lt;li&gt;Average student score in an exam&lt;/li&gt;
&lt;li&gt;Average delivery time for shipments&lt;/li&gt;
&lt;li&gt;Average customer rating for a product&lt;/li&gt;
&lt;li&gt;Average page views per day on a website&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Beginner Mistake
&lt;/h3&gt;

&lt;p&gt;Like &lt;code&gt;SUM()&lt;/code&gt;, &lt;code&gt;AVG()&lt;/code&gt; ignores &lt;code&gt;NULL&lt;/code&gt; values. This means if some rows in your column are &lt;code&gt;NULL&lt;/code&gt;, the average is calculated only from the non-null rows — which may or may not be what you want.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- If 5 rows exist but only 3 have values, AVG uses 3 rows&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;-- Only non-NULL ratings are averaged&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you need to include &lt;code&gt;NULL&lt;/code&gt; as zero in your average, use &lt;code&gt;COALESCE()&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rating&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Function 4 — &lt;code&gt;UPPER()&lt;/code&gt; and &lt;code&gt;LOWER()&lt;/code&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What They Do
&lt;/h3&gt;

&lt;p&gt;These are &lt;strong&gt;string functions&lt;/strong&gt; that convert text to either &lt;strong&gt;all uppercase&lt;/strong&gt; (&lt;code&gt;UPPER()&lt;/code&gt;) or &lt;strong&gt;all lowercase&lt;/strong&gt; (&lt;code&gt;LOWER()&lt;/code&gt;). While they seem simple, they are critical for data cleaning, standardization, and consistent formatting across databases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Syntax
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Convert to uppercase&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;UPPER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Convert to lowercase&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Real-World Example
&lt;/h3&gt;

&lt;p&gt;Consider a &lt;code&gt;customers&lt;/code&gt; table where email addresses and names have been entered inconsistently:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;customer_id&lt;/th&gt;
&lt;th&gt;name&lt;/th&gt;
&lt;th&gt;email&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;alice njeri&lt;/td&gt;
&lt;td&gt;&lt;a href="mailto:ALICE@GMAIL.COM"&gt;ALICE@GMAIL.COM&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;BRIAN OTIENO&lt;/td&gt;
&lt;td&gt;&lt;a href="mailto:brian@yahoo.com"&gt;brian@yahoo.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Carol Wanjiku&lt;/td&gt;
&lt;td&gt;&lt;a href="mailto:CAROL@OUTLOOK.COM"&gt;CAROL@OUTLOOK.COM&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Problem:&lt;/strong&gt; The data is inconsistent — some names are lowercase, some uppercase, and emails are mixed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution:&lt;/strong&gt; Use &lt;code&gt;LOWER()&lt;/code&gt; to standardize emails and &lt;code&gt;UPPER()&lt;/code&gt; to format names properly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Standardize all emails to lowercase&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;UPPER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;formatted_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;standardized_email&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;customer_id&lt;/th&gt;
&lt;th&gt;formatted_name&lt;/th&gt;
&lt;th&gt;standardized_email&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;ALICE NJERI&lt;/td&gt;
&lt;td&gt;&lt;a href="mailto:alice@gmail.com"&gt;alice@gmail.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;BRIAN OTIENO&lt;/td&gt;
&lt;td&gt;&lt;a href="mailto:brian@yahoo.com"&gt;brian@yahoo.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;CAROL WANJIKU&lt;/td&gt;
&lt;td&gt;&lt;a href="mailto:carol@outlook.com"&gt;carol@outlook.com&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You can also use &lt;code&gt;LOWER()&lt;/code&gt; in a &lt;code&gt;WHERE&lt;/code&gt; clause for &lt;strong&gt;case-insensitive searching&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find customer regardless of how their email was entered&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'alice@gmail.com'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures your search works whether the email is stored as &lt;code&gt;ALICE@GMAIL.COM&lt;/code&gt;, &lt;code&gt;Alice@Gmail.Com&lt;/code&gt;, or &lt;code&gt;alice@gmail.com&lt;/code&gt;.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1504868584819-f8e8b4b6d7e3%3Fw%3D1200%26auto%3Dformat%26fit%3Dcrop%26q%3D80" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1504868584819-f8e8b4b6d7e3%3Fw%3D1200%26auto%3Dformat%26fit%3Dcrop%26q%3D80" alt="Data Cleaning" width="1200" height="795"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Image: Data cleaning and standardization — exactly what UPPER() and LOWER() help accomplish.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Use Cases for &lt;code&gt;UPPER()&lt;/code&gt; and &lt;code&gt;LOWER()&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Standardizing email addresses before sending bulk emails&lt;/li&gt;
&lt;li&gt;Normalizing user input for case-insensitive login authentication&lt;/li&gt;
&lt;li&gt;Formatting customer names consistently for reports&lt;/li&gt;
&lt;li&gt;Ensuring uniform data before comparing or joining tables&lt;/li&gt;
&lt;li&gt;Cleaning imported CSV data that has inconsistent casing&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Common Beginner Mistake
&lt;/h3&gt;

&lt;p&gt;These functions do &lt;strong&gt;not modify the actual data&lt;/strong&gt; in the database. They only transform the output in your query result. To permanently change the data, you need an &lt;code&gt;UPDATE&lt;/code&gt; statement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- This only changes the display output, NOT the stored data:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;--  Display only&lt;/span&gt;

&lt;span class="c1"&gt;-- This permanently updates the stored data:&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;--  Permanent change&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Function 5 — &lt;code&gt;CONCAT()&lt;/code&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What It Does
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;CONCAT()&lt;/code&gt; function &lt;strong&gt;joins two or more strings together&lt;/strong&gt; into a single string. It is incredibly useful for combining columns, formatting output, building dynamic messages, and creating full names or addresses from separate fields.&lt;/p&gt;

&lt;h3&gt;
  
  
  Syntax
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Standard CONCAT()&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;string1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Alternative using || operator (works in PostgreSQL and SQLite)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;string1&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;string2&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Real-World Example
&lt;/h3&gt;

&lt;p&gt;Suppose your &lt;code&gt;employees&lt;/code&gt; table stores first and last names in separate columns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;emp_id&lt;/th&gt;
&lt;th&gt;first_name&lt;/th&gt;
&lt;th&gt;last_name&lt;/th&gt;
&lt;th&gt;department&lt;/th&gt;
&lt;th&gt;city&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Alice&lt;/td&gt;
&lt;td&gt;Njeri&lt;/td&gt;
&lt;td&gt;Engineering&lt;/td&gt;
&lt;td&gt;Nairobi&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Brian&lt;/td&gt;
&lt;td&gt;Otieno&lt;/td&gt;
&lt;td&gt;Marketing&lt;/td&gt;
&lt;td&gt;Mombasa&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Carol&lt;/td&gt;
&lt;td&gt;Wanjiku&lt;/td&gt;
&lt;td&gt;Finance&lt;/td&gt;
&lt;td&gt;Nairobi&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt; Generate a full name and a formatted label for each employee.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;emp_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;' works in '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;' — '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;employee_summary&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;emp_id&lt;/th&gt;
&lt;th&gt;full_name&lt;/th&gt;
&lt;th&gt;employee_summary&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Alice Njeri&lt;/td&gt;
&lt;td&gt;Alice works in Engineering — Nairobi&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Brian Otieno&lt;/td&gt;
&lt;td&gt;Brian works in Marketing — Mombasa&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Carol Wanjiku&lt;/td&gt;
&lt;td&gt;Carol works in Finance — Nairobi&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You can also combine &lt;code&gt;CONCAT()&lt;/code&gt; with &lt;code&gt;UPPER()&lt;/code&gt; for even more powerful string formatting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;UPPER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;UPPER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;full_name_caps&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;full_name_caps&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ALICE NJERI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BRIAN OTIENO&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CAROL WANJIKU&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  Use Cases for &lt;code&gt;CONCAT()&lt;/code&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Building full names from &lt;code&gt;first_name&lt;/code&gt; and &lt;code&gt;last_name&lt;/code&gt; columns&lt;/li&gt;
&lt;li&gt;Creating a full address from &lt;code&gt;street&lt;/code&gt;, &lt;code&gt;city&lt;/code&gt;, and &lt;code&gt;country&lt;/code&gt; fields&lt;/li&gt;
&lt;li&gt;Generating personalized email greetings like &lt;code&gt;"Dear Alice Njeri,"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Combining a product code with a category code to form a SKU&lt;/li&gt;
&lt;li&gt;Building dynamic SQL strings within stored procedures&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Beginner Mistake
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;CONCAT()&lt;/code&gt; treats &lt;strong&gt;NULL values differently&lt;/strong&gt; depending on the database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- In MySQL, CONCAT() returns NULL if ANY argument is NULL:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Hello'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'World'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;-- Returns: NULL &lt;/span&gt;

&lt;span class="c1"&gt;-- Use COALESCE() or CONCAT_WS() to handle NULLs safely:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Hello'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'World'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;-- Returns: HelloWorld &lt;/span&gt;

&lt;span class="c1"&gt;-- CONCAT_WS (Concatenate With Separator) skips NULLs automatically:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CONCAT_WS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;middle_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;-- &lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Combining Functions Together
&lt;/h2&gt;

&lt;p&gt;The true power of SQL functions emerges when you &lt;strong&gt;combine them&lt;/strong&gt; in a single query. Here is an example that uses &lt;code&gt;COUNT()&lt;/code&gt;, &lt;code&gt;SUM()&lt;/code&gt;, &lt;code&gt;AVG()&lt;/code&gt;, and &lt;code&gt;CONCAT()&lt;/code&gt; together in one query to produce a rich report:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_employees&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_salary_budget&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;average_salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Dept: '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;UPPER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;department_label&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_salary_budget&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This single query tells you — for each department:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How many employees exist&lt;/li&gt;
&lt;li&gt;The total salary budget&lt;/li&gt;
&lt;li&gt;The average salary&lt;/li&gt;
&lt;li&gt;A nicely formatted department label&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the kind of query you would write to build a management dashboard or a financial summary report.&lt;/p&gt;




&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Always Use Aliases (&lt;code&gt;AS&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;Use the &lt;code&gt;AS&lt;/code&gt; keyword to give your function outputs meaningful names. It makes your results readable and your queries professional.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Hard to read:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Much better:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_orders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_order_value&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Handle NULL Values Proactively
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;COALESCE()&lt;/code&gt; or &lt;code&gt;IFNULL()&lt;/code&gt; to deal with &lt;code&gt;NULL&lt;/code&gt; values before passing them to aggregate functions, especially when calculating averages or concatenating strings.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Use &lt;code&gt;GROUP BY&lt;/code&gt; with Aggregate Functions
&lt;/h3&gt;

&lt;p&gt;When using &lt;code&gt;COUNT()&lt;/code&gt;, &lt;code&gt;SUM()&lt;/code&gt;, or &lt;code&gt;AVG()&lt;/code&gt; alongside non-aggregate columns, always pair them with a &lt;code&gt;GROUP BY&lt;/code&gt; clause.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;--  This will cause an error in most databases:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;--  This is correct:&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Test on Small Datasets First
&lt;/h3&gt;

&lt;p&gt;Before running aggregate queries on millions of rows, test on a limited subset using &lt;code&gt;LIMIT&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Comment Your Queries
&lt;/h3&gt;

&lt;p&gt;Use &lt;code&gt;--&lt;/code&gt; for single-line comments and &lt;code&gt;/* */&lt;/code&gt; for multi-line comments to document your queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Get a revenue summary by order status&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;-- Number of orders per status&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_amount&lt;/span&gt;  &lt;span class="c1"&gt;-- Total value per status&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;SQL does not have to be intimidating. By mastering just five core functions — &lt;code&gt;COUNT()&lt;/code&gt;, &lt;code&gt;SUM()&lt;/code&gt;, &lt;code&gt;AVG()&lt;/code&gt;, &lt;code&gt;UPPER()&lt;/code&gt;/&lt;code&gt;LOWER()&lt;/code&gt;, and &lt;code&gt;CONCAT()&lt;/code&gt; — you already have the tools to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Summarize and analyze data&lt;/li&gt;
&lt;li&gt;Clean and standardize messy datasets&lt;/li&gt;
&lt;li&gt;Build meaningful reports and dashboards&lt;/li&gt;
&lt;li&gt;Answer real business questions with confidence&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>sql</category>
      <category>beginners</category>
      <category>luxdevhq</category>
      <category>harunmbaabu</category>
    </item>
    <item>
      <title>SQL IS NOT BAD - It's Running the World And Most Developers Don't Know Half of It....</title>
      <dc:creator>Collins Njeru</dc:creator>
      <pubDate>Fri, 17 Apr 2026 12:44:00 +0000</pubDate>
      <link>https://dev.to/cnew_aerospace_85c7b7d3cb/sql-is-not-bad-its-running-the-world-and-most-developers-dont-know-half-of-it-2jjo</link>
      <guid>https://dev.to/cnew_aerospace_85c7b7d3cb/sql-is-not-bad-its-running-the-world-and-most-developers-dont-know-half-of-it-2jjo</guid>
      <description>&lt;h2&gt;
  
  
  &lt;em&gt;"Every database, every app, every swipe of your card — SQL is there."&lt;/em&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The language that outlived every trend since 1974
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Written for developers, analysts, and data enthusiasts at every level
&lt;/h4&gt;

&lt;h5&gt;
  
  
  From zero to advanced — in one article
&lt;/h5&gt;

&lt;h6&gt;
  
  
  Last updated: 2026 | PostgreSQL-first, broadly compatible
&lt;/h6&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"In an era obsessed with the newest JavaScript framework or the latest AI buzzword, SQL has done something remarkable , it has outlived every trend thrown at it since&lt;/em&gt; &lt;strong&gt;&lt;em&gt;1974.&lt;/em&gt;&lt;/strong&gt;"*&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;NoSQL was supposed to &lt;strong&gt;kill it&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Big Data was supposed to &lt;strong&gt;replace it&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
Pandas, Spark, and cloud warehouses were going to &lt;strong&gt;make it irrelevant&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Yet here we are. SQL is baked into everything from &lt;strong&gt;PostgreSQL&lt;/strong&gt; and &lt;strong&gt;MySQL&lt;/strong&gt; to &lt;strong&gt;Snowflake&lt;/strong&gt;, &lt;strong&gt;BigQuery&lt;/strong&gt;, and &lt;strong&gt;DuckDB&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;del&gt;SQL is dying.&lt;/del&gt; ← False. SQL is thriving.&lt;/p&gt;

&lt;p&gt;If you work with data — &lt;strong&gt;&lt;em&gt;in any capacity&lt;/em&gt;&lt;/strong&gt; — SQL fluency is not optional. It's oxygen.&lt;/p&gt;


&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;The Anatomy of a SQL Query&lt;/li&gt;
&lt;li&gt;JOINs Demystified&lt;/li&gt;
&lt;li&gt;Aggregation and GROUP BY&lt;/li&gt;
&lt;li&gt;Window Functions — SQL's Hidden Superpower&lt;/li&gt;
&lt;li&gt;Subqueries vs CTEs&lt;/li&gt;
&lt;li&gt;Indexing — Why Your Queries Are Slow&lt;/li&gt;
&lt;li&gt;Transactions and ACID&lt;/li&gt;
&lt;li&gt;10 SQL Mistakes Hurting Your Performance&lt;/li&gt;
&lt;li&gt;Advanced SQL Patterns&lt;/li&gt;
&lt;li&gt;The Future of SQL&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  1. The Anatomy of a SQL Query
&lt;/h2&gt;

&lt;p&gt;Before we run, let's walk. A SQL query has a very specific &lt;strong&gt;logical execution order&lt;/strong&gt; that trips up even &lt;em&gt;experienced&lt;/em&gt; developers.&lt;/p&gt;

&lt;p&gt;The order you &lt;strong&gt;&lt;em&gt;write&lt;/em&gt;&lt;/strong&gt; a query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_orders&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kenya'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;
&lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_orders&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The order the database &lt;strong&gt;&lt;em&gt;executes&lt;/em&gt;&lt;/strong&gt; it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. FROM       → Identify the tables
2. JOIN       → Combine related data
3. WHERE      → Filter rows (before grouping)
4. GROUP BY   → Collapse rows into groups
5. HAVING     → Filter groups (after aggregation)
6. SELECT     → Choose columns to return
7. ORDER BY   → Sort the result
8. LIMIT      → Cut result to N rows
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; &lt;code&gt;WHERE&lt;/code&gt; runs &lt;em&gt;before&lt;/em&gt; &lt;code&gt;GROUP BY&lt;/code&gt;. This means you &lt;strong&gt;cannot&lt;/strong&gt; filter on an aggregated column using &lt;code&gt;WHERE&lt;/code&gt;. That's what &lt;code&gt;HAVING&lt;/code&gt; is for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nested tip:&lt;/strong&gt; Get this order wrong, and you'll spend hours debugging a query that looks perfectly fine.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  What You Should Know Before Writing Any Query
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Always&lt;/strong&gt; understand the shape of your data first&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Never&lt;/strong&gt; assume column types — check them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use&lt;/strong&gt; &lt;code&gt;EXPLAIN&lt;/code&gt; before running on large tables&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Limit&lt;/strong&gt; your results while exploring&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Your SQL Query Checklist
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;[x] Understand the execution order&lt;/li&gt;
&lt;li&gt;[x] Know which table is the "driving" table&lt;/li&gt;
&lt;li&gt;[x] Add a &lt;code&gt;LIMIT&lt;/code&gt; when exploring&lt;/li&gt;
&lt;li&gt;[ ] Run &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; to check the query plan&lt;/li&gt;
&lt;li&gt;[ ] Check indexes on &lt;code&gt;JOIN&lt;/code&gt; and &lt;code&gt;WHERE&lt;/code&gt; columns&lt;/li&gt;
&lt;li&gt;[ ] Peer-review complex queries before production&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. JOINs Demystified
&lt;/h2&gt;

&lt;p&gt;JOINs are the &lt;strong&gt;heart&lt;/strong&gt; of relational databases. If you understand them deeply, you understand SQL.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Main JOIN Types
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;JOIN Type&lt;/th&gt;
&lt;th&gt;What It Returns&lt;/th&gt;
&lt;th&gt;Use Case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;INNER JOIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rows matching in &lt;strong&gt;both&lt;/strong&gt; tables&lt;/td&gt;
&lt;td&gt;Most common join&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LEFT JOIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All left rows + matched right rows&lt;/td&gt;
&lt;td&gt;Find unmatched records&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RIGHT JOIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All right rows + matched left rows&lt;/td&gt;
&lt;td&gt;Rare — prefer LEFT JOIN&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FULL OUTER JOIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All rows from both tables&lt;/td&gt;
&lt;td&gt;Complete data comparison&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;CROSS JOIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cartesian product of both tables&lt;/td&gt;
&lt;td&gt;Combinations/permutations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;SELF JOIN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A table joined to itself&lt;/td&gt;
&lt;td&gt;Hierarchies, org charts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Visual Mental Model
&lt;/h3&gt;

&lt;p&gt;Think of two overlapping circles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;INNER JOIN&lt;/strong&gt; → &lt;em&gt;only&lt;/em&gt; the overlap&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LEFT JOIN&lt;/strong&gt; → entire left circle + overlap&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FULL OUTER JOIN&lt;/strong&gt; → both circles entirely&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A Common Real-World Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find customers who have NEVER placed an order&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;The trick of using &lt;code&gt;LEFT JOIN ... WHERE right_table.id IS NULL&lt;/code&gt; to find &lt;em&gt;non-matching&lt;/em&gt; rows is one of the most &lt;strong&gt;elegant patterns in SQL&lt;/strong&gt;. Learn it. Use it often.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The SELF JOIN — More Useful Than You Think
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find employees and their managers (from the same table)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;employee&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;manager&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;manager_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  3. Aggregation and GROUP BY
&lt;/h2&gt;

&lt;p&gt;Aggregation is where SQL transforms raw rows into &lt;strong&gt;&lt;em&gt;business intelligence&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Aggregate Functions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                    &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_staff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                 &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                 &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;highest_salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;MIN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                 &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;lowest_salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                 &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_payroll&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DISTINCT&lt;/span&gt; &lt;span class="n"&gt;job_title&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;unique_roles&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_payroll&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Golden Rule of GROUP BY
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Every column in &lt;code&gt;SELECT&lt;/code&gt; must &lt;em&gt;either&lt;/em&gt; be in &lt;code&gt;GROUP BY&lt;/code&gt; &lt;em&gt;or&lt;/em&gt; be inside an aggregate function.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Wrong vs Right
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;--  WRONG — 'name' is neither grouped nor aggregated&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;--  CORRECT&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;headcount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Filtering Groups with HAVING
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Only departments with more than 10 employees earning above $50,000&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;headcount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt;          &lt;span class="c1"&gt;-- filters rows BEFORE grouping&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;
&lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;          &lt;span class="c1"&gt;-- filters groups AFTER aggregation&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;headcount&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  4. Window Functions — SQL's Hidden Superpower
&lt;/h2&gt;

&lt;p&gt;If there is &lt;strong&gt;one&lt;/strong&gt; feature that separates &lt;em&gt;SQL beginners&lt;/em&gt; from &lt;strong&gt;&lt;em&gt;SQL professionals&lt;/em&gt;&lt;/strong&gt;, it is window functions.&lt;/p&gt;

&lt;p&gt;Window functions let you perform calculations across related rows — &lt;strong&gt;without collapsing&lt;/strong&gt; the result like &lt;code&gt;GROUP BY&lt;/code&gt; does.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Syntax
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;function_name&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;column&lt;/span&gt;     &lt;span class="c1"&gt;-- Optional: group rows into windows&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;column&lt;/span&gt;         &lt;span class="c1"&gt;-- Optional: define order within the window&lt;/span&gt;
    &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;        &lt;span class="c1"&gt;-- Optional: define the window frame&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Ranking Functions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;        &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DENSE_RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dense_rank&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;row_num&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Difference Between Ranking Functions
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Function&lt;/th&gt;
&lt;th&gt;Handles Ties&lt;/th&gt;
&lt;th&gt;Example Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ROW_NUMBER()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Always unique&lt;/td&gt;
&lt;td&gt;1, 2, 3, 4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;RANK()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Skips after tie&lt;/td&gt;
&lt;td&gt;1, 1, 3, 4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;DENSE_RANK()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No skip after tie&lt;/td&gt;
&lt;td&gt;1, 1, 2, 3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Running Totals and Moving Averages
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Running total of sales with a 7-day moving average&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;sale_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;running_total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;sale_date&lt;/span&gt;
        &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="k"&gt;PRECEDING&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;CURRENT&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;moving_7day_avg&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  LAG and LEAD — Comparing to Neighboring Rows
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Month-over-month revenue growth&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;prev_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LEAD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;next_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;revenue&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revenue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;growth&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Window functions are the reason SQL is still king for analytics.&lt;/strong&gt; No other tool lets you express this kind of computation as cleanly.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  5. Subqueries vs CTEs
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Subquery — Powerful but Messy at Scale
&lt;/h3&gt;

&lt;p&gt;A query &lt;em&gt;nested inside&lt;/em&gt; another query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer_totals&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;customer_totals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
        &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;sub&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works — but grows &lt;em&gt;unreadable&lt;/em&gt; fast.&lt;/p&gt;

&lt;h3&gt;
  
  
  The CTE (Common Table Expression) — Clean, Named, Reusable
&lt;/h3&gt;

&lt;p&gt;CTEs let you &lt;strong&gt;name intermediate results&lt;/strong&gt; using the &lt;code&gt;WITH&lt;/code&gt; keyword:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;customer_totals&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;avg_spend&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_val&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_totals&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_spent&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customer_totals&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;avg_spend&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;avg_spend&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;avg_val&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Prefer CTEs&lt;/strong&gt; over nested subqueries for anything beyond one level. Your teammates — &lt;em&gt;and your future self&lt;/em&gt; — will thank you.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Recursive CTEs — Conquering Hierarchical Data
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Traverse an employee org chart tree&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="k"&gt;RECURSIVE&lt;/span&gt; &lt;span class="n"&gt;org_chart&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="c1"&gt;-- Base case: the CEO (no manager)&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;manager_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;manager_id&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;

    &lt;span class="k"&gt;UNION&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;

    &lt;span class="c1"&gt;-- Recursive case: employees joined to their managers&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;manager_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;oc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
    &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;org_chart&lt;/span&gt; &lt;span class="n"&gt;oc&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;manager_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;oc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;org_chart&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  When to Use Each
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Subquery&lt;/strong&gt; — one-off, simple, single-level nesting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CTE&lt;/strong&gt; — multi-step logic, readable complex queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recursive CTE&lt;/strong&gt; — trees, graphs, hierarchies&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  6. Indexing — Why Your Queries Are Slow
&lt;/h2&gt;

&lt;p&gt;No SQL article is complete without talking about &lt;strong&gt;performance&lt;/strong&gt;. And no performance topic matters more than &lt;strong&gt;&lt;em&gt;indexing&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Is an Index?
&lt;/h3&gt;

&lt;p&gt;An index is a &lt;strong&gt;separate data structure&lt;/strong&gt; (usually a B-tree) that lets the database find rows &lt;em&gt;without scanning every record&lt;/em&gt;. Think of it as the index in the back of a textbook.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Speed Difference Is Staggering
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Without index — scanning 50 million rows&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'alice@example.com'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- ⏱ Time: 8.4 seconds&lt;/span&gt;

&lt;span class="c1"&gt;-- Create the index&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_customers_email&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- With index — direct lookup&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'alice@example.com'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="c1"&gt;-- ⏱ Time: 0.003 seconds&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  When to Create an Index
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Columns used frequently in &lt;code&gt;WHERE&lt;/code&gt; clauses&lt;/li&gt;
&lt;li&gt;Columns used in &lt;code&gt;JOIN&lt;/code&gt; conditions&lt;/li&gt;
&lt;li&gt;Columns used in &lt;code&gt;ORDER BY&lt;/code&gt; or &lt;code&gt;GROUP BY&lt;/code&gt; on large tables&lt;/li&gt;
&lt;li&gt;Foreign key columns&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  When Indexes &lt;em&gt;Hurt&lt;/em&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Columns with very low cardinality (e.g., &lt;code&gt;is_active&lt;/code&gt; = TRUE/FALSE only)&lt;/li&gt;
&lt;li&gt;Tables written to far more often than they're read&lt;/li&gt;
&lt;li&gt;Too many indexes slow down &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, and &lt;code&gt;DELETE&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  EXPLAIN — Your Best Debugging Friend
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kenya'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Use &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; to see whether the database does a full table scan (&lt;code&gt;Seq Scan&lt;/code&gt;) or uses an index (&lt;code&gt;Index Scan&lt;/code&gt;). &lt;strong&gt;Never optimize blindly.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  7. Transactions and ACID
&lt;/h2&gt;

&lt;p&gt;Every time you move money, book a seat, or submit an order — you're trusting &lt;strong&gt;transactions&lt;/strong&gt; to make sure nothing breaks mid-operation.&lt;/p&gt;

&lt;h3&gt;
  
  
  The ACID Properties
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Property&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Real-World Guarantee&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;A&lt;/strong&gt;tomicity&lt;/td&gt;
&lt;td&gt;All-or-nothing&lt;/td&gt;
&lt;td&gt;No partial transfers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;C&lt;/strong&gt;onsistency&lt;/td&gt;
&lt;td&gt;Valid state → valid state&lt;/td&gt;
&lt;td&gt;Rules always enforced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;I&lt;/strong&gt;solation&lt;/td&gt;
&lt;td&gt;Concurrent transactions don't interfere&lt;/td&gt;
&lt;td&gt;No dirty reads&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;D&lt;/strong&gt;urability&lt;/td&gt;
&lt;td&gt;Committed data survives crashes&lt;/td&gt;
&lt;td&gt;Your data is safe&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  A Bank Transfer — The Classic Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt; &lt;span class="n"&gt;TRANSACTION&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Step 1: Deduct from sender&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;accounts&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'alice'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Step 2: Add to receiver&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;accounts&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'bob'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- If both succeed:&lt;/span&gt;
&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- If anything fails:&lt;/span&gt;
&lt;span class="c1"&gt;-- ROLLBACK;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Without transactions:&lt;/strong&gt; a server crash between Step 1 and Step 2 means Alice &lt;em&gt;loses&lt;/em&gt; money that Bob &lt;em&gt;never receives&lt;/em&gt;.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;With transactions:&lt;/strong&gt; either both steps happen, or neither does.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  8. 10 SQL Mistakes Hurting Your Performance
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;del&gt;SELECT *&lt;/del&gt; — Never in Production
&lt;/h3&gt;

&lt;p&gt;Always name the columns you need. &lt;code&gt;SELECT *&lt;/code&gt; fetches &lt;em&gt;unnecessary&lt;/em&gt; data, breaks when schemas change, and prevents index-only scans.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Top 10 Offenders
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;SELECT *&lt;/code&gt; in production code&lt;/strong&gt; — fetch only what you need&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not filtering before joining&lt;/strong&gt; — &lt;code&gt;WHERE&lt;/code&gt; early, not after millions of rows are joined&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Functions on indexed columns in WHERE&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;   &lt;span class="c1"&gt;--  Kills the index&lt;/span&gt;
   &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="nb"&gt;YEAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2024&lt;/span&gt;

   &lt;span class="c1"&gt;--  Preserves the index&lt;/span&gt;
   &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-01'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="s1"&gt;'2025-01-01'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;HAVING&lt;/code&gt; for non-aggregated filters&lt;/strong&gt; — use &lt;code&gt;WHERE&lt;/code&gt; before grouping&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;OR&lt;/code&gt; on indexed columns&lt;/strong&gt; — can prevent index use; try &lt;code&gt;UNION ALL&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;NOT IN&lt;/code&gt; with NULLs&lt;/strong&gt; — returns &lt;em&gt;no rows&lt;/em&gt; if subquery has a NULL; use &lt;code&gt;NOT EXISTS&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing indexes on foreign keys&lt;/strong&gt; — always index FK columns used in JOINs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;N+1 queries&lt;/strong&gt; — never query inside a loop; use JOINs or batch queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implicit type conversions&lt;/strong&gt; — comparing &lt;code&gt;VARCHAR&lt;/code&gt; to integers forces a cast on every row&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Forgetting &lt;code&gt;LIMIT&lt;/code&gt; while exploring&lt;/strong&gt; — 3 rows in dev can be 30 million in prod&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  9. Advanced SQL Patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  UPSERT — Insert or Update in One Statement
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- PostgreSQL syntax&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;login_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'alice@example.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;CONFLICT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;login_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;login_count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;last_seen&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pivot with CASE WHEN
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Turn rows into columns&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;quarter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Q1'&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Q1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;quarter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Q2'&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Q2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;quarter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Q3'&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Q3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;CASE&lt;/span&gt; &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;quarter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Q4'&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt; &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Q4&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Find Duplicates
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;occurrences&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;
&lt;span class="k"&gt;HAVING&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;occurrences&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Gap Detection — Finding Missing IDs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;gap_start&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;MAX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Percentile Calculation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;PERCENTILE_CONT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;WITHIN&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;median_salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;PERCENTILE_CONT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;WITHIN&lt;/span&gt; &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;p90_salary&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  10. The Future of SQL
&lt;/h2&gt;

&lt;p&gt;Far from fading, SQL is &lt;strong&gt;evolving aggressively&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; &lt;strong&gt;DuckDB&lt;/strong&gt; — &lt;em&gt;in-process&lt;/em&gt; analytics SQL that runs on your laptop with serious performance&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Snowflake / BigQuery / Redshift&lt;/strong&gt; — cloud data warehouses where SQL scales to &lt;em&gt;petabytes&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;dbt (data build tool)&lt;/strong&gt; — SQL as software engineering, with version control and testing&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;SQL on JSON&lt;/strong&gt; — modern databases now query deeply nested JSON &lt;em&gt;natively&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;AI-assisted SQL&lt;/strong&gt; — LLMs generate SQL, but knowing it deeply makes you 10x better at &lt;em&gt;validating&lt;/em&gt; that output&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;SQL will be the &lt;strong&gt;&lt;em&gt;last&lt;/em&gt;&lt;/strong&gt; programming language to die.&lt;br&gt;&lt;br&gt;
And it won't.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Your SQL Learning Roadmap
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Beginner
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;[ ] SELECT, FROM, WHERE basics&lt;/li&gt;
&lt;li&gt;[ ] INNER JOIN and LEFT JOIN&lt;/li&gt;
&lt;li&gt;[ ] GROUP BY and aggregate functions&lt;/li&gt;
&lt;li&gt;[ ] ORDER BY and LIMIT&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Intermediate
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;[ ] All JOIN types&lt;/li&gt;
&lt;li&gt;[ ] Subqueries and CTEs&lt;/li&gt;
&lt;li&gt;[ ] HAVING vs WHERE&lt;/li&gt;
&lt;li&gt;[ ] NULL handling&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Advanced
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Window functions (RANK, LAG, LEAD, running totals)&lt;/li&gt;
&lt;li&gt;[ ] Recursive CTEs&lt;/li&gt;
&lt;li&gt;[ ] Indexing strategy and EXPLAIN&lt;/li&gt;
&lt;li&gt;[ ] Transactions and isolation levels&lt;/li&gt;
&lt;li&gt;[ ] Query optimization techniques&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;SQL is not just a tool. It's a &lt;strong&gt;way of thinking about data&lt;/strong&gt; — relationally, declaratively, and &lt;em&gt;powerfully&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The developers and analysts who master SQL don't just write better queries. They:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Think more &lt;em&gt;clearly&lt;/em&gt; about data problems&lt;/li&gt;
&lt;li&gt;Debug &lt;em&gt;faster&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Communicate &lt;em&gt;better&lt;/em&gt; across engineering, product, and business teams&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The language is &lt;strong&gt;50+ years old&lt;/strong&gt; and still deeply relevant. That is not an accident — it's &lt;strong&gt;&lt;em&gt;proof&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start writing better SQL today. The data is waiting.&lt;/strong&gt;&lt;/p&gt;




</description>
    </item>
    <item>
      <title>TAMING DATA CHAOS IN POWER BI: A Guide to Joins, Relationships, and Schemas</title>
      <dc:creator>Collins Njeru</dc:creator>
      <pubDate>Sun, 29 Mar 2026 19:07:40 +0000</pubDate>
      <link>https://dev.to/cnew_aerospace_85c7b7d3cb/taming-data-chaos-in-power-bi-a-guide-to-joins-relationships-and-schemas-53me</link>
      <guid>https://dev.to/cnew_aerospace_85c7b7d3cb/taming-data-chaos-in-power-bi-a-guide-to-joins-relationships-and-schemas-53me</guid>
      <description>&lt;p&gt;Data modeling is the backbone of effective analytics in Power BI. It defines how tables connect, interact, and provide meaningful insights. Without a proper model, even the most advanced visuals can mislead. This article explores SQL joins, Power BI relationships, schemas, and common modeling practices using a &lt;strong&gt;customer dataset&lt;/strong&gt; as an example.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Data Modeling?
&lt;/h2&gt;

&lt;p&gt;Data modeling is the process of structuring data to represent real-world entities and their relationships. In Power BI, this involves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tables&lt;/strong&gt;: Fact tables (transactions, metrics) and Dimension tables (descriptive attributes).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relationships&lt;/strong&gt;: Logical connections between tables.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schemas&lt;/strong&gt;: The overall design of how tables are organized.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A well-designed model ensures that filters, measures, and visuals behave as expected. Poor modeling often leads to incorrect totals, duplicated counts, or slow performance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Example Dataset
&lt;/h2&gt;

&lt;p&gt;We’ll use two simple tables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Customers&lt;/strong&gt;: &lt;code&gt;CustomerID&lt;/code&gt;, &lt;code&gt;Name&lt;/code&gt;, &lt;code&gt;Region&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orders&lt;/strong&gt;: &lt;code&gt;OrderID&lt;/code&gt;, &lt;code&gt;CustomerID&lt;/code&gt;, &lt;code&gt;OrderDate&lt;/code&gt;, &lt;code&gt;Amount&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This dataset is small, but it illustrates the principles that scale to enterprise-level models.&lt;/p&gt;




&lt;h2&gt;
  
  
  SQL Joins Explained
&lt;/h2&gt;

&lt;p&gt;Joins combine data from multiple tables based on a common key. In Power BI, joins are performed in &lt;strong&gt;Power Query&lt;/strong&gt; using &lt;em&gt;Merge Queries&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. INNER JOIN
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Definition&lt;/strong&gt;: Returns rows with matching keys in both tables.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example&lt;/strong&gt;: Customers who placed orders.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diagram (ASCII)&lt;/strong&gt;:&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Useful when analyzing only active customers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. LEFT JOIN
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Definition&lt;/strong&gt;: Returns all rows from the left table and matching rows from the right.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example&lt;/strong&gt;: All customers, with orders if they exist.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Identify customers who have not placed orders.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. RIGHT JOIN
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Definition&lt;/strong&gt;: Returns all rows from the right table and matching rows from the left.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example&lt;/strong&gt;: All orders, with customer details if available.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Ensures no order is excluded even if customer data is missing.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. FULL OUTER JOIN
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Definition&lt;/strong&gt;: Returns all rows when there is a match in either table.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example&lt;/strong&gt;: All customers and all orders, matched where possible.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Data reconciliation across systems.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. LEFT ANTI JOIN
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Definition&lt;/strong&gt;: Returns rows from the left table that have no match in the right.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example&lt;/strong&gt;: Customers who never placed an order.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Marketing campaigns targeting inactive customers.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  6. RIGHT ANTI JOIN
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Definition&lt;/strong&gt;: Returns rows from the right table that have no match in the left.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Example&lt;/strong&gt;: Orders without a customer record.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Detecting data quality issues.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Power BI Relationships
&lt;/h2&gt;

&lt;p&gt;Relationships define how tables interact in the &lt;strong&gt;Model View&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Types of Relationships
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One-to-Many (1:M)&lt;/strong&gt;: One customer → many orders. Most common.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Many-to-Many (M:M)&lt;/strong&gt;: Both sides can have multiple matches. Requires bridge tables.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-to-One (1:1)&lt;/strong&gt;: Rare. One employee → one profile.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cardinality
&lt;/h3&gt;

&lt;p&gt;Cardinality defines the uniqueness of values in a relationship. For example, &lt;code&gt;CustomerID&lt;/code&gt; is unique in &lt;code&gt;Customers&lt;/code&gt; but repeats in &lt;code&gt;Orders&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Active vs Inactive Relationships
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Active&lt;/strong&gt;: Default relationship used in visuals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inactive&lt;/strong&gt;: Can be activated using DAX functions like:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CALCULATE(
    SUM(Orders[Amount]),
    USERELATIONSHIP(Customers[CustomerID], Orders[CustomerID])
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cross-Filter Direction
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single&lt;/strong&gt;: Filters flow one way (e.g., Customers → Orders).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Both&lt;/strong&gt;: Filters flow both ways, useful for complex models but can cause ambiguity.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Joins vs Relationships
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Joins&lt;/strong&gt;: Combine data during query (Power Query).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Relationships&lt;/strong&gt;: Define logical connections in the data model (Model View).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Think of joins as data preparation and relationships as data modeling. Both are essential, but they serve different purposes&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  Fact vs Dimension Tables
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fact Tables&lt;/strong&gt;: Contain metrics (sales, revenue). Example: Orders.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dimension Tables&lt;/strong&gt;: Contain descriptive attributes (customer, product). Example: Customers.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Separating facts and dimensions improves clarity and performance. Facts answer “what happened?” while dimensions answer “who, what, when, where?&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Schemas in Power BI
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Star Schema
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structure&lt;/strong&gt;: Central fact table connected to dimension tables.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use Case&lt;/strong&gt;: Best practice for performance and clarity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: Orders linked to Customers, Products, Dates**&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Snowflake Schema
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structure&lt;/strong&gt;: Dimensions normalized into multiple related tables.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use Case&lt;/strong&gt;: When dimensions have hierarchical attributes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: Customers linked to Regions and Countries.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Flat Table (DLAT)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structure&lt;/strong&gt;: All data in one table.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use Case&lt;/strong&gt;: Quick prototypes, but poor for scalability.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Role-Playing Dimensions
&lt;/h2&gt;

&lt;p&gt;Sometimes the same dimension is used multiple times. Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Date Dimension&lt;/strong&gt;: Used for Order Date, Ship Date, and Delivery Date.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Solution&lt;/strong&gt;: Duplicate the dimension table and rename accordingly:,&lt;code&gt;Date_Order&lt;/code&gt;,&lt;code&gt;Date_Ship&lt;/code&gt;,&lt;code&gt;Date_Delivery&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;This avoids ambiguity and allows precise filtering&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  Common Modeling Issues
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ambiguous relationships&lt;/strong&gt;: Multiple paths between tables can confuse filters.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Circular references&lt;/strong&gt;: Loops in relationships cause errors.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Performance bottlenecks&lt;/strong&gt;: Using flat tables or M:M relationships excessively.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Inactive filters&lt;/strong&gt;: Forgetting to activate relationships in DAX.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step-by-Step in Power BI
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Load Data&lt;/strong&gt;: Import Customers and Orders.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Power Query Joins&lt;/strong&gt;: Use Merge Queries for SQL-style joins.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Model View&lt;/strong&gt;: Define relationships (CustomerID → CustomerID).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Schema Design&lt;/strong&gt;: Organize into star or snowflake schemas.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Validate: Build visuals&lt;/strong&gt; (e.g., total sales by region) to confirm filters work.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Hands-On DAX Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Total Sales by Region
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Total&lt;/span&gt; &lt;span class="n"&gt;Sales&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Orders&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Customers Without Orders
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Inactive&lt;/span&gt; &lt;span class="n"&gt;Customers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="n"&gt;CALCULATETABLE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="n"&gt;Customers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="k"&gt;NOT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RELATEDTABLE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Orders&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Sales by Order Date vs Ship Date
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Sales&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;Ship&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="n"&gt;CALCULATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Orders&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
&lt;span class="n"&gt;USERELATIONSHIP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Orders&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ShipDate&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Real-Life Use Cases
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Retail&lt;/strong&gt;: Identify customers who haven’t purchased recently (LEFT ANTI JOIN).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Finance&lt;/strong&gt;: Reconcile transactions across systems (FULL OUTER JOIN).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Logistics&lt;/strong&gt;: Track shipments using role-playing Date dimensions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Marketing&lt;/strong&gt;: Segment customers by region and purchase behavior.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Data modeling in Power BI is about clarity, efficiency, and accuracy. By mastering joins, relationships, schemas, and best practices, you ensure that your dashboards tell the right story. Whether you’re building a star schema or handling role-playing dimensions, thoughtful modeling is the key to reliable insights.&lt;/p&gt;

&lt;p&gt;With a clean model, you can confidently answer questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Which regions have the highest sales?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Which customers are inactive?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How do shipping delays affect revenue?&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Power BI provides the tools,your job is to design the model that makes the data speak.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>luxdevhq</category>
      <category>programming</category>
      <category>database</category>
    </item>
    <item>
      <title>LINUX AS THE NERVOUS SYSTEM OF DATA ENGINEERING</title>
      <dc:creator>Collins Njeru</dc:creator>
      <pubDate>Sat, 28 Mar 2026 21:26:57 +0000</pubDate>
      <link>https://dev.to/cnew_aerospace_85c7b7d3cb/linux-as-the-nervous-system-of-data-engineering-12hk</link>
      <guid>https://dev.to/cnew_aerospace_85c7b7d3cb/linux-as-the-nervous-system-of-data-engineering-12hk</guid>
      <description>&lt;p&gt;Data engineering is the backbone of modern data-driven organizations. It enables the collection, transformation, and delivery of data at scale. While tools like &lt;strong&gt;Apache Spark&lt;/strong&gt;, &lt;strong&gt;Hadoop&lt;/strong&gt;, and &lt;strong&gt;Kafka&lt;/strong&gt; are essential, the operating system powering these tools is equally critical.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Linux&lt;/strong&gt; has emerged as the preferred OS due to its stability, scalability, flexibility, and open-source nature. This article explores Linux’s role in real-world data engineering, including essential skills, workflow management, tool integration, cloud deployment, and practical examples.&lt;/p&gt;




&lt;h2&gt;
  
  
  WHY LINUX DOMINATES DATA ENINEERING
&lt;/h2&gt;

&lt;p&gt;Linux has become the de facto standard for data engineers due to several key advantages:&lt;/p&gt;

&lt;h3&gt;
  
  
  Open-Source Flexibility
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Fully customizable for specific workloads
&lt;/li&gt;
&lt;li&gt;Kernel can be optimized for performance
&lt;/li&gt;
&lt;li&gt;Lightweight distributions work well for containerized workflows
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Stability and Uptime
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Runs continuously with minimal downtime
&lt;/li&gt;
&lt;li&gt;Ideal for mission-critical production pipelines
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cost-Effectiveness
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Free to use, reducing infrastructure costs
&lt;/li&gt;
&lt;li&gt;Scales easily without expensive licenses
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Community Support
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Extensive documentation, forums, and troubleshooting resources
&lt;/li&gt;
&lt;li&gt;Large community of contributors and developers&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  CORE LINUX SKILLS FOR DATA ENGINEERS
&lt;/h2&gt;




&lt;h3&gt;
  
  
  1. FILE SYSTEM NAVIGATION
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# List files&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt;

&lt;span class="c"&gt;# Change directory&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; /path/to/directory

&lt;span class="c"&gt;# Show current working directory&lt;/span&gt;
&lt;span class="nb"&gt;pwd&lt;/span&gt;

&lt;span class="c"&gt;# Find files&lt;/span&gt;
find /path/to/search &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s2"&gt;"dataset.csv"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  2. PROCESS MANAGEMENT
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Show all running processes&lt;/span&gt;
ps aux

&lt;span class="c"&gt;# Monitor system resource usage&lt;/span&gt;
top

&lt;span class="c"&gt;# Kill a specific process&lt;/span&gt;
&lt;span class="nb"&gt;kill&lt;/span&gt; &amp;lt;pid&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  3. SHELL SCRIPTING
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Download and process data&lt;/span&gt;
wget http://example.com/dataset.csv
python process_data.py dataset.csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  4.Permission &amp;amp; Ownership
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Change file permissions&lt;/span&gt;
&lt;span class="nb"&gt;chmod &lt;/span&gt;755 my_file.txt

&lt;span class="c"&gt;# Change file ownership&lt;/span&gt;
&lt;span class="nb"&gt;chown &lt;/span&gt;user:group my_file.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  5.Package Management
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install a package on Debian-based systems&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;package-name

&lt;span class="c"&gt;# Update all packages&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;apt upgrade
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  lINUX IN DATA PIPELINES
&lt;/h2&gt;




&lt;h3&gt;
  
  
  1. Scheduling Tasks with Cron
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Edit cron jobs&lt;/span&gt;
crontab &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="c"&gt;# Schedule a pipeline to run every hour&lt;/span&gt;
0 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /home/user/data_pipeline.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  2. Automating ETL with Shell Scripts
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# Download data&lt;/span&gt;
wget http://example.com/data.csv

&lt;span class="c"&gt;# Transform data&lt;/span&gt;
&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-F&lt;/span&gt;, &lt;span class="s1"&gt;'{print $1, $2, $3}'&lt;/span&gt; data.csv &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; transformed_data.csv

&lt;span class="c"&gt;# Load into PostgreSQL&lt;/span&gt;
psql &lt;span class="nt"&gt;-U&lt;/span&gt; user &lt;span class="nt"&gt;-d&lt;/span&gt; dbname &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\c&lt;/span&gt;&lt;span class="s2"&gt;opy my_table FROM transformed_data.csv WITH CSV"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  3. Logging Pipeline Output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;: Pipeline started"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/data_pipeline.log
python etl_script.py &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/data_pipeline.log 2&amp;gt;&amp;amp;1
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;: Pipeline finished"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/data_pipeline.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  INTEGRATION WITH DATA ENGINEERING TOOLS
&lt;/h2&gt;




&lt;h3&gt;
  
  
  1.Apache Hadoop
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Execute a Hadoop job&lt;/span&gt;
hadoop jar /usr/local/hadoop/hadoop-examples.jar wordcount /input /output
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  2.Apache Kafka
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start Zookeeper&lt;/span&gt;
bin/zookeeper-server-start.sh config/zookeeper.properties

&lt;span class="c"&gt;# Start Kafka broker&lt;/span&gt;
bin/kafka-server-start.sh config/server.properties

&lt;span class="c"&gt;# Produce messages&lt;/span&gt;
bin/kafka-console-producer.sh &lt;span class="nt"&gt;--topic&lt;/span&gt; my_topic &lt;span class="nt"&gt;--bootstrap-server&lt;/span&gt; localhost:9092

&lt;span class="c"&gt;# Consume messages&lt;/span&gt;
bin/kafka-console-consumer.sh &lt;span class="nt"&gt;--topic&lt;/span&gt; my_topic &lt;span class="nt"&gt;--from-beginning&lt;/span&gt; &lt;span class="nt"&gt;--bootstrap-server&lt;/span&gt; localhost:9092
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  3.Apache Spark
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Submit a Spark job&lt;/span&gt;
spark-submit &lt;span class="nt"&gt;--master&lt;/span&gt; &lt;span class="nb"&gt;local&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;4] etl_spark_job.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  4.Docker &amp;amp; Kubernetes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Build Docker image&lt;/span&gt;
docker build &lt;span class="nt"&gt;-t&lt;/span&gt; mydataengineerimage &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Run Docker container&lt;/span&gt;
docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; data_pipeline_container mydataengineerimage

&lt;span class="c"&gt;# Deploy Kubernetes resources&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; data_pipeline_deployment.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  LINUX IN CLOUD AND BIG DATA ENVIRONMENTS
&lt;/h2&gt;




&lt;h3&gt;
  
  
  1.Cloud Servers and Virtual Machines
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Launch Ubuntu VM on AWS&lt;/span&gt;
aws ec2 run-instances &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--image-id&lt;/span&gt; ami-0abcdef1234567890 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--count&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--instance-type&lt;/span&gt; t2.medium &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--key-name&lt;/span&gt; MyKeyPair &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--security-group-ids&lt;/span&gt; sg-0123456789abcdef0 &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--subnet-id&lt;/span&gt; subnet-6e7f829e
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  2.Monitoring System Resources
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# CPU usage&lt;/span&gt;
top

&lt;span class="c"&gt;# Memory usage&lt;/span&gt;
free &lt;span class="nt"&gt;-h&lt;/span&gt;

&lt;span class="c"&gt;# Disk usage&lt;/span&gt;
&lt;span class="nb"&gt;df&lt;/span&gt; &lt;span class="nt"&gt;-h&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  3.Debugging and Troubleshooting
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Cheking logs
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# View system logs&lt;/span&gt;
&lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /var/log/syslog

&lt;span class="c"&gt;# View pipeline logs&lt;/span&gt;
&lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; /var/log/data_pipeline.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Killing Stuck Processes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Find process ID&lt;/span&gt;
ps aux | &lt;span class="nb"&gt;grep &lt;/span&gt;etl_script.py

&lt;span class="c"&gt;# Kill process&lt;/span&gt;
&lt;span class="nb"&gt;kill&lt;/span&gt; &lt;span class="nt"&gt;-9&lt;/span&gt; &amp;lt;pid&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  CHALLENGES OF USING LINUX IN DATA ENGINEERING
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Steep learning Curve&lt;/strong&gt; : Command-line usage can be intimidating for beginners&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging Complexity&lt;/strong&gt; : Requires familiarity with logs,permisssions and processes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automation Dependency&lt;/strong&gt; : Heavy reliance on scripts and CLI tools&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  CONCLUSION
&lt;/h2&gt;

&lt;p&gt;Linux is essential for real-world data engineering. It provides the foundation for stable, scalable, and efficient data pipelines. By mastering Linux skills, data engineers can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Build robust ETL pipelines&lt;/li&gt;
&lt;li&gt;Integrate seamlessly with Hadoop,spark and Kafka&lt;/li&gt;
&lt;li&gt;Deploy applications in cloud and containerized environments&lt;/li&gt;
&lt;li&gt;Monitor and Troubleshoot complex workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;IN today's  data-driven world,linux is more than an operating system,it is a critical enabler of modern data engineering&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>luxdevhq</category>
      <category>harunmbaabu</category>
      <category>deeplearning</category>
    </item>
    <item>
      <title>Should you join Data Engineering?A guide to the tools you'll use</title>
      <dc:creator>Collins Njeru</dc:creator>
      <pubDate>Mon, 16 Mar 2026 12:11:50 +0000</pubDate>
      <link>https://dev.to/cnew_aerospace_85c7b7d3cb/should-you-join-data-engineeringa-guide-to-the-tools-youll-use-3g9a</link>
      <guid>https://dev.to/cnew_aerospace_85c7b7d3cb/should-you-join-data-engineeringa-guide-to-the-tools-youll-use-3g9a</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;Many aspiring technologists find themselves at a crossroad:&lt;em&gt;is data engineering the right career path for me&lt;/em&gt;.The hesitation often comes from uncertainty about the tools and technologies involved. This article breaks down the core categories of data engineering tools, giving you a clear picture of what you’ll be working with if you decide to join the field.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core categories of data engineering tools
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.Data ingestion &amp;amp; Integration
&lt;/h3&gt;

&lt;p&gt;Data engineering starts with collecting information from multiple sources&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fivetran /Stitch/ Hevo Data&lt;/strong&gt; : Automate extraction from SaaS apps and databases&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbuer5qmjrjmrv9gekx00.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbuer5qmjrjmrv9gekx00.png" alt="Data Ingestion Tools" width="800" height="625"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apache Kafka&lt;/strong&gt; : Real-time streaming and event-driven pipelines. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7n64rxs8jihlf9lowd40.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7n64rxs8jihlf9lowd40.png" alt="Apache Kafka" width="800" height="305"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apache Nifi&lt;/strong&gt; : Flow-based ingestion and routing.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnkp8paozhq8sp2r3juvj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnkp8paozhq8sp2r3juvj.png" alt="Apache Nifi" width="800" height="372"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  2.Data storage &amp;amp; Warehousing
&lt;/h3&gt;

&lt;p&gt;Once data is ingested, it needs a reliable home.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Snowflake&lt;/strong&gt;:Cloud-native warehouse with scalability.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr72zj801ohtj4xwj1z3a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr72zj801ohtj4xwj1z3a.png" alt="Data Storages" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google BigQuery&lt;/strong&gt;:Serverless, highly scalable analytics warehouse. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzjetn4hfo9t7m8w6o89h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzjetn4hfo9t7m8w6o89h.png" alt="Google BigQuery" width="800" height="264"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Redshift&lt;/strong&gt; :AWS-based warehouse optimized for queries. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzirht9zu25fianpqxn3t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzirht9zu25fianpqxn3t.png" alt="Amazon Redshift" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3.Data processing &amp;amp; transformation
&lt;/h3&gt;

&lt;p&gt;Raw data must be cleaned and transformed before use.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apache spark&lt;/strong&gt;:Distributed computing for batch and streaming.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9887dmu7nsadfgwq6lnb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9887dmu7nsadfgwq6lnb.png" alt="Apache Spark" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hadoop&lt;/strong&gt;:Large-scale storage and batch processing.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6k346z24f26vp1wx0o7m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6k346z24f26vp1wx0o7m.png" alt="Hadoop" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dbt (Data Build Tool)&lt;/strong&gt;:SQL-based transformations for analytics teams.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgub2u92loxrtkjl3rly9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgub2u92loxrtkjl3rly9.png" alt="Data Build Tool" width="800" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Workflow &amp;amp; orchestration
&lt;/h3&gt;

&lt;p&gt;Pipelines need automation and scheduling.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apache Airflow&lt;/strong&gt;:Workflow automation and DAG scheduling.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flyoz7btyz41f6msg56p6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flyoz7btyz41f6msg56p6.png" alt="Apache Airflow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prefect/luigi&lt;/strong&gt; :Alternatives for managing complex workflows.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpiuirvyluadhbwyt7d9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpiuirvyluadhbwyt7d9.png" alt="Prefect/Luigi" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  5.Infrastructure &amp;amp; Deployment
&lt;/h3&gt;

&lt;p&gt;Behind the scenes, infrastructure ensures scalability.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker &amp;amp; Kubernetes&lt;/strong&gt;:Containerization and orchestration.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0v4ls7uza7hsqgg3gems.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0v4ls7uza7hsqgg3gems.png" alt="Docker &amp;amp; Kubernetes" width="800" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Terraform&lt;/strong&gt;:Infrastructure as Code for cloud resources.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fytaeidhhf9pcs28s1cls.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fytaeidhhf9pcs28s1cls.png" alt="Terraform" width="800" height="551"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  6.Monitoring &amp;amp; Quality
&lt;/h3&gt;

&lt;p&gt;Data must be trustworthy and pipelines reliable.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Great expectations&lt;/strong&gt; :Data validation and quality checks.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Datadog / Prometheus&lt;/strong&gt; :Monitoring pipelines and infrastructure&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Considerations
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Scalability&lt;/strong&gt;: Spark and Snowflake excel with large datasets.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Real-Time vs Batch&lt;/strong&gt;: Kafka is unmatched for streaming; Hadoop and Spark dominate batch workloads. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cloud Integration&lt;/strong&gt;: Align tools with your provider (AWS  Redshift, GCP  BigQuery, Azure Synapse ). &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cost&lt;/strong&gt;:Open-source tools are free but require setup; managed services reduce overhead but add licensing costs.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Joining data engineering means stepping into a field where you’ll design the backbone of modern businesses. The tools may seem overwhelming at first, but each one solves a specific problem together, they form a powerful toolkit. If you’re excited about building systems that move, store, and transform data at scale, then data engineering isn’t just a career option; it’s a future-proof calling.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>kafka</category>
      <category>apachespark</category>
      <category>snowflake</category>
    </item>
  </channel>
</rss>
