<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Brian Muriithi</title>
    <description>The latest articles on DEV Community by Brian Muriithi (@brian_muriithi).</description>
    <link>https://dev.to/brian_muriithi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3845919%2F9208a620-02fe-41cf-a528-8ef6e2d6a379.png</url>
      <title>DEV Community: Brian Muriithi</title>
      <link>https://dev.to/brian_muriithi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/brian_muriithi"/>
    <language>en</language>
    <item>
      <title>SQL Window Functions and CTEs</title>
      <dc:creator>Brian Muriithi</dc:creator>
      <pubDate>Tue, 21 Apr 2026 20:23:06 +0000</pubDate>
      <link>https://dev.to/brian_muriithi/sql-window-functions-and-ctes-m4g</link>
      <guid>https://dev.to/brian_muriithi/sql-window-functions-and-ctes-m4g</guid>
      <description>&lt;p&gt;You can do quite a lot with some of the most popular SQL queries such as SELECT, WHERE, and JOIN. But there exists some question that you cannot answer with these queries such as &lt;em&gt;What is each student's mark compared to their own average?" or "Can we rank results without losing any rows?"&lt;/em&gt; &lt;/p&gt;

&lt;p&gt;That is where window functions and CTEs come in. They sound intimidating, but once they click, you will wonder how you ever wrote SQL without them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1 - Window Functions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What even is a "window"?
&lt;/h3&gt;

&lt;p&gt;Think about a regular &lt;code&gt;GROUP BY&lt;/code&gt; query. When you group data, the individual rows collapse into summary rows. You lose the detail.&lt;/p&gt;

&lt;p&gt;A window function is different. It lets you run a calculation &lt;em&gt;across a set of rows&lt;/em&gt; while still keeping every single row in your result. The "window" is just the group of rows the function is looking at for each calculation.&lt;/p&gt;

&lt;p&gt;Here is a simple way to picture it: imagine you have a classroom of students with exam scores. You want to see each student's score AND the class average on the same row. With &lt;code&gt;GROUP BY&lt;/code&gt; alone, you cannot do that - you get one row for the average, not one row per student. A window function gives you both.&lt;/p&gt;

&lt;h3&gt;
  
  
  The basic syntax
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="k"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;FUNCTION_NAME&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;column&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;OVER()&lt;/code&gt; clause is what makes it a window function. Everything inside &lt;code&gt;OVER()&lt;/code&gt; defines the window - which rows to include and in what order.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PARTITION BY&lt;/code&gt; - splits rows into groups (like GROUP BY, but rows are kept)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ORDER BY&lt;/code&gt; - sets the order within each group&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ROW_NUMBER, RANK, and DENSE_RANK
&lt;/h3&gt;

&lt;p&gt;These three are probably the most common window functions you will use early on. They all assign numbers to rows, but they handle ties differently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;row_num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;        &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank_num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;DENSE_RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;dense_rank_num&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Say two students both scored 85. Here is what each function does:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;marks&lt;/th&gt;
&lt;th&gt;ROW_NUMBER&lt;/th&gt;
&lt;th&gt;RANK&lt;/th&gt;
&lt;th&gt;DENSE_RANK&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;90&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;78&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;ROW_NUMBER&lt;/code&gt; just counts - every row gets a unique number, no matter what&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;RANK&lt;/code&gt; gives tied rows the same number, then skips ahead (no rank 3 here)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;DENSE_RANK&lt;/code&gt; gives tied rows the same number but does not skip - the next rank is always one higher
Which one to use depends on the context. For leaderboards where gaps matter, use &lt;code&gt;RANK&lt;/code&gt;. For position tracking where you need every position filled, use &lt;code&gt;DENSE_RANK&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  PARTITION BY - running calculations per group
&lt;/h3&gt;

&lt;p&gt;This is where window functions really shine. &lt;code&gt;PARTITION BY&lt;/code&gt; lets you do calculations per group without collapsing your data.&lt;/p&gt;

&lt;p&gt;Here is a real example - showing each exam result alongside that student's personal average:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;student_avg&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;student_id&lt;/th&gt;
&lt;th&gt;marks&lt;/th&gt;
&lt;th&gt;student_avg&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;td&gt;79.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;86&lt;/td&gt;
&lt;td&gt;79.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;91&lt;/td&gt;
&lt;td&gt;88.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;86&lt;/td&gt;
&lt;td&gt;88.50&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every row stays. Each student's average is calculated only across their own rows.&lt;/p&gt;

&lt;h3&gt;
  
  
  LAG - looking at the previous row
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;LAG()&lt;/code&gt; lets you pull a value from a previous row into the current row. It is very useful for tracking progress over time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;exam_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;previous_mark&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;exam_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;improvement&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;student_id&lt;/th&gt;
&lt;th&gt;marks&lt;/th&gt;
&lt;th&gt;previous_mark&lt;/th&gt;
&lt;th&gt;improvement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;86&lt;/td&gt;
&lt;td&gt;72&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;86&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;91&lt;/td&gt;
&lt;td&gt;86&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The first result per student is &lt;code&gt;NULL&lt;/code&gt; because there is no previous row to look back at. That is expected and totally fine.&lt;/p&gt;

&lt;h3&gt;
  
  
  NTILE - splitting data into bands
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;NTILE(n)&lt;/code&gt; divides your rows into &lt;code&gt;n&lt;/code&gt; roughly equal groups. It is great for performance banding.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;result_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;NTILE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;band&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Band 1 - top third of scores&lt;/li&gt;
&lt;li&gt;Band 2 - middle third&lt;/li&gt;
&lt;li&gt;Band 3 - bottom third&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Part 2 - CTEs (Common Table Expressions)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is a CTE?
&lt;/h3&gt;

&lt;p&gt;A CTE is basically a named, temporary result set that you define at the top of your query and then use below it - like giving a subquery a proper name.&lt;/p&gt;

&lt;p&gt;The syntax looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;cte_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;cte_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;WITH&lt;/code&gt; keyword kicks it off. Everything inside the parentheses is the CTE - it runs first, gets a name, and then your main query can reference it like a regular table.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why bother? Can't I just use a subquery?
&lt;/h3&gt;

&lt;p&gt;You can. But CTEs are much easier to read and debug, especially when queries get long. Compare these two approaches:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With a subquery:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;avg_mark&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_mark&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;student_averages&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;avg_mark&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;75&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;With a CTE:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;student_averages&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ROUND&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;avg_mark&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;avg_mark&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;student_averages&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;avg_mark&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;75&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same result. But the CTE version reads like a story - first we build the averages, then we filter them. When you come back to this query a week later, you will thank yourself.&lt;/p&gt;

&lt;h3&gt;
  
  
  A practical CTE example
&lt;/h3&gt;

&lt;p&gt;Say we want to find patients from our hospital database who have had more than one appointment, and we want to show their name alongside the count.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;appointment_counts&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;patient_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;appointment_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_appointments&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;appointments&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;patient_id&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_appointments&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;patients&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;appointment_counts&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;patient_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;patient_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_appointments&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_appointments&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CTE handles the counting. The main query handles the joining and filtering. Each part does one job - and you can read it top to bottom without getting lost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick recap
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Window functions:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run calculations across rows without collapsing them&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;OVER()&lt;/code&gt; is what makes a function a window function&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PARTITION BY&lt;/code&gt; groups rows, &lt;code&gt;ORDER BY&lt;/code&gt; sets the order within the group&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ROW_NUMBER&lt;/code&gt;, &lt;code&gt;RANK&lt;/code&gt;, &lt;code&gt;DENSE_RANK&lt;/code&gt; for ranking&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LAG&lt;/code&gt; for comparing to previous rows&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;NTILE&lt;/code&gt; for banding data into groups
&lt;strong&gt;CTEs:&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Named temporary result sets defined with &lt;code&gt;WITH&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Make long queries much easier to read and debug&lt;/li&gt;
&lt;li&gt;Can be chained - one CTE can reference a previous one&lt;/li&gt;
&lt;li&gt;Great for separating logic into clear steps
Both tools do one important thing - they make your SQL easier to reason about. And when queries are easy to read, they are easy to fix, easy to share, and easy to build on.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Follow me along on this journey as we explore Data Science and Analytics&lt;/em&gt;&lt;/p&gt;

</description>
      <category>sql</category>
      <category>beginners</category>
      <category>database</category>
      <category>postgressql</category>
    </item>
    <item>
      <title>Introduction to Databases with SQL</title>
      <dc:creator>Brian Muriithi</dc:creator>
      <pubDate>Mon, 13 Apr 2026 15:47:16 +0000</pubDate>
      <link>https://dev.to/brian_muriithi/introduction-to-databases-with-sql-30h7</link>
      <guid>https://dev.to/brian_muriithi/introduction-to-databases-with-sql-30h7</guid>
      <description>&lt;p&gt;&lt;em&gt;A hands-on walkthrough of DDL, DML, DQL using a real school project&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I recently completed my first SQL project assignment: building a working school database from scratch for Nairobi Academy (&lt;em&gt;PS data used in this project is fictitious&lt;/em&gt;). That meant creating tables, populating them with data, running queries, and writing logic to label and categorize results.&lt;/p&gt;

&lt;p&gt;This article walks through what I learned, the SQL concepts behind it, and code from the actual project. Whether you're just starting out or refreshing your fundamentals, I hope this makes things clear.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Big Picture: DDL vs DML vs DQL
&lt;/h2&gt;

&lt;p&gt;Before writing a single line of SQL, it helps to understand the different categories of commands you'll use constantly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Firq7enhzdrpicpz5q1ms.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Firq7enhzdrpicpz5q1ms.png" alt="SQL categories" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DDL (Data Definition Language)&lt;/strong&gt; is for building and modifying the &lt;em&gt;structure&lt;/em&gt; of a database. Think of it as the architecture work - laying the foundation before you move any furniture in. The main commands are &lt;code&gt;CREATE&lt;/code&gt;, &lt;code&gt;ALTER&lt;/code&gt;, &lt;code&gt;DROP&lt;/code&gt;, and &lt;code&gt;TRUNCATE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DML (Data Manipulation Language)&lt;/strong&gt; is for working with the &lt;em&gt;data inside&lt;/em&gt; those structures. Once the rooms are built, DML is how you fill them, rearrange them, or clear them out. The main commands are &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, and &lt;code&gt;DELETE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here's a simple way to remember the difference:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Question it answers&lt;/th&gt;
&lt;th&gt;Commands&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DDL&lt;/td&gt;
&lt;td&gt;What does the database look like?&lt;/td&gt;
&lt;td&gt;CREATE, ALTER, DROP, TRUNCATE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DML&lt;/td&gt;
&lt;td&gt;What data is in the database?&lt;/td&gt;
&lt;td&gt;INSERT, UPDATE, DELETE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;There's also &lt;strong&gt;DQL (Data Query Language)&lt;/strong&gt; - specifically &lt;code&gt;SELECT&lt;/code&gt; - which is how you read data. We also do have &lt;strong&gt;TCL (Transaction Control Language)&lt;/strong&gt; which includes commands like &lt;code&gt;COMMIT&lt;/code&gt;, &lt;code&gt;SAVEPOINT&lt;/code&gt;, and &lt;code&gt;ROLLBACK&lt;/code&gt;. Finally we do have &lt;strong&gt;DCL (Data Control Language)&lt;/strong&gt; with commands like &lt;code&gt;GRANT&lt;/code&gt; and &lt;code&gt;REVOKE&lt;/code&gt;. We will not be covering these last two in this walkthrough. &lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1 - Building the Database (DDL in Action)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Setting Up a Schema
&lt;/h3&gt;

&lt;p&gt;A schema is a logical container that groups related tables together. For this project, everything lives inside &lt;code&gt;nairobi_academy&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="n"&gt;nairobi_academy&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;search_path&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;nairobi_academy&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;SET search_path&lt;/code&gt; tells PostgreSQL to look inside &lt;code&gt;nairobi_academy&lt;/code&gt; by default, so you don't have to prefix every table name.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Tables
&lt;/h3&gt;

&lt;p&gt;The school has three core tables: &lt;code&gt;students&lt;/code&gt;, &lt;code&gt;subjects&lt;/code&gt;, and &lt;code&gt;exam_results&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;student_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt;  &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;gender&lt;/span&gt;     &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;date_of_birth&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt;      &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;city&lt;/span&gt;       &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few design decisions worth noting here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;student_id INT PRIMARY KEY&lt;/code&gt; - every row needs a unique identifier. Primary keys cannot be NULL and must be unique across the entire table.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;NOT NULL&lt;/code&gt; on &lt;code&gt;first_name&lt;/code&gt; and &lt;code&gt;last_name&lt;/code&gt; - a student record without a name is useless, so the database enforces this at the structural level.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gender VARCHAR(1)&lt;/code&gt; - a single character stores &lt;code&gt;'M'&lt;/code&gt; or &lt;code&gt;'F'&lt;/code&gt; efficiently.&lt;/li&gt;
&lt;li&gt;Phone numbers (when they appeared later) are stored as &lt;code&gt;VARCHAR&lt;/code&gt;, not a number type. This matters because phone numbers can start with a zero and may include symbols like &lt;code&gt;+&lt;/code&gt;. Storing them as integers would break both of those cases.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;subjects&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;subject_id&lt;/span&gt;   &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;subject_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;department&lt;/span&gt;   &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;teacher_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;credits&lt;/span&gt;      &lt;span class="nb"&gt;INT&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;result_id&lt;/span&gt;  &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;student_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;subject_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;marks&lt;/span&gt;      &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;exam_date&lt;/span&gt;  &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;grade&lt;/span&gt;      &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;UNIQUE&lt;/code&gt; on &lt;code&gt;subject_name&lt;/code&gt; means the database won't allow two subjects with the same name - good data hygiene.&lt;/p&gt;

&lt;h3&gt;
  
  
  Modifying Tables with ALTER
&lt;/h3&gt;

&lt;p&gt;After the initial build, the school needed to add a phone number column, then rename a column, and then remove the phone number column entirely. This is a normal project requirement change.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Add a column&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;phone_number&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Rename a column&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;subjects&lt;/span&gt; &lt;span class="k"&gt;RENAME&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;credits&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;credit_hours&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Remove a column&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="k"&gt;DROP&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;phone_number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ALTER TABLE&lt;/code&gt; is the tool for any structural change to an existing table. The key thing to remember: dropping a column is permanent. The data in it is gone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 2 - Filling the Database (DML in Action)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  INSERT - Adding Data
&lt;/h3&gt;

&lt;p&gt;Once the structure exists, it's time to populate it. Here's the student data going in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;date_of_birth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Amina'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;'Wanjiku'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-03-12'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Brian'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;'Ochieng'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-07-25'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mombasa'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Cynthia'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mutua'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-11-25'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kisumu'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'David'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;'Kamau'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-02-18'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Esther'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Akinyi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-06-30'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nakuru'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Felix'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;'Otieno'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-09-14'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Eldoret'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Grace'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;'Mwangi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-01-22'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Hassan'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Adbi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-04-09'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mombasa'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'Ivy'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="s1"&gt;'Chebet'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-12-10'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nakuru'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'James'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="s1"&gt;'Kariuki'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2008-08-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One &lt;code&gt;INSERT&lt;/code&gt; statement can add multiple rows at once by separating each row's values with a comma. This is much cleaner than writing ten separate statements.&lt;/p&gt;

&lt;h3&gt;
  
  
  UPDATE - Correcting Records
&lt;/h3&gt;

&lt;p&gt;Two things needed fixing after the initial data entry: a student moved cities, and an exam mark was recorded wrong.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Esther moved from Nakuru to Nairobi&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;student_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Fix an incorrectly entered mark&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;59&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;result_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;WHERE&lt;/code&gt; clause here is critical. Without it, every single row in the table would be updated. Always double-check your &lt;code&gt;WHERE&lt;/code&gt; condition before running an &lt;code&gt;UPDATE&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  DELETE - Removing a Record
&lt;/h3&gt;

&lt;p&gt;One exam result was cancelled entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;result_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same rule applies - &lt;code&gt;WHERE&lt;/code&gt; is your safety net. &lt;code&gt;DELETE FROM exam_results&lt;/code&gt; with no condition would wipe the entire table.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 3 - Querying the Data
&lt;/h2&gt;

&lt;p&gt;With data in place, the real work begins: asking questions of the database.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic Filtering with WHERE
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- All Form 4 students&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Female students only&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Exam results with marks of 70 or above&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;WHERE&lt;/code&gt; filters which rows come back. Only rows where the condition is &lt;code&gt;TRUE&lt;/code&gt; are returned.&lt;/p&gt;

&lt;h3&gt;
  
  
  Combining Conditions: AND / OR
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Form 3 students who are specifically in Nairobi&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Nairobi'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Students in Form 2 or Form 4&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 2'&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;AND&lt;/code&gt; means both conditions must be true. &lt;code&gt;OR&lt;/code&gt; means at least one must be true.&lt;/p&gt;

&lt;p&gt;One important tip: when mixing &lt;code&gt;AND&lt;/code&gt; and &lt;code&gt;OR&lt;/code&gt; in the same query, use parentheses to make your logic explicit. Without them, SQL's precedence rules may not do what you expect.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 4 - BETWEEN, IN, and LIKE
&lt;/h2&gt;

&lt;p&gt;These three operators make filtering much easier.&lt;/p&gt;

&lt;h3&gt;
  
  
  BETWEEN - Range Queries
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Exam results with marks between 50 and 80 (inclusive)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Exams that took place over a specific date range&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;exam_date&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="s1"&gt;'2024-03-15'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="s1"&gt;'2024-03-18'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;BETWEEN&lt;/code&gt; is inclusive on both ends - so &lt;code&gt;BETWEEN 50 AND 80&lt;/code&gt; includes both 50 and 80 themselves. It works on numbers and dates equally well.&lt;/p&gt;

&lt;h3&gt;
  
  
  IN and NOT IN - Membership Checks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Students from specific cities&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Nairobi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mombasa'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kisumu'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Students NOT in Form 2 or Form 3&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Form 2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;IN&lt;/code&gt; is cleaner than writing multiple &lt;code&gt;OR&lt;/code&gt; conditions. Instead of &lt;code&gt;WHERE city = 'Nairobi' OR city = 'Mombasa' OR city = 'Kisumu'&lt;/code&gt;, you get one compact line.&lt;/p&gt;

&lt;h3&gt;
  
  
  LIKE - Pattern Matching
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Students whose first name starts with 'A' or 'E'&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'A%'&lt;/span&gt;
    &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'E%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Subjects containing the word 'Studies'&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;subject_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;subjects&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;subject_name&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'%Studies%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;%&lt;/code&gt; is a wildcard that matches any sequence of characters. So:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;'A%'&lt;/code&gt; means "starts with A"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;'%Studies%'&lt;/code&gt; means "contains the word Studies anywhere"&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;'%s'&lt;/code&gt; means "ends with s"&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Part 5 - COUNT
&lt;/h2&gt;

&lt;p&gt;Sometimes you don't need the rows themselves - you just need to know how many there are.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- How many students are in Form 3?&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;form_3_students&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- How many exam results scored 70 or above?&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;above_70_students&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;COUNT(*)&lt;/code&gt; counts every row that matches the condition. &lt;code&gt;COUNT(column_name)&lt;/code&gt; counts rows where that column is not NULL. For most counting purposes, &lt;code&gt;COUNT(*)&lt;/code&gt; is what you want.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 6 - CASE WHEN: Adding Logic to Queries
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;CASE WHEN&lt;/code&gt; lets you create new columns based on conditions - like an &lt;code&gt;if/else&lt;/code&gt; block inside a SQL query.&lt;/p&gt;

&lt;h3&gt;
  
  
  Labeling Exam Performance
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;result_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;marks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;grade&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Distinction'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Merit'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;marks&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Pass'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Fail'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;performance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;exam_results&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SQL evaluates each &lt;code&gt;WHEN&lt;/code&gt; condition from top to bottom and stops at the first one that's true. So a student with 85 marks hits &lt;code&gt;marks &amp;gt;= 80&lt;/code&gt; first and gets 'Distinction', even though &lt;code&gt;marks &amp;gt;= 60&lt;/code&gt; is also technically true for them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Classifying Students by Year Group
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;student_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 3'&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Form 4'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Senior'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Junior'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;student_level&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice the parentheses around &lt;code&gt;class = 'Form 3' OR class = 'Form 4'&lt;/code&gt;. Without them inside a &lt;code&gt;CASE WHEN&lt;/code&gt; that might have other conditions, the logic can go wrong. It's a small thing that can cause subtle bugs, so wrapping &lt;code&gt;OR&lt;/code&gt; conditions in parentheses is a good habit.&lt;/p&gt;

&lt;p&gt;I hope you learned a thing or two from this walkthrough. If you like to learn more about Data Science and Analytic, please give me a follow. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;This article was written as part of a SQL assignment for the Nairobi Academy project. The full SQL script is available on GitHub. Link on my bio&lt;/em&gt;&lt;/p&gt;

</description>
      <category>sql</category>
      <category>data</category>
      <category>database</category>
      <category>postgres</category>
    </item>
    <item>
      <title>How to Publish and Share a Power BI Report Online</title>
      <dc:creator>Brian Muriithi</dc:creator>
      <pubDate>Mon, 06 Apr 2026 01:28:59 +0000</pubDate>
      <link>https://dev.to/brian_muriithi/how-to-publish-and-share-a-power-bi-report-online-316j</link>
      <guid>https://dev.to/brian_muriithi/how-to-publish-and-share-a-power-bi-report-online-316j</guid>
      <description>&lt;p&gt;As we continue to explore Data Science and Analytics, this weeks deliverable was to work on an electronics sales dashboard in PowerBI and publish it online. This is a common request in data analytics so in this article we will walk through the full process of publishing a Power BI report to the web, from clicking "Publish" in Desktop to getting a shareable link that anyone can open in a browser. &lt;/p&gt;

&lt;h2&gt;
  
  
  The Project in Brief
&lt;/h2&gt;

&lt;p&gt;Before jumping into publishing, here is a quick overview of what the report contains so you have context for the screenshots:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dataset:&lt;/strong&gt; 100 electronics retail transactions across 10 products, 9 cities (USA and Canada), and 3 sales representatives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data preparation:&lt;/strong&gt; The raw Excel data was loaded into Power Query where it was cleaned, validated, and normalized. Canadian Dollar transactions were converted to USD at a rate of 0.74. A star schema was built with six dimension tables (Products, Customers, Cities, SalesReps, PaymentMethod, DateTable) linked to a central fact table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DAX measures:&lt;/strong&gt; Total Revenue, Total Profit, Total Orders, Average Profit Margin, Average Shipping Cost, and a calculated Date Table using the &lt;code&gt;CALENDAR()&lt;/code&gt; function.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dashboard pages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Page 1 - Executive Dashboard:&lt;/strong&gt; KPI cards, sales by product, revenue by category, sales rep distribution, payment method breakdown, and interactive slicers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe589dmv8uv0mdr4ha78e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe589dmv8uv0mdr4ha78e.png" alt="Dashboard page" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Page 2 - Product Analysis:&lt;/strong&gt; Revenue and profit comparison by product, geographic maps, profit margin rankings.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj5o34iss775nefitd6gf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj5o34iss775nefitd6gf.png" alt="Product Analysis" width="800" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Page 3 - Sales Reps and Customers:&lt;/strong&gt; Customer rankings, sales rep performance combo chart, customer age distribution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8wlrz5yb8al94oi697nb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8wlrz5yb8al94oi697nb.png" alt="Customers and Sales" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With the report complete in Power BI Desktop, the next step is getting it online.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Publish from Power BI Desktop
&lt;/h2&gt;

&lt;p&gt;Open your &lt;code&gt;.pbix&lt;/code&gt; file in Power BI Desktop. In the top ribbon, go to &lt;strong&gt;Home&lt;/strong&gt; and click the &lt;strong&gt;Publish&lt;/strong&gt; button on the right side of the toolbar.&lt;/p&gt;

&lt;p&gt;If you are not signed in, Power BI will prompt you to sign in with your Microsoft account. Use the account associated with your Power BI Service subscription.&lt;/p&gt;

&lt;p&gt;Once signed in, a dialog box appears asking you to select a destination workspace. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4hu1vs9dmf66rg97wob.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb4hu1vs9dmf66rg97wob.png" alt="Publish a Report" width="800" height="522"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You will see a list of available workspaces. If you have not created any custom workspaces, you will only see &lt;strong&gt;My workspace&lt;/strong&gt;, which is the default personal workspace that comes with every Power BI account.&lt;/p&gt;

&lt;p&gt;Select the workspace you want and click &lt;strong&gt;Select&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Power BI Desktop will upload the report. This usually takes a few seconds depending on the file size. When it finishes, you get a success confirmation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpyq3un8aijpuf6jhjwjq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpyq3un8aijpuf6jhjwjq.png" alt="Publishing Sucessful" width="502" height="307"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click the &lt;strong&gt;"Open in Power BI"&lt;/strong&gt; link to jump directly to your report in the browser, or click &lt;strong&gt;Got it&lt;/strong&gt; to stay in Desktop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Verify in Power BI Service
&lt;/h2&gt;

&lt;p&gt;Head over to &lt;a href="https://app.powerbi.com" rel="noopener noreferrer"&gt;app.powerbi.com&lt;/a&gt; and navigate to the workspace where you published. You should see your report listed there.&lt;/p&gt;

&lt;p&gt;Open the report and spend a few minutes checking everything:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do all visuals render correctly?&lt;/li&gt;
&lt;li&gt;Do the slicers filter across all charts as expected?&lt;/li&gt;
&lt;li&gt;Do navigation buttons (if you added any) work?&lt;/li&gt;
&lt;li&gt;Are the KPI numbers matching what you see in Desktop?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fstdrnp4if6r671ipeqrv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fstdrnp4if6r671ipeqrv.png" alt="Published Report" width="800" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the screenshot above, you can see the report is live in the Power BI Service. The left sidebar shows the Pages panel listing all four pages (Dashboard, Products, Sales Reps and Custom...). The report is fully interactive - slicers, tooltips, and filtering all work just like they do in Desktop.&lt;/p&gt;

&lt;p&gt;Notice the toolbar at the top: &lt;strong&gt;File&lt;/strong&gt;, &lt;strong&gt;Export&lt;/strong&gt;, &lt;strong&gt;Share&lt;/strong&gt;, &lt;strong&gt;Explore&lt;/strong&gt;, &lt;strong&gt;Set alert&lt;/strong&gt;, &lt;strong&gt;Monitor&lt;/strong&gt;, and &lt;strong&gt;Edit&lt;/strong&gt;. These are the options available to you as the report owner. The ones we care about for sharing are &lt;strong&gt;File&lt;/strong&gt;, &lt;strong&gt;Export&lt;/strong&gt;, and &lt;strong&gt;Share&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Share the Report
&lt;/h2&gt;

&lt;p&gt;This is where it gets interesting, and also where you might run into some friction depending on your account type. There are several ways to share a Power BI report, each with different access requirements.&lt;/p&gt;

&lt;h3&gt;
  
  
  Publish to Web (Public Embed)
&lt;/h3&gt;

&lt;p&gt;This is the simplest option for making your report accessible to anyone without requiring them to have a Power BI account.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Path:&lt;/strong&gt; File &amp;gt; Embed report &amp;gt; Publish to web (public)&lt;/p&gt;

&lt;p&gt;This generates two things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;direct URL&lt;/strong&gt; that opens the report in any browser&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;iframe embed code&lt;/strong&gt; that you can paste into a website, blog, or portfolio page&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The direct link looks something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://app.powerbi.com/view?r=eyJrIjoi...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The iframe code looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;iframe&lt;/span&gt;
  &lt;span class="na"&gt;title=&lt;/span&gt;&lt;span class="s"&gt;"Electronics Sales Dashboard"&lt;/span&gt;
  &lt;span class="na"&gt;width=&lt;/span&gt;&lt;span class="s"&gt;"1140"&lt;/span&gt;
  &lt;span class="na"&gt;height=&lt;/span&gt;&lt;span class="s"&gt;"541.25"&lt;/span&gt;
  &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"https://app.powerbi.com/view?r=eyJrIjoi..."&lt;/span&gt;
  &lt;span class="na"&gt;frameborder=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt;
  &lt;span class="na"&gt;allowFullScreen=&lt;/span&gt;&lt;span class="s"&gt;"true"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/iframe&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Anyone with the link can view and interact with the report. No sign-in required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Embed on a Website (If Publish to Web Worked)
&lt;/h2&gt;

&lt;p&gt;If you got the embed code from "Publish to web", you can host your report on any website. Here are two common approaches:&lt;/p&gt;

&lt;h3&gt;
  
  
  Simple HTML Page
&lt;/h3&gt;

&lt;p&gt;Create an &lt;code&gt;index.html&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;!DOCTYPE html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;html&lt;/span&gt; &lt;span class="na"&gt;lang=&lt;/span&gt;&lt;span class="s"&gt;"en"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;head&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;charset=&lt;/span&gt;&lt;span class="s"&gt;"UTF-8"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"viewport"&lt;/span&gt; &lt;span class="na"&gt;content=&lt;/span&gt;&lt;span class="s"&gt;"width=device-width, initial-scale=1.0"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;Electronics Sales Dashboard&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;style&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;body&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nl"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;font-family&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Arial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sans-serif&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#f5f5f5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nc"&gt;.header&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nl"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#1B2A4A&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="no"&gt;white&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;20px&lt;/span&gt; &lt;span class="m"&gt;40px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nc"&gt;.header&lt;/span&gt; &lt;span class="nt"&gt;h1&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nl"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;font-size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;24px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nc"&gt;.header&lt;/span&gt; &lt;span class="nt"&gt;p&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nl"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8px&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;font-size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;14px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;#B5D4F4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nc"&gt;.container&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nl"&gt;max-width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1200px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;margin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;24px&lt;/span&gt; &lt;span class="nb"&gt;auto&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt; &lt;span class="m"&gt;20px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="nt"&gt;iframe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nl"&gt;width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;100%&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;height&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;600px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;border&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;none&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nl"&gt;border-radius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;8px&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/style&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/head&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;body&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"header"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;h1&amp;gt;&lt;/span&gt;Electronics Sales Dashboard&lt;span class="nt"&gt;&amp;lt;/h1&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;Power BI Report -- January 2026&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"container"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;iframe&lt;/span&gt;
          &lt;span class="na"&gt;title=&lt;/span&gt;&lt;span class="s"&gt;"Electronics Sales Dashboard"&lt;/span&gt;
          &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"YOUR_EMBED_URL_HERE"&lt;/span&gt;
          &lt;span class="na"&gt;allowFullScreen=&lt;/span&gt;&lt;span class="s"&gt;"true"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;/iframe&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/body&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/html&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Replace &lt;code&gt;YOUR_EMBED_URL_HERE&lt;/code&gt; with the actual embed URL from Power BI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Host on GitHub Pages
&lt;/h3&gt;

&lt;p&gt;If your project is already in a GitHub repo, you can host the embed page for free using GitHub Pages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a subfolder in your repo (e.g., &lt;code&gt;electronics-sales/&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Put the &lt;code&gt;index.html&lt;/code&gt; file inside it.&lt;/li&gt;
&lt;li&gt;Push to GitHub.&lt;/li&gt;
&lt;li&gt;Go to &lt;strong&gt;Settings &amp;gt; Pages&lt;/strong&gt; in your repo.&lt;/li&gt;
&lt;li&gt;Set the source to deploy from your main branch.&lt;/li&gt;
&lt;li&gt;After a minute or two, your page will be live at &lt;code&gt;https://yourusername.github.io/repo-name/electronics-sales/&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If your repo contains multiple projects, each subfolder with its own &lt;code&gt;index.html&lt;/code&gt; becomes a separate page under the same base URL. You only need to enable GitHub Pages once per repo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The publishing workflow in Power BI goes like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Build&lt;/strong&gt; the report in Power BI Desktop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Publish&lt;/strong&gt; to a workspace in the Power BI Service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify&lt;/strong&gt; that everything works in the browser.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Share&lt;/strong&gt; using one of the available methods (Publish to web, direct share, or export).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embed&lt;/strong&gt; on a website or GitHub Pages if you want a standalone hosted version.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I hope you enjoyed this guide. If you would like to learn more about the PowerBI project, the insights gained and the original data used, check out my GitHub repo that contains the project: &lt;a href="https://github.com/BrianMunene96/LuxDevHQ_Assignments/blob/main/README_PowerBI_Project.md" rel="noopener noreferrer"&gt;Electronics Sale Dashboard&lt;/a&gt;&lt;/p&gt;

</description>
      <category>data</category>
      <category>analytics</category>
      <category>datascience</category>
      <category>analyst</category>
    </item>
    <item>
      <title>Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained</title>
      <dc:creator>Brian Muriithi</dc:creator>
      <pubDate>Sun, 29 Mar 2026 21:21:50 +0000</pubDate>
      <link>https://dev.to/brian_muriithi/understanding-data-modeling-in-power-bi-joins-relationships-and-schemas-explained-5a33</link>
      <guid>https://dev.to/brian_muriithi/understanding-data-modeling-in-power-bi-joins-relationships-and-schemas-explained-5a33</guid>
      <description>&lt;p&gt;If you have ever opened Power BI, loaded a couple of tables, and then wondered why your numbers look wrong or your visuals are not filtering the way you expect, the answer is almost always the same: data modeling. It is literally the backbone of everything in Power BI, and getting it right early on saves you from a world of pain later.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Data Modeling?
&lt;/h2&gt;

&lt;p&gt;Data modeling is the process of organizing your data into a structure that Power BI can work with efficiently. It involves deciding which tables you need, how those tables connect to each other, and what role each table plays in your analysis.&lt;/p&gt;

&lt;p&gt;Think of it this way. If you dump all your data into one massive flat table, Power BI will technically work, but it will be slow, hard to maintain, and prone to producing incorrect calculations. A proper data model splits your data into logical, connected tables. This makes your reports faster, your DAX calculations simpler, and your results more reliable.&lt;/p&gt;

&lt;p&gt;A data model in Power BI consists of three main elements: tables (which hold your data from one or more sources), relationships (which define how those tables connect to each other), and measures (which are the calculations you write in DAX to analyze your data). Get the first two right, and the third becomes much easier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Joins
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39pq9ck09i6wsr80xgxo.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39pq9ck09i6wsr80xgxo.jpg" alt="Joins visualized RadCad.com copyright" width="737" height="432"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Joins are operations that combine two tables based on matching values in one or more columns. Let us walk through each join type using a simple example. Imagine you have two tables:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sales table&lt;/strong&gt; with columns: Date, CustomerID, Amount&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Customers table&lt;/strong&gt; with columns: CustomerID, CustomerName, City&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;CustomerID&lt;/code&gt; column exists in both tables and serves as the &lt;code&gt;key&lt;/code&gt; that connects them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inner Join
&lt;/h3&gt;

&lt;p&gt;An inner join returns only the rows where both tables have a matching value. If a sale has a CustomerID that does not exist in the Customers table, that sale is dropped. If a customer has no matching sales, that customer is also dropped. You get the intersection of both tables and nothing else. This is best to use when you only care about records that exist in both tables. For example, showing only customers who have actually made a purchase.&lt;/p&gt;

&lt;h3&gt;
  
  
  Left Outer Join
&lt;/h3&gt;

&lt;p&gt;A left outer join returns all rows from the left (first) table and the matching rows from the right (second) table. If a row in the left table has no match in the right table, the right-side columns come back as null.&lt;/p&gt;

&lt;p&gt;When to use it: This is the most common join type. Use it when you want to keep all your primary records (like all sales) and just add supplementary information from another table (like customer names). You do not want to lose sales data just because a customer record is missing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Right Outer Join
&lt;/h3&gt;

&lt;p&gt;A right outer join is the mirror of a left outer join. It returns all rows from the right table and the matching rows from the left table. Unmatched rows on the left side come back as null.&lt;/p&gt;

&lt;p&gt;When to use it: Less common than a left join since you can usually just swap the table order. But it is useful when you want to keep all records from a reference table (like all customers) even if they have no transactions yet.&lt;/p&gt;

&lt;h3&gt;
  
  
  Full Outer Join
&lt;/h3&gt;

&lt;p&gt;A full outer join returns all rows from both tables. Where there is a match, the columns are filled in from both sides. Where there is no match on either side, you get nulls.&lt;/p&gt;

&lt;p&gt;When to use it: When you need a complete picture of both datasets. For instance, reconciling two lists to see which records exist in one, the other, or both.&lt;/p&gt;

&lt;h3&gt;
  
  
  Left Anti Join
&lt;/h3&gt;

&lt;p&gt;A left anti join returns only the rows from the left table that have no matching rows in the right table. It is essentially the opposite of an inner join for the left side.&lt;/p&gt;

&lt;p&gt;When to use it: Finding orphaned records. For example, trying to see all sales where the CustomerID does not exist in the Customers table." This is great for data quality checks and identifying gaps in your data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Right Anti Join
&lt;/h3&gt;

&lt;p&gt;A right anti join returns only the rows from the right table that have no matching rows in the left table. When to use it: The flip side of a left anti. Say you want to see all customers who have never made a purchase. Useful for things like identifying dormant customers or products that have never been sold.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Perform Joins in Power BI
&lt;/h2&gt;

&lt;p&gt;Joins in Power BI happen in Power Query, not in the model itself. Here is how to do it step by step:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Open Power BI Desktop and load your tables. Go to &lt;strong&gt;Home &amp;gt; Transform Data&lt;/strong&gt; to open the Power Query Editor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Select the table you want to be your left (primary) table from the Queries pane on the left.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Go to &lt;strong&gt;Home &amp;gt; Merge Queries&lt;/strong&gt;. You will see two options: "Merge Queries" (modifies the current query) and "Merge Queries as New" (creates a new query with the merged result). For most cases, "Merge Queries as New" is the safer choice since it preserves your original tables.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the Merge dialog, your selected table appears at the top as the left table. Use the dropdown to pick your right table.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click the column in each table that serves as the join key. Power Query will highlight the selected columns.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;At the bottom of the dialog, choose your &lt;strong&gt;Join Kind&lt;/strong&gt; from the dropdown: Left Outer, Right Outer, Full Outer, Inner, Left Anti, or Right Anti.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click OK. Power Query creates a new column containing the matched rows as nested tables.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click the expand icon on the new column header to choose which columns from the right table you want to bring in.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is it. Power Query handles the join logic, and the resulting merged table flows into your data model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Relationships in Power BI
&lt;/h2&gt;

&lt;p&gt;Joins and relationships are not the same thing, even though they both involve connecting tables. &lt;code&gt;Joins&lt;/code&gt; (in Power Query) physically combine two tables into one before the data reaches the model. The result is a single merged table. &lt;code&gt;Relationships&lt;/code&gt; (in the data model) keep tables separate but define a logical connection between them. Power BI uses these connections at query time to filter and aggregate data across tables without merging them together. This is the preferred approach for data modeling in Power BI because it keeps your model clean, efficient, and flexible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cardinality Types
&lt;/h3&gt;

&lt;p&gt;Cardinality describes how many rows on one side of a relationship match rows on the other side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One-to-Many (1:M)&lt;/strong&gt; is the most common and recommended type. One row in the lookup table matches many rows in the data table. For example, one customer can have many orders. The "one" side has unique values (like CustomerID in a Customers dimension table), and the "many" side can have duplicates (like CustomerID in a Sales fact table).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs6e3mnqogywhjuxbjhap.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs6e3mnqogywhjuxbjhap.png" alt="One to many relationship" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Many-to-One (M:1)&lt;/strong&gt; is just the reverse direction of one-to-many. It is the same relationship, just described from the other table's perspective.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F841i498vb93ani3ny2ra.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F841i498vb93ani3ny2ra.png" alt="many to one relationship" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One-to-One (1:1)&lt;/strong&gt; means each row in one table matches exactly one row in the other table. This is uncommon and often signals that the two tables could be combined into one. It sometimes shows up when you split a wide table for organizational reasons.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn6bxyrvtcdbjawpzwqse.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn6bxyrvtcdbjawpzwqse.png" alt="one to one relationship" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Many-to-Many (M:M)&lt;/strong&gt; means both columns can contain duplicate values. This is the trickiest type and should be used sparingly. It is sometimes necessary when you are working with tables at different levels of granularity. For example, if you have sales targets stored at the product category level but your product dimension is at the individual product level. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2s8co7pwtdk40faqqdv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi2s8co7pwtdk40faqqdv.png" alt="Many to many relationship" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Cross-Filter Direction
&lt;/h3&gt;

&lt;p&gt;Cross-filter direction controls how filters flow between related tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single&lt;/strong&gt; means filters flow in one direction only, from the "one" side to the "many" side. This is the default and recommended setting for most relationships. When you filter on a dimension table (like selecting a specific customer), that filter flows through to the fact table (showing only that customer's sales). But filtering the fact table does not flow back to filter the dimension table.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc4tj8rd8c8wctxx75k7l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc4tj8rd8c8wctxx75k7l.png" alt="single direction" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Both&lt;/strong&gt; (bidirectional) means filters flow in both directions. This can be useful in specific scenarios, like when you need a slicer on a fact table to filter a dimension. But bidirectional filtering can cause performance issues and create ambiguous filter paths, so use it only when you genuinely need it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxf6frl2902eb5qv8z4kg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxf6frl2902eb5qv8z4kg.png" alt="double direction" width="800" height="379"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Active vs. Inactive Relationships
&lt;/h3&gt;

&lt;p&gt;Power BI only allows one active relationship between any two tables at a time. In the Model view, active relationships show as solid lines, while inactive relationships appear as dashed lines.&lt;/p&gt;

&lt;p&gt;Why would you have an inactive relationship? The most common reason is role-playing dimensions. Suppose you have a Date table and a Sales table, but your Sales table has three date columns: OrderDate, ShipDate, and DeliveryDate. You can create three relationships from the Date table to the Sales table, but only one can be active at a time. The other two become inactive.&lt;/p&gt;

&lt;p&gt;To use an inactive relationship in a DAX calculation, you use the &lt;code&gt;USERELATIONSHIP()&lt;/code&gt; function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;Total&lt;/span&gt; &lt;span class="n"&gt;Shipped&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;CALCULATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Sales&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Amount&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="n"&gt;USERELATIONSHIP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Sales&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ShipDate&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells Power BI to temporarily activate the ShipDate relationship for that specific calculation.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Create Relationships in Power BI
&lt;/h2&gt;

&lt;p&gt;There are several ways to create and manage relationships in Power BI Desktop:&lt;/p&gt;

&lt;h3&gt;
  
  
  Method 1: Drag and Drop in Model View
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Click the &lt;strong&gt;Model view&lt;/strong&gt; icon on the left sidebar of Power BI Desktop (it looks like a diagram).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;You will see all your tables displayed as cards with their columns listed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;To create a relationship, simply click on a column in one table and drag it to the matching column in another table.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Power BI will auto-detect the cardinality and cross-filter direction. You can double-click the relationship line to edit these settings.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Method 2: Manage Relationships Dialog
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Go to the &lt;strong&gt;Modeling tab&lt;/strong&gt; in the ribbon and click &lt;strong&gt;Manage Relationships&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This opens a dialog that lists all existing relationships. You can see the tables, columns, cardinality, and whether each relationship is active.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;New&lt;/strong&gt; to create a relationship. Select your two tables and the columns to join on. Set the cardinality and cross-filter direction.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Click &lt;strong&gt;OK&lt;/strong&gt; to save.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Method 3: Autodetect
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;In the Manage Relationships dialog, click &lt;strong&gt;Autodetect&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Power BI will scan your tables for columns with matching names and data and attempt to create relationships automatically.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This is convenient but always review the results. Autodetect can create incorrect relationships if column names are coincidentally similar.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Method 4: Model Explorer (Properties Pane)
&lt;/h3&gt;

&lt;p&gt;In the Model view, you can also right-click on the Relationships section in the Model Explorer pane and select &lt;strong&gt;New relationship&lt;/strong&gt;. This opens a streamlined Properties pane where you fill in the details. This approach generates fewer queries, which matters when working with large datasets or DirectQuery connections.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fact Tables vs. Dimension Tables
&lt;/h2&gt;

&lt;p&gt;Understanding the difference between fact tables and dimension tables is fundamental to building a good data model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fact tables&lt;/strong&gt; store your transactional or event data. These are the tables with the numbers you want to analyze: sales amounts, order quantities, revenue figures, website clicks. Fact tables tend to be tall (many rows) and relatively narrow (fewer columns). Each row represents a single event or transaction. Fact tables contain foreign keys that link to dimension tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dimension tables&lt;/strong&gt; store the descriptive context around your facts. They answer the "who, what, where, when" questions: customer names, product categories, store locations, dates. Dimension tables tend to be wide (many descriptive columns) and shorter (fewer rows compared to facts). Each row represents a unique entity, and the key column has unique values.&lt;/p&gt;

&lt;p&gt;A simple way to think about it: if you are summing, counting, or averaging a column, it probably belongs in a fact table. If you are filtering or grouping by a column, it probably belongs in a dimension table.&lt;/p&gt;

&lt;h2&gt;
  
  
  Schemas: Star vs Snowflake, and Flat Table
&lt;/h2&gt;

&lt;p&gt;A schema is the overall shape of your data model, meaning how your fact and dimension tables are arranged and connected.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4g98vv6fdqylutdena8.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4g98vv6fdqylutdena8.jpg" alt="Copyright Exaso: Star vs snowflake" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Star Schema
&lt;/h3&gt;

&lt;p&gt;The star schema is the gold standard for Power BI and the one Microsoft recommends. It gets its name from the way it looks: a central fact table surrounded by dimension tables, like a star.&lt;/p&gt;

&lt;p&gt;In a star schema, each dimension table connects directly to the fact table through a single relationship. The dimension tables are denormalized, meaning they contain all their descriptive attributes in a single table rather than splitting them into sub-tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; A retail star schema might have a Sales fact table in the center, connected to Customer, Product, Date, and Store dimension tables. If you want to filter sales by product category, you go through the Product dimension. If you want to filter by month, you go through the Date dimension.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it works well in Power BI:&lt;/strong&gt; Star schemas minimize the number of joins needed for queries, which improves performance. They also make the model easy to understand: every dimension table has a one-to-many relationship with the fact table, filters flow naturally from dimensions to facts, and DAX calculations behave predictably.&lt;/p&gt;

&lt;h3&gt;
  
  
  Snowflake Schema
&lt;/h3&gt;

&lt;p&gt;A snowflake schema is a variation of the star schema where one or more dimension tables are normalized. This means a dimension table is broken into related sub-tables instead of keeping everything in one table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; Instead of a single Product dimension with columns for ProductName, Category, and SubCategory, you might have a Product table linked to a Category table linked to a SubCategory table. Each level is its own table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When it comes up:&lt;/strong&gt; Snowflake schemas often appear when you import data directly from a normalized relational database. The structure makes sense for storage efficiency in a database, but in Power BI, it adds complexity. More tables mean more relationships, more joins at query time, and more chances for filters to break or behave unexpectedly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommendation:&lt;/strong&gt; If you encounter a snowflake schema in your source data, consider flattening the snowflaked dimensions in Power Query before loading them into the model. Merge those sub-tables back together so that each dimension is a single table. Your model will be simpler and will perform better.&lt;/p&gt;

&lt;h3&gt;
  
  
  Flat Table (Single Table / DLAT)
&lt;/h3&gt;

&lt;p&gt;A flat table model puts everything into one big table. Every column, both transactional data and descriptive attributes, are side by side. No relationships needed because there is only one table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When people use it:&lt;/strong&gt; This is very common for people coming from an Excel background, where everything lives on one worksheet. It works fine for very small datasets or quick one-off analyses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it is problematic at scale:&lt;/strong&gt; Flat tables repeat descriptive data across every row (the customer name appears on every transaction for that customer), which bloats the file size. They also make DAX calculations harder to write and can lead to performance issues as the data grows. Power BI's engine is optimized for star schemas, not flat tables. If your "flat table" model starts misbehaving or slowing down, restructuring into a star schema is usually the fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Role-Playing Dimensions
&lt;/h2&gt;

&lt;p&gt;A role-playing dimension is a single dimension table that plays multiple roles in the model. The Date table is the classic example.&lt;/p&gt;

&lt;p&gt;Consider a Sales table with OrderDate, ShipDate, and DeliveryDate. All three columns relate to the same Date dimension, but each relationship represents a different analytical perspective. You might want to analyze sales by order date for revenue trends, by ship date for logistics, and by delivery date for customer experience metrics.&lt;/p&gt;

&lt;p&gt;In Power BI, you handle this by creating three relationships between the Date table and the Sales table (one for each date column). Only one can be active; the others are inactive. You then use &lt;code&gt;USERELATIONSHIP()&lt;/code&gt; in your DAX measures to activate the specific relationship you need for each calculation.&lt;/p&gt;

&lt;p&gt;In some situations, an analyst may prefer to create separate copies of the Date table (OrderDate table, ShipDate table, DeliveryDate table) to avoid dealing with inactive relationships entirely. Both approaches work. Separate tables are simpler for report consumers who are using slicers, while a single table with &lt;code&gt;USERELATIONSHIP()&lt;/code&gt; keeps the model more compact.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Data Modeling Issues (and How to Fix Them)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Circular Dependencies
&lt;/h3&gt;

&lt;p&gt;This happens when your relationships create a loop. Table A connects to Table B, Table B connects to Table C, and Table C connects back to Table A. Power BI will flag this and prevent you from creating the relationship.&lt;/p&gt;

&lt;p&gt;Fix: Restructure your model to break the loop. Usually this means removing one of the relationships or rethinking which tables need to be directly connected.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ambiguous Relationships
&lt;/h3&gt;

&lt;p&gt;When there are multiple paths between two tables, Power BI does not know which relationship path to use for filtering. This shows up as unexpected aggregation results or error messages.&lt;/p&gt;

&lt;p&gt;Fix: Ensure there is only one active path between any two tables. Deactivate extra relationships and use &lt;code&gt;USERELATIONSHIP()&lt;/code&gt; in DAX when you need the alternate path.&lt;/p&gt;

&lt;h3&gt;
  
  
  Many-to-Many Pitfalls
&lt;/h3&gt;

&lt;p&gt;Many-to-many relationships can produce inflated or incorrect totals because Power BI cannot determine a unique mapping between the tables.&lt;/p&gt;

&lt;p&gt;Fix: Introduce a bridge table (sometimes called a junction table) that breaks the many-to-many into two one-to-many relationships. This is cleaner and gives you more control over the results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bidirectional Filters Causing Issues
&lt;/h3&gt;

&lt;p&gt;Setting cross-filter direction to "Both" on multiple relationships can create performance problems and ambiguous filter paths.&lt;/p&gt;

&lt;p&gt;Fix: Default to single-direction filtering. Only switch to bidirectional when you have a specific use case that requires it, and test thoroughly to confirm the numbers are correct.&lt;/p&gt;

&lt;h3&gt;
  
  
  Missing or Incorrect Relationships
&lt;/h3&gt;

&lt;p&gt;If Power BI is not filtering your visuals as expected, check the Model view. A missing relationship between two tables means filters will not flow between them. An incorrect join key (like joining on the wrong column) will produce mismatched or blank results.&lt;/p&gt;

&lt;p&gt;Fix: Open Model view, verify all relationships exist and connect on the correct columns, and double-check the cardinality indicators (1 and * symbols on the relationship lines).&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Data modeling is one of those topics that does not always get the attention it deserves, but it is arguably the most important skill in Power BI. A report is only as good as the model behind it.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;I hope you enjoyed reading this and picked up some new concepts. Please leave a comment below if anything was unclear. If you would like to learn more about Analytics and Data Science, give me a follow&lt;/em&gt;&lt;/p&gt;

</description>
      <category>data</category>
      <category>datascience</category>
      <category>sql</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How Excel is Used in Real-World Data Analysis</title>
      <dc:creator>Brian Muriithi</dc:creator>
      <pubDate>Fri, 27 Mar 2026 22:05:20 +0000</pubDate>
      <link>https://dev.to/brian_muriithi/how-excel-is-used-in-real-world-data-analysis-je8</link>
      <guid>https://dev.to/brian_muriithi/how-excel-is-used-in-real-world-data-analysis-je8</guid>
      <description>&lt;p&gt;&lt;em&gt;A practical walkthrough from beginner to building a dashboard&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdcivnxdeb9gqnqugpn3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdcivnxdeb9gqnqugpn3.png" alt="Final Dashboard" width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have always known Excel as a spreadsheet application that I have mostly used to complete class assignments and do simple calculations. I have recently learnt that Excel is more than that. Excel is not just a spreadsheet application. It is a powerful tool for cleaning data, analyzing it, and presenting insights in a way that supports decision-making.&lt;/p&gt;

&lt;p&gt;As part of this learning journey, I worked on a real project: building a &lt;strong&gt;Jumia Product Performance Dashboard&lt;/strong&gt;. Throughout this article, I will use examples from this project to explain how Excel is used in real-world data analysis, from loading raw data to building an interactive dashboard.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started with Excel
&lt;/h2&gt;

&lt;p&gt;The first step was installing and opening Excel on desktop. Once launched, the interface introduces several important components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Workbook&lt;/strong&gt;: The entire Excel file
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Worksheet (Sheet)&lt;/strong&gt;: Individual tabs within a workbook
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rows and Columns&lt;/strong&gt;: Where data is stored
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cells&lt;/strong&gt;: The intersection of rows and columns
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ribbon&lt;/strong&gt;: Contains all Excel tools and features
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Formula Bar&lt;/strong&gt;: Used to enter and edit formulas
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Understanding this layout makes it easier to navigate Excel and perform tasks efficiently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Loading Data into Excel
&lt;/h2&gt;

&lt;p&gt;One of the first practical things I learned was how to load external data. Excel can open CSV files, text files, and data from various other sources. The most straightforward way to load a CSV is to just open it directly in Excel through File &amp;gt; Open, or you can use the Data tab and click "From Text/CSV".&lt;/p&gt;

&lt;p&gt;When importing, Excel gives you a preview and lets you set things like the delimiter (comma, tab, semicolon), the data types for each column, and the encoding.&lt;/p&gt;

&lt;p&gt;For my project, I worked with a dataset of product listings from Jumia, one of the largest e-commerce platforms in Africa. The dataset had columns for product name, current price, old price, discount percentage, number of reviews, and customer ratings. Nothing too complicated on the surface, but as I quickly learned, raw data is almost never clean enough to analyze straight away.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Cleaning: The Part Nobody Talks About Enough
&lt;/h2&gt;

&lt;p&gt;This is where I spent most of my time, and I think that is true for most data work in general. The Jumia dataset had several issues that needed fixing before I could do anything meaningful with it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Handling missing values.&lt;/strong&gt; Some products were missing reviews or ratings. In Excel, you can spot these by filtering columns or using functions like &lt;code&gt;COUNTBLANK()&lt;/code&gt; to count empty cells. Depending on the situation, you might fill in missing values with an average, remove the rows entirely, or leave them and account for it in your analysis. The &lt;strong&gt;Reviews&lt;/strong&gt; and &lt;strong&gt;Rating&lt;/strong&gt; columns had missing values. Instead of leaving them blank, I used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Find and Replace (Ctrl + H)&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To replace:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Blank reviews → &lt;code&gt;"No Reviews"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Blank ratings → &lt;code&gt;"No Rating"&lt;/code&gt; &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Removing duplicates.&lt;/strong&gt; The dataset had some duplicate product entries. Excel has a built-in "Remove Duplicates" feature under the Data tab that makes this pretty painless. You select the columns you want to check for duplicates, click the button, and Excel tells you how many duplicates it found and removed. I went from having some repeated products down to 109 unique entries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Converting data types.&lt;/strong&gt; This was a big one. The price columns came in with "KSh" (Kenyan Shillings) as text stuck to the numbers. You cannot do math on text, so I needed to strip out "KSh". The Find and Replace tool (Ctrl+H) was a lifesaver again. I searched for "KSh" and replaced it with nothing. Then I changed the data type to &lt;code&gt;Currency&lt;/code&gt;.  &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cleaning the discount column.&lt;/strong&gt; The discount percentages also needed to be converted from text to numeric format. Again, Find and Replace came in handy, along with the &lt;code&gt;LEFT()&lt;/code&gt; and &lt;code&gt;RIGHT()&lt;/code&gt; functions for extracting parts of text strings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fixing the rating column.&lt;/strong&gt; Ratings were stored as text like "4.5 out of 5" instead of just the number 4.5. I used text functions to extract just the numeric portion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Converting negative reviews.&lt;/strong&gt; Some review counts came through as negative numbers, which does not make sense for a count of reviews. The &lt;code&gt;ABS()&lt;/code&gt; function quickly converted those to positive values. Find and Replace can also be used to achieve the same.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The TRIM function&lt;/strong&gt; was another one I used a lot. &lt;code&gt;TRIM()&lt;/code&gt; removes extra spaces from text, which is one of those invisible problems that can cause formulas to break or lookups to fail. If a product name has a trailing space, Excel treats it as a different value than the same name without the space, which can mess up your analysis without you even realizing it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building New Columns: Data Enrichment
&lt;/h2&gt;

&lt;p&gt;Once the data was clean, the next step was to create new columns that would make the analysis richer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Discount Amount&lt;/strong&gt; was straightforward. It is just the Old Price minus the Current Price, giving you the actual shilling value of the discount rather than just the percentage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rating Category&lt;/strong&gt; used an &lt;code&gt;IF&lt;/code&gt; formula to classify each product:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Poor" for ratings below 3&lt;/li&gt;
&lt;li&gt;"Average" for ratings between 3 and 4.4&lt;/li&gt;
&lt;li&gt;"Excellent" for ratings of 4.5 and above&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Excel, that looks something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=IF(H2&amp;lt;3,"Poor",IF(H2&amp;lt;4.5,"Average","Excellent"))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Discount Category&lt;/strong&gt; followed the same logic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Low Discount" for discounts below 20%&lt;/li&gt;
&lt;li&gt;"Medium Discount" for discounts between 20% and 40%&lt;/li&gt;
&lt;li&gt;"High Discount" for discounts above 40%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These new categories turned raw numbers into meaningful groupings that I could later use for pivot tables and dashboard filters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Data Types (A Simple but Powerful Tip)
&lt;/h2&gt;

&lt;p&gt;One simple but very useful trick I learned is how to quickly identify data types in Excel:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Numbers are aligned to the right&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text is aligned to the left&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F86hu33rnj6bhrskt42ji.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F86hu33rnj6bhrskt42ji.png" alt="Notice how text data is left aligned while numbers are right aligned within columns" width="800" height="156"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This helps you immediately spot issues. For example, if a column that should contain numbers is left-aligned, it means Excel is treating it as text, which can break calculations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Useful Functions I Picked Up Along the Way
&lt;/h2&gt;

&lt;p&gt;Beyond the cleaning functions, here are some of the Excel functions that came up repeatedly during this project:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;AVERAGE()&lt;/code&gt; calculates the mean of a range. I used this to find that the average product rating was about 3.9 out of 5, the average discount was 36%, and the average current price was around KSh 1,210.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;COUNT()&lt;/code&gt; and &lt;code&gt;COUNTA()&lt;/code&gt; count the number of cells with data. &lt;code&gt;COUNT()&lt;/code&gt; works for numbers only, while &lt;code&gt;COUNTA()&lt;/code&gt; counts any non-empty cell.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;MAX()&lt;/code&gt; and &lt;code&gt;MIN()&lt;/code&gt; find the highest and lowest values. The most expensive product was a 32-piece Portable Cordless Drill Set at KSh 3,750, while the cheapest was a 3-piece Knitting Crochet Needle Set at just KSh 38.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SUM()&lt;/code&gt; adds up values in a range. The total number of reviews across all 109 products was 723.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;COUNTIF()&lt;/code&gt; and &lt;code&gt;SUMIF()&lt;/code&gt; are conditional versions that only count or sum values that meet a specific condition. These were useful for breaking down totals by category.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;VLOOKUP()&lt;/code&gt; searches for a value in one column and returns a corresponding value from another column. While I did not use it heavily in this particular project, it is one of those functions that comes up constantly in real-world Excel work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pivot Tables: Where Excel Gets Really Powerful
&lt;/h2&gt;

&lt;p&gt;If there is one feature that changed how I think about Excel, it is pivot tables. A pivot table lets you summarize, reorganize, and analyze large datasets without writing complex formulas. You drag and drop fields into rows, columns, values, and filters, and Excel does the aggregation for you.&lt;/p&gt;

&lt;p&gt;For example, I created pivot tables that showed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The count of products in each discount category (59 in High Discount, 32 in Medium, 18 in Low)&lt;/li&gt;
&lt;li&gt;The count of products in each rating category (23 Excellent, 22 Average, 12 Poor)&lt;/li&gt;
&lt;li&gt;The average rating and review count broken down by discount level&lt;/li&gt;
&lt;li&gt;The top 10 products by discount percentage and by number of reviews&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To create a pivot table, you go to &lt;code&gt;Insert &amp;gt; PivotTable&lt;/code&gt;, select your data range, and choose where you want the table placed. Then you drag fields from the field list into the areas you need. It is surprisingly intuitive once you try it a couple of times.&lt;/p&gt;

&lt;p&gt;The pivot tables served as the data source for most of the charts on my dashboard, which is exactly how professionals use them. Instead of pointing a chart directly at raw data, you point it at a pivot table that already has the summary you need. This keeps things clean and makes updates easier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Charts and Visualizations
&lt;/h2&gt;

&lt;p&gt;Excel offers a solid range of chart types, and choosing the right one matters for communicating your findings clearly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bar and column charts&lt;/strong&gt; work well for comparing values across categories. I used a bar chart to show the top 10 products by discount percentage, which made it easy to see that the 6-in-1 Bottle Can Opener and Creative Owl Shape items had the highest discounts at over 60%.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Horizontal bar charts&lt;/strong&gt; are great when product names are long (which they often are in e-commerce data). My "Top 10 Products by Number of Reviews" chart used this format so you could actually read the product names without them getting cut off. The 120W Cordless Vacuum Cleaner led the pack with 69 reviews. I used the same chart to plot of ratings versus number of reviews which showed that the highest-reviewed products (177 reviews) had a 4.6 rating, which is above average but not the highest. Meanwhile, products with a perfect 5.0 rating only had 12 reviews. So higher ratings do not automatically mean more reviews, and that is a useful insight to consider for product strategy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pie and donut charts&lt;/strong&gt; show proportions. I used pie charts to show product distribution across discount categories and rating categories. For example, 54% of products fell into the High Discount bucket, while only 17% were in Low Discount.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Line charts&lt;/strong&gt; are useful for showing trends. I used one to plot discount percentage against number of reviews, which revealed an interesting pattern: the relationship is not linear. The product with the most reviews (118) had a 49% discount, but there were also products with much lower discounts that still had strong engagement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the Dashboard
&lt;/h2&gt;

&lt;p&gt;All of this work came together in the dashboard. A good dashboard gives someone a quick, high-level view of the data and lets them drill down when they want more detail.&lt;/p&gt;

&lt;p&gt;My Jumia Product Performance Dashboard has four KPI cards across the top:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Total Number of Products: 109&lt;/li&gt;
&lt;li&gt;Average Rating: 3.9&lt;/li&gt;
&lt;li&gt;Average Discount Percentage: 36%&lt;/li&gt;
&lt;li&gt;Total Number of Reviews: 723&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Below that, there are six charts covering the different angles of analysis, from top products by discount and reviews, to distribution charts showing how products spread across categories, to relationship charts exploring whether discounts and ratings actually drive customer engagement.&lt;/p&gt;

&lt;p&gt;I also added slicers for Rating Category and Discount Category. Slicers are visual filters that you can click to instantly filter the data across multiple charts. So if someone wants to see only the "Excellent" rated products, they click that slicer button and every chart connected to the underlying pivot tables updates. It makes the dashboard interactive without any coding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Insights from the Analysis
&lt;/h2&gt;

&lt;p&gt;Working through this dataset taught me that numbers rarely tell a simple story. Here are a few things that stood out:&lt;/p&gt;

&lt;p&gt;The average discount is 36%, but more than half of all products (54%) have discounts above 40%. That suggests heavy discounting is the norm on the platform.&lt;/p&gt;

&lt;p&gt;Higher discounts do not automatically lead to more reviews. The product with the highest number of reviews had a 49% discount, which is above average, but there were plenty of products with steep discounts that had very few reviews. Other factors like product type, visibility, and quality probably matter more.&lt;/p&gt;

&lt;p&gt;Similarly, higher-rated products tend to get more reviews, but it is not a guarantee. Products with the most reviews (177) had a solid 4.6 rating, while products with a perfect 5.0 score only had 12 reviews. &lt;/p&gt;

&lt;p&gt;Products rated "Poor" (below 3) made up only about 11% of the catalog, which suggests either a baseline of quality or that poorly rated products get removed from the platform over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Excel Still Matters
&lt;/h2&gt;

&lt;p&gt;After a week of working with Excel intensively, I can see why it remains a n important tool for analysts across industries. It is accessible. You do not need to set up a development environment or learn a programming language to start analyzing data. The visual, grid-based interface makes it easy to spot patterns and anomalies just by scrolling through your data.&lt;/p&gt;

&lt;p&gt;It is also surprisingly powerful. Between formulas, pivot tables, conditional formatting, and the charting tools, you can go from raw data to a polished dashboard without ever leaving the application. For small to medium datasets, this is often all you need.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddujiw14z2voeaiwjupu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fddujiw14z2voeaiwjupu.png" alt="Final Dashboard" width="800" height="478"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>tutorial</category>
      <category>data</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
