<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kahuthu Muriuki </title>
    <description>The latest articles on DEV Community by Kahuthu Muriuki  (@muriuki_kahuthu_54).</description>
    <link>https://dev.to/muriuki_kahuthu_54</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818823%2Fe1d7d0d0-ce24-426d-98f5-ea676dc849ba.png</url>
      <title>DEV Community: Kahuthu Muriuki </title>
      <link>https://dev.to/muriuki_kahuthu_54</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/muriuki_kahuthu_54"/>
    <language>en</language>
    <item>
      <title>SQL Functions You Will Actually Use in Data Work</title>
      <dc:creator>Kahuthu Muriuki </dc:creator>
      <pubDate>Sat, 18 Apr 2026 22:21:28 +0000</pubDate>
      <link>https://dev.to/muriuki_kahuthu_54/sql-functions-you-will-actually-use-in-data-work-24g1</link>
      <guid>https://dev.to/muriuki_kahuthu_54/sql-functions-you-will-actually-use-in-data-work-24g1</guid>
      <description>&lt;p&gt;Most SQL tutorials stop at SELECT, WHERE, and GROUP BY. That covers retrieval, but it does not cover the layer of work that happens between raw data and a meaningful result. In financial data environments — transaction records, reconciliation tables, KYC logs — the real analytical work depends on functions and operators that transform, filter, combine, and rank data before it becomes useful.&lt;/p&gt;

&lt;p&gt;This article covers six categories of SQL functionality that come up repeatedly in practice: row-level functions, date and time handling, string manipulation, joins, window functions, and set operators. Each section includes syntax and examples grounded in the kind of data you encounter in financial and operational contexts.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Row-Level Functions
&lt;/h2&gt;

&lt;p&gt;Row-level functions operate on individual records one at a time. They do not aggregate — they transform or evaluate each row in isolation.&lt;/p&gt;

&lt;p&gt;The most commonly used ones fall into three groups: conditional logic, null handling, and type conversion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conditional Logic — CASE&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;transaction_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CASE&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;100000&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'High Value'&lt;/span&gt;
    &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;  &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Mid Range'&lt;/span&gt;
    &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Standard'&lt;/span&gt;
  &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;transaction_tier&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CASE evaluates each row against a set of conditions and returns the first match. It works anywhere in a query — SELECT, WHERE, ORDER BY, and inside aggregate functions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Null Handling — COALESCE and ISNULL&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'No contact on file'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;contact_detail&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;COALESCE returns the first non-null value from a list. This is useful when records have multiple optional fields and you need to surface whichever one is populated. ISNULL (SQL Server) or IFNULL (MySQL) handles the simpler two-value version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Type Conversion — CAST and CONVERT&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;account_balance&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction_date&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;txn_date&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Data pulled from flat files or external systems often arrives with the wrong data type. CAST forces a column into the type you need before filtering or calculation.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Date and Time Functions
&lt;/h2&gt;

&lt;p&gt;Date logic is one of the areas where SQL gets used most heavily in financial and operational data work. Reporting periods, transaction timestamps, ageing calculations, and SLA tracking all depend on correct date handling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Getting the Current Date&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;GETDATE&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;        &lt;span class="c1"&gt;-- SQL Server: returns current date and time&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;     &lt;span class="c1"&gt;-- Standard SQL / PostgreSQL: returns date only&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;            &lt;span class="c1"&gt;-- MySQL / PostgreSQL: returns date and time&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Extracting Parts of a Date&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;transaction_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nb"&gt;YEAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;txn_year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;MONTH&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;txn_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;DAY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;txn_day&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In PostgreSQL, use EXTRACT:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;EXTRACT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;MONTH&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;txn_month&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Calculating the Difference Between Dates&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- SQL Server&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;loan_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;DATEDIFF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;DAY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;disbursement_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;repayment_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;days_to_repay&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;loan_records&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;loan_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;repayment_date&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;disbursement_date&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;days_to_repay&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;loan_records&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Adding or Subtracting Time&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- SQL Server&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DATEADD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;MONTH&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;GETDATE&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;quarter_ahead&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- PostgreSQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_DATE&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'3 months'&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;quarter_ahead&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Formatting Dates for Display&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- SQL Server&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;FORMAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'dd-MMM-yyyy'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;formatted_date&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- MySQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;DATE_FORMAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'%d-%b-%Y'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;formatted_date&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Date formatting matters when reports are consumed by non-technical audiences who expect dates in a specific regional format.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. String Functions
&lt;/h2&gt;

&lt;p&gt;String functions clean, reshape, and extract text data. In practice, this comes up constantly — member names with inconsistent casing, account numbers with leading spaces, reference codes that need to be split or concatenated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UPPER, LOWER, and TRIM&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;UPPER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;LOWER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;       &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;TRIM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;account_ref&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;account_ref&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;TRIM removes leading and trailing spaces. LTRIM and RTRIM handle one side at a time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LEN and SUBSTRING&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;account_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;LEN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;account_number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;char_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUBSTRING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;account_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;account_prefix&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SUBSTRING(column, start, length) extracts a portion of a string. In MySQL the function is SUBSTR. This is useful for parsing structured codes — product categories embedded in reference numbers, branch identifiers in account strings, and similar patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;REPLACE and CHARINDEX&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Remove hyphens from ID numbers&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'-'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;clean_id&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;kyc_records&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Find position of a character&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CHARINDEX&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'@'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;at_position&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Concatenation&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- SQL Server / MySQL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;CONCAT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- SQL Server also supports the || operator in some versions&lt;/span&gt;
&lt;span class="c1"&gt;-- PostgreSQL uses ||&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;' '&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;LIKE for Pattern Matching&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;reference_code&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'TXN-%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;%&lt;/code&gt; wildcard matches any sequence of characters. &lt;code&gt;_&lt;/code&gt; matches a single character. These are used heavily in compliance work where you are scanning transaction references or flagging records that match a particular naming pattern.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. JOINs
&lt;/h2&gt;

&lt;p&gt;JOINs combine rows from two or more tables based on a related column. Understanding which join type to use determines whether you get matched records, all records from one side, or everything from both sides.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;INNER JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns only rows where the condition is met in both tables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This returns customers who have at least one transaction. Customers with no transaction history do not appear.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LEFT JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns all rows from the left table, and matching rows from the right. Where there is no match, columns from the right table return NULL.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use this when you need to identify records with no match — customers who have never transacted, accounts with no KYC record, loans with no repayment entries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RIGHT JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The mirror of LEFT JOIN. Returns all rows from the right table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;RIGHT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, RIGHT JOIN is less common because you can always rewrite it as a LEFT JOIN by swapping the table order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;FULL OUTER JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns all rows from both tables. Where there is no match on either side, NULLs fill the gaps.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;full_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction_id&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;FULL&lt;/span&gt; &lt;span class="k"&gt;OUTER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Useful for reconciliation queries where you need to see both unmatched customers and unmatched transactions in one result set.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CROSS JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Produces a Cartesian product — every row in the first table paired with every row in the second.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;region_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not something you use for retrieval in most contexts, but it is practical for generating combinations — pairing every product with every region to pre-populate a reporting matrix, for example.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Window Functions
&lt;/h2&gt;

&lt;p&gt;Window functions perform calculations across a set of rows that are related to the current row, without collapsing the result into a single aggregate value. The row count in your output stays the same — which is what makes window functions different from GROUP BY.&lt;/p&gt;

&lt;p&gt;The syntax always includes OVER(), which defines the window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ROW_NUMBER&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Assigns a unique sequential number to each row within a partition.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;transaction_date&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;txn_rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Where &lt;code&gt;txn_rank = 1&lt;/code&gt;, you have each customer's most recent transaction. This is a common pattern for pulling the latest record per entity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RANK and DENSE_RANK&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;total_collections&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;       &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_collections&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank_with_gaps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;DENSE_RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_collections&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank_no_gaps&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;agent_performance&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;RANK leaves gaps after tied positions (1, 2, 2, 4). DENSE_RANK does not (1, 2, 2, 3). The choice depends on whether the gaps matter for how results are consumed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SUM, AVG, and COUNT as Window Functions&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;transaction_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer_total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer_avg&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps every transaction row visible while also showing the customer-level total and average alongside each record — something GROUP BY cannot do without a subquery.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Running Totals&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;transaction_date&lt;/span&gt; &lt;span class="k"&gt;ROWS&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="n"&gt;UNBOUNDED&lt;/span&gt; &lt;span class="k"&gt;PRECEDING&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;CURRENT&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;running_total&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running totals built this way are cleaner than self-joins and easier to read in a query review.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LAG and LEAD&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;previous_txn_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;LEAD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;transaction_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;next_txn_amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LAG looks back at the previous row. LEAD looks ahead. Both are useful for period-over-period comparisons without needing a self-join.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. SET Operators
&lt;/h2&gt;

&lt;p&gt;SET operators combine the results of two or more SELECT statements. They operate on result sets rather than individual tables, which makes them different from JOINs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UNION&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Combines two result sets and removes duplicate rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;retail_customers&lt;/span&gt;
&lt;span class="k"&gt;UNION&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;full_name&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;business_customers&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both SELECT statements must return the same number of columns in the same order, with compatible data types.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UNION ALL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Same as UNION but keeps duplicates. Runs faster because there is no deduplication step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;transaction_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Q1'&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;quarter&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions_q1&lt;/span&gt;
&lt;span class="k"&gt;UNION&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;transaction_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Q2'&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;quarter&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions_q2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use UNION ALL when you are certain duplicates are not an issue, or when the source tables are structured to avoid them — combining quarterly partitions into a full-year view, for example.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;INTERSECT&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns only rows that appear in both result sets.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;savings_accounts&lt;/span&gt;
&lt;span class="k"&gt;INTERSECT&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;loan_accounts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This identifies customers who hold both products — useful for cross-sell analysis or eligibility filtering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;EXCEPT (or MINUS in Oracle/MySQL)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns rows from the first result set that do not appear in the second.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;savings_accounts&lt;/span&gt;
&lt;span class="k"&gt;EXCEPT&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;loan_accounts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This surfaces savings account holders who do not have a loan — a segment you might target for a lending product campaign, or flag for a financial inclusion review.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Points to Take Away
&lt;/h2&gt;

&lt;p&gt;These six categories cover most of the transformation work that sits between a raw database and a finished analysis. A few things worth noting as you work with them:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Row-level functions and aggregates work at different layers.&lt;/strong&gt; CASE and COALESCE operate on each row individually. SUM and AVG collapse rows. Mixing them requires understanding whether your logic belongs in SELECT, WHERE, or HAVING.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Date functions are not portable across databases.&lt;/strong&gt; GETDATE() is SQL Server. NOW() is MySQL/PostgreSQL. CURRENT_DATE is ANSI standard. If your queries move between environments, this is where they will break first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Window functions do not filter rows — they add columns.&lt;/strong&gt; If you want to use a window function result as a filter condition, wrap it in a subquery or CTE. You cannot reference a window function alias directly in a WHERE clause.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JOIN type choice affects row count.&lt;/strong&gt; An INNER JOIN on a one-to-many relationship multiplies rows. A LEFT JOIN on a table with nulls keeps records you might not expect. Test against a small known dataset before running against production volumes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UNION vs UNION ALL is a performance decision as much as a logic one.&lt;/strong&gt; Deduplication has a cost. Where duplicates are structurally impossible — different source tables, different time periods — UNION ALL is the right default.&lt;/p&gt;

&lt;p&gt;Understanding how these functions interact gives you the control to move from data retrieval into actual data work.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>database</category>
      <category>sql</category>
    </item>
    <item>
      <title>SQL Fundamentals: DDL, DML, and Practical Data Manipulation</title>
      <dc:creator>Kahuthu Muriuki </dc:creator>
      <pubDate>Tue, 14 Apr 2026 13:59:29 +0000</pubDate>
      <link>https://dev.to/muriuki_kahuthu_54/sql-fundamentals-ddl-dml-and-practical-data-manipulation-3bic</link>
      <guid>https://dev.to/muriuki_kahuthu_54/sql-fundamentals-ddl-dml-and-practical-data-manipulation-3bic</guid>
      <description>&lt;p&gt;I have spent a good portion of my career dealing with structured data — member records at SACCOs, trade logs at a securities brokerage, and KYC verification tables on payments platforms. What all of these have in common is that they sit in relational databases, and the language you use to build, fill, query, and maintain those databases is SQL.&lt;/p&gt;

&lt;p&gt;This week's assignment brought me back to the basics: DDL, DML, filtering, and conditional logic. Here is what I worked through and how it connects to the kind of data work I do day to day.&lt;/p&gt;




&lt;h2&gt;
  
  
  DDL and DML — Two Sides of the Same Coin
&lt;/h2&gt;

&lt;p&gt;SQL commands fall into categories depending on what they act on. The two that matter most when you are starting out are DDL and DML.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DDL (Data Definition Language)&lt;/strong&gt; is about structure. It deals with the skeleton of your database — tables, columns, data types, constraints. The main DDL commands are &lt;code&gt;CREATE&lt;/code&gt;, &lt;code&gt;ALTER&lt;/code&gt;, and &lt;code&gt;DROP&lt;/code&gt;. When you run a DDL statement, you are changing what the database looks like, not what is stored inside it. Think of it as drawing up the blueprint for a filing cabinet before you start stuffing folders into it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DML (Data Manipulation Language)&lt;/strong&gt; is about the actual records. Once DDL has set up the structure, DML is how you put data in, pull data out, change it, or remove it. The core DML commands are &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, &lt;code&gt;DELETE&lt;/code&gt;, and &lt;code&gt;SELECT&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The distinction matters because mixing them up causes real problems. I have seen junior analysts on a team attempt to &lt;code&gt;INSERT&lt;/code&gt; into a table that did not exist yet — they skipped the DDL step entirely. Structure first, data second. That order never changes.&lt;/p&gt;




&lt;h2&gt;
  
  
  How I Used CREATE, INSERT, UPDATE, and DELETE
&lt;/h2&gt;

&lt;h3&gt;
  
  
  CREATE
&lt;/h3&gt;

&lt;p&gt;For the assignment, I started by defining the tables I would need. In my case, I set up a &lt;code&gt;members&lt;/code&gt; table, a &lt;code&gt;loan_products&lt;/code&gt; table, and a &lt;code&gt;transactions&lt;/code&gt; table — modelled loosely on SACCO operations I have worked with before.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;member_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;gender&lt;/span&gt; &lt;span class="nb"&gt;CHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;date_of_birth&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;branch&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;kyc_status&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;date_joined&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every column has a defined data type, and &lt;code&gt;member_id&lt;/code&gt; is set as the primary key so each record is uniquely identifiable. This is the same logic I followed when modelling dimension tables in Power BI — the grain of the table has to be clear from the start.&lt;/p&gt;

&lt;h3&gt;
  
  
  INSERT
&lt;/h3&gt;

&lt;p&gt;With the table in place, I populated it with records.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;member_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;date_of_birth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;branch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kyc_status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;date_joined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Amina'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Wanjiku'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'1990-03-12'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Westlands'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Verified'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2019-06-15'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Brian'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Ochieng'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'1987-07-25'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mombasa CBD'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Verified'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2018-01-10'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Cynthia'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mutua'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'1992-11-05'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kisumu Town'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2021-04-22'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'David'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kamau'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'1985-02-18'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Westlands'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Verified'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2017-09-03'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Esther'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Akinyi'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'F'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'1995-06-30'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Nakuru East'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Rejected'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2022-08-11'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Felix'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Otieno'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'M'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'1988-09-14'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Eldoret Central'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Verified'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2020-02-28'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a production environment I would be loading this from a CSV or an ETL pipeline, but the logic is the same — each &lt;code&gt;INSERT&lt;/code&gt; statement maps values to the columns defined in the &lt;code&gt;CREATE&lt;/code&gt; step.&lt;/p&gt;

&lt;h3&gt;
  
  
  UPDATE
&lt;/h3&gt;

&lt;p&gt;Records change. A member moves branches, a KYC status gets resolved, a transaction amount needs correction. &lt;code&gt;UPDATE&lt;/code&gt; handles that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;branch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Kilimani'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kyc_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Verified'&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;member_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This corrects Cynthia's branch and moves her KYC status from 'Pending' to 'Verified'. The &lt;code&gt;WHERE&lt;/code&gt; clause is critical here — run an &lt;code&gt;UPDATE&lt;/code&gt; without it and you overwrite every row in the table. I have seen that happen on a live SACCO database. It is not a mistake you make twice.&lt;/p&gt;

&lt;h3&gt;
  
  
  DELETE
&lt;/h3&gt;

&lt;p&gt;Sometimes records need to go. A duplicate entry, a test row left in production, or a member account that was created in error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;member_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same rule applies: always use &lt;code&gt;WHERE&lt;/code&gt; with &lt;code&gt;DELETE&lt;/code&gt;. Omitting it wipes the entire table clean.&lt;/p&gt;




&lt;h2&gt;
  
  
  Filtering with WHERE
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;WHERE&lt;/code&gt; clause is how you narrow down results to only the rows that matter. Without it, every &lt;code&gt;SELECT&lt;/code&gt;, &lt;code&gt;UPDATE&lt;/code&gt;, and &lt;code&gt;DELETE&lt;/code&gt; hits the full table.&lt;/p&gt;

&lt;p&gt;Here are the operators I used most in the assignment:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Equality (&lt;code&gt;=&lt;/code&gt;)&lt;/strong&gt; — straightforward exact match.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;branch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Westlands'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This returns only members registered at the Westlands branch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Greater than (&lt;code&gt;&amp;gt;&lt;/code&gt;)&lt;/strong&gt; — useful for numerical or date comparisons.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a SACCO context, this pulls transactions above KES 50,000 — the kind of threshold that triggers additional AML checks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BETWEEN&lt;/strong&gt; — filters within a range, inclusive on both ends.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;transaction_date&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="s1"&gt;'2024-01-01'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="s1"&gt;'2024-03-31'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I used this to isolate Q1 2024 transactions. Date range filtering comes up constantly in financial reporting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IN&lt;/strong&gt; — matches against a list of values.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;branch&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Westlands'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Mombasa CBD'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'Kisumu Town'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cleaner than writing three separate &lt;code&gt;OR&lt;/code&gt; conditions. I use &lt;code&gt;IN&lt;/code&gt; a lot when pulling data for specific branches or account types.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LIKE&lt;/strong&gt; — pattern matching on text fields.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="k"&gt;LIKE&lt;/span&gt; &lt;span class="s1"&gt;'W%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;%&lt;/code&gt; wildcard matches any sequence of characters. This finds all members whose last name starts with 'W'. The &lt;code&gt;_&lt;/code&gt; wildcard matches a single character if you need more precision.&lt;/p&gt;

&lt;p&gt;All of these operators can be combined with &lt;code&gt;AND&lt;/code&gt; and &lt;code&gt;OR&lt;/code&gt; to build more specific filters. The key thing is that &lt;code&gt;WHERE&lt;/code&gt; keeps your queries targeted — you get back what you need and nothing more.&lt;/p&gt;




&lt;h2&gt;
  
  
  Using CASE WHEN to Transform Data
&lt;/h2&gt;

&lt;p&gt;Raw data is rarely presentation-ready. &lt;code&gt;CASE WHEN&lt;/code&gt; lets you apply conditional logic inside a query, similar to nested IF statements in Excel but running directly on the database.&lt;/p&gt;

&lt;p&gt;In the assignment, I used it to classify transaction amounts into risk tiers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;member_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;100000&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'High Value'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Medium Value'&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Standard'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Micro'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;risk_tier&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This adds a computed column called &lt;code&gt;risk_tier&lt;/code&gt; to the result set without changing the underlying table. In compliance work, this kind of classification feeds directly into suspicious transaction reports — you tag records by threshold, then route them for review.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;CASE WHEN&lt;/code&gt; evaluates top to bottom and stops at the first match, so the order of conditions matters. A transaction of KES 120,000 hits the first condition and gets labelled 'High Value' — it does not fall through to the next one.&lt;/p&gt;

&lt;p&gt;I also used it to flag KYC status into a binary ready/not-ready indicator:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;kyc_status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;CASE&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;kyc_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Verified'&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt; &lt;span class="s1"&gt;'Active'&lt;/span&gt;
        &lt;span class="k"&gt;ELSE&lt;/span&gt; &lt;span class="s1"&gt;'Restricted'&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;account_status&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;members&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the kind of derived column I would build into a Power BI data model as a calculated column — except here it runs at the SQL layer before the data even reaches the BI tool.&lt;/p&gt;




&lt;h2&gt;
  
  
  Reflection
&lt;/h2&gt;

&lt;p&gt;The SQL covered this week is not new to me conceptually — I have written plenty of queries in the course of my work. But going back to the fundamentals and writing each statement out deliberately forced me to think about things I normally take for granted.&lt;/p&gt;

&lt;p&gt;What caught my attention most was how much of my day-to-day data work maps directly to these four operations. Every member onboarding is an &lt;code&gt;INSERT&lt;/code&gt;. Every KYC status update is an &lt;code&gt;UPDATE&lt;/code&gt;. Every duplicate cleanup is a &lt;code&gt;DELETE&lt;/code&gt;. And every report starts with a &lt;code&gt;SELECT&lt;/code&gt; and a &lt;code&gt;WHERE&lt;/code&gt; clause.&lt;/p&gt;

&lt;p&gt;The part I found most useful was working through &lt;code&gt;CASE WHEN&lt;/code&gt; with financial thresholds. In my compliance work, transaction classification is not optional — regulators expect it. Writing those conditions as SQL rather than handling them in Excel or Power Query is cleaner and more auditable.&lt;/p&gt;

&lt;p&gt;One thing that tripped me up briefly was column aliasing with &lt;code&gt;AS&lt;/code&gt; inside &lt;code&gt;CASE WHEN&lt;/code&gt; — I initially placed the alias in the wrong position, outside the &lt;code&gt;END&lt;/code&gt; keyword. Small syntax issue, but it threw an error that took a minute to track down. SQL is unforgiving about placement, and that is a good discipline to have when you are writing queries that run against production data.&lt;/p&gt;

&lt;p&gt;Overall, a solid week. The fundamentals are the foundation — everything from joins to window functions to stored procedures builds on top of &lt;code&gt;CREATE&lt;/code&gt;, &lt;code&gt;INSERT&lt;/code&gt;, &lt;code&gt;SELECT&lt;/code&gt;, and &lt;code&gt;WHERE&lt;/code&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Lawrence Kahuthu Muriuki — Data &amp;amp; Finance Professional, Nairobi&lt;/em&gt;&lt;/p&gt;

</description>
      <category>database</category>
      <category>sql</category>
      <category>analytics</category>
      <category>data</category>
    </item>
    <item>
      <title>How to Publish a Power BI Report and Embed It in a Website</title>
      <dc:creator>Kahuthu Muriuki </dc:creator>
      <pubDate>Sun, 05 Apr 2026 18:55:24 +0000</pubDate>
      <link>https://dev.to/muriuki_kahuthu_54/how-to-publish-a-power-bi-report-and-embed-it-in-a-website-1ion</link>
      <guid>https://dev.to/muriuki_kahuthu_54/how-to-publish-a-power-bi-report-and-embed-it-in-a-website-1ion</guid>
      <description>&lt;p&gt;Working with data professionally means that building a report is only half the job. The other half is making sure the right people can actually see it — without having to log into a dedicated analytics platform every time they need an update. That is where publishing and embedding come in.&lt;/p&gt;

&lt;p&gt;In my work across data-heavy environments, I have found that one of the most practical things you can do with a finished Power BI report is push it to the web. Whether you are sharing dashboards with a broader team or surfacing insights on an internal site, the steps are straightforward once you know the flow. This guide walks through the full process — from setting up a workspace to getting your report live on a webpage.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is Power BI?
&lt;/h2&gt;

&lt;p&gt;Power BI is a data analytics tool that takes raw data and turns it into interactive reports and visual dashboards. It connects to a wide range of data sources and gives you the ability to explore trends, track KPIs, and present findings in a format that non-technical stakeholders can actually engage with. It sits somewhere between a reporting tool and a full business intelligence platform — practical enough for day-to-day analysis, and capable enough for more complex data modelling work.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Publishing and Embedding Process at a Glance
&lt;/h2&gt;

&lt;p&gt;The overall flow has four stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Creating a workspace in the Power BI service&lt;/li&gt;
&lt;li&gt;Uploading and publishing your report from Power BI Desktop&lt;/li&gt;
&lt;li&gt;Generating an embed code from the published report&lt;/li&gt;
&lt;li&gt;Pasting that code into an HTML file to display the report on a website&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let us go through each stage step by step.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Creating a Workspace
&lt;/h2&gt;

&lt;p&gt;A workspace in Power BI is a shared environment where you organise and host your reports, dashboards, and datasets. Think of it as a project folder in the cloud — one that can be accessed and managed by multiple team members.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Open your browser and navigate to &lt;a href="https://app.powerbi.com/home?experience=power-bi" rel="noopener noreferrer"&gt;https://app.powerbi.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Sign in with your account credentials.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fouci4eyhf7zqho52k8sy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fouci4eyhf7zqho52k8sy.png" alt="Sign in to Power BI" width="605" height="462"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Once logged in, find the left-hand navigation panel and click on &lt;strong&gt;Workspaces&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felq5etye5tzkfjvlbmu0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felq5etye5tzkfjvlbmu0.png" alt=" " width="58" height="611"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; Click &lt;strong&gt;+ New Workspace&lt;/strong&gt; to begin creating your workspace.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjyf6omu2tp4k5kzyqwqf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjyf6omu2tp4k5kzyqwqf.png" alt="Create a new workspace" width="401" height="143"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5:&lt;/strong&gt; A side panel will appear. Enter a name and a brief description for your workspace, then click &lt;strong&gt;Apply&lt;/strong&gt; to save it. Give it a name that clearly reflects the project or report it will host — this saves confusion later, especially if you manage multiple workspaces.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Uploading and Publishing Your Report
&lt;/h2&gt;

&lt;p&gt;With your workspace ready, the next step is getting your report from your local machine into the Power BI service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Open the Power BI Desktop file (&lt;code&gt;.pbix&lt;/code&gt;) containing your report on your computer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; At the top right of the Power BI Desktop interface, click the &lt;strong&gt;Publish&lt;/strong&gt; button.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl8h8awwizs6ggopfiqbv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl8h8awwizs6ggopfiqbv.png" alt="Publish button in Power BI Desktop" width="89" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; A dialog box will appear prompting you to choose a destination. Select the workspace you just created.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqs27rxtr3d44tuprsnk3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqs27rxtr3d44tuprsnk3.png" alt=" " width="714" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once the upload finishes, your report becomes accessible through the Power BI service. You will see a confirmation screen indicating the publish was successful.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxur482z858nf5kpjgr5r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxur482z858nf5kpjgr5r.png" alt=" " width="612" height="352"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Generating the Embed Code
&lt;/h2&gt;

&lt;p&gt;Power BI gives you a way to embed reports directly into external web pages using an iframe snippet. Here is how to get that code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Open the published report in the Power BI service (navigate to your workspace and click on the report).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi71tbycx3twthhr5fdrr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi71tbycx3twthhr5fdrr.png" alt=" " width="800" height="85"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; In the top menu, click &lt;strong&gt;File&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfwvs5yp9tba5jn8dapq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjfwvs5yp9tba5jn8dapq.png" alt=" " width="800" height="463"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Hover over &lt;strong&gt;Embed report&lt;/strong&gt;, then click &lt;strong&gt;Website or portal&lt;/strong&gt; from the submenu that appears.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fujieeiw5ak8ndapx4amt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fujieeiw5ak8ndapx4amt.png" alt="Embed report option" width="596" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; A pop-up will display your embed code — an iframe snippet that you can copy directly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8taoqqo6m9pbxwtmwhfv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8taoqqo6m9pbxwtmwhfv.png" alt=" " width="700" height="361"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Copy that code. You will need it in the next step. It is worth noting that this option makes your report publicly accessible to anyone with the link, so be deliberate about what data you expose this way.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Embedding the Report on a Website
&lt;/h2&gt;

&lt;p&gt;With the embed code in hand, the final step is placing it inside an HTML file so that the report renders in a browser.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Visit &lt;a href="https://www.w3schools.com/html/" rel="noopener noreferrer"&gt;https://www.w3schools.com/html/&lt;/a&gt; and copy one of the basic HTML page templates to use as your starting structure.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdf49lztysbtu77uvm479.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdf49lztysbtu77uvm479.png" alt="HTML template reference" width="456" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Open a text editor of your choice. On your desktop, create a new folder and give it a name — this will hold your HTML file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqgk0g2cc8o3k72r190c8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqgk0g2cc8o3k72r190c8.png" alt="Setting up the project folder in a text editor" width="508" height="707"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; In the text editor, open that folder and create a new file with a &lt;code&gt;.html&lt;/code&gt; extension. Paste the HTML template you copied from the reference site into the file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9j100p4fupyb6fcw2z52.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9j100p4fupyb6fcw2z52.png" alt="Creating the HTML file" width="800" height="528"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; Inside the HTML file, locate the body section and replace the placeholder content with your Power BI embed code. Save the file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5:&lt;/strong&gt; Navigate to the folder you created on your desktop and open the HTML file in your browser.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6jt2o00fa9lu2anuwyl0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6jt2o00fa9lu2anuwyl0.png" alt="Report loading in browser" width="800" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 6:&lt;/strong&gt; Sign in when prompted. Once authenticated, your Power BI report will render directly on the page.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fti76r14cttta4j21re9x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fti76r14cttta4j21re9x.png" alt=" " width="800" height="448"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Insights
&lt;/h2&gt;

&lt;p&gt;A few things worth keeping in mind as you work through this process:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The "Publish to web" option is public by default.&lt;/strong&gt; When you use the &lt;em&gt;Website or portal&lt;/em&gt; embed path, the report is visible to anyone who accesses that page — no login required on the viewer's end. This works fine for general-purpose dashboards, but if your report contains sensitive data (financial records, client information, operational metrics), consider authenticated embed options instead, which require a Power BI Pro or Premium licence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workspaces determine access and governance.&lt;/strong&gt; The workspace you publish to controls who can view, edit, or manage the report within the Power BI service. Set up workspace access roles deliberately before sharing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The embed code is tied to the published report.&lt;/strong&gt; If you update and republish the report, the embed code typically remains valid and reflects the latest version automatically. You do not need to regenerate it every time you make changes to the underlying report.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is a starting point, not the only approach.&lt;/strong&gt; The method covered here — using a static HTML file — is the quickest way to test that your embed works. In a production environment, you would drop the same iframe code into your CMS, website template, or internal portal instead. The embed behaviour is the same regardless of where it sits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data currency depends on your refresh schedule.&lt;/strong&gt; The report on your website reflects whatever data was loaded the last time Power BI refreshed the dataset. If you need the embedded report to stay current, configure a scheduled refresh in the Power BI service settings for your dataset.&lt;/p&gt;




&lt;p&gt;Publishing a Power BI report and embedding it on a web page is a practical skill that closes the gap between analysis and audience. The built-in tooling handles most of the heavy lifting — the main thing you are doing is connecting the right pieces in the right order.&lt;/p&gt;

</description>
      <category>analytics</category>
      <category>microsoft</category>
      <category>tutorial</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained.</title>
      <dc:creator>Kahuthu Muriuki </dc:creator>
      <pubDate>Sun, 05 Apr 2026 10:52:06 +0000</pubDate>
      <link>https://dev.to/muriuki_kahuthu_54/understanding-data-modeling-in-power-bi-joins-relationships-and-schemas-explained-4c5b</link>
      <guid>https://dev.to/muriuki_kahuthu_54/understanding-data-modeling-in-power-bi-joins-relationships-and-schemas-explained-4c5b</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Working across fintech environments — raw data rarely comes in a shape that is ready for analysis. You are almost always dealing with multiple tables, spread across different systems, each holding a piece of the full picture. Before you can build a single meaningful report in Power BI, you need to understand how those pieces connect.&lt;/p&gt;

&lt;p&gt;That is what data modelling is about. It is the process of organising data into structured tables and defining how those tables relate to each other, so that reports are accurate, filters work correctly, and performance does not collapse under the weight of a large dataset. In Power BI, this modelling work sits between loading your data and building your visuals — and if you skip or rush it, everything downstream gets harder.&lt;/p&gt;

&lt;p&gt;This article walks through the key building blocks: joins, relationships, cardinality, schemas, and a few practical mistakes worth avoiding.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Joins: Combining Tables at the Query Level
&lt;/h2&gt;

&lt;p&gt;Before we get into how Power BI manages relationships, it helps to understand joins — because joins are how most data professionals first learn to connect tables, especially when working in SQL or within Power Query.&lt;/p&gt;

&lt;p&gt;A join merges data from two tables into a single result set based on a shared column. Think of a payments table and a member accounts table in a SACCO system. Both tables share a member ID. A join uses that shared column to bring relevant records together.&lt;/p&gt;

&lt;p&gt;But not every join behaves the same way, and choosing the wrong one can silently distort your analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inner Join
&lt;/h3&gt;

&lt;p&gt;An inner join returns only the records where a match exists in both tables. If a member account has no corresponding payment record, it is excluded entirely. This is useful when you only care about records with complete data on both sides.&lt;/p&gt;

&lt;h3&gt;
  
  
  Left Join (Left Outer Join)
&lt;/h3&gt;

&lt;p&gt;A left join keeps every record from the left table and pulls in matching records from the right table. Where there is no match, the right-side columns are left blank. This is particularly useful in compliance contexts — for example, when you need a full list of registered customers and want to flag those without a corresponding KYC document on file.&lt;/p&gt;

&lt;h3&gt;
  
  
  Right Join (Right Outer Join)
&lt;/h3&gt;

&lt;p&gt;The mirror of a left join. All records from the right table are retained, and matching records from the left are attached. Non-matching left-side records are dropped.&lt;/p&gt;

&lt;h3&gt;
  
  
  Full Outer Join
&lt;/h3&gt;

&lt;p&gt;Keeps everything from both tables, matched or not. Gaps appear where there is no corresponding record on either side. This is helpful when you want a complete picture — for instance, a reconciliation view showing all expected transactions alongside all actual ones, with clear gaps where discrepancies exist.&lt;/p&gt;

&lt;h3&gt;
  
  
  Left Anti Join
&lt;/h3&gt;

&lt;p&gt;Returns records from the left table that have no match in the right table. In a brokerage context, this could surface client accounts that have never placed a trade — useful for identifying inactive accounts during a portfolio review.&lt;/p&gt;

&lt;h3&gt;
  
  
  Right Anti Join
&lt;/h3&gt;

&lt;p&gt;Returns records from the right table with no match on the left. This could identify transactions that reference account IDs not found in your master client register — a red flag in any compliance or AML screening process.&lt;/p&gt;




&lt;h3&gt;
  
  
  Where to Apply Joins in Power BI
&lt;/h3&gt;

&lt;p&gt;Joins in Power BI live in Power Query, accessed via &lt;strong&gt;Home → Transform Data → Power Query Editor&lt;/strong&gt;. From there, you select the two tables, choose the matching columns, and pick your join type. The result is a merged table that can be loaded into your model.&lt;/p&gt;

&lt;p&gt;The important thing to note is that joins are a data preparation step. They physically combine records into a single table. That is different from relationships, which we will cover next.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Relationships: The Backbone of Your Data Model
&lt;/h2&gt;

&lt;p&gt;Once data is loaded into Power BI, the preferred way to connect tables is through relationships — not by repeatedly merging them. A relationship is a defined link between two tables based on a shared column. Define it once, and Power BI uses it automatically across every visual you build.&lt;/p&gt;

&lt;p&gt;Consider a SACCO reporting scenario: you have a transactions table recording every deposit, withdrawal, and loan repayment, and a members table holding member details. A relationship on the member ID column means Power BI knows how to connect a member's name to their transaction history, without you writing a join every time you build a chart.&lt;/p&gt;

&lt;p&gt;This is the fundamental difference between joins and relationships:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Joins&lt;/strong&gt; are temporary and create a physical merged table. They increase data volume.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Relationships&lt;/strong&gt; are persistent and logical. They keep tables separate but connected, and Power BI resolves the link at query time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For large datasets — say, transaction logs from a payments platform  — keeping tables separated through relationships rather than merged through joins is far more efficient.&lt;/p&gt;




&lt;h3&gt;
  
  
  Creating Relationships in Power BI
&lt;/h3&gt;

&lt;p&gt;Power BI will attempt to auto-detect relationships when column names match across tables. This is convenient but unreliable. I have seen it try to link two columns both named &lt;code&gt;date&lt;/code&gt;, from entirely unrelated tables, producing nonsensical filter behaviour. Always review auto-detected relationships and confirm or correct them manually.&lt;/p&gt;

&lt;p&gt;There are two ways to create relationships:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Model View&lt;/strong&gt; — drag a column from one table and drop it onto the matching column in another. A line appears between the tables showing the link.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manage Relationships&lt;/strong&gt; — available on the Home ribbon. This dialogue lets you create, edit, and delete relationships in one place, and gives you explicit control over cardinality and filter direction.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  3. Cardinality: Defining How Tables Relate
&lt;/h2&gt;

&lt;p&gt;Cardinality describes the numerical nature of the relationship between two tables — specifically, how records on one side correspond to records on the other.&lt;/p&gt;

&lt;h3&gt;
  
  
  One-to-Many (Most Common)
&lt;/h3&gt;

&lt;p&gt;One record in the first table corresponds to many records in the second. In a securities brokerage setup, one client maps to many trades. The clients table holds one row per client; the trades table holds one row per trade, with the client ID repeated across multiple rows. This is the relationship type you will use most often — between dimension tables and fact tables.&lt;/p&gt;

&lt;h3&gt;
  
  
  Many-to-One
&lt;/h3&gt;

&lt;p&gt;The same relationship viewed from the other direction. Many trades belong to one client. Power BI treats one-to-many and many-to-one as equivalent, depending on which table you anchor from.&lt;/p&gt;

&lt;h3&gt;
  
  
  One-to-One
&lt;/h3&gt;

&lt;p&gt;Each record in one table matches exactly one record in another. This is uncommon but does come up — for example, linking an employee record to their biometric access profile where there is a strict one-person-one-profile constraint.&lt;/p&gt;

&lt;h3&gt;
  
  
  Many-to-Many
&lt;/h3&gt;

&lt;p&gt;Multiple records in one table can relate to multiple records in another. This occurs in scenarios like a many-to-many product bundling setup, where one product can appear in multiple bundles, and each bundle can contain multiple products. These relationships require careful handling. Power BI supports them, but they can produce unexpected filter behaviour if not understood properly. Where possible, introduce a bridge table to resolve the many-to-many into two one-to-many relationships.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Cross-Filter Direction
&lt;/h2&gt;

&lt;p&gt;When a relationship exists, Power BI needs to know which direction filters should travel. This is the cross-filter direction setting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single direction&lt;/strong&gt; is the default and recommended approach. Filters flow from dimension tables toward the fact table. Select a specific security in a product's dimension, and the trade fact table updates to show only transactions involving that security.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Both directions&lt;/strong&gt; allow filters to propagate in either direction between tables. This can be necessary in some advanced reporting scenarios, but it introduces complexity. Filters can cascade in unexpected ways across the model, making results harder to predict and debug. Use bidirectional filtering sparingly and document why it is needed when you do.&lt;/p&gt;




&lt;h3&gt;
  
  
  Active vs. Inactive Relationships
&lt;/h3&gt;

&lt;p&gt;Power BI allows multiple relationships between two tables, but only one can be active at a time. The active relationship is the one Power BI uses by default in visuals.&lt;/p&gt;

&lt;p&gt;Inactive relationships are not ignored — they exist in the model and can be called on explicitly using DAX functions like &lt;code&gt;USERELATIONSHIP()&lt;/code&gt;. A common use case is date tables, where you might have separate relationships for trade date, settlement date, and value date, but only one can be active in the default context.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Fact Tables and Dimension Tables
&lt;/h2&gt;

&lt;p&gt;Most professional data models are built around a clear separation between fact tables and dimension tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fact tables&lt;/strong&gt; store events or transactions. They record what happened — a payment processed, a trade executed, a loan disbursed, a member's deposit posted. Each row in a fact table is a single event. Fact tables tend to be long (many rows) and relatively narrow — they hold numeric measures and foreign keys, not descriptive details.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dimension tables&lt;/strong&gt; provide context for those events. They answer the who, what, where, and when. A members table, a products table, a branches table, a calendar table — these are all dimension tables. They tend to be shorter (fewer rows) and wider, with descriptive attributes that give your visuals meaningful labels.&lt;/p&gt;

&lt;p&gt;The relationship between dimension tables and the fact table is almost always one-to-many: one member, many transactions. This structure is the foundation of the star schema.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Data Modelling Schemas
&lt;/h2&gt;

&lt;p&gt;A schema is the overall structure of how your tables are arranged and connected. Choosing the right schema affects model performance, report flexibility, and how easy the model is to maintain as data grows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Flat Table
&lt;/h3&gt;

&lt;p&gt;Before any modelling happens, data often arrives as a flat table — one large sheet with every column crammed in: customer name, branch, product, transaction amount, date, currency, all in one place. This is common with data exports from core banking systems or when scraping transaction logs.&lt;/p&gt;

&lt;p&gt;Flat tables are readable at a glance, but they scale poorly. Every time a customer name or branch label appears in a transaction, it is repeated in full. If a branch is renamed, you update hundreds or thousands of rows. Storage grows faster than it should, and query performance suffers. Flat tables are a starting point, not a destination.&lt;/p&gt;

&lt;h3&gt;
  
  
  Star Schema
&lt;/h3&gt;

&lt;p&gt;The star schema is the workhorse of Power BI data modelling. One central fact table sits at the middle, with dimension tables connected directly around it. Viewed in Model View, it literally looks like a star.&lt;/p&gt;

&lt;p&gt;In a payments context, your central fact table might hold every payment transaction — amount, currency, timestamp, sender ID, receiver ID, method ID. Surrounding it are dimension tables: a members/accounts table, a currency reference table, a payment methods table, a calendar table. Each dimension connects directly to the fact table on a one-to-many basis.&lt;/p&gt;

&lt;p&gt;The advantages are real:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Descriptive data is stored once in dimension tables instead of being repeated in every transaction row.&lt;/li&gt;
&lt;li&gt;The model is easy to read. You can look at the diagram and immediately understand how everything connects.&lt;/li&gt;
&lt;li&gt;Power BI performs well with this structure. Filters propagate predictably from dimensions to the fact table.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most reporting work — dashboards, KPI monitoring, regulatory reporting — the star schema is the right choice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Snowflake Schema
&lt;/h3&gt;

&lt;p&gt;The snowflake schema extends the star by normalising the dimension tables. Instead of one flat members dimension, you might split it into a members table, a regions table, and a countries table — each linking to the next. This breaks down hierarchical data into its component layers.&lt;/p&gt;

&lt;p&gt;Normalisation reduces redundancy further and ensures data consistency. If Kenya is spelt "Kenya" in the countries table and linked by ID everywhere else, you never have to worry about "kenya" or "KENYA" appearing elsewhere.&lt;/p&gt;

&lt;p&gt;The tradeoff is complexity. More tables mean more relationships to manage, more potential for misconfiguration, and more joins for Power BI to resolve at query time. Snowflake schemas make sense for very large datasets with deep hierarchies, or when the source data is already structured that way from a warehousing system. For most operational dashboards and management reporting, the star schema performs better and is much easier to maintain.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Common Mistakes and How to Avoid Them
&lt;/h2&gt;

&lt;p&gt;A few problems come up regularly when building data models, particularly if you are moving quickly or working with unfamiliar data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Missing relationships&lt;/strong&gt; are the most common. If your tables are not properly linked, visuals behave unexpectedly — a slicer that should filter your transactions does nothing, or totals appear correct but do not break down properly. Always confirm that every connection between a fact table and its dimensions is in place.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrong cardinality&lt;/strong&gt; can produce duplicated or inflated numbers. Setting a one-to-many relationship as many-to-many when the data does not warrant it causes Power BI to count records multiple times. Before confirming a relationship, verify whether your key column truly has unique values on the "one" side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Descriptive columns in the fact table&lt;/strong&gt; bloat the model unnecessarily. A member's name, branch label, or product category has no business sitting in every transaction row. That information belongs in dimension tables. Keep fact tables lean: foreign keys and numeric measures only.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Overusing bidirectional filters&lt;/strong&gt; creates ambiguity. In a model with several connected tables, bidirectional filters can cause cascading filter propagation that makes results unpredictable. Default to single-direction filters unless you have a specific, well-understood reason to enable both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unnecessary snowflaking&lt;/strong&gt; adds complexity without proportionate benefit. If a simple star schema serves your reporting needs, building a multi-level snowflake structure because it feels more rigorous will only make the model harder to troubleshoot. Start simple and add complexity only when the data or the reporting requirements genuinely demand it.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Putting It Together: A Practical Walkthrough
&lt;/h2&gt;

&lt;p&gt;Say you are building a member activity dashboard for a SACCO. You have three tables: &lt;code&gt;transactions&lt;/code&gt;, &lt;code&gt;members&lt;/code&gt;, and &lt;code&gt;loan_products&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Load data:&lt;/strong&gt; Import all three tables into Power BI via Power Query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 — Clean data:&lt;/strong&gt; In Power Query, remove duplicate records, standardise column names (use &lt;code&gt;member_id&lt;/code&gt; consistently, not &lt;code&gt;ID&lt;/code&gt; in one table and &lt;code&gt;MemberID&lt;/code&gt; in another), and handle any null values in key columns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3 — Build relationships:&lt;/strong&gt; In Model View, link &lt;code&gt;transactions[member_id]&lt;/code&gt; to &lt;code&gt;members[member_id]&lt;/code&gt; with a one-to-many relationship. Link &lt;code&gt;transactions[product_id]&lt;/code&gt; to &lt;code&gt;loan_products[product_id]&lt;/code&gt; the same way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4 — Apply star schema:&lt;/strong&gt; &lt;code&gt;transactions&lt;/code&gt; is your fact table. &lt;code&gt;members&lt;/code&gt; and &lt;code&gt;loan_products&lt;/code&gt; are your dimension tables.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5 — Build visuals:&lt;/strong&gt; Now you can create a bar chart showing total disbursements by loan product, a table breaking down transaction volumes by member tier, and a slicer filtering all visuals by branch — all from a single clean model.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Data modelling is where reporting either earns its reliability or loses it. The mechanics — joins, relationships, cardinality, schemas — are not just theoretical abstractions. They directly determine whether a slicer filters correctly, whether a total is accurate, and whether a model can handle a growing dataset without grinding to a halt.&lt;/p&gt;

&lt;p&gt;The star schema, clear one-to-many relationships, and lean fact tables are the foundation of any model worth building on. Get the structure right before you start building visuals, and the visuals become straightforward. Get it wrong, and you end up debugging report behaviour instead of actually analysing data.&lt;/p&gt;




</description>
      <category>datamodelling</category>
      <category>analytics</category>
      <category>dataviz</category>
    </item>
    <item>
      <title>How Excel is Used in Real-World Data Analysis</title>
      <dc:creator>Kahuthu Muriuki </dc:creator>
      <pubDate>Sun, 29 Mar 2026 06:39:53 +0000</pubDate>
      <link>https://dev.to/muriuki_kahuthu_54/how-excel-is-used-in-real-world-data-analysis-538b</link>
      <guid>https://dev.to/muriuki_kahuthu_54/how-excel-is-used-in-real-world-data-analysis-538b</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Excel is not always the flashiest choice. There are more specialised tools — Power BI for dashboards, Python for large-scale processing, SQL for database queries. I use several of them. But Excel remains the place where raw data first lands, where quick checks happen, and where non-technical stakeholders can engage with findings without needing a login or a training session.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is Excel, and Why Does It Still Matter?
&lt;/h2&gt;

&lt;p&gt;Microsoft Excel is a spreadsheet application that organises data in rows and columns, supports calculation through formulas and functions, and produces visual output through charts and pivot tables. What keeps it relevant despite faster, more capable alternatives is its low barrier to entry and near-universal presence in organisations. Every finance department, every compliance team, and every operations desk I have worked in has had Excel open.&lt;/p&gt;

&lt;p&gt;For anyone entering data-related work, Excel builds the right instincts — understanding data types, spotting inconsistencies, and thinking about how records relate to each other. The skills transfer directly to more advanced tools later on.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where Excel Shows Up in Real Work
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Financial Reporting and Reconciliation
&lt;/h3&gt;

&lt;p&gt;In financial services, Excel is where P&amp;amp;L summaries are assembled, cash flow projections are modelled, and month-end variance analyses are performed. Functions like &lt;code&gt;SUMIF&lt;/code&gt;, &lt;code&gt;VLOOKUP&lt;/code&gt;, and nested &lt;code&gt;IF&lt;/code&gt; statements do most of the heavy lifting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Compliance Tracking and KYC Management
&lt;/h3&gt;

&lt;p&gt;A well-built Excel tracker for compliance work typically uses data validation to restrict entries to approved values, conditional formatting to flag overdue reviews in red, and &lt;code&gt;DATEDIF&lt;/code&gt; formulas to calculate how many days remain before a document expires. These are not complex features, but they prevent errors that cost time and create regulatory exposure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Data Cleaning: The Part Nobody Talks About Enough
&lt;/h2&gt;

&lt;p&gt;In practice, most data arriving in Excel is not clean. Client names have inconsistent spacing, dates are stored as text, and duplicate entries appear across merged files from different systems. Cleaning this data before any analysis is not optional — it directly determines whether the conclusions drawn are accurate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features Regularly used
&lt;/h3&gt;

&lt;h4&gt;
  
  
  TRIM and CLEAN
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;TRIM&lt;/code&gt; removes extra spaces that accumulate when data is exported from systems like Salesforce or Zendesk. &lt;code&gt;CLEAN&lt;/code&gt; removes non-printable characters that sometimes appear in data pulled from legacy platforms. Both are quiet functions that prevent downstream errors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=TRIM(A2)      — removes leading, trailing and double spaces
=CLEAN(A2)     — removes non-printable characters from text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Find and Replace — &lt;code&gt;Ctrl+H&lt;/code&gt;
&lt;/h4&gt;

&lt;p&gt;Particularly useful when a field has inconsistent values — for example, a country column where the same country appears as &lt;code&gt;KE&lt;/code&gt;, &lt;code&gt;Kenya&lt;/code&gt;, and &lt;code&gt;kenya&lt;/code&gt;. A few Find and Replace passes standardise the field before any counting or filtering happens.&lt;/p&gt;

&lt;h4&gt;
  
  
  Text to Columns
&lt;/h4&gt;

&lt;p&gt;When a field contains combined data — a full name in one cell, a date and a reference number separated by a hyphen — Text to Columns splits it into usable parts. I have used this frequently when pulling client data from onboarding systems that concatenate fields.&lt;/p&gt;

&lt;h4&gt;
  
  
  Remove Duplicates
&lt;/h4&gt;

&lt;p&gt;In compliance work, duplicate partner records are a real risk. The Remove Duplicates function, combined with conditional formatting to highlight matches first, is a reliable way to check data quality before running analysis.&lt;/p&gt;

&lt;h4&gt;
  
  
  Data Validation
&lt;/h4&gt;

&lt;p&gt;The best way to reduce cleaning work is to prevent bad data from entering in the first place. Data Validation restricts cells to approved values, specific number ranges, or particular date formats. When I build input templates for teams, data validation is always included.&lt;/p&gt;




&lt;h2&gt;
  
  
  Transforming Data into Something Usable
&lt;/h2&gt;

&lt;p&gt;Once data is clean, it often needs reshaping before it can be analysed. Transformation in Excel covers formatting for readability, converting data types, and restructuring how records are organised.&lt;/p&gt;

&lt;h3&gt;
  
  
  Formatting for Readability
&lt;/h3&gt;

&lt;p&gt;Column widths that cut off values, missing headers, inconsistent number formats — these are not cosmetic issues. They slow down work and cause misreadings. Bolding header rows, applying consistent number formatting, and using freeze panes to keep headers visible while scrolling are small habits that save time across large datasets.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Type Conversion
&lt;/h3&gt;

&lt;p&gt;Numbers stored as text are a persistent problem in exported data. They look correct, but will not respond to &lt;code&gt;SUM&lt;/code&gt; or &lt;code&gt;AVERAGE&lt;/code&gt;. The &lt;code&gt;VALUE&lt;/code&gt; function converts them. Similarly, dates stored as text need conversion before date functions will work correctly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=VALUE(A2)       — converts text that looks like a number into an actual number
=DATEVALUE(A2)   — converts text that looks like a date into a date serial number
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Text Standardisation
&lt;/h3&gt;

&lt;p&gt;In datasets containing client names or country entries, inconsistent capitalisation creates grouping errors. &lt;code&gt;PROPER&lt;/code&gt;, &lt;code&gt;UPPER&lt;/code&gt;, and &lt;code&gt;LOWER&lt;/code&gt; standardise text fields so that pivot tables and &lt;code&gt;COUNTIF&lt;/code&gt; formulas group records correctly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=PROPER(A2)    — capitalises the first letter of each word
=UPPER(A2)     — converts all text to capitals
=LOWER(A2)     — converts all text to lower case
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Analysing Data: From Raw Numbers to Decisions
&lt;/h2&gt;

&lt;p&gt;The analytical stage is where the work becomes visible to others. In financial and compliance roles, this means producing numbers that someone will act on — a funding decision, a risk escalation, a process change. The formulas used here need to be correct, and the logic behind them needs to be defensible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Statistical Functions
&lt;/h3&gt;

&lt;p&gt;The foundational functions cover the majority of day-to-day analytical needs in operations and finance work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;=AVERAGE(range)&lt;/code&gt; — mean value across a set, used for KPI benchmarking&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;=MEDIAN(range)&lt;/code&gt; — middle value, more reliable than average when outliers are present&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;=COUNT(range)&lt;/code&gt; — counts numeric entries; &lt;code&gt;COUNTA&lt;/code&gt; counts all non-empty cells&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;=SUM(range)&lt;/code&gt; — total of a range; &lt;code&gt;SUMIF&lt;/code&gt; adds a condition&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;=MAX(range)&lt;/code&gt; and &lt;code&gt;=MIN(range)&lt;/code&gt; — identify the ceiling and floor of a dataset&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Lookup Functions
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;VLOOKUP&lt;/code&gt; and its more capable successor &lt;code&gt;XLOOKUP&lt;/code&gt; are essential when reconciling data across multiple sources — matching partner IDs against a reference table, pulling account names from a separate register, or checking whether a client appears on a restricted list.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=XLOOKUP(lookup_value, lookup_array, return_array)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;XLOOKUP&lt;/code&gt; is the version worth learning now. It handles left-hand lookups, returns custom values when no match is found, and is less sensitive to column order changes than &lt;code&gt;VLOOKUP&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conditional Logic
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;IF&lt;/code&gt; statements allow analysis to respond to conditions in the data. In compliance work, this might mean classifying clients into risk tiers based on transaction volume, or flagging records where a required document field is empty.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=IF(D2&amp;gt;1000000,"High Risk",IF(D2&amp;gt;100000,"Medium Risk","Low Risk"))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;IFS&lt;/code&gt; simplifies the logic when there are multiple conditions to evaluate, avoiding deeply nested &lt;code&gt;IF&lt;/code&gt; functions that become difficult to read and maintain.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pivot Tables
&lt;/h3&gt;

&lt;p&gt;Pivot tables are the fastest route from a large dataset to a summary. In a compliance tracker with hundreds of partner records, a pivot table can show the count of partners by risk tier, by country, and by document status in seconds — without touching the original data.&lt;/p&gt;

&lt;p&gt;The key discipline is keeping the source data clean and consistently structured. A pivot table is only as reliable as the data feeding it. When I build reporting templates, the source data tab and the pivot summary tab are always separate, and the source data has validated, consistent entries.&lt;/p&gt;

&lt;p&gt;Pivot tables are also the starting point for dashboards — once the summary logic is working correctly in a pivot, the chart built from it will update automatically when new data is added.&lt;/p&gt;




&lt;h2&gt;
  
  
  Formula Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Formula&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;th&gt;Where I Use It&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Statistical&lt;/td&gt;
&lt;td&gt;&lt;code&gt;=AVERAGE(range)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Calculates mean values&lt;/td&gt;
&lt;td&gt;Client KPI dashboards&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Statistical&lt;/td&gt;
&lt;td&gt;&lt;code&gt;=COUNT / COUNTA&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Counts entries in a range&lt;/td&gt;
&lt;td&gt;Onboarding completion tracking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lookup&lt;/td&gt;
&lt;td&gt;&lt;code&gt;=VLOOKUP / XLOOKUP&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Retrieves data across tables&lt;/td&gt;
&lt;td&gt;Matching partner records in KYC files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Logical&lt;/td&gt;
&lt;td&gt;&lt;code&gt;=IF / IFS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Returns results based on a condition&lt;/td&gt;
&lt;td&gt;Risk-tier classification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Text&lt;/td&gt;
&lt;td&gt;&lt;code&gt;=TRIM / PROPER&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Cleans and normalises text&lt;/td&gt;
&lt;td&gt;Standardising client name fields&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Date&lt;/td&gt;
&lt;td&gt;&lt;code&gt;=DATEDIF / TODAY()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Calculates date intervals&lt;/td&gt;
&lt;td&gt;Monitoring SLA and review deadlines&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Aggregation&lt;/td&gt;
&lt;td&gt;&lt;code&gt;=SUMIF / COUNTIF&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Conditional sum or count&lt;/td&gt;
&lt;td&gt;Flagging overdue compliance cases&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Data Visualisation: Making Findings Accessible
&lt;/h2&gt;

&lt;p&gt;Not everyone who needs to understand the data wants to scroll through a spreadsheet. Charts and dashboards translate findings into a form that supports faster decisions in meetings, presentations, and reports.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chart Types and When to Use Them
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Bar and Column Charts&lt;/strong&gt; — suited for comparing values across categories, such as monthly onboarding volumes across different markets or contrasting resolution times across client segments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Line Charts&lt;/strong&gt; — best for showing change over time. In operational reporting, line charts work well for tracking weekly ticket volumes, daily transaction counts, or month-on-month revenue trends.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pie Charts&lt;/strong&gt; — useful for showing proportional composition, such as the share of total cases by risk category. They become harder to read with more than five or six segments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scatter Plots&lt;/strong&gt; — helpful when exploring relationships between two variables, for example, whether higher transaction values correlate with longer onboarding times.&lt;/p&gt;

&lt;h3&gt;
  
  
  Building a Dashboard
&lt;/h3&gt;

&lt;p&gt;A functional Excel dashboard connects pivot tables and charts into a single view, using slicers and dropdown controls to filter without modifying the underlying data. Keep the layout consistent, limit the number of metrics on a single page, and make the filters obvious to someone who did not build it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing Thoughts
&lt;/h2&gt;

&lt;p&gt;Excel rewards the person who takes the time to understand it properly. The gap between someone who uses Excel to store numbers and someone who uses it to drive decisions is not about knowing more functions — it is about understanding the data, asking the right questions, and building outputs that other people can trust and act on.&lt;/p&gt;

&lt;p&gt;The tool itself is not the skill; the skill is knowing how to move from raw data to a conclusion that holds up under scrutiny. That is a habit worth building early, and Excel is still one of the better places to build it.&lt;/p&gt;

</description>
      <category>learning</category>
      <category>analytics</category>
    </item>
  </channel>
</rss>
