<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kithokoi James</title>
    <description>The latest articles on DEV Community by Kithokoi James (@kithokoi).</description>
    <link>https://dev.to/kithokoi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3708626%2F223669ba-d227-4cd7-84ab-39d5ed490828.jpeg</url>
      <title>DEV Community: Kithokoi James</title>
      <link>https://dev.to/kithokoi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kithokoi"/>
    <language>en</language>
    <item>
      <title>Connecting Power BI to a SQL Database From Scratch: A Step by Step Guide</title>
      <dc:creator>Kithokoi James</dc:creator>
      <pubDate>Mon, 16 Mar 2026 20:14:50 +0000</pubDate>
      <link>https://dev.to/kithokoi/how-to-connect-power-bi-to-a-sql-database-from-scratch-25f7</link>
      <guid>https://dev.to/kithokoi/how-to-connect-power-bi-to-a-sql-database-from-scratch-25f7</guid>
      <description>&lt;p&gt;Power BI is a data analysis and visualization tool. It helps turn raw data into charts, reports, and dashboards that are easier to read and use for decisions. &lt;/p&gt;

&lt;p&gt;Power BI Desktop is the part most beginners start with because it is where you connect to data, clean it, build relationships, and design reports. It is widely used in data analysis and business intelligence because it can pull data from many sources, shape that data, and turn it into interactive visuals.&lt;/p&gt;

&lt;p&gt;Companies connect Power BI to databases because databases are where business data usually lives. A SQL database stores data in structured tables which makes it easy to manage, update, and query large sets of data. When Power BI connects directly to that database, analysts can work with more reliable and more up-to-date data.&lt;/p&gt;

&lt;p&gt;In this article, we will use PostgreSQL as the SQL database example. We will look at two common situations: connecting Power BI to a local PostgreSQL database on your own PC, and connecting it to a cloud PostgreSQL database hosted on Aiven. Then we will look at how to load tables and create relationships so you can Power BI can analyze the data properly&lt;/p&gt;

&lt;h2&gt;
  
  
  How Power BI connects to a local PostgreSQL database
&lt;/h2&gt;

&lt;p&gt;The truth is that connecting Power BI to a local PostgreSQL database is simple.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Open PowerBI and Navigate to Get Data&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv8emanrykba1mbb5hy2s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv8emanrykba1mbb5hy2s.png" alt=" " width="800" height="423"&gt;&lt;/a&gt;&lt;br&gt;
This is always the first step, open the Power BI Desktop app and, on the home ribbon, click &lt;em&gt;Get Data&lt;/em&gt;. In the list of data sources, choose PostgreSQL database. This opens the connection window where Power BI asks for the server and database details. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Choose PostgreSQL Database&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1yjudom3phsqknoad0p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft1yjudom3phsqknoad0p.png" alt=" " width="800" height="424"&gt;&lt;/a&gt;&lt;br&gt;
Since we are connecting to a local PostgreSQL database, select postgresql on the list. A pop up will show up with two slots, server and database. The server name tells Power BI where the database is running. If PostgreSQL is installed on your own computer, the server name will be &lt;em&gt;localhost&lt;/em&gt;, which simply means “this machine.” &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Enter the Server and Database Details&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17x6fxdugm96w8slq2zf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17x6fxdugm96w8slq2zf.png" alt=" " width="800" height="423"&gt;&lt;/a&gt;&lt;br&gt;
Once open, you will enter the server name, the database name, and your login credentials. If your PostgreSQL setup uses the default port, it will work with &lt;em&gt;localhost&lt;/em&gt;, though in some cases you may also see it written as &lt;em&gt;localhost:5432&lt;/em&gt;. The database name is the specific database you want Power BI to access, such as a sales or inventory database. After that, Power BI will ask for your PostgreSQL username and password. &lt;/p&gt;

&lt;p&gt;These credentials are important because they control access to the database and make sure only authorized users can connect. Once these details are entered correctly, Power BI will reach the database and reveal the available tables.&lt;/p&gt;

&lt;p&gt;After that, Power BI asks how you want to connect. There is two options, import and directQuery. Here, import is easier because Power BI loads the data into the report model, which tends to be simpler and faster to work with as you learn. Then choose Database authentication and enter your PostgreSQL username and password. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Select and Load&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuov92k6owlmfitzomc2r.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuov92k6owlmfitzomc2r.jpg" alt=" " width="800" height="517"&gt;&lt;/a&gt;&lt;br&gt;
Once you sign in successfully, power BI will take you to the navigator, where you will see the available tables and views in your database. From there, you can either click Load to bring the data in immediately or Transform Data if you want to clean or reshape it first.&lt;/p&gt;

&lt;p&gt;So connecting locally involves, opening Power BI Desktop, choosing PostgreSQL as the source, entering localhost and your database name, signing in with your PostgreSQL credentials, and then loading the tables you want.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Connect PowerBI to a Cloud PostgreSQL Database on Aiven
&lt;/h2&gt;

&lt;p&gt;To connect to a cloud database is very similar in principle, but you will need conection string from &lt;a href="https://aiven.io/" rel="noopener noreferrer"&gt;aiven&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Login to Aiven&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flht9vizg7zfilrzvgkof.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flht9vizg7zfilrzvgkof.png" alt=" " width="800" height="380"&gt;&lt;/a&gt;&lt;br&gt;
Open your aiven console and go to services on the left panel. Select postgreSQL as it's the one we are using. Once opened, you will see the connection details needed to connect with powerBI.&lt;/p&gt;

&lt;p&gt;These details are the cloud version of what localhost represented in the local example. This simply means instead of pointing to your own machine, you are now pointing Power BI to a remote PostgreSQL server hosted by Aiven.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Establish Connection With PowerBI&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd96txrxud10de1xaiwea.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd96txrxud10de1xaiwea.png" alt=" " width="800" height="380"&gt;&lt;/a&gt;&lt;br&gt;
To connect, navigate to the connection information area on aiven postgreSQL, you will see the details Power BI needs: the host, port, database name, username, and password.&lt;/p&gt;

&lt;p&gt;In Power BI Desktop, go again to home button, then to get data and finnaly to postgreSQL database. In the Server field, enter the Aiven host name. You will specify the port, and add it to the host with a colon, such as &lt;em&gt;&lt;em&gt;your-host.aivencloud.com:12345&lt;/em&gt;&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;Then enter the database name. After that, choose Database authentication and enter the Aiven username and password. If you do this right and it's succesful, Power BI will show the navigator so you can pick the tables you want to load.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Download the SSL Certificate&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frkj19j45718lsmtem9g0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frkj19j45718lsmtem9g0.png" alt=" " width="800" height="377"&gt;&lt;/a&gt;&lt;br&gt;
When you connect Power BI to a cloud database like Aiven PostgreSQL, SSL certificates matter because the connection is happening over the internet rather than only on your own computer. That means the data is traveling through external networks, so it needs protection. &lt;/p&gt;

&lt;p&gt;The SSL certificate helps create a secure connection by encrypting the data as it moves between Power BI and the database, which makes it much harder for anyone else to read or interfere with it. It also helps your computer confirm that it is connecting to the real Aiven database server and not a fake or unsafe one. This is especially important when the database contains business data such as customer information, sales records, or product details. &lt;/p&gt;

&lt;p&gt;To download it, go to your Aiven service’s Overview page, find the Connection information section, and click CA Certificate to download it. This certificate helps keep the connection secure and makes sure Power BI connects to the real database server.&lt;/p&gt;

&lt;p&gt;The process is that simple, with the cloud version as you will use simmilar process as the local one: open Power BI, choose PostgreSQL, enter the server and database, sign in, and load tables. The main difference is that for Aiven you work with remote host details instead of localhost.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Load Data and Create Relationships Between Tables
&lt;/h2&gt;

&lt;p&gt;Once the connection is working, the next step is loading your tables. &lt;/p&gt;

&lt;p&gt;Suppose your database has these tables: customers, products, sales, and inventory. In the navigator, select those tables and click Load. Power BI imports them into the data model, and in many cases it will try to detect relationships automatically, If not, you can create them yourself.&lt;/p&gt;

&lt;p&gt;This is where basic data modeling starts. A relationship tells Power BI how one table connects to another. For example, sales might have a customer_id column that links to customers.customer_id, and sales.product_id might link to products.product_id. If your inventory table tracks stock by product, then inventory.product_id would also connect to products.product_id. &lt;/p&gt;

&lt;p&gt;These links matter because Power BI uses them to calculate totals correctly and to make filters work properly across visuals. These relationships are necessary when you want accurate analysis across multiple tables.&lt;/p&gt;

&lt;p&gt;A useful beginner way to think about this is that some tables describe things, while others record activity. customers and products are descriptive tables. sales is a transaction table because it records what happened. &lt;/p&gt;

&lt;p&gt;In Power BI this kind of setup is a star schema, where descriptive tables connect to fact-style tables for better usability and performance. That is  one reason it is usually smarter to connect customers and products to sales than to randomly connect every table to every other table.&lt;/p&gt;

&lt;p&gt;If Power BI does not create the relationships automatically, go to Model view. This view will shows all tables and the lines between them. You can open Modeling &amp;gt; Manage relationships and choose New to create a relationship manually, or in some cases drag a column from one table to the matching column in another table. Power BI also expects at least one side of a relationship to contain unique values. So customers.customer_id should be unique in the customers table, and products.product_id should be unique in the products table.&lt;/p&gt;

&lt;p&gt;When these relationships are set correctly, Power BI will answer questions properly. For example, it will show total sales by customer, sales by product, or compare product sales against inventory. Without the relationships, the report will show wrong totals, duplicated counts, or visuals that do not filter each other the way it is expected. &lt;/p&gt;

&lt;h2&gt;
  
  
  Why SQL Skills Are Important for Power BI Analysts
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;SQL helps analysts get the right data.&lt;br&gt;
In many cases, the data used in Power BI comes from SQL databases. Knowing SQL helps analysts pull only the data they need instead of bringing in everything, which keeps the work cleaner and easier to manage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;SQL makes it easier to combine and prepare data.&lt;br&gt;
Business data is often spread across different tables, such as customers, products, and sales. SQL helps analysts join those tables, filter records, and organize the data before building reports.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;SQL helps create more accurate analysis.&lt;br&gt;
Analysts often need totals, averages, counts, or grouped results before they start building visuals. SQL helps do this in a clear way, which makes the final dashboard more reliable and easier to trust.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Knowing how to connect Power BI to a SQL database is a basic but very important skill. It lets you move from static files into real business data. Whether the database is local PostgreSQL on your machine or a cloud PostgreSQL service like Aiven, the idea is the same: connect securely, load the right tables, and build the relationships Power BI needs for correct analysis.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>sqlserver</category>
      <category>database</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Advanced SQL for Data Analytics: Advanced Techniques Every Data Analyst Should Know</title>
      <dc:creator>Kithokoi James</dc:creator>
      <pubDate>Wed, 11 Mar 2026 19:28:28 +0000</pubDate>
      <link>https://dev.to/kithokoi/how-to-install-postgresql-for-beginners-a-step-by-step-guide-480b</link>
      <guid>https://dev.to/kithokoi/how-to-install-postgresql-for-beginners-a-step-by-step-guide-480b</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;SQL is one of the most important skills for anyone working with data. Whether you are a data analyst, data scientist, or data engineer, chances are that a large portion of your work involves querying data stored in databases.&lt;/p&gt;

&lt;p&gt;Most people start learning SQL by writing simple queries. They learn how to retrieve data, filter rows, and perform basic aggregations. While these skills are important, they only represent the first step.&lt;/p&gt;

&lt;p&gt;In real-world data analytics, analysts are rarely asked simple questions like &lt;em&gt;“show me all rows in a table.”&lt;/em&gt; Instead, businesses ask questions that require deeper analysis:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which customers generate the most revenue?&lt;/li&gt;
&lt;li&gt;How has sales performance changed over time?&lt;/li&gt;
&lt;li&gt;Which products are becoming more popular?&lt;/li&gt;
&lt;li&gt;What patterns exist in user behavior?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Answering questions like these requires more than basic SQL knowledge. Analysts need to apply &lt;strong&gt;advanced SQL techniques&lt;/strong&gt; to transform raw data into meaningful insights.&lt;/p&gt;

&lt;p&gt;In this article, we will explore several advanced SQL concepts and how they are used in real-world scenarios. The goal is not only to understand the syntax but also to understand &lt;strong&gt;how analysts think when solving data problems&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Basic Queries to Real Data Questions
&lt;/h2&gt;

&lt;p&gt;When learning SQL for the first time, many people start with simple queries such as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query retrieves all records from a table. While it is useful for exploring data, it does not provide much insight.&lt;/p&gt;

&lt;p&gt;In a real company, an analyst might be asked questions such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Who are our highest spending customers?&lt;/li&gt;
&lt;li&gt;What is our average order value?&lt;/li&gt;
&lt;li&gt;Which products generate the most revenue?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To answer these questions, analysts must summarize and analyze data rather than simply retrieve it.&lt;/p&gt;

&lt;p&gt;For example, if a company wants to understand total revenue generated by each customer, we might write a query like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query calculates the total spending per customer and ranks them from highest to lowest.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding How Business Data Is Structured
&lt;/h2&gt;

&lt;p&gt;Before writing advanced SQL queries, it is important to understand how real-world data is organized.&lt;/p&gt;

&lt;p&gt;Most companies store information in multiple related tables rather than a single dataset.&lt;/p&gt;

&lt;p&gt;For example, an e-commerce company might have the following tables:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Customers:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;customer_id&lt;/th&gt;
&lt;th&gt;first_name&lt;/th&gt;
&lt;th&gt;last_name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Orders:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;order_id&lt;/th&gt;
&lt;th&gt;customer_id&lt;/th&gt;
&lt;th&gt;order_date&lt;/th&gt;
&lt;th&gt;order_amount&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Products:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;product_id&lt;/th&gt;
&lt;th&gt;product_name&lt;/th&gt;
&lt;th&gt;category&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each table contains a different set of information. To perform meaningful analysis, analysts will need to combine data from multiple tables.&lt;/p&gt;

&lt;p&gt;This is where joins become essential.&lt;/p&gt;

&lt;h2&gt;
  
  
  Joining Tables to Understand Customer Behavior
&lt;/h2&gt;

&lt;p&gt;A JOIN allows us to combine information from different tables using a shared key.&lt;/p&gt;

&lt;p&gt;For example, suppose we want to see which customers placed orders and how much they spent.&lt;/p&gt;

&lt;p&gt;We can join the customers table with the orders table using the customer_id.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query connects customer information with order data.&lt;/p&gt;

&lt;p&gt;Joins are extremely important in data analysis because real insights often require combining multiple datasets.&lt;/p&gt;

&lt;p&gt;For example, analysts may join tables to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connect purchases to customer profiles&lt;/li&gt;
&lt;li&gt;Link product sales with product categories&lt;/li&gt;
&lt;li&gt;Combine website activity with user accounts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without joins, much of the valuable context in business data would remain hidden.&lt;/p&gt;

&lt;h2&gt;
  
  
  Turning Raw Data Into Metrics with Aggregations
&lt;/h2&gt;

&lt;p&gt;Businesses rarely care about individual rows of data. Instead, they focus on metrics that summarize performance.&lt;/p&gt;

&lt;p&gt;For example, a company might want to track:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;total revenue
-average order value&lt;/li&gt;
&lt;li&gt;number of customers&lt;/li&gt;
&lt;li&gt;sales per product category&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SQL aggregation functions allow us to calculate these metrics.&lt;/p&gt;

&lt;p&gt;Some of the most commonly used functions include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SUM()&lt;/strong&gt; – calculates totals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AVG()&lt;/strong&gt; – calculates averages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;COUNT()&lt;/strong&gt; – counts records&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MAX()&lt;/strong&gt; and &lt;strong&gt;MIN()&lt;/strong&gt; – identify extreme values&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, suppose we want to calculate total revenue generated by each product category.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;
    &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_revenue&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This type of query helps businesses understand which product categories generate the most income.&lt;/p&gt;

&lt;p&gt;Aggregation queries like this form the foundation of many dashboards and reports used by companies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Window Functions: A Powerful Analytical Tool
&lt;/h2&gt;

&lt;p&gt;One of the most powerful features in SQL is the window function.&lt;/p&gt;

&lt;p&gt;Window functions allow analysts to perform calculations across related rows while still keeping each individual row in the result.&lt;/p&gt;

&lt;p&gt;This is different from aggregation queries, which combine rows into grouped summaries.&lt;/p&gt;

&lt;p&gt;For example, imagine a company wants to rank customers based on their total spending.&lt;/p&gt;

&lt;p&gt;We can use a window function to calculate the ranking.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;spending_rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The RANK() function assigns a ranking based on total spending.&lt;/p&gt;

&lt;p&gt;Window functions are widely used in data analytics because they make it possible to perform calculations such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rankings&lt;/li&gt;
&lt;li&gt;Running totals&lt;/li&gt;
&lt;li&gt;Moving averages&lt;/li&gt;
&lt;li&gt;Comparisons between rows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, analysts often use window functions to calculate cumulative revenue over time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;cumulative_revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query shows how revenue accumulates day by day.&lt;/p&gt;

&lt;p&gt;These types of calculations are essential for understanding trends and growth patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Subqueries
&lt;/h2&gt;

&lt;p&gt;Sometimes we need to calculate a value and then use it inside another query.&lt;/p&gt;

&lt;p&gt;This is where subqueries become useful.&lt;/p&gt;

&lt;p&gt;A subquery is simply a query nested inside another query.&lt;/p&gt;

&lt;p&gt;For example, suppose a company wants to identify customers who spend more than the average customer.&lt;/p&gt;

&lt;p&gt;First we must calculate the average spending, then compare each customer against that value.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_sales&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;AVG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_sales&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The inner query calculates the average spending across all customers.&lt;br&gt;
The outer query then filters customers whose spending is greater than that average.&lt;/p&gt;

&lt;p&gt;Subqueries allow analysts to break complex problems into smaller logical steps.&lt;/p&gt;
&lt;h2&gt;
  
  
  Common Table Expressions
&lt;/h2&gt;

&lt;p&gt;As SQL queries become more complex, they become difficult to read and maintain.&lt;/p&gt;

&lt;p&gt;A useful technique for solving this problem is the Common Table Expression (CTE).&lt;/p&gt;

&lt;p&gt;A CTE allows to create a temporary result set that can be referenced later in the query.&lt;/p&gt;

&lt;p&gt;For example, suppose we want to calculate total spending per customer and then rank them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;customer_sales&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt;
        &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
    &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_spent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customer_sales&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The CTE calculates total spending per customer.&lt;/li&gt;
&lt;li&gt;The main query ranks customers based on that spending.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This structure makes the query easier to understand because each step represents a clear transformation of the data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Analyzing Trends
&lt;/h2&gt;

&lt;p&gt;Many business questions involve understanding how data changes over time.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How has revenue changed month to month?&lt;/li&gt;
&lt;li&gt;Which days generate the most sales?&lt;/li&gt;
&lt;li&gt;Are users becoming more active or less active?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SQL makes it possible to analyze these trends directly within the database.&lt;/p&gt;

&lt;p&gt;For example, we might calculate monthly revenue using a date function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
    &lt;span class="n"&gt;DATE_TRUNC&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'month'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;monthly_revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;month&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query groups orders by month and calculates the total revenue for each month.&lt;/p&gt;

&lt;p&gt;Time-based analysis helps businesses understand patterns such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Growth trends&lt;/li&gt;
&lt;li&gt;Seasonal demand&lt;/li&gt;
&lt;li&gt;Unexpected changes in performance&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These insights help decision-makers plan strategies and allocate resources effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing Efficient Queries for Large Datasets
&lt;/h2&gt;

&lt;p&gt;As datasets grow larger, query performance becomes increasingly important.&lt;/p&gt;

&lt;p&gt;A poorly written query might work fine on a small dataset but become extremely slow when processing millions of records.&lt;/p&gt;

&lt;p&gt;One common technique for improving performance is using indexes.&lt;/p&gt;

&lt;p&gt;An index works similarly to the index in a book. Instead of scanning every row in a table, the database can quickly locate the rows it needs.&lt;/p&gt;

&lt;p&gt;For example, creating an index on the customer_id column can improve the speed of queries that filter by customer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_customer_id&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In addition to indexing, analysts can improve performance by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Filtering data early using WHERE&lt;/li&gt;
&lt;li&gt;Avoiding unnecessary joins
-Selecting only required columns instead of SELECT *&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Efficient queries help ensure that dashboards, reports, and analytics pipelines run smoothly.&lt;/p&gt;

&lt;h2&gt;
  
  
  SQL in the Real Workflow of Data Teams
&lt;/h2&gt;

&lt;p&gt;In modern organizations, SQL is used across many different roles.&lt;/p&gt;

&lt;p&gt;Data analysts use SQL to explore datasets and build reports.&lt;/p&gt;

&lt;p&gt;Data scientists use SQL to extract and prepare data for machine learning models.&lt;/p&gt;

&lt;p&gt;Data engineers use SQL when building data pipelines and transforming raw data into structured tables.&lt;/p&gt;

&lt;p&gt;Even when other tools are involved, SQL usually plays a central role. Data may eventually be visualized in tools like Tableau or Power BI, but those tools often rely on SQL queries running behind the scenes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Advanced SQL is an essential skill for anyone working with data.&lt;/p&gt;

&lt;p&gt;While basic SQL allows us to retrieve information from databases, advanced techniques enable us to perform deeper analysis and answer meaningful business questions.&lt;/p&gt;

&lt;p&gt;Techniques such as joins, aggregations, window functions, subqueries, and Common Table Expressions allow analysts to transform raw data into insights that organizations can act upon.&lt;/p&gt;

&lt;p&gt;More importantly, mastering SQL helps analysts think more clearly about data problems. Instead of simply retrieving rows from tables, you learn how to structure questions, transform datasets, and uncover patterns that would otherwise remain hidden.&lt;/p&gt;

&lt;p&gt;The goal of SQL is not just writing queries, it is understanding data well enough to tell meaningful stories with it.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>data</category>
      <category>analytics</category>
      <category>database</category>
    </item>
    <item>
      <title>SQL Joins and Window Functions: A Beginner Friendly Guide.</title>
      <dc:creator>Kithokoi James</dc:creator>
      <pubDate>Sun, 01 Mar 2026 22:17:33 +0000</pubDate>
      <link>https://dev.to/kithokoi/sql-joins-and-window-functions-a-beginner-friendly-guide-iii</link>
      <guid>https://dev.to/kithokoi/sql-joins-and-window-functions-a-beginner-friendly-guide-iii</guid>
      <description>&lt;p&gt;If you’re learning data analytics, SQL will quickly become your daily tool. &lt;/p&gt;

&lt;p&gt;But then, two topics decide whether you’ll feel confident or constantly confused: joins and window functions. Joins are how you combine related data across tables, and window functions are how you do calculations across a set of table rows, running totals, and comparisons without destroying the detail in your result set. &lt;/p&gt;

&lt;p&gt;When you understand these two well, SQL stops feeling like memorizing commands and starts feeling like reasoning.&lt;/p&gt;

&lt;p&gt;A simple way to remember the difference is this: a join helps you bring in columns from another table, while a window function helps you compute insights across rows while keeping every row visible. &lt;/p&gt;

&lt;p&gt;In this article we will look at both of them and see how we can work with them.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are Joins and How do Tables Connect to Tell a Story
&lt;/h2&gt;

&lt;p&gt;A database is usually split into multiple tables because it's cleaner and prevents repeatition of data. For example, customers are stored in one table, orders in another, and products in another. &lt;/p&gt;

&lt;p&gt;That structure is good design, but it creates a practical question, how do you view a complete story, like “show me each order with the customer name and the items inside it”? And that's where joins come into play.&lt;/p&gt;

&lt;p&gt;A join works by matching rows across tables using a condition written after ON. Most of the time, that condition matches a foreign key to a primary key. Briefly, one table stores a reference to another table, and SQL uses that reference to connect them.&lt;/p&gt;

&lt;p&gt;Imagine four common tables: &lt;em&gt;customers(customer_id, name), orders(order_id, customer_id, order_date, total_amount), order_items(order_id, product_id, qty, unit_price),&lt;/em&gt; and &lt;em&gt;products(product_id, product_name)&lt;/em&gt;. An order belongs to a customer, and order items belong to an order. Those “belongs to” relationships are exactly what joins are built for.&lt;/p&gt;

&lt;h3&gt;
  
  
  Types of Joins
&lt;/h3&gt;

&lt;p&gt;-1. INNER JOINS&lt;br&gt;
An inner join returns only the rows where a match exists on both sides. If you join orders to customers, SQL will keep only the orders that successfully match a customer record. This is the join you use when you only want valid, connected data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;customer_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query produces one row per order, assuming each order has exactly one customer. If an order points to a customer ID that doesn’t exist, that order won’t show up, because INNER JOIN keeps only confirmed matches.&lt;/p&gt;

&lt;p&gt;-2. LEFT JOIN&lt;br&gt;
A left join keeps every row from the left table, even when there is no match on the right. &lt;/p&gt;

&lt;p&gt;When no match exists, the right-side columns become NULL. This join is extremely common because it lets you keep your “main list” intact.&lt;/p&gt;

&lt;p&gt;For example, if you want to list all customers and show any orders they might have, you’d use a LEFT JOIN. Customers with no orders still appear, which is often exactly what you need when analyzing conversion or engagement.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this querry, you’ll see customers who have never ordered, and their order columns will be NULL&lt;/p&gt;

&lt;p&gt;-3. RIGHT JOIN&lt;br&gt;
A right join is the same idea as a LEFT JOIN but reversed. It keeps everything from the right table. It’s not wrong, but many analysts avoid it because it’s hard to read. If you swap the table order, you can express it as a LEFT JOIN and keep your queries consistent.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Join Mistakes People Make That Destroy Accuracy
&lt;/h2&gt;

&lt;p&gt;Joins are not difficult because the syntax is hard. Joins are difficult because it’s easy to get correct looking results that are mathematically wrong.&lt;/p&gt;

&lt;p&gt;The first major trap is row multiplication. When you join a “one” table to a “many” table, you will repeat the “one” row for each matching “many” row. For example, one order can have multiple order items. If you join orders to order_items, the order row repeats once per item. That repetition is correct, but it becomes confusing when you sum values from the repeated side.&lt;/p&gt;

&lt;p&gt;Here’s a common mistake: calculating revenue from orders.total_amount after joining to items.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;order_items&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If an order has five items, o.total_amount appears five times, and your revenue becomes inflated. The fix here is simple once you see it, don’t sum a value from a table after it has been multiplied. Either sum directly from orders without joining, or calculate revenue from items using &lt;em&gt;qty * unit_price.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;qty&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;unit_price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;revenue&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;order_items&lt;/span&gt; &lt;span class="n"&gt;oi&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second trap is turning a LEFT JOIN into an INNER JOIN by accident. &lt;br&gt;
This happens when you filter right-table columns in the WHERE clause. A LEFT JOIN produces NULL for missing matches, but a WHERE filter removes those NULL rows.&lt;/p&gt;

&lt;p&gt;This query looks reasonable, but it’s misleading:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Customers with no orders have &lt;em&gt;o.order_date = NULL&lt;/em&gt;, and they get filtered out. To keep customers even when there’s no order, move the filter into the join condition.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;customers&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
  &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;
 &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="s1"&gt;'2026-01-01'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The third trap is joining on unstable fields like names. Names can repeat, change, and contain typos. In analytics, you should join using IDs or real keys whenever possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Window Functions
&lt;/h2&gt;

&lt;p&gt;Window functions solve a different problem. Many times, you want to calculate something “across a set of rows,” but you don’t want to collapse the result the way GROUP BY does.&lt;/p&gt;

&lt;p&gt;If you run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’ll get one row per customer. That’s useful, but you lose the order-level detail. Window functions let you compute totals, ranks, and comparisons while keeping each order row visible.&lt;/p&gt;

&lt;p&gt;The core idea is simple, a window function performs a calculation over a “window” of related rows defined by &lt;em&gt;OVER(...)&lt;/em&gt;. You can split rows into groups using &lt;em&gt;PARTITION BY&lt;/em&gt; and define ordering inside each group using &lt;em&gt;ORDER BY.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How Joins and Window Functions Work Together
&lt;/h2&gt;

&lt;p&gt;In real analytics, you often join first to bring in context, then apply window functions to compute insights. &lt;/p&gt;

&lt;p&gt;You might join orders to customers to get names, then rank customers by spending, or compute running totals per customer over time.&lt;/p&gt;

&lt;p&gt;The key is to stay clear on what each tool is doing. Joins change the shape of your data by combining rows from tables. Window functions add calculations across related rows without collapsing them. If you keep that difference in your head, your SQL becomes far more predictable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary: The Key Ideas to Remember
&lt;/h2&gt;

&lt;p&gt;Joins connect tables. INNER JOIN keeps only matching rows, while LEFT JOIN keeps all rows from the left and fills missing matches with NULLs. Joins can multiply rows, so you must be careful when summing after a join, otherwise your totals may be inflated. &lt;/p&gt;

&lt;p&gt;Also, filtering right-table columns in a WHERE clause after a LEFT JOIN often defeats the purpose of the LEFT JOIN, so filters should be placed in the ON clause when you need to keep unmatched left rows.&lt;/p&gt;

&lt;p&gt;Window functions calculate across related rows while keeping each row in the result. PARTITION BY defines the group, ORDER BY defines the sequence, and window frames define how many rows are included in calculations.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>data</category>
      <category>database</category>
    </item>
    <item>
      <title>How Analysts Translate Messy Data, DAX, and Dashboards into Action Using Power BI</title>
      <dc:creator>Kithokoi James</dc:creator>
      <pubDate>Fri, 13 Feb 2026 11:04:41 +0000</pubDate>
      <link>https://dev.to/kithokoi/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-1392</link>
      <guid>https://dev.to/kithokoi/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-1392</guid>
      <description>&lt;p&gt;Most organizations have plenty of data.&lt;/p&gt;

&lt;p&gt;What they don’t have is usable data.&lt;/p&gt;

&lt;p&gt;It’s scattered across Excel files, sales systems, finance tools, CRMs, and attachments. The data might have names that don’t match, dates that are broken and missing values, with every department swearing their version is the correct one.&lt;/p&gt;

&lt;p&gt;That’s why so many “data-driven” companies still make decisions based on gut feel.&lt;/p&gt;

&lt;p&gt;Because raw data is not insight and this is where analysts come into place, to clean data and make it make sense.&lt;/p&gt;

&lt;p&gt;That’s what analysts do.&lt;/p&gt;

&lt;p&gt;They take messy, disconnected data and refine it into something leaders can actually use.&lt;/p&gt;

&lt;p&gt;Cleaned data help you as a business understand:&lt;br&gt;
-Where you are? &lt;br&gt;
-What changed?&lt;br&gt;
-What’s working? &lt;br&gt;
-What should be done next?&lt;/p&gt;

&lt;p&gt;And Power BI is one of the most practical tools for that job.&lt;/p&gt;

&lt;p&gt;In this article, i will explain in simple language, how analysts use Power BI to:&lt;/p&gt;

&lt;p&gt;-clean messy data (without writing code)&lt;/p&gt;

&lt;p&gt;-build a model that doesn’t break&lt;/p&gt;

&lt;p&gt;-use DAX to calculate real business metrics&lt;/p&gt;

&lt;p&gt;-create dashboards people actually understand&lt;/p&gt;

&lt;p&gt;-turn insights into actions that move the business&lt;/p&gt;

&lt;p&gt;Let’s walk through the same process professionals use, step by step but explained like you’re starting from zero.&lt;/p&gt;
&lt;h2&gt;
  
  
  Phase 1: Cleaning Messy Data with Power Query
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2y0rra8n6fgvcpqawxn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq2y0rra8n6fgvcpqawxn.png" alt=" " width="800" height="412"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If the foundation is messy, the dashboard will lie even if it looks beautiful.&lt;/p&gt;

&lt;p&gt;Power Query is the “data cleaning room” in Power BI.&lt;/p&gt;

&lt;p&gt;This is where analysts fix the everyday problems that destroy accuracy.&lt;/p&gt;

&lt;p&gt;First: Connect to your data&lt;/p&gt;

&lt;p&gt;Power BI can pull data from different sources like Excel sheets and CSV files among other sources.&lt;/p&gt;

&lt;p&gt;The goal is always to collect the right data and make it consistent by spotting problems and working on them. This makes sure everything is okay before starting to work on the data.&lt;/p&gt;

&lt;p&gt;The most common cleaning fixes (the ones you’ll use all the time)&lt;br&gt;
1) Standardize names - Analysts standardize this into one clean value so reports don’t split totals across fake categories.&lt;/p&gt;

&lt;p&gt;2) Fix data types&lt;/p&gt;

&lt;p&gt;Power BI needs to know what each column is:&lt;/p&gt;

&lt;p&gt;Date should be Date (not text)&lt;/p&gt;

&lt;p&gt;-Sales should be Decimal Number (not text)&lt;/p&gt;

&lt;p&gt;-Quantity should be Whole Number (not “10 units”)&lt;/p&gt;

&lt;p&gt;If you skip this, DAX calculations become harder, filters behave strangely, and time-based charts break.&lt;/p&gt;

&lt;p&gt;3) Remove duplicates&lt;/p&gt;

&lt;p&gt;If a sales transaction appears twice, your dashboard confidently reports revenue that never existed.&lt;/p&gt;

&lt;p&gt;Analysts remove duplicates (or identify why they exist) before building anything.&lt;/p&gt;

&lt;p&gt;4) Drop unnecessary columns&lt;/p&gt;

&lt;p&gt;More columns = slower model.&lt;/p&gt;

&lt;p&gt;If you don’t use it for analysis, remove it. Your reports will refresh faster and your model becomes easier to understand.&lt;/p&gt;

&lt;p&gt;Cleaning is what makes your dashboard trustworthy.&lt;/p&gt;
&lt;h2&gt;
  
  
  Phase 2: Data Modeling
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frcdbpf5ol4tw9ce37gt3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frcdbpf5ol4tw9ce37gt3.png" alt=" " width="800" height="504"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once your data is clean, you organize it.&lt;/p&gt;

&lt;p&gt;This is where beginners often struggle because they load five tables and hope Power BI figures it out.&lt;/p&gt;

&lt;p&gt;Power BI doesn’t “figure it out.”&lt;/p&gt;

&lt;p&gt;It follows your model.&lt;/p&gt;

&lt;p&gt;The star schema &lt;br&gt;
A professional model usually looks like this:&lt;/p&gt;

&lt;p&gt;Fact table = the “events” table (transactions)&lt;/p&gt;

&lt;p&gt;Dimension tables = the “labels/context” tables (customers, products, dates, regions)&lt;/p&gt;

&lt;p&gt;And a star schema is where the fact table sits in the middle and dimensions connect around it like points of a star.&lt;/p&gt;

&lt;p&gt;Analysts love this structure because it makes everything work smoothly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
-slicers filter correctly

-calculations stay accurate

-performance is faster

-your model stays clean as data grows
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Relationships tell Power BI how tables connect. Most of the time, you want one-to-many:&lt;/p&gt;

&lt;p&gt;One product in DimProduct - many rows in FactSales&lt;/p&gt;

&lt;p&gt;One customer in DimCustomer - many rows in FactSales&lt;/p&gt;

&lt;p&gt;If relationships are wrong, filters don’t flow properly, and numbers become unreliable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 3: Using DAX
&lt;/h2&gt;

&lt;p&gt;DAX is the formula language in Power BI.&lt;/p&gt;

&lt;p&gt;The three types of DAX work analysts do most&lt;br&gt;
1) Basic totals&lt;/p&gt;

&lt;p&gt;Examples: Total Sales,Total Profit and Total Orders&lt;/p&gt;

&lt;p&gt;These sound simple and they are, but they become the building blocks for everything else.&lt;/p&gt;

&lt;p&gt;2) Time intelligence&lt;/p&gt;

&lt;p&gt;-This is where analysts answer questions like:&lt;/p&gt;

&lt;p&gt;-Are we growing compared to last year?&lt;/p&gt;

&lt;p&gt;-How much have we sold this year so far?&lt;/p&gt;

&lt;p&gt;-Which months are trending down?&lt;/p&gt;

&lt;p&gt;Common metrics: YTD (Year-to-Date), YoY (Year-over-Year) growth&lt;/p&gt;

&lt;p&gt;This is what turns a dashboard into something leaders actually use.&lt;/p&gt;

&lt;p&gt;3) Conditional logic&lt;/p&gt;

&lt;p&gt;This is where you label performance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-If profit margin &amp;gt; 20% → “Excellent”

-If margin between 10–20% → “Okay”

-If margin &amp;lt; 10% → “Needs attention”
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Functions like IF and SWITCH help you do that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 4: Dashboards That Tell a Story
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fittdfmsvqejl4o420j49.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fittdfmsvqejl4o420j49.png" alt=" " width="800" height="386"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most dashboards fail for one reason, they show data, but they don’t communicate.&lt;/p&gt;

&lt;p&gt;A good dashboard answers questions quickly and makes the next decision obvious.&lt;/p&gt;

&lt;p&gt;The simple visual hierarchy analysts Use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-KPI cards for headline numbers (Sales, Profit, Margin)

-Line charts for trends over time
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bar/column charts for category comparisons (regions, products)&lt;/p&gt;

&lt;p&gt;Tables when exact detail matters&lt;/p&gt;

&lt;p&gt;A simple rule helps:&lt;/p&gt;

&lt;p&gt;-If it’s time, use a line.&lt;/p&gt;

&lt;p&gt;-If it’s comparison, use bars.&lt;/p&gt;

&lt;p&gt;-If it’s a headline, use a card.&lt;/p&gt;

&lt;p&gt;Let people explore without calling you. Power BI dashboards become powerful when users can interact. On this, you can use slicers to filter (region, product, month).&lt;/p&gt;

&lt;p&gt;This changes dashboards from "static reports" into “self-service insight tools.”&lt;/p&gt;

&lt;p&gt;And this reduces the number of “Can you break this down?” messages you get.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 5: Turning Insights Into Action
&lt;/h2&gt;

&lt;p&gt;This is the phase that matters most because a dashboard that doesn’t change decisions is just decoration.&lt;/p&gt;

&lt;p&gt;Analysts aim to move from:&lt;/p&gt;

&lt;p&gt;What happened to, why did it happen?, and the to, what should we do now?&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion:
&lt;/h2&gt;

&lt;p&gt;Power BI can connect data, clean it, model it, calculate it, and visualize it.&lt;/p&gt;

&lt;p&gt;But the real value comes from the person who uses it with discipline:&lt;/p&gt;

&lt;p&gt;-cleaning data so it can be trusted&lt;/p&gt;

&lt;p&gt;-modeling data so it behaves correctly&lt;/p&gt;

&lt;p&gt;-writing DAX so metrics match the business&lt;/p&gt;

&lt;p&gt;-Designing dashboards people can read in seconds&lt;/p&gt;

&lt;p&gt;-Translating insights into actions leaders can take&lt;/p&gt;

</description>
      <category>data</category>
      <category>analyst</category>
      <category>dax</category>
    </item>
    <item>
      <title>A Beginner’s Guide to Schemas and Data Modeling in Power BI</title>
      <dc:creator>Kithokoi James</dc:creator>
      <pubDate>Sun, 01 Feb 2026 22:15:17 +0000</pubDate>
      <link>https://dev.to/kithokoi/a-beginners-guide-to-schemas-and-data-modeling-in-power-bi-2cl1</link>
      <guid>https://dev.to/kithokoi/a-beginners-guide-to-schemas-and-data-modeling-in-power-bi-2cl1</guid>
      <description>&lt;p&gt;Power BI is often introduced as a visual analytics tool, but its effectiveness depends far more on data structure than on charts or formulas. &lt;/p&gt;

&lt;p&gt;Beginners mostly encounter issues such as incorrect totals, broken filters, or inconsistent results and assume the problem lies in DAX or visuals. In practice, these problems almost always originate from poor understanding of schemas and data modeling.&lt;/p&gt;

&lt;p&gt;In this article I will give an explanation of schemas and data modeling in Power BI. I will defines concepts, explains why they matter, and show how they affect everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Definition
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Schema&lt;/strong&gt;: The logical structure of tables and relationships that defines how data is organized for analysis.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Data modeling&lt;/strong&gt;: The process of designing and implementing that structure so analytical tools can correctly interpret and aggregate data.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a Schema Means in Power BI
&lt;/h2&gt;

&lt;p&gt;In Power BI, a schema is not a separate object or configuration. It is the resulting structure formed by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tables loaded into the model&lt;/li&gt;
&lt;li&gt;The relationships between those tables&lt;/li&gt;
&lt;li&gt;How filters move between tables&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A schema answers fundamental questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What does each table represent?&lt;/li&gt;
&lt;li&gt;How does one table relate to another?&lt;/li&gt;
&lt;li&gt;When I filter one table, what other tables should be affected?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Power BI is built for analysis, not for storing raw data. They are optimized for querying, aggregation, and slicing. Because of this, the way data is structured matters more.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Data Modeling
&lt;/h2&gt;

&lt;p&gt;Data modeling is the process of preparing data so Power BI can analyze it correctly.&lt;/p&gt;

&lt;p&gt;This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Separating numbers from descriptions&lt;/li&gt;
&lt;li&gt;Making sure each table has a clear purpose&lt;/li&gt;
&lt;li&gt;Defining clear relationships between tables&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Data modeling is not just loading data and hoping it works. Raw data usually comes from systems designed for recording transactions, not for answering analytical questions. Data modeling reshapes that raw data into something Power BI can understand.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Schemas Matter in Analytical Models
&lt;/h2&gt;

&lt;p&gt;Power BI evaluates measures by applying filters through relationships. The structure of the schema determines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which rows are included in a calculation&lt;/li&gt;
&lt;li&gt;How slicers affect visuals&lt;/li&gt;
&lt;li&gt;Whether totals align with detail rows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An improperly designed schema brings out confusion. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple filter paths between tables can cause unpredictable results&lt;/li&gt;
&lt;li&gt;Many-to-many relationships can distort aggregations&lt;/li&gt;
&lt;li&gt;Bidirectional filters can introduce hidden dependencies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3bsc3z31l3voer219atg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3bsc3z31l3voer219atg.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A good schema removes confusion. Power BI knows exactly how tables relate, and calculations behave consistently.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Most Important Schema: Star Schema
&lt;/h2&gt;

&lt;p&gt;The most common and recommended structure in Power BI is the star schema.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Is a Star Schema?
&lt;/h3&gt;

&lt;p&gt;A star schema organizes data into two main types of tables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fact tables&lt;/li&gt;
&lt;li&gt;Dimension tables&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjz6kl00t9uyfaayw196.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjz6kl00t9uyfaayw196.webp" alt=" " width="800" height="833"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The fact table sits in the center. Dimension tables connect to it, forming a star-like shape.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fact Tables
&lt;/h3&gt;

&lt;p&gt;Fact tables store:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Events or transactions&lt;/li&gt;
&lt;li&gt;Numeric values you want to analyze&lt;/li&gt;
&lt;li&gt;Keys that connect to dimension tables&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sales transactions&lt;/li&gt;
&lt;li&gt;Orders&lt;/li&gt;
&lt;li&gt;Website visits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fact tables answer the question: “What happened?”&lt;/p&gt;

&lt;h3&gt;
  
  
  Dimension Tables
&lt;/h3&gt;

&lt;p&gt;Dimension tables store:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Descriptive information&lt;/li&gt;
&lt;li&gt;Categories and labels&lt;/li&gt;
&lt;li&gt;One row per unique item&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Products&lt;/li&gt;
&lt;li&gt;Customers&lt;/li&gt;
&lt;li&gt;Dates&lt;/li&gt;
&lt;li&gt;Locations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Dimension tables answer the question: “Describe it.”&lt;/p&gt;

&lt;h2&gt;
  
  
  How Power BI Uses This Structure
&lt;/h2&gt;

&lt;p&gt;When you filter a report, Power BI:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Applies the filter to a dimension table&lt;/li&gt;
&lt;li&gt;Uses relationships to pass that filter to the fact table&lt;/li&gt;
&lt;li&gt;Calculates results using only the filtered rows&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This works best when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Relationships are one-to-many&lt;/li&gt;
&lt;li&gt;Filters flow from dimension tables to the fact table&lt;/li&gt;
&lt;li&gt;There is only one clear path between tables&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The star schema supports this behavior naturally.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Simple Example
&lt;/h2&gt;

&lt;p&gt;Imagine you are analyzing sales data.&lt;/p&gt;

&lt;p&gt;A good model would include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A Sales table with revenue and IDs&lt;/li&gt;
&lt;li&gt;A Product table with product names and categories&lt;/li&gt;
&lt;li&gt;A Customer table with customer details&lt;/li&gt;
&lt;li&gt;A Date table for time analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these dimension tables connects directly to the Sales table. None of the dimensions connect to each other.&lt;/p&gt;

&lt;p&gt;This design makes it easy to filter sales by product, customer, or date without confusion.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Beginner Mistakes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Using One Large Table
&lt;/h3&gt;

&lt;p&gt;Beginners often keep all data in one table. This causes duplication and limits flexibility. It works at first but creates problems as reports grow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Many Relationships
&lt;/h3&gt;

&lt;p&gt;These relationships often lead to incorrect totals. They usually mean the data has not been modeled at the right level of detail.&lt;/p&gt;

&lt;h3&gt;
  
  
  Too Many Bidirectional Filters
&lt;/h3&gt;

&lt;p&gt;Bidirectional filters can cause hidden logic issues. They make models hard to understand.&lt;/p&gt;

&lt;p&gt;These mistakes happen because beginners focus on visuals before structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who is This For and When to Use It
&lt;/h2&gt;

&lt;p&gt;Understanding schemas and data modeling is essential for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Anyone learning Power BI for data analysis&lt;/li&gt;
&lt;li&gt;Analysts building dashboards and reports&lt;/li&gt;
&lt;li&gt;Users working with measures and aggregations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This approach should be used for analytical reporting. It is not meant for data storage or transaction processing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Schemas and data modeling define how Power BI understands your data. They affect accuracy, performance, and reliability.&lt;/p&gt;

&lt;p&gt;For beginners, learning these concepts early prevents many common problems. Once the data is structured correctly, Power BI becomes simpler, faster, and more predictable. The quality of your analysis depends on the quality of your model and that starts with the schema.&lt;/p&gt;

</description>
      <category>data</category>
      <category>analytics</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Introduction to MS Excel for Data Analytics</title>
      <dc:creator>Kithokoi James</dc:creator>
      <pubDate>Sun, 25 Jan 2026 16:18:46 +0000</pubDate>
      <link>https://dev.to/kithokoi/introduction-to-ms-excel-for-data-analytics-41kh</link>
      <guid>https://dev.to/kithokoi/introduction-to-ms-excel-for-data-analytics-41kh</guid>
      <description>&lt;p&gt;Data is everywhere. Knowing  or unknowingly we interact with data on a daily basis, from sales numbers and customer lists to website traffic. Data helps us understand what is really happening behind the scenes.&lt;/p&gt;

&lt;p&gt;One of the easiest and most practical tools to start with in data analytics is MS Excel.&lt;/p&gt;

&lt;p&gt;Excel is more than a spreadsheet for storing numbers. It is a powerful, flexible tool that helps you organize data, clean it, analyze it, and turn it into insights—without needing any programming knowledge.&lt;/p&gt;

&lt;p&gt;In article, we will dive deep into MS Excel for data analytics using simple, beginner-friendly language. If you are new to data analysis, this guide will help you understand how Excel fits into the process and how its features are used in real life.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Microsoft Excel Is Ideal for Beginners
&lt;/h2&gt;

&lt;p&gt;Excel is often the first tool people use when they start working with data and for good reason.&lt;/p&gt;

&lt;p&gt;It is easy to access, widely used in businesses, and simple to understand. You can see your data clearly, interact with it directly, and get results instantly.&lt;/p&gt;

&lt;p&gt;Most importantly, Excel allows beginners to learn how to ask data questions and find answers before moving on to advanced tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Excell Workspace
&lt;/h2&gt;

&lt;p&gt;When you open Excel, you see a large grid called a &lt;strong&gt;worksheet&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Columns&lt;/strong&gt; represent different variables (such as Name, Date, or Sales). Denoted by letters A,B,C,D onwards&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rows&lt;/strong&gt; represent individual records (such as one customer or one transaction).They are denoted by numbers 1,2,3 and so on&lt;/li&gt;
&lt;li&gt;Each box is a &lt;strong&gt;cell&lt;/strong&gt;, identified by a column letter and row number (for example, A1).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ow3mbnu3bklja59xlbh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ow3mbnu3bklja59xlbh.png" alt=" " width="800" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At the top is the &lt;strong&gt;Ribbon&lt;/strong&gt;, where you find tools for formatting, formulas, charts, and data analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Structuring Data the Right Way
&lt;/h2&gt;

&lt;p&gt;Good data analysis starts with good structure.&lt;/p&gt;

&lt;p&gt;Each column should contain only one type of data, and each row should represent one complete entry. This makes your data easier to sort, filter, and analyze.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2dyu583sznj13hdvkqtq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2dyu583sznj13hdvkqtq.png" alt=" " width="800" height="302"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This simple structure is the foundation of everything you will do in Excell.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sorting and Filtering: Making Sense of Large Data
&lt;/h2&gt;

&lt;p&gt;Sorting and filtering are often the first real data analysis skills beginners learn.&lt;/p&gt;

&lt;p&gt;Sorting helps you arrange data—for example, highest sales to lowest or newest dates first.&lt;br&gt;
Filtering allows you to focus on specific parts of your data, such as sales from one region or records above a certain value.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwlj4024e7b1cck09n58j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwlj4024e7b1cck09n58j.png" alt=" " width="800" height="391"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These tools help you explore data without changing or damaging the original dataset.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Formulas to Analyze Data
&lt;/h2&gt;

&lt;p&gt;Formulas are what transform Excell from a data table into a data analytics tool. You do not need to learn hundreds of formulas—just a few key ones can take you very far.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic Analytical Formulas
&lt;/h3&gt;

&lt;p&gt;Some formulas are used in almost every analysis:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SUM&lt;/strong&gt; – Adds values together&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AVERAGE&lt;/strong&gt; – Calculates the mean&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;COUNT&lt;/strong&gt; – Counts the number of entries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MAX and MIN&lt;/strong&gt; – Find highest and lowest values&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsqaqpce340bslukctygm.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsqaqpce340bslukctygm.webp" alt=" " width="750" height="341"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These formulas help you summarize large datasets quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Combining and Cleaning Data with Text Formulas
&lt;/h2&gt;

&lt;p&gt;Data does not always come in a perfect format. Names may be split, codes may be separated, or text may be inconsistent. This is where text formulas become useful.&lt;/p&gt;

&lt;h3&gt;
  
  
  CONCATENATE (or CONCAT)
&lt;/h3&gt;

&lt;p&gt;This formula joins text from multiple cells into one.&lt;/p&gt;

&lt;p&gt;For example, you can combine &lt;em&gt;First Name&lt;/em&gt; and &lt;em&gt;Last Name&lt;/em&gt; into a single column.&lt;/p&gt;

&lt;p&gt;This is especially useful when preparing data for reports or databases.&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding and Matching Data with VLOOKUP
&lt;/h2&gt;

&lt;p&gt;One of the most important formulas in Excell for data analytics is &lt;strong&gt;VLOOKUP&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;VLOOKUP helps you find related information from another table. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Matching employee IDs to names&lt;/li&gt;
&lt;li&gt;Finding product prices from a price list&lt;/li&gt;
&lt;li&gt;Linking sales data to product categories&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This formula allows you to work with multiple datasets at once, which is a common task in real-world data analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Logical Thinking with IF Statements
&lt;/h2&gt;

&lt;p&gt;Data analysis often involves decision-making. The &lt;strong&gt;IF&lt;/strong&gt; formula helps you apply logic to your data.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mark sales as “High” or “Low”&lt;/li&gt;
&lt;li&gt;Identify students as “Pass” or “Fail”&lt;/li&gt;
&lt;li&gt;Flag customers as “Active” or “Inactive”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F40jp2nb4w50q8yc3sdi1.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F40jp2nb4w50q8yc3sdi1.webp" alt=" " width="800" height="357"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;IF statements help you turn raw numbers into meaningful categories.&lt;/p&gt;

&lt;h2&gt;
  
  
  Counting and Summarizing with COUNTIF and SUMIF
&lt;/h2&gt;

&lt;p&gt;When analyzing data, you often want to count or sum values that meet specific conditions.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;COUNTIF&lt;/strong&gt; counts entries that meet a rule&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SUMIF&lt;/strong&gt; adds values that meet a rule&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Total sales for one region&lt;/li&gt;
&lt;li&gt;Number of customers from one country&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8l50e1ihi12f1mkq5f2.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8l50e1ihi12f1mkq5f2.webp" alt=" " width="800" height="204"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These formulas are extremely useful for quick insights.&lt;/p&gt;

&lt;h2&gt;
  
  
  Visualizing Data with Charts
&lt;/h2&gt;

&lt;p&gt;Charts help turn numbers into stories.&lt;/p&gt;

&lt;p&gt;Excell allows you to create charts that show:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trends over time&lt;/li&gt;
&lt;li&gt;Comparisons between categories&lt;/li&gt;
&lt;li&gt;Proportions of a whole&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Good charts make your analysis easier to understand and easier to explain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cleaning Data for Accurate Analysis
&lt;/h2&gt;

&lt;p&gt;Real-world data is often messy. Excell includes tools that help you clean and prepare data, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Removing duplicates&lt;/li&gt;
&lt;li&gt;Fixing inconsistent text&lt;/li&gt;
&lt;li&gt;Formatting dates and numbers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Clean data leads to reliable results, which is essential in analytics.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Excell Still Matters in Data Analytics
&lt;/h2&gt;

&lt;p&gt;Even with modern tools like Python, SQL, and Power BI, Excell remains relevant. Many professionals still rely on it for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Exploratory analysis&lt;/li&gt;
&lt;li&gt;Reporting&lt;/li&gt;
&lt;li&gt;Business decisions&lt;/li&gt;
&lt;li&gt;Data preparation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Excell is often the first step before data moves into more advanced systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Microsoft Excell is one of the best tools for beginners entering data analytics. It helps you organize data, analyze it, and discover insights using simple tools and formulas.&lt;/p&gt;

&lt;p&gt;You do not need to master everything at once. Start with basic formulas, explore your data, and build confidence step by step.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>excell</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Essential Git Bash Commands for Pushing and Pulling Code on GitHub</title>
      <dc:creator>Kithokoi James</dc:creator>
      <pubDate>Sun, 18 Jan 2026 09:12:45 +0000</pubDate>
      <link>https://dev.to/kithokoi/essential-git-bash-commands-for-pushing-and-pulling-code-on-github-40mb</link>
      <guid>https://dev.to/kithokoi/essential-git-bash-commands-for-pushing-and-pulling-code-on-github-40mb</guid>
      <description>&lt;p&gt;If you are like me, using gitbash was a nightmare for the first time.&lt;/p&gt;

&lt;p&gt;Most beginners don’t struggle because Git is “too complex.” They struggle because they don’t know:&lt;/p&gt;

&lt;p&gt;-Which command to run,&lt;/p&gt;

&lt;p&gt;-In what order,&lt;/p&gt;

&lt;p&gt;-And what each command actually does.&lt;/p&gt;

&lt;p&gt;If you are interacting with gitbash for the first time,don't worry, because in this article, i will help you understand how towork on GitHub using Git Bash — confidently and without guessing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Gitbash
&lt;/h2&gt;

&lt;p&gt;Git Bash is a command-line interface that lets you run Git commands on your local machine.&lt;/p&gt;

&lt;p&gt;It’s not GitHub, it’s not a programming language, and it’s not optional if you want to use Git from the terminal. Git Bash is simply the environment where Git commands are executed.&lt;/p&gt;

&lt;p&gt;You use Git Bash every time you want to pull code from GitHub or push your local changes back to it.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to set up Gitbash
&lt;/h2&gt;

&lt;p&gt;This is a very important step before you start working on anything.&lt;/p&gt;

&lt;p&gt;You need to configure 2 things:&lt;br&gt;
-Your git username&lt;br&gt;
-Your git email&lt;/p&gt;

&lt;p&gt;To configure your user name you will use this command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git --config global user.name "yourname"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And, to for your email, this is the right command to use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git --config global user.email "your email
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is very important because everytime you want to work on anything Git uses this information to link your commits to your account.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting a Repository into Git Bash
&lt;/h2&gt;

&lt;p&gt;This is where most beginners get confused. There are only ways to go about this:&lt;/p&gt;

&lt;p&gt;-1. Cloning an Existing GitHub Repository&lt;br&gt;
This works if there is an already existing project on github that you want to clone. To do this, we use this command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone &amp;lt;repository-url&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command downloads the project to your local machine and creates a local repository.&lt;/p&gt;

&lt;p&gt;-2. Using Git Bash Inside an Existing Local Project&lt;br&gt;
This works when the project exists on your local computer. To work on it, you innitialize git using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command creates a local repository which is not yet connected to github, to connect it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git remote add origin &amp;lt;repository-url&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And this works when you build projects locally and you want to push them to Github&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Workflow: From Local Changes to GitHub
&lt;/h2&gt;

&lt;p&gt;To work seamlessly on github, there are things you need to learn, if you miss this, working on github will prove to be a nightmare.&lt;/p&gt;

&lt;h3&gt;
  
  
  To check the current state of your project
&lt;/h3&gt;

&lt;p&gt;Before doing anything, run this to check the current status of your project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This commands helps you know which files changed and which ones are untracked. Its very important to run this command before you pull, before you commit and before pushing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pulling Code from GitHub Using Git Bash
&lt;/h3&gt;

&lt;p&gt;Pulling always comes before pushing. Git pull downloads changes from github and applies them to your local code which prevents conflitcts.&lt;/p&gt;

&lt;p&gt;To pull changes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git pull origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can replace main with your branch name if needed. &lt;/p&gt;

&lt;p&gt;You pull when you have a project you want to start working on or when you want to push your own changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  How to Make Changes Locally (Before Pushing)
&lt;/h3&gt;

&lt;p&gt;You cannot push raw file changes, git only pushes commits. To stage a change after editing changes, use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git add
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pushing Code to GitHub Using Git Bash
&lt;/h3&gt;

&lt;p&gt;This is where beginners aim at. What gitpush does is that it sends your local commits to github, here is the basic push command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git push -u origin &amp;lt;branch-name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What Are The Common Push and Pull Errors
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Rejected - this happens when you try to push without pulling first. The right order is to pull then push first.&lt;/li&gt;
&lt;li&gt;Authentication failed - This happens when git can't verify its you.
-Nothing to push - This will happen when you didnt commit anything.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Correct Push–Pull Order&lt;br&gt;
As a beginner there is a sequence you need to master, in this order:&lt;br&gt;
Pull, change, add, commit, push.&lt;/p&gt;

&lt;p&gt;This order is very inportant as it keeps your code in sync, prevents rejected pushes and minimizes conflicts.&lt;/p&gt;

&lt;p&gt;Breaking this order will lead to errors everytime.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Understanding git might seem hard when you are starting, but it gets easier and better with time.&lt;/p&gt;

&lt;p&gt;You dont get good at it by memorizing commands.You perfect by running the same commands on different real repeatedly.&lt;/p&gt;

&lt;p&gt;The aim as a beginner should be to practise with actual repositories, make mistakes, read the errors and git will start making sense.&lt;/p&gt;

&lt;p&gt;Once you start moving, Gitbash starts feeling predictable.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>github</category>
      <category>git</category>
    </item>
  </channel>
</rss>
