<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Maxwel Waweru</title>
    <description>The latest articles on DEV Community by Maxwel Waweru (@maxwel_waweru_28).</description>
    <link>https://dev.to/maxwel_waweru_28</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3828000%2Ff21e93f7-4676-4df0-a158-340f371eeb55.png</url>
      <title>DEV Community: Maxwel Waweru</title>
      <link>https://dev.to/maxwel_waweru_28</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/maxwel_waweru_28"/>
    <language>en</language>
    <item>
      <title>How to Publish a Power BI Report and Embed It into a Website</title>
      <dc:creator>Maxwel Waweru</dc:creator>
      <pubDate>Sun, 05 Apr 2026 11:17:45 +0000</pubDate>
      <link>https://dev.to/maxwel_waweru_28/how-to-publish-a-power-bi-report-and-embed-it-into-a-website-1can</link>
      <guid>https://dev.to/maxwel_waweru_28/how-to-publish-a-power-bi-report-and-embed-it-into-a-website-1can</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;Introduction&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Power BI is a business analytics service developed by Microsoft that allows users to visualize data, create interactive reports, and share insights across an organization.&lt;/p&gt;

&lt;p&gt;Throughout this course, you have learned about Power BI queries, DAX, data modeling, joins, charts, dashboards, and reporting. Now, it is time to take your skills a step further by sharing your work with the world.&lt;/p&gt;

&lt;p&gt;In this article, you will learn how to publish a Power BI report and embed it directly into a website. The process involves four main steps: creating a workspace, publishing your report, generating the embed code, and embedding the report on a website.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Step 1: Create a Workspace&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Open your web browser and go to &lt;strong&gt;app.powerbi.com&lt;/strong&gt;. Sign in.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click &lt;strong&gt;Workspaces&lt;/strong&gt; on the left navigation pane&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Create workspace&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Enter a name (e.g., &lt;em&gt;Sales Report Workspace&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Save&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjxjmw482non21pzzjmep.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjxjmw482non21pzzjmep.png" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This workspace acts as a container for storing your reports and dashboards.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Step 2: Upload and Publish Your Report&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Open your report in Power BI Desktop.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click &lt;strong&gt;Publish&lt;/strong&gt; on the Home ribbon&lt;/li&gt;
&lt;li&gt;Select your workspace&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Select&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Wait for the success confirmation&lt;/li&gt;
&lt;li&gt;Open the report in Power BI Service&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgcoqr3lzgo333jyw9mho.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgcoqr3lzgo333jyw9mho.png" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your report is now available online in the Power BI Service.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Step 3: Generate the Embed Code&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In Power BI Service:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Navigate to your workspace&lt;/li&gt;
&lt;li&gt;Click the &lt;strong&gt;Reports&lt;/strong&gt; tab&lt;/li&gt;
&lt;li&gt;Click the &lt;strong&gt;More options (three dots)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Embed report → Publish to web (public)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Create embed code&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Publish&lt;/strong&gt; to confirm&lt;/li&gt;
&lt;li&gt;Copy the iframe embed code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fura61hsadnmaxae4nfbt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fura61hsadnmaxae4nfbt.png" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note that this method makes your report publicly accessible.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Step 4: Embed the Report on a Website&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Create an &lt;strong&gt;index.html&lt;/strong&gt; file and add the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;!DOCTYPE html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;head&amp;gt;&amp;lt;title&amp;gt;&lt;/span&gt;Power BI Report&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&amp;lt;/head&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;body&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;h1&amp;gt;&lt;/span&gt;Sales Dashboard&lt;span class="nt"&gt;&amp;lt;/h1&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;iframe&lt;/span&gt; &lt;span class="na"&gt;width=&lt;/span&gt;&lt;span class="s"&gt;"100%"&lt;/span&gt; &lt;span class="na"&gt;height=&lt;/span&gt;&lt;span class="s"&gt;"600"&lt;/span&gt;
        &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"YOUR_EMBED_CODE_HERE"&lt;/span&gt;
        &lt;span class="na"&gt;frameborder=&lt;/span&gt;&lt;span class="s"&gt;"0"&lt;/span&gt;
        &lt;span class="na"&gt;allowFullScreen&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/iframe&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/body&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/html&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Replace the &lt;code&gt;src&lt;/code&gt; with your embed link&lt;/li&gt;
&lt;li&gt;Open the file in a browser&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbgju649oiymft3x63hnn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbgju649oiymft3x63hnn.png" width="800" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The report will now display directly on your website.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Key Insights&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;“Publish to web” makes reports public and should not be used for sensitive data&lt;/li&gt;
&lt;li&gt;Workspaces help organize reports and dashboards&lt;/li&gt;
&lt;li&gt;Embedded reports update automatically when republished&lt;/li&gt;
&lt;li&gt;i frame embedding works across most platforms&lt;/li&gt;
&lt;li&gt;Public embedding is free but lacks security controls&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For secure embedding, consider Power BI Embedded or SharePoint Online.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;You have successfully learned how to publish a Power BI report and embed it into a website. The process involves creating a workspace, publishing your report, generating an embed code, and embedding it into your site.&lt;/p&gt;

</description>
      <category>data</category>
      <category>javascript</category>
      <category>website</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Data Science vs Data Analysis vs Machine Learning.</title>
      <dc:creator>Maxwel Waweru</dc:creator>
      <pubDate>Sat, 04 Apr 2026 10:27:34 +0000</pubDate>
      <link>https://dev.to/maxwel_waweru_28/data-science-vs-data-analysis-vs-machine-learning-pen</link>
      <guid>https://dev.to/maxwel_waweru_28/data-science-vs-data-analysis-vs-machine-learning-pen</guid>
      <description>&lt;p&gt;``&lt;br&gt;
&lt;strong&gt;What's the difference? (And why you've been using them wrong)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Job Interview That Confused Me&lt;/strong&gt;&lt;br&gt;
I'll never forget my first data-related job interview.&lt;/p&gt;

&lt;p&gt;The recruiter asked: "So, do you have experience in Data Science?"&lt;/p&gt;

&lt;p&gt;I said yes.&lt;/p&gt;

&lt;p&gt;Then: "What about Data Analysis?"&lt;/p&gt;

&lt;p&gt;I said yes again.&lt;/p&gt;

&lt;p&gt;Then: "And Machine Learning?"&lt;/p&gt;

&lt;p&gt;I hesitated. Aren't they all the same thing?&lt;/p&gt;

&lt;p&gt;Spoiler: They're not. But for years, I used these terms like they were interchangeable. And honestly? Most beginners do the same.&lt;/p&gt;

&lt;p&gt;Today, I'm going to clear up this confusion once and for all – using one simple analogy you'll never forget.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Coffee shop analogy&lt;/strong&gt;&lt;br&gt;
Imagine you run a small coffee shop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Analysis (What happened?)&lt;/strong&gt;&lt;br&gt;
You look at your sales records from last month.&lt;/p&gt;

&lt;p&gt;"We sold 500 lattes and 300 cappuccinos."&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Sales peak between 8-10 AM."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;"December had 20% higher sales than November."&lt;/p&gt;

&lt;p&gt;Data Analysis answers: What happened? and Why did it happen?&lt;/p&gt;

&lt;p&gt;It looks at the past. It describes. It summarizes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Machine Learning (What will happen?)&lt;/strong&gt;&lt;br&gt;
You take those same sales records and build a system that predicts the future.&lt;/p&gt;

&lt;p&gt;"Tomorrow, we'll likely sell 45 lattes."&lt;/p&gt;

&lt;p&gt;"Customers who buy a muffin usually also buy a cappuccino."&lt;/p&gt;

&lt;p&gt;"If it rains, soup sales go up 30%."&lt;/p&gt;

&lt;p&gt;Machine Learning answers: What will happen next?&lt;/p&gt;

&lt;p&gt;It learns from past data to make predictions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Science (The big picture)&lt;/strong&gt;&lt;br&gt;
You combine everything – analysis, predictions, business strategy, and technical systems – to run a better coffee shop.&lt;/p&gt;

&lt;p&gt;You analyze past sales (Data Analysis)&lt;/p&gt;

&lt;p&gt;You predict future demand (Machine Learning)&lt;/p&gt;

&lt;p&gt;You decide to hire an extra barista for morning rush (Business Action)&lt;/p&gt;

&lt;p&gt;You build a system that automatically orders milk when stock is low (Deployment)&lt;/p&gt;

&lt;p&gt;Data Science is the entire universe. Data Analysis and Machine Learning are tools inside it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Side-by-Side Comparison&lt;/strong&gt;&lt;br&gt;
Let me break this down in a clear table.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data Analysis (Descriptive &amp;amp; Diagnostic)&lt;/strong&gt;&lt;br&gt;
Main Question: What happened? Why?&lt;br&gt;
Focus: Past and present data&lt;br&gt;
Output: Reports, dashboards, charts, visualizations&lt;br&gt;
Key Skills: SQL, Excel, Statistics&lt;br&gt;
Complexity: Low to Medium &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Machine Learning (Predictive)&lt;/strong&gt;&lt;br&gt;
Main Question: What will happen?&lt;br&gt;
Focus: Future predictions, finding patterns in data&lt;br&gt;
Output: Predictive models, algorithms&lt;br&gt;
Key Skills: Python, Algorithms, Math&lt;br&gt;
Complexity: Medium to High &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data Science (Comprehensive)&lt;/strong&gt;&lt;br&gt;
Main Question: How do we create value from data?&lt;br&gt;
Focus: The entire process (collecting, cleaning, analyzing, modeling)&lt;br&gt;
Output: Models + insights + systems + strategy&lt;br&gt;
Key Skills: All of the above + Business strategy + Deployment (Cloud)&lt;br&gt;
Complexity: High (covers everything) &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Key Differences in the Data&lt;/strong&gt;&lt;br&gt;
Human involvement: High in Analysis (interpretation), Medium in Machine Learning (guided learning).&lt;br&gt;
Data Types: Data analysis deals more with structured data, while Data Science often handles both structured and unstructured data&lt;/p&gt;

&lt;p&gt;The Venn Diagram (Visual Learners, This Is For You)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faytsd2u438fv7aj02vo0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faytsd2u438fv7aj02vo0.png" alt="venn diagram explaining the relationship between datascience\data anlysis\machine learning" width="800" height="769"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Data Science sits in the middle – borrowing tools from both Data Analysis and Machine Learning, while adding business context and deployment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-World Examples&lt;/strong&gt;&lt;br&gt;
Let me show you how these three play out in different jobs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example 1: E-commerce (Amazon)&lt;/strong&gt;&lt;br&gt;
Role    What they do&lt;/p&gt;

&lt;p&gt;Data Analyst    "Last month, shoes were the top-selling category. Returns increased by 5%."&lt;/p&gt;

&lt;p&gt;Machine Learning Engineer   "I built a model that recommends products based on your browsing history."&lt;/p&gt;

&lt;p&gt;Data Scientist  "I led the project to reduce cart abandonment. We analyzed behavior, built a prediction model, deployed a 'you forgot this' email system, and measured a 10% recovery rate."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example 2: Healthcare (Hospital)&lt;/strong&gt;&lt;br&gt;
Role    What they do&lt;/p&gt;

&lt;p&gt;Data Analyst    "In Q3, patient wait times averaged 25 minutes. ER visits peaked on weekends."&lt;/p&gt;

&lt;p&gt;Machine Learning Engineer   "I created a model that predicts which patients are at high risk of readmission within 30 days."&lt;/p&gt;

&lt;p&gt;Data Scientist  "I built an early warning system for sepsis. It collects vital signs (data), predicts deterioration (ML), alerts nurses (deployment), and tracks how many lives were saved (evaluation)."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example 3: Sports (Football Team)&lt;/strong&gt;&lt;br&gt;
Role    What they do&lt;/p&gt;

&lt;p&gt;Data Analyst    "Our striker scores 70% of his goals in the second half. The team concedes most goals between minutes 30-45."&lt;/p&gt;

&lt;p&gt;Machine Learning Engineer   "I built a model that predicts injury risk based on player workload and fatigue data."&lt;/p&gt;

&lt;p&gt;Data Scientist  "I created a player recruitment system. It analyzes past transfers (analysis), predicts future performance (ML), and helps the coach decide who to sign (decision)."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which One Should You Learn First?&lt;/strong&gt;&lt;br&gt;
This is the question everyone asks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start with Data Analysis if:&lt;/strong&gt;&lt;br&gt;
You're completely new to data&lt;/p&gt;

&lt;p&gt;You work with Excel or SQL already&lt;/p&gt;

&lt;p&gt;You want quick, practical results&lt;/p&gt;

&lt;p&gt;You prefer reports and dashboards over code&lt;/p&gt;

&lt;p&gt;Learning path: Excel → SQL → Statistics → Visualization&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Jump to Machine Learning if:&lt;/strong&gt;&lt;br&gt;
You already know Python&lt;/p&gt;

&lt;p&gt;You're excited about predictions and AI&lt;/p&gt;

&lt;p&gt;You enjoy math and algorithms&lt;/p&gt;

&lt;p&gt;You want to build models&lt;/p&gt;

&lt;p&gt;Learning path: Python → Pandas → Scikit-learn → Basic ML algorithms&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Go for Data Science if:&lt;/strong&gt;&lt;br&gt;
You want the full picture&lt;/p&gt;

&lt;p&gt;You're aiming for a senior or leadership role&lt;/p&gt;

&lt;p&gt;You like mixing business + tech + statistics&lt;/p&gt;

&lt;p&gt;You want to deploy real systems&lt;/p&gt;

&lt;p&gt;Learning path: All of the above + Deployment + Big Data tools&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Honest Truth&lt;/strong&gt;&lt;br&gt;
Here's what nobody tells you.&lt;/p&gt;

&lt;p&gt;Job titles are messy.&lt;/p&gt;

&lt;p&gt;Some companies call their Data Analysts "Data Scientists." Some call their ML engineers "Data Scientists." There's no police enforcing these definitions.&lt;/p&gt;

&lt;p&gt;What matters is skills, not titles.&lt;/p&gt;

&lt;p&gt;Can you clean messy data?&lt;/p&gt;

&lt;p&gt;Can you find insights?&lt;/p&gt;

&lt;p&gt;Can you build a simple prediction?&lt;/p&gt;

&lt;p&gt;Can you explain it to a non-technical person?&lt;/p&gt;

&lt;p&gt;Master those, and the title doesn't matter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick Summary Table (Save This)&lt;/strong&gt;&lt;br&gt;
You want to...  That's...&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Describe what happened    -Data Analysis&lt;/li&gt;
&lt;li&gt;Understand why it happened    -Data Analysis&lt;/li&gt;
&lt;li&gt;Predict what will happen  -Machine Learning&lt;/li&gt;
&lt;li&gt;Group similar things together -Machine Learning&lt;/li&gt;
&lt;li&gt;Clean and prepare data    -Both!&lt;/li&gt;
&lt;li&gt;Build a dashboard -Data Analysis&lt;/li&gt;
&lt;li&gt;Deploy a model to production  -Data Science&lt;/li&gt;
&lt;li&gt;Solve a business problem with data    -Data Science&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your Turn&lt;br&gt;
Here's a simple exercise.&lt;/p&gt;

&lt;p&gt;Think of a problem you care about. Could be:&lt;/p&gt;

&lt;p&gt;Predicting which movies you'll like (ML)&lt;/p&gt;

&lt;p&gt;Analyzing your monthly spending (Analysis)&lt;/p&gt;

&lt;p&gt;Building a system to find cheap flights (Data Science)&lt;/p&gt;

&lt;p&gt;Which one is it? Drop your answer in the comments.&lt;/p&gt;

&lt;h3&gt;
  
  
  Next Up in the Series
&lt;/h3&gt;

&lt;p&gt;Upnext: Getting Started with Python for Data Science – the exact steps to set up your environment and write your first data script.&lt;/p&gt;

&lt;p&gt;in the works:&lt;/p&gt;

&lt;p&gt;Introduction to Jupyter Notebook&lt;/p&gt;

&lt;p&gt;Top Free Tools Every Beginner Should Know&lt;/p&gt;

&lt;p&gt;What is Data Cleaning and Why It Matters&lt;/p&gt;

&lt;p&gt;Did this clear things up? Hit ❤️ if you'll never confuse these terms again.&lt;/p&gt;

&lt;p&gt;I'm [Maxwel Waweru], writing daily beginner guides. Follow me so you don't miss tomorrow's Python setup guide!&lt;/p&gt;

&lt;p&gt;Previously in this series:&lt;/p&gt;

&lt;p&gt;What is Data Science? A Simple Beginner's Guide&lt;/p&gt;

&lt;p&gt;Understanding the Data Science Lifecycle&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>machinelearning</category>
      <category>comparison</category>
      <category>ai</category>
    </item>
    <item>
      <title>Understanding the Data Science Lifecycle From messy data to real-world impact – a step-by-step journey</title>
      <dc:creator>Maxwel Waweru</dc:creator>
      <pubDate>Fri, 03 Apr 2026 20:39:20 +0000</pubDate>
      <link>https://dev.to/maxwel_waweru_28/understanding-the-data-science-lifecycle-from-messy-data-to-real-world-impact-a-step-by-step-58bf</link>
      <guid>https://dev.to/maxwel_waweru_28/understanding-the-data-science-lifecycle-from-messy-data-to-real-world-impact-a-step-by-step-58bf</guid>
      <description>&lt;p&gt;&lt;strong&gt;The 6 PM Realization&lt;/strong&gt;&lt;br&gt;
Picture this.&lt;/p&gt;

&lt;p&gt;You've just learned what Data Science is (maybe from that previous article you read ). You're excited. You open your laptop, download a dataset, and… freeze.&lt;/p&gt;

&lt;p&gt;Where do I even start?&lt;/p&gt;

&lt;p&gt;Do you clean the data first? Build a model? Make a chart? Call it a day and watch Netflix?&lt;/p&gt;

&lt;p&gt;I've been there. Most beginners think Data Science is a single leap from "I have data" to "I have answers."&lt;/p&gt;

&lt;p&gt;It's not.&lt;/p&gt;

&lt;p&gt;It's a journey with several clear stops along the way. And once you understand the map, the whole process becomes 10x less intimidating.&lt;/p&gt;

&lt;p&gt;Let me walk you through it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the Data Science Lifecycle?&lt;/strong&gt;&lt;br&gt;
Think of it like building a house.&lt;/p&gt;

&lt;p&gt;You wouldn't start by painting the walls. You'd first:&lt;/p&gt;

&lt;p&gt;Talk to the family (understand the need)&lt;/p&gt;

&lt;p&gt;Draw a blueprint (plan)&lt;/p&gt;

&lt;p&gt;Lay the foundation (prepare)&lt;/p&gt;

&lt;p&gt;Build the structure (create)&lt;/p&gt;

&lt;p&gt;Inspect the work (evaluate)&lt;/p&gt;

&lt;p&gt;Hand over the keys (deploy)&lt;/p&gt;

&lt;p&gt;The Data Science Lifecycle is exactly that—a structured, repeatable process for turning raw data into real value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The 6 Stages (Your Roadmap)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 1&lt;/strong&gt;: Problem Definition (The "Why")&lt;br&gt;
Before you write a single line of code, you must answer one question:&lt;/p&gt;

&lt;p&gt;What problem are we trying to solve?&lt;/p&gt;

&lt;p&gt;Bad question: "Let's use AI on our customer data!"&lt;/p&gt;

&lt;p&gt;Good question: "Why are 20% of our customers leaving within the first 30 days?"&lt;/p&gt;

&lt;p&gt;What happens here:&lt;/p&gt;

&lt;p&gt;Talk to business stakeholders&lt;/p&gt;

&lt;p&gt;Define success (e.g., "reduce churn by 15%")&lt;/p&gt;

&lt;p&gt;Set clear goals&lt;/p&gt;

&lt;p&gt;Beginner tip: A well-defined problem is 50% of the solution. Don't skip this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 2&lt;/strong&gt;: Data Collection (The "Where")&lt;br&gt;
Now you need the raw ingredients.&lt;/p&gt;

&lt;p&gt;Where does data live?&lt;/p&gt;

&lt;p&gt;Databases (SQL)&lt;/p&gt;

&lt;p&gt;CSV/Excel files&lt;/p&gt;

&lt;p&gt;APIs (Twitter, weather, etc.)&lt;/p&gt;

&lt;p&gt;Web scraping&lt;/p&gt;

&lt;p&gt;Surveys&lt;/p&gt;

&lt;p&gt;Example: To understand customer churn, you might collect:&lt;/p&gt;

&lt;p&gt;Customer demographics&lt;/p&gt;

&lt;p&gt;Purchase history&lt;/p&gt;

&lt;p&gt;Support ticket logs&lt;/p&gt;

&lt;p&gt;Website activity&lt;/p&gt;

&lt;p&gt;Beginner tip: Start with ready-made datasets from Kaggle or Google Dataset Search. Don't worry about collecting your own data yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 3&lt;/strong&gt;: Data Preparation (The "80%" – Seriously)&lt;br&gt;
This is the least glamorous but most important stage.&lt;/p&gt;

&lt;p&gt;Why? Real-world data is messy. Really messy.&lt;/p&gt;

&lt;p&gt;Common problems you'll find:&lt;/p&gt;

&lt;p&gt;Missing values (blanks)&lt;/p&gt;

&lt;p&gt;Duplicates&lt;/p&gt;

&lt;p&gt;Inconsistent formatting (e.g., "NY", "New York", "new york")&lt;/p&gt;

&lt;p&gt;Outliers (a 200-year-old customer?)&lt;/p&gt;

&lt;p&gt;Incorrect data types (dates stored as text)&lt;/p&gt;

&lt;p&gt;What you'll do:&lt;/p&gt;

&lt;p&gt;Clean missing data&lt;/p&gt;

&lt;p&gt;Remove duplicates&lt;/p&gt;

&lt;p&gt;Standardize formats&lt;/p&gt;

&lt;p&gt;Handle outliers&lt;/p&gt;

&lt;p&gt;Create new features (e.g., "age" from "birthdate")&lt;/p&gt;

&lt;p&gt;Beginner tip: Spend time here. A clean dataset leads to good models. Garbage in = garbage out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 4&lt;/strong&gt;: Modeling (The "AI" Part – Finally!)&lt;br&gt;
This is what everyone thinks Data Science is.&lt;/p&gt;

&lt;p&gt;You use algorithms to find patterns or make predictions.&lt;/p&gt;

&lt;p&gt;Common algorithms for beginners:&lt;/p&gt;

&lt;p&gt;Linear Regression (predicting numbers, like house prices)&lt;/p&gt;

&lt;p&gt;Logistic Regression (predicting categories, like spam or not spam)&lt;/p&gt;

&lt;p&gt;Decision Trees (simple if-then rules)&lt;/p&gt;

&lt;p&gt;What happens here:&lt;/p&gt;

&lt;p&gt;Split data into training and testing sets&lt;/p&gt;

&lt;p&gt;Train the model on past data&lt;/p&gt;

&lt;p&gt;Make predictions on new data&lt;/p&gt;

&lt;p&gt;Beginner tip: Don't get lost in complex algorithms. Start with simple ones. They often work surprisingly well.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 5&lt;/strong&gt;: Evaluation (The "Did It Work?")&lt;br&gt;
You've built a model. Great. But is it any good?&lt;/p&gt;

&lt;p&gt;Questions you ask:&lt;/p&gt;

&lt;p&gt;How accurate are the predictions?&lt;/p&gt;

&lt;p&gt;Does it work on data it hasn't seen before?&lt;/p&gt;

&lt;p&gt;Is it better than a random guess?&lt;/p&gt;

&lt;p&gt;Common metrics (don't panic – they're simple):&lt;/p&gt;

&lt;p&gt;Accuracy: What percentage did it get right?&lt;/p&gt;

&lt;p&gt;Precision/Recall: How often is it wrong? (For fraud detection, you care more about catching fraud than being perfect)&lt;/p&gt;

&lt;p&gt;Beginner tip: Always test on data the model hasn't seen during training. Otherwise, it's like giving a student the answer key before the exam.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 6&lt;/strong&gt;: Deployment (The "Real World")&lt;br&gt;
A model on your laptop is worthless.&lt;/p&gt;

&lt;p&gt;It needs to go where decisions are made.&lt;/p&gt;

&lt;p&gt;What deployment looks like:&lt;/p&gt;

&lt;p&gt;A dashboard (e.g., "Customer churn risk" updated daily)&lt;/p&gt;

&lt;p&gt;An API (other apps can call your model)&lt;/p&gt;

&lt;p&gt;A simple report emailed to the team&lt;/p&gt;

&lt;p&gt;Beginner tip: For your first projects, "deployment" can mean sharing a Jupyter Notebook or creating a simple visualization. Don't overcomplicate it.&lt;/p&gt;

&lt;p&gt;The Feedback Loop (Important!)&lt;br&gt;
Notice the dashed arrow in the diagram?&lt;/p&gt;

&lt;p&gt;Once you deploy, you learn. The model makes mistakes. Business needs change. New data arrives.&lt;/p&gt;

&lt;p&gt;So you go back to Stage 1 and start again.&lt;/p&gt;

&lt;p&gt;Data Science is never "done." It's a cycle of continuous improvement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A Real-World Example&lt;/strong&gt;&lt;br&gt;
Let's walk through a quick example.&lt;/p&gt;

&lt;p&gt;Problem: An e-commerce store wants to predict which customers will buy again next month.&lt;/p&gt;

&lt;p&gt;Stage   What Happens&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Problem Definition   "Increase repeat purchases by 10%"&lt;/li&gt;
&lt;li&gt;Data Collection  Purchase history, browsing behavior, email engagement&lt;/li&gt;
&lt;li&gt;Data Preparation Remove inactive accounts, fill missing ages, standardize dates&lt;/li&gt;
&lt;li&gt;Modeling Train a simple classification model (will buy / won't buy)&lt;/li&gt;
&lt;li&gt;Evaluation   Model predicts correctly 85% of the time&lt;/li&gt;
&lt;li&gt;Deployment   Add a "high risk of churn" badge to the internal dashboard
Then: The marketing team sends special offers to high-risk customers. Repeat purchases go up. The model gets retrained with new data. The cycle continues.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Common Beginner Mistakes&lt;/strong&gt;&lt;br&gt;
Mistake What to Do Instead&lt;br&gt;
Starting with modeling  Start with problem definition&lt;br&gt;
Skipping data cleaning  Embrace it – it's 80% of the work&lt;br&gt;
Testing on training data    Always hold out a test set&lt;br&gt;
Perfecting one stage before moving on   Iterate. Go through the whole cycle quickly first, then improve&lt;br&gt;
Forgetting deployment   Ask early: "How will this be used?"&lt;br&gt;
Your Turn&lt;br&gt;
You don't need to master all 6 stages at once.&lt;/p&gt;

&lt;p&gt;Start small. Pick a simple dataset. Go through each stage – even if it's messy. You'll learn more from one full cycle than from ten tutorials.&lt;/p&gt;

&lt;p&gt;Next step: Tomorrow, we'll compare Data Science vs Data Analysis vs Machine Learning – so you never mix them up again.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quick Recap&lt;/strong&gt;&lt;br&gt;
The Data Science Lifecycle is a structured process, not random hacking&lt;/p&gt;

&lt;p&gt;6 stages: Problem → Collect → Prepare → Model → Evaluate → Deploy&lt;/p&gt;

&lt;p&gt;Data preparation takes 80% of the time (and that's normal)&lt;/p&gt;

&lt;p&gt;Modeling is just one piece of the puzzle&lt;/p&gt;

&lt;p&gt;The cycle never ends – you continuously improve&lt;/p&gt;

&lt;p&gt;Found this helpful? Hit the ❤️ or 🦄 to help other beginners find their way.&lt;/p&gt;

&lt;p&gt;Question for you: Which stage sounds most intimidating to you right now? Drop a comment below – I'd love to help.&lt;/p&gt;

&lt;p&gt;I'm [Maxwel Waweru], writing daily beginner guides on data science, analytics, and AI. Follow me so you don't miss tomorrow's post!&lt;/p&gt;

&lt;p&gt;Previously in this series: What is Data Science? A Simple Beginner's Guide&lt;br&gt;
Coming up:  Data Science و Data Analysis و Machine Learning&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>beginners</category>
      <category>tutorial</category>
      <category>learning</category>
    </item>
    <item>
      <title>What is Data Science? A Simple Beginner’s Guide</title>
      <dc:creator>Maxwel Waweru</dc:creator>
      <pubDate>Tue, 31 Mar 2026 19:34:20 +0000</pubDate>
      <link>https://dev.to/maxwel_waweru_28/what-is-data-science-a-simple-beginners-guide-le6</link>
      <guid>https://dev.to/maxwel_waweru_28/what-is-data-science-a-simple-beginners-guide-le6</guid>
      <description>&lt;p&gt;&lt;strong&gt;The 3 AM Confession&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let me tell you a secret.&lt;/p&gt;

&lt;p&gt;When I first started learning to code, I thought Data Science was magic. I pictured a hooded figure in a dark room, lines of green code scrolling down a screen, whispering incantations like pandas.DataFrame() and fit_transform().&lt;/p&gt;

&lt;p&gt;I thought you had to be a mathematician, a statistician, and a senior developer all rolled into one.&lt;/p&gt;

&lt;p&gt;The truth? Data Science is not magic. It is storytelling.&lt;/p&gt;

&lt;p&gt;If you are a developer looking to dip your toes into this field, or a complete beginner wondering where to start, forget the complex math for a second. Let’s strip away the buzzwords and look at what Data Science actually is.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The Recipe Analogy (The Easiest Way to Understand)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Imagine you want to bake the perfect chocolate chip cookie.&lt;/p&gt;

&lt;p&gt;· Data: You have a notebook filled with past attempts. Some cookies were burnt; some were gooey. You have notes on oven temperatures, types of chocolate, and how long you chilled the dough. This is your Raw Data.&lt;br&gt;
· Data Science: You look through this notebook. You notice that every time the oven was above 375°F, the cookies burnt. But when you used dark chocolate and chilled the dough for 24 hours, they were perfect.&lt;br&gt;
· The Output: You write a new recipe. You now know exactly what to do to get a perfect cookie every single time. You can even predict that if a friend uses margarine instead of butter, the cookies will spread too thin.&lt;/p&gt;

&lt;p&gt;Data Science is exactly this process, but for business problems.&lt;/p&gt;

&lt;p&gt;You take messy ingredients (raw data), you analyze past experiments (exploratory analysis), you find hidden patterns (machine learning), and you produce a recipe (an actionable insight or a model) that tells you what to do next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Breaking Down the Buzzwords&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If we look at the technical definition, Data Science sits at the intersection of three distinct worlds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph TD
    A[Math &amp;amp; Statistics] --- B(Data Science)
    B --- C[Programming &amp;amp; Databases]
    B --- D[Domain Expertise]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s break that down in human terms:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Math &amp;amp; Statistics (The Logic)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This isn’t about solving calculus problems on a whiteboard. It’s about asking: "Is this pattern real, or did it happen by accident?"&lt;/p&gt;

&lt;p&gt;· Beginner take: You need to know the difference between average (mean) and the middle value (median). You need to know how to spot a lie (bias). You don’t need a PhD to start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Programming &amp;amp; Databases (The Toolbox)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is where you feel at home. We use code to clean the "dirty" data and build models.&lt;/p&gt;

&lt;p&gt;· Python is the language of choice (though R is great too).&lt;br&gt;
· SQL is non-negotiable. Most data lives in databases; you have to know how to ask for it.&lt;br&gt;
· Beginner take: If you can write a for loop and use SELECT * FROM, you have enough to start learning the rest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Domain Expertise (The Context)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the secret sauce. If you are analyzing healthcare data but don’t know what "blood pressure" means, your model will fail.&lt;/p&gt;

&lt;p&gt;· Beginner take: You don't need to be a doctor. But you must understand why the data exists. Data Science is useless without context.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Data Science Workflow (What the Job Actually Looks Like)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Most beginners think the job is 100% building AI models. In reality, building the model is about 10% of the job. Here is the actual workflow:&lt;/p&gt;

&lt;p&gt;Step 1: Ask the Right Question&lt;/p&gt;

&lt;p&gt;Before writing a single line of code, you ask: "What problem are we solving?"&lt;/p&gt;

&lt;p&gt;· Bad question: "How do we use AI?"&lt;br&gt;
· Good question: "Why are we losing customers in the first month?"&lt;/p&gt;

&lt;p&gt;Step 2: Data Collection &amp;amp; Cleaning (The 80% Rule)&lt;/p&gt;

&lt;p&gt;If you take one thing away from this article, let it be this: Data Scientists spend 80% of their time cleaning data.&lt;/p&gt;

&lt;p&gt;· You will find missing values (NaN).&lt;br&gt;
· You will find duplicates.&lt;br&gt;
· You will find dates formatted as text.&lt;br&gt;
· Your job here is to turn chaos into a tidy table.&lt;/p&gt;

&lt;p&gt;Step 3: Exploration (EDA)&lt;/p&gt;

&lt;p&gt;You play with the data. You make charts.&lt;/p&gt;

&lt;p&gt;· Does more screen time correlate with lower test scores? (Plot a scatter plot).&lt;br&gt;
· Which product sells the most? (Plot a bar chart).&lt;br&gt;
  This is where you find the "story."&lt;/p&gt;

&lt;p&gt;Step 4: Modeling (The "AI" Part)&lt;/p&gt;

&lt;p&gt;This is where you use algorithms (like Linear Regression, Random Forests, or Neural Networks) to make predictions.&lt;/p&gt;

&lt;p&gt;· Example: Based on past data, this customer will probably cancel their subscription next week.&lt;/p&gt;

&lt;p&gt;Step 5: Deployment &amp;amp; Communication&lt;/p&gt;

&lt;p&gt;If you build a model that stays on your laptop, it is useless. You have to deploy it (put it in the cloud) or create a dashboard. Most importantly, you have to explain to the CEO why they should listen to your model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Common Myths (Let’s Clear the Air)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Myth Reality&lt;br&gt;
You need to be a genius at math. You need to be curious. Basic statistics and logical thinking get you 90% of the way. The libraries (scikit-learn) do the heavy math for you.&lt;br&gt;
You need a PhD. Some research roles do, but most industry roles care about your portfolio. Can you solve problems? That’s what matters.&lt;br&gt;
Data Science is just about AI. No. Most Data Science is about descriptive analytics. "What happened last quarter and why?" AI is just a small (but fun) subset.&lt;br&gt;
It’s a solo job. It’s incredibly collaborative. You work with engineers, product managers, and business leaders constantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. How to Start (Without Overwhelming Yourself)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you want to transition into Data Science, don’t try to learn everything at once. Do this instead:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Learn Python Basics: If you already know JavaScript or Java, Python will feel like writing English. Focus on pandas (for data) and matplotlib (for charts).&lt;/li&gt;
&lt;li&gt;Master SQL: Go to W3Schools or LeetCode and practice SQL until you can join tables in your sleep.&lt;/li&gt;
&lt;li&gt;Do a "Full" Project: Don’t just follow a tutorial. Find a dataset on Kaggle (e.g., Titanic or Housing Prices). Try to predict something.
· Fail.
· Google the error.
· Fix it.
· This is the real learning loop.&lt;/li&gt;
&lt;li&gt;Share Your Work: Write a post on Dev.to showing your first chart. Put the code on GitHub. This builds your portfolio faster than any certificate.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Data Science is not about having a crystal ball.&lt;/p&gt;

&lt;p&gt;It is about taking the messiest parts of a business (or a kitchen), using code to clean it up, using statistics to find the truth, and using storytelling to get people to act on it.&lt;/p&gt;

&lt;p&gt;You don't have to know everything today. You just have to ask one question: "I wonder why that happens?"&lt;/p&gt;

&lt;p&gt;If you have that curiosity, you already have the hardest skill to teach. The code is just the tool you use to find the answer.&lt;/p&gt;

&lt;p&gt;Ready to take the next step?&lt;br&gt;
Drop a comment below if you want a follow-up post on "Understanding the data Science lifecycle"&lt;/p&gt;

&lt;p&gt;I’m Maxwel Waweru, a Data Scientist who believes in breaking down complex topics into simple stories. Follow me for more.&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>beginners</category>
      <category>python</category>
      <category>career</category>
    </item>
    <item>
      <title>UNDERSTANDING DATA MODELING IN POWER BI:JOINS,RELATIONSHIPS AND SCHEMAS EXPLAINED</title>
      <dc:creator>Maxwel Waweru</dc:creator>
      <pubDate>Tue, 31 Mar 2026 18:28:06 +0000</pubDate>
      <link>https://dev.to/maxwel_waweru_28/understanding-data-modeling-in-power-bijoinsrelationships-and-schemas-explained-23ep</link>
      <guid>https://dev.to/maxwel_waweru_28/understanding-data-modeling-in-power-bijoinsrelationships-and-schemas-explained-23ep</guid>
      <description>&lt;p&gt;A beautiful report is only as good as the data model behind it,&lt;br&gt;
in this modern world of Business Intelligence. Power BI is a powerful tool, but without a solid understanding of how tables connect, you’ll quickly run into incorrect totals, sluggish performance, and confusing filter behavior.&lt;/p&gt;

&lt;p&gt;This article serves as your complete guide to mastering data modeling in Power BI. We will break down the technicalities of SQL joins, explain the unique nature of Power BI relationships, explore star schemas, and walk through where to actually build these elements inside the Power BI interface.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. What is Data Modeling?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Data modeling is the process of defining how data tables connect to each other. It involves structuring raw data (often from multiple sources) into a logical, unified framework that is optimized for reporting.&lt;/p&gt;

&lt;p&gt;A good data model ensures that:&lt;/p&gt;

&lt;p&gt;· Accuracy: Filters propagate correctly to show the right numbers.&lt;br&gt;
· Performance: Reports load quickly (compression and optimized queries).&lt;br&gt;
· Usability: End-users can intuitively click through reports without encountering errors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. SQL Joins Explained (The Foundation)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before we look at Power BI Relationships, we must understand Joins. Joins occur in the Power Query Editor (the "Get Data" phase). They combine two tables horizontally based on a matching column, resulting in a single flattened table.&lt;/p&gt;

&lt;p&gt;Here are the six essential SQL joins, visualized and explained with a real-world example of Sales and Customers.&lt;/p&gt;

&lt;p&gt;The Setup&lt;/p&gt;

&lt;p&gt;· Sales Table: Contains SaleID, CustomerID, and Amount.&lt;br&gt;
· Customers Table: Contains CustomerID and CustomerName.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. INNER JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns only rows where there is a match in both tables.&lt;/p&gt;

&lt;p&gt;· Use Case: Finding sales that belong to existing, valid customers. If a sale has a CustomerID not found in the Customers table, it is excluded.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Sales] ---(Match)---&amp;gt; [Customers]
Result: Only Sales associated with a known Customer.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. LEFT JOIN (LEFT OUTER)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns all rows from the left table (Sales) and the matched rows from the right table (Customers). If no match, results are null.&lt;/p&gt;

&lt;p&gt;· Use Case: Keeping all sales transactions, even if the customer record was deleted or the ID is missing. This is the most common join in Power Query to preserve fact data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Sales] ---------------------&amp;gt; [Customers]
Result: All Sales. Customer Name appears if exists; otherwise, blank.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. RIGHT JOIN (RIGHT OUTER)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns all rows from the right table (Customers) and the matched rows from the left table (Sales). This is the logical opposite of a LEFT JOIN.&lt;/p&gt;

&lt;p&gt;· Use Case: Finding all customers, regardless of whether they have made a purchase.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Sales] &amp;lt;--------------------- [Customers]
Result: All Customers. Sales data appears only if they bought something.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. FULL OUTER JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns all rows from both tables. Where there is a match, they are combined; where there is not, missing sides are filled with null.&lt;/p&gt;

&lt;p&gt;· Use Case: Merging two systems to see a complete master list, such as merging legacy CRM data with a new CRM to see all records.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Sales] &amp;lt;----------------------&amp;gt; [Customers]
Result: Every sale and every customer. Unmatched sales have null names; unmatched customers have null amounts.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;5. LEFT ANTI JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns only rows from the left table that have no match in the right table.&lt;/p&gt;

&lt;p&gt;· Use Case: Data cleansing. Finding orphaned records (sales with missing Customer IDs) to flag or delete them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Sales] ---(No Match)---&amp;gt; [Customers]
Result: Only sales with invalid CustomerIDs.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;6. RIGHT ANTI JOIN&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Returns only rows from the right table that have no match in the left table.&lt;/p&gt;

&lt;p&gt;· Use Case: Finding inactive customers (customers who exist in the CRM but have never made a sale).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Sales] &amp;lt;---(No Match)--- [Customers]
Result: Customers with zero sales.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Power BI Relationships vs. SQL Joins&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the most critical distinction to understand. Do not confuse a SQL Join with a Power BI Relationship.&lt;/p&gt;

&lt;p&gt;Feature SQL Join (Power Query) Power BI Relationship (Model View)&lt;br&gt;
Result Creates a single new table. Tables remain separate but connected.&lt;br&gt;
Storage Data is duplicated (denormalized). Increases file size. Data is stored once (normalized). Optimizes compression.&lt;br&gt;
Filtering No dynamic filtering. It’s a static merge. Dynamic. Filters flow across tables automatically based on the relationship.&lt;br&gt;
Best For Lookup operations, adding columns to fact tables, or final staging. Creating star schemas, row-level security, and complex calculations (DAX).&lt;/p&gt;

&lt;p&gt;Rule of Thumb: Use Joins in Power Query to bring attributes into a table (e.g., adding "Product Name" to the Sales table). Use Relationships in the Model View to connect dimensions to facts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Power BI Relationships Deep Dive&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Relationships are the glue of the data model. You define them in the Model View.&lt;/p&gt;

&lt;p&gt;Cardinality&lt;/p&gt;

&lt;p&gt;· One-to-Many (1:M): The most common. A single row in one table (e.g., Products) relates to many rows in another (e.g., Sales). The filter flows from the "One" side to the "Many" side.&lt;br&gt;
· One-to-One (1:1): Rare. Used for splitting a wide table for security purposes or when using "Bidirectional" cross-filtering.&lt;br&gt;
· Many-to-Many (M:M): Complex. Either both tables have duplicate values, or a bridging table is involved. Power BI now supports M:M relationships natively, but they require careful management to avoid performance hits.&lt;/p&gt;

&lt;p&gt;Cross-Filter Direction&lt;/p&gt;

&lt;p&gt;· Single: The default. Filters propagate from the "One" side to the "Many" side.&lt;br&gt;
· Both (Bidirectional): Allows filtering in both directions. Useful for row-level security or bridging tables, but overuse can create ambiguous filter paths (circular dependencies).&lt;/p&gt;

&lt;p&gt;Active vs. Inactive Relationships&lt;/p&gt;

&lt;p&gt;You can only have one active path of propagation between two tables at a time.&lt;/p&gt;

&lt;p&gt;· Active: Solid line. Used by default for filters.&lt;br&gt;
· Inactive: Dotted line. Not used by default. You must activate it in DAX using the USERELATIONSHIP() function.&lt;br&gt;
· Use Case: A "Sales" table might have an active relationship to the "Order Date" table, but an inactive relationship to the "Ship Date" table to calculate shipping delays.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Fact vs. Dimension Tables&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A proper model relies on distinguishing these two types of tables.&lt;/p&gt;

&lt;p&gt;· Fact Tables (Transactions):&lt;br&gt;
  · Contains quantitative data (numbers, measures).&lt;br&gt;
  · Think: Sales Amount, Quantity, Revenue.&lt;br&gt;
  · Usually long and narrow (many rows, few columns).&lt;br&gt;
  · Changes frequently (every transaction).&lt;br&gt;
· Dimension Tables (Descriptions):&lt;br&gt;
  · Contains descriptive data (text, attributes).&lt;br&gt;
  · Think: Customer Name, Product Category, Date.&lt;br&gt;
  · Usually short and wide (few rows, many columns).&lt;br&gt;
  · Changes slowly (e.g., a customer changes address).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Data Schemas: Star, Snowflake, and Flat Table&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;How you arrange your Facts and Dimensions defines your schema.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Star Schema&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The gold standard for Power BI.&lt;/p&gt;

&lt;p&gt;· Structure: A central Fact table surrounded directly by Dimension tables.&lt;br&gt;
· Look: Like a star.&lt;br&gt;
· Advantages: Optimal for Power BI VertiPaq engine; fastest performance; simplest for DAX calculations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        [Customer]
            |
[Product] — [Sales] — [Date]
            |
        [Store]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Snowflake Schema&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;· Structure: Dimensions are normalized. For example, Product table links to Subcategory, which links to Category.&lt;br&gt;
· Disadvantage: Increases the number of tables; can slow down performance compared to Star; more complex to navigate.&lt;br&gt;
· When to use: Rarely in Power BI. Only use if the source data is strictly structured this way and flattening it in Power Query is too complex.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Flat Table (Denormalized / DLAT)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;· Structure: A single table containing both facts and dimensions (e.g., Date, Product, Sales all in one row).&lt;br&gt;
· Advantage: Simple for beginners; no relationships to set up.&lt;br&gt;
· Disadvantage: Massive file size (high duplication); difficult to maintain; limited analytical complexity (time intelligence becomes hard). Avoid this unless your data is very small and static.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Advanced Concepts &amp;amp; Common Issues&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Role-Playing Dimensions&lt;/p&gt;

&lt;p&gt;A single dimension table used to filter multiple columns in a fact table.&lt;/p&gt;

&lt;p&gt;· Example: A Date dimension used to filter Order Date, Ship Date, and Delivery Date.&lt;br&gt;
· Implementation: You cannot have three active relationships to the same table. Keep one active (e.g., Order Date) and use USERELATIONSHIP in DAX for the others.&lt;/p&gt;

&lt;p&gt;Common Modeling Issues&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Ambiguity&lt;/strong&gt;: Creating bidirectional relationships that create multiple paths between tables. Power BI will throw a warning. Resolve by using single direction or bridging tables.&lt;br&gt;
&lt;strong&gt;2. Many-to-Many Ambiguity&lt;/strong&gt;: Having two M:M relationships leads to inconsistent totals. Use a bridge/concordance table instead.&lt;br&gt;
&lt;strong&gt;3. Referential Integrity&lt;/strong&gt;: Having sales rows with CustomerID = Null that break the relationship. Either fix the data source or use a "No Match" row in the dimension table.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;8. Step-by-Step: Where to Build This in Power BI&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A. Creating Joins (Power Query Editor)&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to Home &amp;gt; Transform Data.&lt;/li&gt;
&lt;li&gt;In the Power Query Editor, select the table you want to merge into (e.g., Sales).&lt;/li&gt;
&lt;li&gt;Click Home &amp;gt; Merge Queries.&lt;/li&gt;
&lt;li&gt;Select the second table (e.g., Customers).&lt;/li&gt;
&lt;li&gt;Select the matching columns (e.g., CustomerID).&lt;/li&gt;
&lt;li&gt;Choose the Join Kind (Inner, Left Outer, etc.).&lt;/li&gt;
&lt;li&gt;Click OK. Expand the new column to bring in the data (e.g., CustomerName).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;B. Creating Relationships (Model View)&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Click the Model View icon on the left side of the screen.&lt;/li&gt;
&lt;li&gt;Drag and Drop: Click the column in the Dimension table (e.g., Products[ProductID]) and drag it to the column in the Fact table (e.g., Sales[ProductID]).&lt;/li&gt;
&lt;li&gt;Manage Relationships:
· Go to Modeling &amp;gt; Manage Relationships.
· Click New.
· Select the two tables and columns.
· Set Cardinality (e.g., Many to One) and Cross Filter Direction (Single).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;C. Managing Inactive Relationships&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In Model View, double-click the line between two tables.&lt;/li&gt;
&lt;li&gt;Uncheck "Make this relationship active."&lt;/li&gt;
&lt;li&gt;To use it in a measure:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   Sales by Ship Date = 
   CALCULATE(
       SUM(Sales[Amount]),
       USERELATIONSHIP(Sales[ShipDateKey], Date[DateKey])
   )
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Mastering data modeling is the line between a Power BI user and a Power BI professional. Remember the hierarchy:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Structure: Use Joins in Power Query to shape your data and add necessary attributes.&lt;/li&gt;
&lt;li&gt;Connect: Use Relationships in Model View to define business logic.&lt;/li&gt;
&lt;li&gt;Optimize: Aim for a Star Schema with One-to-Many relationships flowing from Dimensions to Facts.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By understanding the difference between a static SQL join and a dynamic Power BI relationship, and by properly designing your fact and dimension tables, you will build models that are fast, accurate, and scalable.&lt;/p&gt;

</description>
      <category>powerbi</category>
      <category>sql</category>
      <category>datascience</category>
      <category>beginners</category>
    </item>
    <item>
      <title>HOW EXCEL IS USED IN REAL-WORLD DATA ANALYSIS</title>
      <dc:creator>Maxwel Waweru</dc:creator>
      <pubDate>Sun, 29 Mar 2026 18:26:25 +0000</pubDate>
      <link>https://dev.to/maxwel_waweru_28/how-excel-is-used-in-real-world-data-analysis-2402</link>
      <guid>https://dev.to/maxwel_waweru_28/how-excel-is-used-in-real-world-data-analysis-2402</guid>
      <description>&lt;p&gt;When people hear data analysis, they often imagine complex Python scripts, SQL queries, or advanced BI tools. However, in reality, the foundation of data analysis across most organizations—from global corporations to local startups—is Microsoft Excel.&lt;/p&gt;

&lt;p&gt;If you're starting your journey into data, Excel is not optional—it is a core skill required across operations, finance, marketing, and analytics roles.&lt;/p&gt;

&lt;p&gt;In this article, I’ll explain how Excel is used in real-world scenarios using a product performance dataset from Jumia, one of Africa’s largest e-commerce platforms.&lt;/p&gt;

&lt;h2&gt;
  
  
  what is excel
&lt;/h2&gt;

&lt;p&gt;Microsoft Excel is far more than a spreadsheet tool. It is a multi-functional analytical platform that serves as:&lt;/p&gt;

&lt;p&gt;A database – storing large volumes of structured data&lt;/p&gt;

&lt;p&gt;A calculation engine – performing simple to complex computations&lt;/p&gt;

&lt;p&gt;A data transformation tool – cleaning and reshaping messy datasets&lt;/p&gt;

&lt;p&gt;A visualization platform – building dashboards and reports&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Applications of Excel
&lt;/h2&gt;

&lt;h2&gt;
  
  
  1.Marketing
&lt;/h2&gt;

&lt;p&gt;: Evaluating Pricing Strategy&lt;br&gt;
A marketing team needs to determine whether discounts drive engagement.&lt;/p&gt;

&lt;p&gt;Task: Analyze the relationship between discount percentage and customer reviews&lt;/p&gt;

&lt;p&gt;Solution: Using PivotTables and scatter plots to identify trends.           &lt;/p&gt;

&lt;p&gt;Insight: Moderate-to-high discounts significantly increased customer engagement.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Merchandising
&lt;/h2&gt;

&lt;p&gt;: Identifying Underperforming Products&lt;br&gt;
The goal is to flag products that are discounted but poorly rated.&lt;/p&gt;

&lt;p&gt;Task: Detect products with high discounts and low ratings&lt;/p&gt;

&lt;p&gt;Solution: Conditional Formatting to highlight risk products instantly&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Operations
&lt;/h2&gt;

&lt;p&gt;: Data Cleaning&lt;br&gt;
Raw data is rarely usable in its original form.&lt;/p&gt;

&lt;p&gt;Task: Convert text-based values into numeric format&lt;/p&gt;

&lt;p&gt;Solution:&lt;/p&gt;

&lt;p&gt;Find &amp;amp; Replace to clean currency values&lt;/p&gt;

&lt;p&gt;TEXTSPLIT to extract ratings&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Excel Features Used
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Data Cleaning&lt;/strong&gt;&lt;br&gt;
Removed “KSh” and commas from price fields&lt;/p&gt;

&lt;p&gt;Extracted numeric ratings using TEXTSPLIT&lt;/p&gt;

&lt;p&gt;Used ABS() to correct negative values&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Calculated Columns&lt;/strong&gt;&lt;br&gt;
Discount Amount = Old Price – Current Price&lt;/p&gt;

&lt;p&gt;Rating Classification using nested IF statements&lt;/p&gt;

&lt;p&gt;Discount categorization (Low, Medium, High)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. PivotTables&lt;/strong&gt;&lt;br&gt;
Used to summarize:&lt;/p&gt;

&lt;p&gt;Average ratings by discount category&lt;/p&gt;

&lt;p&gt;Discount distribution across products&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. XLOOKUP&lt;/strong&gt;&lt;br&gt;
Used to map:&lt;/p&gt;

&lt;p&gt;Top-performing products&lt;/p&gt;

&lt;p&gt;Associated prices and names&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Conditional Formatting&lt;/strong&gt;&lt;br&gt;
Highlighted:&lt;/p&gt;

&lt;p&gt;High-discount, low-rating products&lt;/p&gt;

&lt;p&gt;Immediate visual alerts for decision-making&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Dashboard Creation&lt;/strong&gt;&lt;br&gt;
Built an interactive dashboard with:&lt;/p&gt;

&lt;p&gt;Charts linked to PivotTables&lt;/p&gt;

&lt;p&gt;Slicers for filtering data dynamically&lt;/p&gt;

&lt;p&gt;Scatter plots to reveal pricing patterns&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Insights from the Analysis&lt;/strong&gt;&lt;br&gt;
Discounts increase engagement: High-discount products generated significantly more reviews&lt;/p&gt;

&lt;p&gt;Discounts don’t guarantee quality: Some heavily discounted products had poor ratings&lt;/p&gt;

&lt;p&gt;Optimal pricing range: Products discounted between 20–40% achieved the highest ratings&lt;/p&gt;

&lt;p&gt;Strong customer satisfaction: Majority of products had ratings above 4.5&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Excel Mindset&lt;/strong&gt;&lt;br&gt;
Effective Excel usage goes beyond formulas:&lt;/p&gt;

&lt;p&gt;Maintain a clean workflow (Raw → Working → Dashboard)&lt;/p&gt;

&lt;p&gt;Use Excel Tables for dynamic updates&lt;/p&gt;

&lt;p&gt;Document formulas for future reference&lt;/p&gt;

&lt;p&gt;Focus on clarity and reproducibility&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;br&gt;
Excel remains one of the most powerful and accessible tools for data analysis. It bridges the gap between raw data and actionable business insights.&lt;/p&gt;

&lt;p&gt;From cleaning messy datasets to building dashboards and uncovering trends, Excel enables end-to-end analysis without requiring advanced programming skills.&lt;/p&gt;

&lt;p&gt;For anyone entering data-related fields, mastering Excel is one of the highest ROI skills you can develop.&lt;/p&gt;

</description>
      <category>excel</category>
      <category>dataanalysis</category>
      <category>beginners</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
