<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Joan Wambui</title>
    <description>The latest articles on DEV Community by Joan Wambui (@wambuijoan).</description>
    <link>https://dev.to/wambuijoan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3850284%2F48390040-4e1d-4e25-bbce-eca16461d9d3.png</url>
      <title>DEV Community: Joan Wambui</title>
      <link>https://dev.to/wambuijoan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/wambuijoan"/>
    <language>en</language>
    <item>
      <title>SQL fundamentals: DDL, DML, filtering, and CASE WHEN</title>
      <dc:creator>Joan Wambui</dc:creator>
      <pubDate>Mon, 13 Apr 2026 16:06:42 +0000</pubDate>
      <link>https://dev.to/wambuijoan/sql-fundamentals-ddl-dml-filtering-and-case-when-2dfe</link>
      <guid>https://dev.to/wambuijoan/sql-fundamentals-ddl-dml-filtering-and-case-when-2dfe</guid>
      <description>&lt;h2&gt;
  
  
  What is SQL?
&lt;/h2&gt;

&lt;p&gt;SQL (Structured Query Language) is the standard language for interacting with relational databases. Whether you are storing customer records, tracking exam results, or analysing transactions, SQL is how you talk to the database.&lt;/p&gt;




&lt;h2&gt;
  
  
  The five command categories
&lt;/h2&gt;

&lt;p&gt;SQL commands are grouped into five categories:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Its Function&lt;/th&gt;
&lt;th&gt;Commands&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DDL (Data Definition Language)&lt;/td&gt;
&lt;td&gt;Defines and modifies the database structure&lt;/td&gt;
&lt;td&gt;CREATE, ALTER, DROP, TRUNCATE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DML (Data Manipulation Language)&lt;/td&gt;
&lt;td&gt;Manages/manipulates data inside tables&lt;/td&gt;
&lt;td&gt;INSERT, UPDATE, DELETE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DQL (Data Query Language)&lt;/td&gt;
&lt;td&gt;Retrieves data from the tables&lt;/td&gt;
&lt;td&gt;SELECT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;TCL (Transaction Control Language)&lt;/td&gt;
&lt;td&gt;Manages transactions&lt;/td&gt;
&lt;td&gt;COMMIT, ROLLBACK, SAVEPOINT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DCL (Data Control Language)&lt;/td&gt;
&lt;td&gt;Controls access and permissions&lt;/td&gt;
&lt;td&gt;GRANT, REVOKE&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This article focuses on DDL, DML, and DQL.&lt;/p&gt;




&lt;h2&gt;
  
  
  DDL (Data Definition Language)
&lt;/h2&gt;

&lt;p&gt;DDL creates the schemas, tables and columns. It defines the shape of your data before any data exists. It has four commands, as mentioned in the table: CREATE, ALTER, DROP and TRUNCATE.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;CREATE TABLE&lt;/code&gt; sets up a table with its columns, data types, and constraints. &lt;code&gt;ALTER TABLE&lt;/code&gt; modifies an existing table by adding columns, renaming them, or dropping ones that are no longer needed. &lt;code&gt;TRUNCATE TABLE&lt;/code&gt; removes the contents of the table; however, it retains the table structure. &lt;code&gt;DROP TABLE&lt;/code&gt; removes the entire table.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A lightbulb moment:&lt;/strong&gt; &lt;code&gt;CREATE TABLE&lt;/code&gt; wraps its column definitions in parentheses. &lt;code&gt;ALTER TABLE&lt;/code&gt; does not as the instruction follows directly after the table name.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Creation of a student table&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;student_id&lt;/span&gt;    &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt;    &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt;     &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;gender&lt;/span&gt;        &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;date_of_birth&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;class&lt;/span&gt;         &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;city&lt;/span&gt;          &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Addition of a phone number column&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;students&lt;/span&gt;
&lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;phone_number&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  DML (Data Manipulation Language)
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;INSERT INTO&lt;/code&gt; adds rows to a table. You list the target columns, follow with &lt;code&gt;VALUES&lt;/code&gt;, and can insert multiple rows in one statement by separating them with commas. Always quote dates. Without quotes, PostgreSQL reads &lt;code&gt;2008-03-12&lt;/code&gt; as arithmetic (2008 − 3 − 12 = 1993).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INSERT INTO students (student_id, first_name, gender, date_of_birth, class, city)
VALUES
    (1, 'Amina', 'F', '2008-03-12', 'Form 3', 'Nairobi'),
    (2, 'Brian', 'M', '2007-07-25', 'Form 4', 'Mombasa');
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
plaintext&lt;/p&gt;

&lt;p&gt;&lt;code&gt;UPDATE&lt;/code&gt; modifies existing rows. &lt;code&gt;DELETE&lt;/code&gt; removes them. Both require a &lt;code&gt;WHERE&lt;/code&gt; clause. Without it, every row in the table is affected, not just the one you intended.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Updating a student's current city of residence:
UPDATE students
SET city = 'Nairobi'
WHERE student_id = 5;

-- Deleting an exam result from a student whose id is 9:
DELETE FROM exam_results
WHERE result_id = 9;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
plaintext&lt;/p&gt;


&lt;h2&gt;
  
  
  WHERE clause
&lt;/h2&gt;

&lt;p&gt;SQL has a variety of operators that allow for various filtering conditions. The &lt;code&gt;WHERE&lt;/code&gt; clause, when used with these operators, allows you to target specific rows.&lt;/p&gt;

&lt;p&gt;Some operators include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;BETWEEN&lt;/code&gt; filters a range and is inclusive on both ends.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;IN&lt;/code&gt; checks against a list of values. It is cleaner than chaining multiple &lt;code&gt;OR&lt;/code&gt; conditions.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;LIKE&lt;/code&gt; matches text patterns, where &lt;code&gt;%&lt;/code&gt; represents any sequence of characters.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WHERE marks BETWEEN 50 AND 80
WHERE city IN ('Nairobi', 'Mombasa', 'Kisumu')
WHERE class NOT IN ('Form 1', 'Form 2')
WHERE first_name LIKE 'A%' OR first_name LIKE 'E%'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;br&gt;
plaintext&lt;/p&gt;


&lt;h2&gt;
  
  
  CASE WHEN
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;CASE WHEN&lt;/code&gt; lets you apply conditional logic directly inside a query, generating a new column based on rules you define — without touching the underlying table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Exam results labelling:

SELECT *,
    CASE
        WHEN marks &amp;gt;= 80 THEN 'Distinction'
        WHEN marks &amp;gt;= 60 THEN 'Merit'
        WHEN marks &amp;gt;= 40 THEN 'Pass'
        ELSE 'Fail'
    END AS performance
FROM exam_results;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;PostgreSQL evaluates conditions top to bottom and stops at the first match, so order matters. The &lt;code&gt;ELSE&lt;/code&gt; clause covers anything that falls outside your defined conditions. Without it, unmatched rows return &lt;code&gt;NULL&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In &lt;code&gt;CASE WHEN&lt;/code&gt;, the result exists only in the query output. The table structure is never changed.&lt;/p&gt;

&lt;p&gt;What makes &lt;code&gt;CASE WHEN&lt;/code&gt; powerful is that it lets you enhance your data at query time without needing extra columns in your schema. For reporting and analysis, that flexibility is very useful.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>database</category>
      <category>sql</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How I Published My Power BI Report and Embedded It on a Website</title>
      <dc:creator>Joan Wambui</dc:creator>
      <pubDate>Sun, 05 Apr 2026 22:07:04 +0000</pubDate>
      <link>https://dev.to/wambuijoan/how-i-published-my-power-bi-report-and-embedded-it-on-a-website-22bb</link>
      <guid>https://dev.to/wambuijoan/how-i-published-my-power-bi-report-and-embedded-it-on-a-website-22bb</guid>
      <description>&lt;p&gt;I recently completed a Power BI project, an electronics sales dashboard built from a raw CSV file. Cleaning the data, modeling the tables, writing DAX measures, all of that happened in Power BI Desktop.&lt;/p&gt;

&lt;p&gt;This article covers what happens next: getting that report out of your computer and onto the internet where it can actually be used.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Create a Workspace
&lt;/h2&gt;

&lt;p&gt;Before publishing anything, you need a workspace. In Power BI Service, a workspace is essentially a folder that holds your reports and datasets.&lt;br&gt;
Go to app.powerbi.com and sign in. On the left navigation panel, click Workspaces, then click “New workspace”.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx62zj0gz7rdhqx6i4m7b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx62zj0gz7rdhqx6i4m7b.png" alt=" " width="769" height="339"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Publish from Power BI Desktop
&lt;/h2&gt;

&lt;p&gt;Go back to Power BI Desktop, press Ctrl + S to save your file, then go to the Home tab in the ribbon and click Publish on the right side.&lt;br&gt;&lt;br&gt;
A dialogue box appears asking where to publish. Select the workspace you just created and click Select.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ftzwv9cticukeeutu0z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ftzwv9cticukeeutu0z.png" alt=" " width="690" height="166"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Generate the Embed Code
&lt;/h2&gt;

&lt;p&gt;Now that the report is live in Service, you need the embed code to place it on a website.&lt;br&gt;
With the report open in the browser, click File in the top menu. Hover over Embed report and select Publish to web (public). &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiba534l3ohyn5nyxygpe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiba534l3ohyn5nyxygpe.png" alt=" " width="690" height="166"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can also embed the report using the other option “Website or portal” copy the iframe  code&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Embed It in a Webpage
&lt;/h2&gt;

&lt;p&gt;Open any HTML file in VS Code or a text editor of choice. Paste the iframe code where you want the report to appear. Save the file and open it in a browser. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fizl3054myokwurloeqj4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fizl3054myokwurloeqj4.png" alt=" " width="800" height="245"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your report is now published and the link is ready to share.&lt;/p&gt;

</description>
      <category>analytics</category>
      <category>beginners</category>
      <category>microsoft</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Understanding Data Modeling in Power BI: Joins, Relationships, and Schemas Explained</title>
      <dc:creator>Joan Wambui</dc:creator>
      <pubDate>Tue, 31 Mar 2026 03:15:27 +0000</pubDate>
      <link>https://dev.to/wambuijoan/understanding-data-modeling-in-power-bi-joins-relationships-and-schemas-explained-42ed</link>
      <guid>https://dev.to/wambuijoan/understanding-data-modeling-in-power-bi-joins-relationships-and-schemas-explained-42ed</guid>
      <description>&lt;h2&gt;
  
  
  What Is Data Modeling?
&lt;/h2&gt;

&lt;p&gt;Data modeling refers to how one organises and connects their data so it can be analysed correctly. In Power BI, it defines how tables relate to each other, how filters move across visuals, and how your measures calculate results. &lt;/p&gt;

&lt;h2&gt;
  
  
  Joins
&lt;/h2&gt;

&lt;p&gt;Joins combine data from two tables based on a shared column. In Power BI, joins happen in Power Query Editor before data enters the model.&lt;br&gt;
&lt;strong&gt;INNER JOIN&lt;/strong&gt; - Returns only rows with a match in both tables. &lt;br&gt;
&lt;strong&gt;LEFT JOIN&lt;/strong&gt; - Returns all rows from the left table only.&lt;br&gt;
&lt;strong&gt;RIGHT JOIN&lt;/strong&gt; - Returns all rows from the right table only.&lt;br&gt;
&lt;strong&gt;FULL OUTER JOIN&lt;/strong&gt; - Returns everything from both tables. Useful for spotting data gaps.&lt;br&gt;
&lt;strong&gt;LEFT ANTI JOIN&lt;/strong&gt; - Returns only rows from the left table with no match on the right. &lt;br&gt;
&lt;strong&gt;RIGHT ANTI JOIN&lt;/strong&gt; - Returns only rows from the right table with no match on the left.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffv4zgp0gqu88tr3z8urh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffv4zgp0gqu88tr3z8urh.png" alt="Joins in Power BI" width="800" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Power BI Relationships
&lt;/h2&gt;

&lt;p&gt;Relationships connect tables inside the model without merging them. They control how filters move between tables when you interact with visuals.&lt;br&gt;
Cardinality defines how rows relate across tables:&lt;br&gt;
• &lt;strong&gt;One-to-Many (1:M)&lt;/strong&gt; - One customer, many orders. The most common type.&lt;br&gt;
• &lt;strong&gt;Many-to-Many (M:M)&lt;/strong&gt; - Requires careful handling, best resolved with a bridge table.&lt;br&gt;
• &lt;strong&gt;One-to-One (1:1)&lt;/strong&gt; - Usually means the tables can be merged.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Active vs Inactive relationship&lt;/strong&gt; - Only one active relationship is allowed between two tables. Inactive relationships are triggered using &lt;code&gt;USERELATIONSHIP()&lt;/code&gt; in DAX when needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-filter direction&lt;/strong&gt; - Single direction is the default and safest. Bidirectional filters flow both ways but can cause performance issues if overused.&lt;/p&gt;

&lt;h2&gt;
  
  
  Joins vs Relationships
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbhezdncrv1p9196pt15m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbhezdncrv1p9196pt15m.png" alt="Joins vs Relationships" width="776" height="172"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Fact Tables vs Dimension Tables
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fact tables&lt;/strong&gt; hold measurable data such as sales and transactions. They are long with many rows and link out to dimension tables via foreign keys.&lt;br&gt;
&lt;strong&gt;Dimension tables&lt;/strong&gt; hold descriptive data such as customer names, product categories, dates. They give context to the numbers in your fact table.&lt;br&gt;
In simple terms, your fact table is the center. Dimension tables surround it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Schemas
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Star Schema&lt;/strong&gt; - One fact table connected directly to multiple dimension tables. Fast, clean, and what Power BI is optimised for. This should be used by default.&lt;br&gt;
Snowflake Schema - Dimension tables broken into sub-dimensions. This is more complex and slower in Power BI. It is recommended for one to use only it when source data is already normalised.&lt;br&gt;
&lt;strong&gt;Flat Table&lt;/strong&gt; - Everything in one table. This is simple but it creates redundancy and performance problems at scale.&lt;br&gt;
&lt;strong&gt;Pro Tip:&lt;/strong&gt; &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A clean data model is what makes a Power BI report trustworthy. Start with a star schema, separate your facts from your dimensions, define your relationships carefully, and always build a proper Date table. The model is invisible to the end user, but it is what everything depends on.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>beginners</category>
      <category>database</category>
      <category>microsoft</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How Linux is Used in Real-World Data Engineering</title>
      <dc:creator>Joan Wambui</dc:creator>
      <pubDate>Mon, 30 Mar 2026 02:46:26 +0000</pubDate>
      <link>https://dev.to/wambuijoan/how-linux-is-used-in-real-world-data-engineering-451p</link>
      <guid>https://dev.to/wambuijoan/how-linux-is-used-in-real-world-data-engineering-451p</guid>
      <description>&lt;p&gt;Linux has been such a common word in the tech journey, a mountain of a word for beginners, but a simple and helpful foundation once you grasp it. I had heard it before but couldn't tell you what it actually was. Was it a programming language? A tool? The same thing as Bash or Git? For a while I used all four interchangeably. A quiet kind of confusion, the type you don't realise you have until something breaks and you don't know where to look.&lt;/p&gt;

&lt;p&gt;The moment it clicked wasn't dramatic. It was sitting at a terminal during my data engineering class, connected to a remote server, and realising the environment I was operating in was Linux. Not a language. Not a tool I had installed. The environment itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Is Linux?
&lt;/h2&gt;

&lt;p&gt;Linux is an operating environment, similar to Windows or macOS, that sits beneath your tools, files, terminal sessions, and pipelines. Bash is the language you speak inside it. Git is a version control tool that runs inside it. GitHub is the cloud platform where your Git repositories live. Four different things, four different layers, blurring together because you encounter them all at once through the same black terminal window.&lt;/p&gt;

&lt;p&gt;This matters because data engineering assumes Linux fluency without announcing it.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Linux Shows Up in Real Data Engineering Work
&lt;/h2&gt;

&lt;p&gt;When you provision a virtual machine on Azure or AWS, you land in a Linux environment. When your pipeline runs on a scheduler, it is running on a Linux server. When you work with Docker containers, they are built on Linux. It is rarely the thing being discussed, but it is always underneath the thing being discussed.&lt;/p&gt;

&lt;p&gt;In practice, data engineers use Linux to navigate directories where raw data lands and processed outputs go. They write Bash scripts to automate repetitive ingestion tasks (which include ingesting a file, validating its structure, moving it to a staging folder, logging the result) and schedule them with cron to run automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Step 1: Navigate to where raw data lands&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; /data/raw

&lt;span class="c"&gt;# Step 2: Run your ingestion script&lt;/span&gt;
bash ingest.sh

&lt;span class="c"&gt;# Step 3: Schedule it to run automatically every day at 6am&lt;/span&gt;
0 6 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /scripts/ingest.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;They monitor running jobs, check resource usage, and kill processes that hang. When something breaks on a production server, Linux is the environment you are debugging in. Knowing it is not optional.&lt;/p&gt;

&lt;p&gt;Linux will not be the most exciting thing you learn in data engineering. It does not have a sleek interface or a compelling pitch. But it is the ground where real data work happens, on servers, in the cloud, inside containers, underneath every tool you will eventually depend on. The clearer your understanding of what it is, the faster everything else makes sense.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>linux</category>
      <category>dataengineering</category>
      <category>cloud</category>
    </item>
  </channel>
</rss>
