<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Damaa-C</title>
    <description>The latest articles on DEV Community by Damaa-C (@damaac).</description>
    <link>https://dev.to/damaac</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3716394%2F5905ce42-5198-46f2-90e5-6bd748e11c54.png</url>
      <title>DEV Community: Damaa-C</title>
      <link>https://dev.to/damaac</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/damaac"/>
    <language>en</language>
    <item>
      <title>How to Connect Power BI to a SQL (PostgreSQL) Database and Build a Unified Dashboard</title>
      <dc:creator>Damaa-C</dc:creator>
      <pubDate>Thu, 19 Mar 2026 06:46:00 +0000</pubDate>
      <link>https://dev.to/damaac/how-to-connect-power-bi-to-a-sql-postgresql-database-and-build-a-unified-dashboard-1e5h</link>
      <guid>https://dev.to/damaac/how-to-connect-power-bi-to-a-sql-postgresql-database-and-build-a-unified-dashboard-1e5h</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Power BI is a business intelligence (BI) and data visualization tool from Microsoft. It enables analysts and business users to transform raw data into interactive dashboards and reports. Companies use Power BI to analyze sales, customer behavior, inventory trends, and other critical business metrics.&lt;/p&gt;

&lt;p&gt;SQL databases, like PostgreSQL, are widely used to store and manage structured data. Connecting Power BI to SQL databases allows analysts to retrieve, clean, transform, and model data efficiently, creating a single source of truth for dashboards and KPIs.&lt;/p&gt;

&lt;p&gt;In this article, we’ll walk through:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connecting Power BI to local and cloud PostgreSQL databases&lt;/li&gt;
&lt;li&gt;Loading raw data and applying transformations&lt;/li&gt;
&lt;li&gt;Combining datasets from multiple sources&lt;/li&gt;
&lt;li&gt;Adding a Date Dimension table (DimDate) for KPIs and filtering&lt;/li&gt;
&lt;li&gt;Preparing a dashboard-ready model for GitHub and reporting&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What You Need Before Connecting
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; Power BI Desktop installed&lt;/li&gt;
&lt;li&gt;PostgreSQL database connection details (server, port, database, username, password) &lt;/li&gt;
&lt;li&gt;For cloud connections: SSL certificate (CA certificate)&lt;/li&gt;
&lt;li&gt;Model view showing relationships across cloud and local tables&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Connecting to a Local PostgreSQL Database
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Open Power BI Desktop.&lt;/li&gt;
&lt;li&gt;Click &lt;code&gt;Get Data → PostgreSQL database&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmuja5c62us91l6vuhe3r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmuja5c62us91l6vuhe3r.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F20klv8w166r4cc41iml0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F20klv8w166r4cc41iml0.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Enter your connection info:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Server:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; localhost:5432 (replace 5432 with your PostgreSQL port)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Database: Name of your database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmutt8pxcsxiq4dxb8ug2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmutt8pxcsxiq4dxb8ug2.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enter your username and password.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz8u5wi3x3dhg6fzipr7w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz8u5wi3x3dhg6fzipr7w.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click OK to connect.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The &lt;code&gt;localhost:port&lt;/code&gt; syntax ensures Power BI connects to the correct PostgreSQL server port.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In the Navigator window, select tables to load, then click Load (or Transform Data).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqhaeme1626hoc82bzloi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqhaeme1626hoc82bzloi.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foighsih3wdlxiipqr02w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foighsih3wdlxiipqr02w.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting to a Cloud PostgreSQL Database (Aiven or Similar)
&lt;/h2&gt;

&lt;p&gt;Cloud databases require secure SSL connections.&lt;/p&gt;

&lt;h3&gt;
  
  
  Guide to Aiven Connection
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;From your cloud provider, obtain:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Host&lt;/li&gt;
&lt;li&gt;Port&lt;/li&gt;
&lt;li&gt;Database name&lt;/li&gt;
&lt;li&gt;Username and password&lt;/li&gt;
&lt;li&gt;SSL certificate (CA certificate)&lt;/li&gt;
&lt;li&gt;Install the CA certificate on your system:&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Open the downloaded .crt or .pem file.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Install it in the trusted root certificate store:&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;Windows: Right-click → Install Certificate → Place in “Trusted Root Certification Authorities”&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
&lt;code&gt;Linux/macOS: Follow OS instructions to add to the system or user trust store&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Installing the CA certificate allows Power BI to validate the server’s identity and establish a secure connection.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;In Power BI Desktop:&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;Go to &lt;code&gt;Get Data → PostgreSQL database&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Enter Server
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;your-cloud-host:5432
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and Database&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi8woqdjcyo7gbuqy2j30.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi8woqdjcyo7gbuqy2j30.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check Use SSL certificate and select the installed certificate&lt;/li&gt;
&lt;li&gt;Enter your username and password&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzw3s255xopek538qzdyz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzw3s255xopek538qzdyz.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If successful, it opens your aiven cloud database as shown below
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnjvy6d7xhlcn0hudte0r.png" alt=" " width="800" height="600"&gt;
&lt;/li&gt;
&lt;li&gt;Choose tables from your database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe3406ibiobq3temr53iy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe3406ibiobq3temr53iy.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Click OK and load the tables&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Load Raw Data First
&lt;/h4&gt;

&lt;p&gt;It’s important to load raw data before applying transformations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Load all tables (customers, products, sales, inventory) first.&lt;/li&gt;
&lt;li&gt;Review the raw data to assess missing values, incorrect or inconsistent formats and unnecessary columns&lt;/li&gt;
&lt;li&gt;This helps you understand which fields need cleaning and transformation before building dashboards.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Filtering and Transformation
&lt;/h2&gt;

&lt;p&gt;Use Power BI’s Power Query Editor for data preparation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Remove unnecessary columns – keep only fields needed for analysis.&lt;/li&gt;
&lt;li&gt;Rename columns – make names descriptive.&lt;/li&gt;
&lt;li&gt;Filter rows – e.g., remove canceled sales or test records.&lt;/li&gt;
&lt;li&gt;Change data types – ensure numeric and date fields are correct.&lt;/li&gt;
&lt;li&gt;Create calculated columns – e.g., total sales = quantity × price.&lt;/li&gt;
&lt;li&gt;Handle missing values – fill, replace, or remove nulls.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Creating Relationships Between Tables
&lt;/h2&gt;

&lt;p&gt;Once data is cleaned:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Switch to Model View.&lt;/li&gt;
&lt;li&gt;Create relationships:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;`sales.customer_id → customers.customer_id&lt;/p&gt;

&lt;p&gt;sales.product_id → products.product_id&lt;/p&gt;

&lt;p&gt;inventory.product_id → products.product_id`&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foz1v03qw42h4pz9g8cfh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foz1v03qw42h4pz9g8cfh.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
Validate relationships for correct cardinality and cross-filtering.&lt;/p&gt;

&lt;p&gt;Relationships ensure Power BI can accurately summarize data across multiple tables.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Modeling Basics
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Star schema&lt;br&gt;
Fact table (sales) connected to dimension tables (customers, products, inventory).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cardinality&lt;br&gt;
One-to-many or many-to-one relationships.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Filter directions&lt;br&gt;
Define single or bidirectional filters depending on your analysis needs.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Proper modeling ensures dashboards are interactive and accurate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8s7yqhb8frrf01gcvxyj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8s7yqhb8frrf01gcvxyj.png" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why SQL Skills Matter for Power BI Analysts
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Data extraction: Retrieve exactly the data you need.&lt;/li&gt;
&lt;li&gt;Filtering &amp;amp; aggregation: Pre-process data for better performance.&lt;/li&gt;
&lt;li&gt;Data cleaning &amp;amp; transformation: Prepare data before loading it into Power BI.&lt;/li&gt;
&lt;li&gt;Performance optimization: Efficient queries improve dashboard responsiveness.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By combining SQL knowledge with Power BI, analysts can build powerful, reliable dashboards that drive business decisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Connecting PostgreSQL to Power BI empowers analysts to turn structured data into actionable insights. Whether using local or cloud databases, establishing secure and reliable connections ensures that the data you analyze is accurate and up-to-date. By first loading raw data, applying necessary transformations, and creating relationships, including a Date table for time-based analysis, you build a robust data model ready for interactive dashboards.&lt;/p&gt;

&lt;p&gt;Mastering this workflow not only simplifies KPI calculations and trend analysis but also highlights the importance of SQL skills for preparing and managing data. Ultimately, integrating PostgreSQL with Power BI provides a seamless bridge between data storage and visualization, enabling smarter, data-driven business decisions.&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>database</category>
      <category>postgres</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Mastering SQL Joins and Window Functions: A Practical Guide with Example Data</title>
      <dc:creator>Damaa-C</dc:creator>
      <pubDate>Fri, 06 Mar 2026 19:53:54 +0000</pubDate>
      <link>https://dev.to/damaac/mastering-sql-joins-and-window-functions-a-practical-guide-with-example-data-4mcn</link>
      <guid>https://dev.to/damaac/mastering-sql-joins-and-window-functions-a-practical-guide-with-example-data-4mcn</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SQL&lt;/strong&gt; is a powerful language for managing and analyzing relational databases. Two essential concepts for data analysis are &lt;code&gt;Joins&lt;/code&gt; and &lt;code&gt;Window Functions&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Joins&lt;/strong&gt; allow you to combine rows from multiple tables based on related columns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Window Functions&lt;/strong&gt; perform calculations across a set of rows related to the current row, enabling ranking, cumulative sums, moving averages, and more.&lt;/p&gt;

&lt;p&gt;In this guide, we’ll create a sample database with customers, products, sales, and inventory tables, populate them with data, and demonstrate joins and window functions with real examples.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create Database and Schema
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-- Create a new database
CREATE DATABASE business_db;

-- Connect to the database
\c business_db;

-- Create schema
CREATE SCHEMA assignment;

-- Set schema for this session
SET search_path TO assignment;

-- Verify schema
SHOW search_path;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Create Tables and Insert Data
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Customers Table&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;CREATE TABLE customers (&lt;br&gt;
    customer_id INT PRIMARY KEY,&lt;br&gt;
    first_name VARCHAR(50),&lt;br&gt;
    last_name VARCHAR(50),&lt;br&gt;
    email VARCHAR(100),&lt;br&gt;
    phone_number VARCHAR(50),&lt;br&gt;
    registration_date DATE,&lt;br&gt;
    membership_status VARCHAR(10)&lt;br&gt;
);&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INSERT INTO customers 
(customer_id, first_name, last_name, email, phone_number, registration_date, membership_status) 
VALUES
(1, 'Karen', 'Molina', 'gonzalezkimberly@glass.com', '(728)697-1206', '2020-08-27', 'Bronze'),
(2, 'Elizabeth', 'Archer', 'tramirez@gmail.com', '778.104.6553', '2023-08-28', 'Silver'),
(3, 'Roberta', 'Massey', 'davislori@gmail.com', '+1-365-606-7458x399', '2024-06-12', 'Bronze'),
(4, 'Jacob', 'Adams', 'andrew72@hotmail.com', '246-459-1425x462', '2023-02-10', 'Gold'),
(5, 'Cynthia', 'Lowery', 'suarezkiara@ramsey.com', '001-279-688-8177x4015', '2020-11-13', 'Silver');
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Products Table&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;CREATE TABLE products (&lt;br&gt;
    product_id INT PRIMARY KEY,&lt;br&gt;
    product_name VARCHAR(100),&lt;br&gt;
    category VARCHAR(50),&lt;br&gt;
    price DECIMAL(10,2),&lt;br&gt;
    supplier VARCHAR(100),&lt;br&gt;
    stock_quantity INT&lt;br&gt;
);&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INSERT INTO products
(product_id, product_name, category, price, supplier, stock_quantity)
VALUES
(1, 'Laptop', 'Electronics', 999.99, 'Dell', 50),
(2, 'Smartphone', 'Electronics', 799.99, 'Samsung', 150),
(3, 'Washing Machine', 'Appliances', 499.99, 'LG', 30),
(4, 'Headphones', 'Accessories', 199.99, 'Sony', 100),
(5, 'Refrigerator', 'Appliances', 1200.00, 'Whirlpool', 40);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Sales Table&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;CREATE TABLE sales (&lt;br&gt;
    sale_id INT PRIMARY KEY,&lt;br&gt;
    customer_id INT,&lt;br&gt;
    product_id INT,&lt;br&gt;
    quantity_sold INT,&lt;br&gt;
    sale_date DATE,&lt;br&gt;
    total_amount DECIMAL(10,2),&lt;br&gt;
    FOREIGN KEY (customer_id) REFERENCES customers(customer_id),&lt;br&gt;
    FOREIGN KEY (product_id) REFERENCES products(product_id)&lt;br&gt;
);&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INSERT INTO sales
(sale_id, customer_id, product_id, quantity_sold, sale_date, total_amount)
VALUES
(1, 1, 1, 1, '2023-07-15', 999.99),
(2, 2, 2, 2, '2023-08-20', 1599.98),
(3, 3, 3, 1, '2023-09-10', 499.99),
(4, 4, 4, 3, '2023-07-25', 599.97),
(5, 5, 5, 1, '2023-06-18', 1200.00);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Inventory Table&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;CREATE TABLE inventory (&lt;br&gt;
    product_id INT PRIMARY KEY,&lt;br&gt;
    stock_quantity INT,&lt;br&gt;
    FOREIGN KEY (product_id) REFERENCES products(product_id)&lt;br&gt;
);&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INSERT INTO inventory
(product_id, stock_quantity)
VALUES
(1, 50),
(2, 150),
(3, 30),
(4, 100),
(5, 40);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  SQL Joins
&lt;/h2&gt;

&lt;h3&gt;
  
  
  INNER JOIN
&lt;/h3&gt;

&lt;p&gt;Returns only rows with matching values in both tables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT c.first_name, c.last_name, s.total_amount
FROM customers c
INNER JOIN sales s ON c.customer_id = s.customer_id;

 ## Shows customers who have made purchases.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  LEFT JOIN
&lt;/h3&gt;

&lt;p&gt;Returns all rows from the left table, with &lt;code&gt;NULLs&lt;/code&gt; for unmatched right table rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT c.first_name, c.last_name, s.total_amount
FROM customers c
LEFT JOIN sales s ON c.customer_id = s.customer_id;

 ## Show all customers, even those with no purchases.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  SELF JOIN
&lt;/h3&gt;

&lt;p&gt;Join a table to itself.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT c.first_name AS customer1, m.first_name AS customer2, c.membership_status
FROM customers c
INNER JOIN customers m ON c.membership_status = m.membership_status
WHERE c.customer_id &amp;lt; m.customer_id;

## Find pairs of customers with the same membership status.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Window Functions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  RANK()
&lt;/h3&gt;

&lt;p&gt;Assigns ranks to rows in a partition.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    c.first_name,
    c.last_name,
    SUM(s.total_amount) AS total_spent,
    RANK() OVER(ORDER BY SUM(s.total_amount) DESC) AS customer_rank
FROM customers c
JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.first_name, c.last_name;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  DENSE_RANK()
&lt;/h3&gt;

&lt;p&gt;Assigns ranks to rows without gaps for ties.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    c.first_name,
    c.last_name,
    SUM(s.total_amount) AS total_spent,
    DENSE_RANK() OVER(ORDER BY SUM(s.total_amount) DESC) AS dense_rank
FROM customers c
JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.first_name, c.last_name;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Difference from RANK(): If two customers tie for 1st place, RANK() skips 2, giving the next rank as 3, while DENSE_RANK() gives 2.&lt;/p&gt;

&lt;h3&gt;
  
  
  ROW_NUMBER()
&lt;/h3&gt;

&lt;p&gt;Assigns a unique sequential number to rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    c.first_name,
    c.last_name,
    s.sale_date,
    ROW_NUMBER() OVER(PARTITION BY c.customer_id ORDER BY s.sale_date) AS purchase_order
FROM customers c
JOIN sales s ON c.customer_id = s.customer_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cumulative SUM()
&lt;/h3&gt;

&lt;p&gt;Calculate running totals without collapsing rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    p.product_name,
    s.sale_date,
    SUM(s.quantity_sold) OVER(PARTITION BY p.product_id ORDER BY s.sale_date) AS cumulative_sales
FROM products p
JOIN sales s ON p.product_id = s.product_id;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Top Customers per Membership Tier
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT 
    c.first_name,
    c.last_name,
    c.membership_status,
    SUM(s.total_amount) AS total_spent,
    RANK() OVER(PARTITION BY c.membership_status ORDER BY SUM(s.total_amount) DESC) AS tier_rank
FROM customers c
JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.first_name, c.last_name, c.membership_status
ORDER BY c.membership_status, tier_rank;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Sample Queries for Analysis
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt; Total Sales Per Product
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT p.product_name, SUM(s.quantity_sold) AS total_sales
FROM products p
JOIN sales s ON p.product_id = s.product_id
GROUP BY p.product_name
ORDER BY total_sales DESC;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt; Customers with Purchases Over $1000
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
SELECT first_name, last_name
FROM customers
WHERE customer_id IN (SELECT customer_id FROM sales WHERE total_amount &amp;gt; 1000);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt; Products Low in Stock
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT p.product_name, i.stock_quantity
FROM products p
JOIN inventory i ON p.product_id = i.product_id
WHERE i.stock_quantity &amp;lt; 50;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Joins combine data across tables and are essential for querying normalized databases.&lt;/p&gt;

&lt;p&gt;Window Functions perform calculations over a set of rows without collapsing them, enabling ranking, cumulative totals, and analytics within groups.&lt;/p&gt;

&lt;p&gt;Together, they allow powerful data analysis, such as finding top customers, cumulative sales trends, and product performance.&lt;/p&gt;

</description>
      <category>learning</category>
      <category>dataengineering</category>
      <category>data</category>
      <category>developer</category>
    </item>
    <item>
      <title>How Analysts Translate Messy Data, DAX, and Dashboards into Action Using Power BI</title>
      <dc:creator>Damaa-C</dc:creator>
      <pubDate>Tue, 10 Feb 2026 12:21:30 +0000</pubDate>
      <link>https://dev.to/damaac/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-1hp2</link>
      <guid>https://dev.to/damaac/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-1hp2</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In real-world analytics projects, data is rarely clean or analysis-ready. Analysts often receive data from multiple sources with missing values, inconsistent formats, duplicates, and unclear relationships. Power BI provides an end-to-end analytics platform that enables analysts to clean messy data, build strong data models, write meaningful DAX measures, and design dashboards that translate insights into action. This article explains how analysts achieve this process using Power BI, with reference to a Hospital and Pharmacy dataset.&lt;/p&gt;

&lt;h2&gt;
  
  
  Developing the Analytics Mindset
&lt;/h2&gt;

&lt;p&gt;Effective analysis begins with the right mindset. Analysts must understand the business problem before working with the data. In a hospital and pharmacy environment, decision-makers may want answers to questions such as: How many patients are visiting the hospital? Which departments are busiest? Which drugs are most prescribed? Power BI is used not just to visualize data, but to support informed decision-making through evidence-based insights.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cleaning and Transforming Messy Data Using Power Query
&lt;/h2&gt;

&lt;p&gt;Messy data is one of the biggest challenges in analytics. Hospital datasets often contain duplicate patient records, inconsistent date formats, missing department names, and incorrect data types. Power Query is used to clean and transform this data before analysis begins.&lt;/p&gt;

&lt;p&gt;Using Power Query, analysts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Remove duplicate and irrelevant records&lt;/li&gt;
&lt;li&gt;Standardize column names&lt;/li&gt;
&lt;li&gt;Convert data types to correct formats&lt;/li&gt;
&lt;li&gt;Handle missing or null values&lt;/li&gt;
&lt;li&gt;Filter data to retain only what is necessary&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These steps ensure that the dataset is accurate, consistent, and reliable. Importantly, Power Query transformations are repeatable, meaning the same cleaning steps can be applied when new data is added.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Modeling and Relationships
&lt;/h2&gt;

&lt;p&gt;After data cleaning, analysts build a data model. A well-designed data model improves report performance and ensures accurate calculations. Best practices such as separating fact tables from dimension tables and using a star schema are applied.&lt;/p&gt;

&lt;p&gt;In the Hospital and Pharmacy dataset:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fact tables include patient visits and pharmacy transactions&lt;/li&gt;
&lt;li&gt;Dimension tables include dates, departments, diseases, and drugs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ic2dcmx521agxyox9h9.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ic2dcmx521agxyox9h9.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Clear relationships between these tables allow Power BI visuals and DAX measures to behave correctly across filters and slicers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using DAX to Create Business Metrics
&lt;/h2&gt;

&lt;p&gt;DAX (Data Analysis Expressions) enables analysts to create calculated measures that answer specific business questions. Unlike basic calculations, DAX measures are dynamic and respond to user interaction within reports.&lt;/p&gt;

&lt;p&gt;Examples of DAX measures used in the analysis include:&lt;/p&gt;

&lt;h3&gt;
  
  
  Total Patient Visits
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total Patient Visits = COUNT('PatientVisits'[VisitID])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Total Pharmacy Sales
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total Pharmacy Sales = SUM('PharmacySales'[TotalAmount])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Average Daily Patient Visits
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
Average Daily Visits =
AVERAGEX(
    VALUES('Date'[Date]),
    [Total Patient Visits]
)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Total Prescriptions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total Prescriptions = SUM('PharmacySales'[Quantity])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These measures help quantify hospital activity, track pharmacy performance, and identify trends over time. By using DAX, analysts move beyond raw data to meaningful metrics that support decision-making.&lt;/p&gt;

&lt;h2&gt;
  
  
  Selecting Appropriate Visuals
&lt;/h2&gt;

&lt;p&gt;Choosing the right visuals is critical for effective communication. Analysts select visuals based on the type of insight they want to present. In Power BI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Line charts&lt;/code&gt;are used to show patient visit &lt;code&gt;trends over time&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Bar charts&lt;/code&gt; compare departments, diseases, or drugs&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;KPI cards&lt;/code&gt; highlight key metrics such as total visits and sales&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Tables&lt;/code&gt; and &lt;code&gt;matrices&lt;/code&gt; provide detailed breakdowns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The focus is on clarity and simplicity, ensuring that insights are easily understood by stakeholders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dashboard Design and Data Storytelling
&lt;/h2&gt;

&lt;p&gt;Dashboards are not just collections of charts; they tell a story. Analysts design dashboards to guide users from high-level summaries to more detailed insights. Layout, spacing, and logical flow are carefully considered.&lt;/p&gt;

&lt;p&gt;In the hospital dashboard, users can first view overall patient volumes and pharmacy sales, then drill down into department performance and disease patterns. This storytelling approach allows decision-makers to quickly identify issues and opportunities.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qobz7v76l5yubph2vly.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qobz7v76l5yubph2vly.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Translating Insights into Action
&lt;/h2&gt;

&lt;p&gt;The ultimate goal of analytics is action. Insights generated from Power BI dashboards enable hospital management to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Allocate staff to high-demand departments&lt;/li&gt;
&lt;li&gt;Monitor disease trends for better planning&lt;/li&gt;
&lt;li&gt;Optimize pharmacy stock levels&lt;/li&gt;
&lt;li&gt;Improve operational efficiency and service delivery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By translating data into insights, Power BI supports informed, data-driven decisions.&lt;/p&gt;

&lt;p&gt;Power BI enables analysts to transform messy data into actionable insights through a structured process of data cleaning, modeling, DAX calculations, and dashboard design. By combining technical skills with an analytics mindset, analysts bridge the gap between raw hospital and pharmacy data and real-world decisions. This approach supports better planning, efficiency, and outcomes in healthcare environments.&lt;/p&gt;

</description>
      <category>learning</category>
      <category>dataengineering</category>
      <category>analytics</category>
    </item>
    <item>
      <title>Practical Data Modeling in Power BI: Star and Snowflake Schemas Explained</title>
      <dc:creator>Damaa-C</dc:creator>
      <pubDate>Tue, 03 Feb 2026 10:17:47 +0000</pubDate>
      <link>https://dev.to/damaac/practical-data-modeling-in-power-bi-star-and-snowflake-schemas-explained-18e6</link>
      <guid>https://dev.to/damaac/practical-data-modeling-in-power-bi-star-and-snowflake-schemas-explained-18e6</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In Power BI projects, many reporting issues such as slow performance, incorrect totals, or complex DAX formulas often stem from one root cause; poor data modeling. While visuals and measures usually get most of the attention, the data model is the true foundation of any reliable analytical solution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data modeling&lt;/strong&gt; is the process of structuring data in a way that supports efficient analysis, accurate relationships, and meaningful insights. In Power BI, this typically means organizing data into fact tables and dimension tables, following proven data warehousing principles.&lt;/p&gt;

&lt;p&gt;As &lt;em&gt;Ralph Kimball&lt;/em&gt; explains in &lt;em&gt;The Data Warehouse Toolkit&lt;/em&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Dimensions provide the 'who, what, when, where, why, and how'&lt;br&gt;
context surrounding business process events.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This article provides a practical, beginner-friendly guide to data modeling in Power BI, using Sales and Fact Budget CSV datasets as working examples. The explanations are guided by concepts from Ralph Kimball’s &lt;em&gt;The Data Warehouse Toolkit&lt;/em&gt; and practical demonstrations inspired by the Pragmatic Works Power BI Data Modeling video&lt;a href="https://www.youtube.com/watch?v=air7T8wCYkU" rel="noopener noreferrer"&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Data Modeling Is Important in Power BI
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Improved report performance&lt;/li&gt;
&lt;li&gt;Simpler and more readable DAX measures&lt;/li&gt;
&lt;li&gt;Accurate filtering and aggregations&lt;/li&gt;
&lt;li&gt;Easier maintenance and scalability&lt;/li&gt;
&lt;li&gt;Consistent business logic across reports&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Power BI is not just a visualization tool; it is also an analytical engine. Without a proper model, even the best visuals can produce misleading results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Business Scenario and Datasets Used
&lt;/h2&gt;

&lt;p&gt;To demonstrate practical data modeling concepts, this article uses two related fact tables;Sales table represents actual transactional sales data and Fact Budget table represents planned or budgeted values.&lt;/p&gt;

&lt;p&gt;These datasets allow analysis of actual performance versus planned targets, which is a common real-world business scenario. By modeling these tables correctly, we can compare revenue against budget, calculate variances, and evaluate performance trends over time.&lt;/p&gt;

&lt;p&gt;Both fact tables share common &lt;code&gt;descriptive data&lt;/code&gt; such as &lt;br&gt;
&lt;code&gt;products, dates, and markets&lt;/code&gt;, making them ideal for demonstrating star and snowflake schemas in Power BI.&lt;/p&gt;

&lt;h2&gt;
  
  
  How We Arrive at a Data Model in Power BI
&lt;/h2&gt;

&lt;p&gt;Before any relationships are created in the Model view, a good data model begins in Power Query. This is where raw data is shaped, cleaned, and organized into fact and dimension tables. The steps below describe a practical, repeatable approach used in real-world Power BI projects.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Load the Source Data
&lt;/h3&gt;

&lt;p&gt;The process starts by loading the source files:&lt;/p&gt;

&lt;p&gt;Open Power BI Desktop&lt;br&gt;
Select &lt;code&gt;Get Data&lt;/code&gt;&lt;br&gt;
Choose the appropriate source e.g &lt;code&gt;Text(CSV, Excel, database, etc.)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;In this example, we load:&lt;/p&gt;

&lt;p&gt;Sales data (actual transactions)&lt;/p&gt;

&lt;p&gt;FactBudget data (planned or budget figures)&lt;/p&gt;

&lt;p&gt;Once loaded, we select Transform Data to open Power Query.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Identify Facts and Dimensions
&lt;/h3&gt;

&lt;p&gt;With the data visible in Power Query, the next step is to determine Which tables represent business processes (facts)&lt;br&gt;
and Which attributes describe those processes (dimensions)&lt;/p&gt;

&lt;p&gt;The Sales and FactBudget tables are kept as fact tables, while descriptive fields such as Product, Date, Market, or Department are candidates for dimension tables.&lt;/p&gt;

&lt;p&gt;This approach follows Kimball’s principle of separating measurements from descriptive context.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Create Dimension Tables from Fact Data
&lt;/h3&gt;

&lt;p&gt;Rather than importing separate dimension files, Power BI allows dimensions to be created directly from fact tables.&lt;/p&gt;

&lt;p&gt;For each required dimension:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Duplicate the fact table (Right-click → Duplicate)&lt;/li&gt;
&lt;li&gt;Rename the duplicated table (e.g., DimProduct, DimMarket)&lt;/li&gt;
&lt;li&gt;Unselect all columns, then select only the columns relevant to that dimension&lt;/li&gt;
&lt;li&gt;Use Remove Duplicates to ensure one unique row per dimension member&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;DimProduct keeps only Product-related columns e.g productID, product, category, segment, unit cost, unit price.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;DimCustomer keeps only Customer columns e.g customer ID, email,customer name, zip code, city, country, state. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This results in clean, compact dimension tables that connect efficiently to fact tables.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9zasuou64qh6ka6ys9v.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9zasuou64qh6ka6ys9v.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Creating a Proper Date Dimension (DimDate)
&lt;/h3&gt;

&lt;p&gt;Using a dedicated Date dimension is a core data warehousing best practice. While Power BI allows the use of date columns directly from fact tables, this approach is limited and not recommended for analytical models.&lt;/p&gt;

&lt;p&gt;A true Date dimension allows analysts to answer questions such as:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Is this date a weekday or weekend?&lt;/li&gt;
&lt;li&gt;What month or quarter does it belong to?&lt;/li&gt;
&lt;li&gt;How do holidays affect performance?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To achieve this, we create a Date table using Power Query, based on a function by Devin Knight.&lt;/p&gt;

&lt;h3&gt;
  
  
  Adding the Date Dimension Using Power Query
&lt;/h3&gt;

&lt;p&gt;The steps are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to Transform Data → Power Query Editor&lt;/li&gt;
&lt;li&gt;Select Home → New Source → Blank Query&lt;/li&gt;
&lt;li&gt;Open Advanced Editor&lt;/li&gt;
&lt;li&gt;Paste the Date dimension function code
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;//Create Date Dimension
(StartDate as date, EndDate as date)=&amp;gt;

let
    //Capture the date range from the parameters
    StartDate = #date(Date.Year(StartDate), Date.Month(StartDate), 
    Date.Day(StartDate)),
    EndDate = #date(Date.Year(EndDate), Date.Month(EndDate), 
    Date.Day(EndDate)),

    //Get the number of dates that will be required for the table
    GetDateCount = Duration.Days(EndDate - StartDate),

    //Take the count of dates and turn it into a list of dates
    GetDateList = List.Dates(StartDate, GetDateCount, 
    #duration(1,0,0,0)),

    //Convert the list into a table
    DateListToTable = Table.FromList(GetDateList, 
    Splitter.SplitByNothing(), {"Date"}, null, ExtraValues.Error),

    //Create various date attributes from the date column
    //Add Year Column
    YearNumber = Table.AddColumn(DateListToTable, "Year", 
    each Date.Year([Date])),

    //Add Quarter Column
    QuarterNumber = Table.AddColumn(YearNumber , "Quarter", 
    each "Q" &amp;amp; Number.ToText(Date.QuarterOfYear([Date]))),

    //Add Week Number Column
    WeekNumber= Table.AddColumn(QuarterNumber , "Week Number", 
    each Date.WeekOfYear([Date])),

    //Add Month Number Column
    MonthNumber = Table.AddColumn(WeekNumber, "Month Number", 
    each Date.Month([Date])),

    //Add Month Name Column
    MonthName = Table.AddColumn(MonthNumber , "Month", 
    each Date.ToText([Date],"MMMM")),

    //Add Day of Week Column
    DayOfWeek = Table.AddColumn(MonthName , "Day of Week", 
    each Date.ToText([Date],"dddd"))

in
    DayOfWeek 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Rename the query (e.g., fnDimDate)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After saving the function:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Right-click the function → Invoke&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Provide a Start Date and End Date
Power BI generates a full Date dimension table&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fohjl3z3w48cax3h54q9b.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fohjl3z3w48cax3h54q9b.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxexfl3fjz8bjqmrga5en.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxexfl3fjz8bjqmrga5en.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;br&gt;
This approach is based on the method described by Devin Knight: Creating a Date Dimension with Power Query&lt;a href="https://devinknightsql.com/2015/06/16/creating-a-date-dimension-with-power-query/" rel="noopener noreferrer"&gt;date power query&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Use a Date Dimension Instead of a Fact Date Column?
&lt;/h2&gt;

&lt;p&gt;Relying on a raw date column from a fact table limits analytical capability. A dedicated Date dimension:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Enables advanced time intelligence&lt;/li&gt;
&lt;li&gt;Provides consistent filtering across multiple fact tables&lt;/li&gt;
&lt;li&gt;Allows identification of weekdays, weekends, holidays, and fiscal periods&lt;/li&gt;
&lt;li&gt;Improves model clarity and reusability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why both Sales and FactBudget tables connect to the same DimDate table in the model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Fact Tables
&lt;/h2&gt;

&lt;p&gt;Fact tables store quantitative, measurable data generated by business processes. In this example:&lt;/p&gt;

&lt;h3&gt;
  
  
  Sales Fact Table
&lt;/h3&gt;

&lt;p&gt;The Sales table contains transactional metrics such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sales amount&lt;/li&gt;
&lt;li&gt;Quantity sold&lt;/li&gt;
&lt;li&gt;Revenue&lt;/li&gt;
&lt;li&gt;Profit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The grain of the Sales table is defined at the level of a specific transaction, typically by product, date, and market.&lt;/p&gt;

&lt;h3&gt;
  
  
  FactBudget Table
&lt;/h3&gt;

&lt;p&gt;The FactBudget table stores planned or forecasted metrics such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Budgeted sales amount&lt;/li&gt;
&lt;li&gt;Budgeted revenue&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unlike transactional data, budget data is often recorded at a higher level (for example, monthly or by department), which influences how it is modeled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dimension Tables
&lt;/h3&gt;

&lt;p&gt;Dimension tables contain descriptive attributes that give meaning to numeric facts. According to Ralph Kimball:&lt;/p&gt;

&lt;p&gt;“Dimensions provide the descriptive context for facts.”&lt;/p&gt;

&lt;p&gt;Common dimensions used in this model include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DimDate – when the transaction occurred&lt;/li&gt;
&lt;li&gt;DimProduct – what was sold&lt;/li&gt;
&lt;li&gt;DimCustomer / Market – who bought it and where
as shown in the image below;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qzfqrva5umk8xol7s70.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1qzfqrva5umk8xol7s70.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Dimensions answer critical business questions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Who made the purchase?&lt;/li&gt;
&lt;li&gt;What product was sold?&lt;/li&gt;
&lt;li&gt;When did it occur?&lt;/li&gt;
&lt;li&gt;Where did it happen?&lt;/li&gt;
&lt;li&gt;How or why did it occur (channel, promotion, etc.)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Date Dimension Theory and Implementation
&lt;/h2&gt;

&lt;p&gt;Using a dedicated Date dimension is a best practice in data modeling. Instead of relying on raw date columns from fact tables, a Date table provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Consistent time-based analysis&lt;/li&gt;
&lt;li&gt;Support for time intelligence functions&lt;/li&gt;
&lt;li&gt;Clear relationships across multiple fact tables&lt;/li&gt;
&lt;li&gt;Typical Date Attributes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A Date dimension commonly includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DateKey (e.g., YYYYMMDD)&lt;/li&gt;
&lt;li&gt;Full date&lt;/li&gt;
&lt;li&gt;Year&lt;/li&gt;
&lt;li&gt;Quarter&lt;/li&gt;
&lt;li&gt;Month number&lt;/li&gt;
&lt;li&gt;Month name&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Power BI, a Date table can be generated using DAX. This article includes a Date table created using custom date code, which is then related to both the Sales and FactBudget tables.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkxmiotl41jfckixytis9.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkxmiotl41jfckixytis9.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the table below, you can see dimdate relation to factsales table in the star schema.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51n2ul26hkg51u2rx349.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51n2ul26hkg51u2rx349.jpeg" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Surrogate Keys and Relationship Design
&lt;/h2&gt;

&lt;p&gt;In analytical models, surrogate (index) keys are preferred over textual fields. Examples include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DateKey&lt;/li&gt;
&lt;li&gt;ProductKey&lt;/li&gt;
&lt;li&gt;CustomerKey&lt;/li&gt;
&lt;li&gt;catseg id
as shown below;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ezs5fckrvp2tjccafag.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ezs5fckrvp2tjccafag.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;br&gt;
Surrogate keys improve:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Performance&lt;/li&gt;
&lt;li&gt;Relationship consistency&lt;/li&gt;
&lt;li&gt;Integration across multiple fact tables&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They are especially useful when combining data from different systems or when natural keys are inconsistent or complex.&lt;/p&gt;

&lt;h2&gt;
  
  
  STAR SCHEMA
&lt;/h2&gt;

&lt;p&gt;The star schema is the most common and recommended modeling pattern in Power BI.&lt;/p&gt;

&lt;p&gt;In the Sales model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Sales fact table sits at the center&lt;/li&gt;
&lt;li&gt;Dimension tables surround it&lt;/li&gt;
&lt;li&gt;Relationships are one-to-many&lt;/li&gt;
&lt;li&gt;Filters flow from dimensions to facts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This structure simplifies reporting and ensures efficient query performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5wvlgij2tduvp9za7tdv.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5wvlgij2tduvp9za7tdv.jpeg" alt=" " width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Snowflake Schema
&lt;/h2&gt;

&lt;p&gt;A &lt;code&gt;snowflake schema&lt;/code&gt; occurs when dimension tables are further normalized into additional related tables.&lt;/p&gt;

&lt;p&gt;In the FactBudget model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Budget data may link to higher-level entities such as departments or regions&lt;/li&gt;
&lt;li&gt;These dimensions connect to other dimension tables rather than directly to the fact table&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While snowflake schemas add complexity, they are sometimes necessary, particularly for planning and budgeting data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frt7fuh4v1072zpajhgqk.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frt7fuh4v1072zpajhgqk.jpeg" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating Star and Snowflake Schemas
&lt;/h2&gt;

&lt;p&gt;Power BI allows multiple fact tables to coexist within a single model when they share common dimensions.&lt;/p&gt;

&lt;p&gt;In this example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sales uses a star schema&lt;/li&gt;
&lt;li&gt;FactBudget uses a snowflake schema&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both connect through shared dimensions such as &lt;code&gt;Date and Product&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This integration enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Actual vs budget comparisons&lt;/li&gt;
&lt;li&gt;Variance analysis&lt;/li&gt;
&lt;li&gt;Performance tracking across time and products&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rgq3ch643i6yr1fucwl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rgq3ch643i6yr1fucwl.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In conclusion, Power BI data modeling plays a crucial role in transforming raw data into meaningful insights. By structuring data effectively through relationships, calculated columns, measures, and hierarchies, users can create dynamic and interactive reports that support informed decision-making. Proper data modeling ensures data consistency, accuracy, and performance efficiency, allowing organizations to analyze trends, identify patterns, and make data-driven decisions with confidence. Mastery of Power BI’s data modeling capabilities not only enhances analytical capabilities but also empowers users to communicate insights visually, bridging the gap between complex data and actionable knowledge.&lt;/p&gt;

</description>
      <category>analytics</category>
      <category>dataengineering</category>
      <category>microsoft</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Introduction to Linux for Data Engineers, Including Practical Use of Vi and Nano with Examples</title>
      <dc:creator>Damaa-C</dc:creator>
      <pubDate>Sun, 25 Jan 2026 11:07:13 +0000</pubDate>
      <link>https://dev.to/damaac/introduction-to-linux-for-data-engineers-including-practical-use-of-vi-and-nano-with-examples-2g5</link>
      <guid>https://dev.to/damaac/introduction-to-linux-for-data-engineers-including-practical-use-of-vi-and-nano-with-examples-2g5</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;Linux is the backbone of most modern data engineering systems. From cloud servers and big data platforms to ETL pipelines and data warehouses, Linux provides the environment where data engineers build, deploy, and manage data workflows. This article introduces Linux from a beginner’s perspective, explains why it is important for data engineers, and demonstrates basic Linux usage with a strong focus on text editing using &lt;strong&gt;Vi&lt;/strong&gt; and &lt;strong&gt;Nano&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This guide is written for beginners with no prior Linux experience required.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Linux Is Important for Data Engineers?
&lt;/h2&gt;

&lt;p&gt;Most data engineering tools run on Linux. Tools such as &lt;strong&gt;Apache Hadoop, Spark, Kafka, Airflow, Docker, and Kubernetes&lt;/strong&gt; are primarily designed for Linux environments.&lt;/p&gt;

&lt;p&gt;Here’s why Linux matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Server dominance – Most servers in the cloud (AWS, Azure, GCP) run on Linux. &lt;/li&gt;
&lt;li&gt;Performance and stability – Linux handles large-scale data processing efficiently.&lt;/li&gt;
&lt;li&gt; Automation-friendly – Powerful command-line tools for scripting and scheduling jobs.&lt;/li&gt;
&lt;li&gt;Open source – Free, customizable, and widely supported&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a data engineer, you will often:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connect to Linux servers via SSH&lt;/li&gt;
&lt;li&gt;Edit configuration files&lt;/li&gt;
&lt;li&gt;Run data processing scripts&lt;/li&gt;
&lt;li&gt;Monitor logs and system resources&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Understanding Linux is therefore a core skill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Linux Terminal
&lt;/h2&gt;

&lt;p&gt;The terminal, also called the command line or shell, allows you to interact with the Linux system by typing commands.&lt;br&gt;
Example of terminal output;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;damaris@ubuntu:~$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This shows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Username: &lt;strong&gt;damaris&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Machine name: &lt;strong&gt;ubuntu&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Current directory: &lt;strong&gt;~ (home directory)&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Basic Linux Commands for Beginners
&lt;/h2&gt;

&lt;h2&gt;
  
  
  1. Check for current directory
&lt;/h2&gt;

&lt;p&gt;To check for current directory in the terminal, use the following command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pwd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output: &lt;code&gt;/home/damaris&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. List files and folders
&lt;/h2&gt;

&lt;p&gt;To list files in terminal use this command;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ls # listing files
ls -l ## to list folders
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3. Create a directory
&lt;/h2&gt;

&lt;p&gt;Use &lt;code&gt;mkdir&lt;/code&gt; command to create a directory.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir data_projects
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Navigate Between Directories
&lt;/h2&gt;

&lt;p&gt;Use &lt;code&gt;cd&lt;/code&gt; command to navigate through directories.&lt;br&gt;
Let's use the file &lt;code&gt;mkdir data_projects&lt;/code&gt; we created to navigate the directory.&lt;br&gt;
For example;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd data_projects
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To exit a directory, use &lt;code&gt;cd ..&lt;/code&gt;&lt;br&gt;
Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd data_projects
ls #lists files in the directory
cd ..
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Create a File
&lt;/h2&gt;

&lt;p&gt;Use &lt;code&gt;touch&lt;/code&gt; command to create a file.&lt;br&gt;
Example;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;touch sample.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  6. View File Content
&lt;/h2&gt;

&lt;p&gt;To view contexts or content of a file, use &lt;code&gt;cat&lt;/code&gt; command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat sample.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Text Editors Matter in Data Engineering
&lt;/h2&gt;

&lt;p&gt;As a data engineer, you will constantly edit SQL scripts, Python files, Shell scripts and Configuration files (YAML, JSON, .conf)&lt;/p&gt;

&lt;p&gt;Linux provides powerful terminal-based text editors. The most common are &lt;strong&gt;Vi/Vim&lt;/strong&gt; and &lt;strong&gt;Nano&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using the Nano Editor in Ubuntu (Detailed Beginner Guide)
&lt;/h2&gt;

&lt;h2&gt;
  
  
  What Is Nano?
&lt;/h2&gt;

&lt;p&gt;Nano is a simple, beginner-friendly text editor that runs inside the Ubuntu terminal. Unlike Vi/Vim, Nano does not use modes, which makes it much easier for new Linux users to learn and use.&lt;/p&gt;

&lt;p&gt;For data engineers, Nano is commonly used to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Edit configuration files (.conf, .yaml, .json)&lt;/li&gt;
&lt;li&gt;Write quick notes or scripts&lt;/li&gt;
&lt;li&gt;Modify ETL pipeline settings&lt;/li&gt;
&lt;li&gt;Edit files on remote Linux servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nano is preinstalled on most Ubuntu systems, so you don’t need to install anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Opening the Terminal in Ubuntu
&lt;/h2&gt;

&lt;p&gt;Before using Nano, you need to open the terminal.&lt;/p&gt;

&lt;p&gt;You can do this in any of the following ways:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Press Ctrl + Alt + T&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Search for Terminal in the Applications menu&lt;/p&gt;

&lt;p&gt;You will see something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;damaris@ubuntu:~$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means you are in your home directory.&lt;br&gt;
nano data_notes.txt&lt;/p&gt;
&lt;h2&gt;
  
  
  Creating a File Using Nano
&lt;/h2&gt;

&lt;p&gt;To create a new file using Nano, type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nano data_notes.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then press &lt;strong&gt;Enter&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens Next?
&lt;/h2&gt;

&lt;p&gt;If the file does not exist, Nano creates it.&lt;/p&gt;

&lt;p&gt;If the file already exists, Nano opens it for editing.&lt;/p&gt;

&lt;p&gt;You will now see the Nano editor screen.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Nano Editor Interface
&lt;/h2&gt;

&lt;p&gt;When Nano opens, the screen has three main parts:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Main Editing Area (Center)
&lt;/h3&gt;

&lt;p&gt;This is where you type your text.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;This file contains notes for our data engineering project.
Source: MySQL
Destination: Data Warehouse
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Status Bar (Bottom)
&lt;/h3&gt;

&lt;p&gt;At the bottom of the screen, you’ll see something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;^G Get Help   ^O Write Out   ^W Where Is   ^K Cut   ^X Exit

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;^&lt;/code&gt; symbol means the &lt;strong&gt;Ctrl key&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;So:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;^O&lt;/code&gt; means &lt;code&gt;Ctrl + O&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;^X&lt;/code&gt; means &lt;code&gt;Ctrl + X&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This shortcut list is one of Nano’s biggest advantages.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. File Name Display (Top)
&lt;/h3&gt;

&lt;p&gt;At the top, Nano shows the file name you are editing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GNU nano 6.2         data_notes.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Typing Text in Nano&lt;/p&gt;

&lt;p&gt;Nano starts in editing mode immediately.&lt;/p&gt;

&lt;p&gt;You can begin typing right away without pressing any special keys.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ETL Pipeline Notes
------------------
Extract data from PostgreSQL
Transform data using Python
Load data into the warehouse

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There is no insert mode or command mode like in Vi.&lt;/p&gt;

&lt;h2&gt;
  
  
  Saving a File in Nano
&lt;/h2&gt;

&lt;p&gt;To save your work:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Press Ctrl + O (Write Out)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Nano will ask:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;File Name to Write: data_notes.txt&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Press &lt;strong&gt;Enter&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Your file is now saved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exiting Nano
&lt;/h2&gt;

&lt;p&gt;To exit Nano:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Ctrl + X&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;If You Have Unsaved Changes&lt;/p&gt;

&lt;p&gt;Nano will ask:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Save modified buffer?&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Press Y&lt;/strong&gt; → Save changes&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Press N&lt;/strong&gt; → Exit without saving&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Press Ctrl + C&lt;/strong&gt; → Cancel exit&lt;/p&gt;

&lt;h3&gt;
  
  
  Opening an Existing File with Nano
&lt;/h3&gt;

&lt;p&gt;To edit an existing file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nano data_notes.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This opens the file so you can modify it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Editing Text in Nano
&lt;/h3&gt;

&lt;p&gt;Moving the Cursor&lt;/p&gt;

&lt;p&gt;You can move around using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Arrow keys &lt;code&gt;↑ ↓ ← →&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Page Up / Page Down&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Deleting Text
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Backspace → Delete previous character&lt;/li&gt;
&lt;li&gt;Delete key → Delete next character&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Cutting and Pasting Text
&lt;/h4&gt;

&lt;p&gt;Cut a Line&lt;br&gt;
&lt;code&gt;Ctrl + K&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This cuts the entire line.&lt;/p&gt;
&lt;h4&gt;
  
  
  Paste a Line
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;Ctrl + U&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This pastes the last cut text.&lt;/p&gt;
&lt;h2&gt;
  
  
  Searching for Text in Nano
&lt;/h2&gt;

&lt;p&gt;To search within a file:&lt;br&gt;
&lt;code&gt;Ctrl + W&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Type the word you want to find and &lt;strong&gt;press Enter&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;warehouse
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Practical Example: Editing a Configuration File
&lt;/h2&gt;

&lt;p&gt;Imagine you are a data engineer editing a pipeline configuration file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Open the file
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nano etl_config.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 2: Add configuration details
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;source_database=mysql
source_host=localhost
destination=warehouse
batch_size=500
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Save and exit
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Ctrl + O → Enter

Ctrl + X
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Viewing the File from Terminal&lt;/p&gt;

&lt;p&gt;After exiting Nano, you can confirm the file content using:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat etl_config.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;source_database=mysql
source_host=localhost
destination=warehouse
batch_size=500
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Common Nano Shortcuts (Beginner Must-Know)
&lt;/h2&gt;

&lt;p&gt;Shortcut    Action&lt;br&gt;
&lt;code&gt;Ctrl + O&lt;/code&gt; Save file&lt;br&gt;
&lt;code&gt;Ctrl + X&lt;/code&gt; Exit Nano&lt;br&gt;
&lt;code&gt;Ctrl + K&lt;/code&gt; Cut line&lt;br&gt;
&lt;code&gt;Ctrl + U&lt;/code&gt;      Paste&lt;br&gt;
&lt;code&gt;Ctrl + W&lt;/code&gt;      Search&lt;br&gt;
&lt;code&gt;Ctrl + G&lt;/code&gt; Help&lt;/p&gt;
&lt;h2&gt;
  
  
  Using the Vi Editor in Ubuntu (Detailed Beginner Guide)
&lt;/h2&gt;
&lt;h2&gt;
  
  
  What Is Vi?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Vi&lt;/strong&gt; is a powerful text editor available on almost every Linux system, including Ubuntu. Unlike Nano, Vi works using modes, which can feel confusing at first but make Vi extremely efficient once learned.&lt;/p&gt;

&lt;p&gt;For data engineers, Vi is important because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It is always available on servers (even minimal installs)&lt;/li&gt;
&lt;li&gt;It is fast and lightweight&lt;/li&gt;
&lt;li&gt;It is widely used for editing configuration files and scripts&lt;/li&gt;
&lt;li&gt;Many tools default to Vi&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you connect to a remote Linux server, Vi is almost always there.&lt;/p&gt;
&lt;h2&gt;
  
  
  Opening the Terminal in Ubuntu
&lt;/h2&gt;

&lt;p&gt;Open the terminal using:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Ctrl + Alt + T&lt;/code&gt;, or&lt;/p&gt;

&lt;p&gt;Search for Terminal in Applications&lt;/p&gt;

&lt;p&gt;You will see something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;damaris@ubuntu:~$
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Opening or Creating a File with Vi
&lt;/h2&gt;

&lt;p&gt;To open or create a file using Vi:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;vi pipeline_config.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the file does not exist → Vi creates it&lt;/p&gt;

&lt;p&gt;If the file exists → Vi opens it for editing&lt;/p&gt;

&lt;p&gt;You are now inside the Vi editor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Vi Modes (Very Important)
&lt;/h2&gt;

&lt;p&gt;Vi has three main modes. Most beginner confusion comes from not knowing which mode they are in.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Normal Mode (Default)
&lt;/h2&gt;

&lt;p&gt;This is the mode Vi starts in&lt;/p&gt;

&lt;p&gt;Used for navigation and commands&lt;/p&gt;

&lt;p&gt;You cannot type text here&lt;/p&gt;

&lt;p&gt;If you try typing, nothing appears — this is normal.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Insert Mode
&lt;/h2&gt;

&lt;p&gt;Used for typing text&lt;/p&gt;

&lt;p&gt;You must enter this mode manually&lt;/p&gt;

&lt;p&gt;To enter Insert mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;i
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will see something like:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;-- INSERT --&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;at the bottom of the screen.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Command Mode
&lt;/h2&gt;

&lt;p&gt;Used to save, quit, or exit without saving&lt;/p&gt;

&lt;p&gt;Activated by typing : in Normal mode&lt;/p&gt;

&lt;h2&gt;
  
  
  Typing Text in Vi (Insert Mode)
&lt;/h2&gt;

&lt;p&gt;Step-by-step example:&lt;/p&gt;

&lt;p&gt;Open the file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;vi pipeline_config.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Press:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;i
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Type the text:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
source_database=postgres
host=localhost
port=5432
destination=data_warehouse
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You are now editing the file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exiting Insert Mode
&lt;/h2&gt;

&lt;p&gt;To stop typing and return to Normal mode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Esc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Always press &lt;strong&gt;Esc&lt;/strong&gt; before saving or quitting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Saving a File in Vi
&lt;/h2&gt;

&lt;p&gt;Make sure you are in Normal mode (press &lt;strong&gt;Esc&lt;/strong&gt;)&lt;/p&gt;

&lt;p&gt;Type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;:w
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Press &lt;strong&gt;Enter&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This saves the file but keeps Vi open.&lt;/p&gt;

&lt;h2&gt;
  
  
  Saving and Exiting Vi
&lt;/h2&gt;

&lt;p&gt;To save and exit at the same time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;:wq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then press &lt;strong&gt;Enter&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exiting Vi Without Saving
&lt;/h2&gt;

&lt;p&gt;If you want to quit without saving changes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;:q!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is useful if you make a mistake.&lt;/p&gt;

&lt;p&gt;Navigating Inside a File&lt;br&gt;
Using Arrow Keys&lt;/p&gt;

&lt;p&gt;Most Ubuntu versions support arrow keys for movement.&lt;/p&gt;
&lt;h2&gt;
  
  
  Using Vi Keys (Optional but Powerful)
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;h&lt;/code&gt; → left&lt;/p&gt;

&lt;p&gt;&lt;code&gt;l&lt;/code&gt; → right&lt;/p&gt;

&lt;p&gt;&lt;code&gt;j&lt;/code&gt; → down&lt;/p&gt;

&lt;p&gt;&lt;code&gt;k&lt;/code&gt; → up&lt;/p&gt;
&lt;h2&gt;
  
  
  Deleting Text in Vi
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Delete a Character -&lt;code&gt;x&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Delete a Line- &lt;code&gt;dd&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Undo a Change- &lt;code&gt;u&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Searching for Text in Vi
&lt;/h2&gt;

&lt;p&gt;To search for a word:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/warehouse
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Press Enter.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To move to the next match:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;n
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Practical Example: Editing a Configuration File on a Server
&lt;/h2&gt;

&lt;p&gt;Imagine you are logged into a production server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh user@data-server-ip
cd /opt/etl/config
vi etl_config.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside Vi, press i and add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;batch_size=1000
retry_count=3
log_level=INFO
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Esc
:wq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a real-world daily task for data engineers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Viewing the File After Exiting Vi
&lt;/h2&gt;

&lt;p&gt;Back in the terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat etl_config.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;batch_size=1000&lt;br&gt;
retry_count=3&lt;br&gt;
log_level=INFO&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Vi Commands (Beginner Cheat Sheet)
&lt;/h2&gt;

&lt;p&gt;Command Action&lt;br&gt;
&lt;code&gt;i&lt;/code&gt;    Insert mode&lt;br&gt;
&lt;code&gt;Esc&lt;/code&gt;  Normal mode&lt;br&gt;
&lt;code&gt;:w&lt;/code&gt;   Save&lt;br&gt;
&lt;code&gt;:q&lt;/code&gt;   Quit&lt;br&gt;
&lt;code&gt;:wq&lt;/code&gt;  Save and quit&lt;br&gt;
&lt;code&gt;:q!&lt;/code&gt;  Quit without saving&lt;br&gt;
&lt;code&gt;dd&lt;/code&gt;   Delete line&lt;br&gt;
&lt;code&gt;u&lt;/code&gt;    Undo&lt;br&gt;
&lt;code&gt;/text&lt;/code&gt;    Search&lt;/p&gt;

&lt;p&gt;Through this article, we explored the importance of Linux in data engineering, practiced essential Linux commands, and demonstrated practical text editing using the Nano and Vi editors on Ubuntu. Nano provides a simple and beginner-friendly way to create and edit files, while Vi offers powerful features that are widely used in professional and production environments. Learning both editors prepares beginners for real-world tasks such as editing ETL configurations, scripts, and log files on local or remote servers.&lt;/p&gt;

&lt;p&gt;In conclusion, mastering Linux basics along with Nano and Vi is a strong first step toward a successful data engineering journey. With continued practice, these skills become second nature and form the foundation for working with advanced data tools, automation, and large-scale data pipelines.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>dataengineering</category>
      <category>linux</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Understanding Git: How it tracks, pushes and pulls code on Ubuntu</title>
      <dc:creator>Damaa-C</dc:creator>
      <pubDate>Sun, 18 Jan 2026 09:41:54 +0000</pubDate>
      <link>https://dev.to/damaac/understanding-git-how-it-tracks-pushes-and-pulls-code-on-ubuntu-4aai</link>
      <guid>https://dev.to/damaac/understanding-git-how-it-tracks-pushes-and-pulls-code-on-ubuntu-4aai</guid>
      <description>&lt;p&gt;Before we dive into how Git tracks changes and handles pushing and pulling code, let’s first understand what Git is, what it does, and how it works together with GitHub.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Git?
&lt;/h2&gt;

&lt;p&gt;Git is a version control system. This means it works on your computer first, even without internet. Think of it as a time machine for your code.&lt;/p&gt;

&lt;p&gt;Git allows you to track changes to your files, save versions of your project, revert to earlier states and collaborate safely with others.&lt;br&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;, on the other hand, is a remote platform where Git repositories are stored online.&lt;/p&gt;
&lt;h2&gt;
  
  
  Installing Git and linking it to GitHub
&lt;/h2&gt;

&lt;p&gt;Before installing Git, first go to your web browser and create a &lt;a href="https://github.com/" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; account. Go to your terminal after creating the account.&lt;/p&gt;


&lt;h3&gt;
  
  
  Step 1: Install Git
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update

Install Git:

&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;git &lt;span class="nt"&gt;-y&lt;/span&gt;

Verify Git installation:

git &lt;span class="nt"&gt;--version&lt;/span&gt;

&lt;span class="sb"&gt;`&lt;/span&gt;Shows Git is successfully installed on Ubuntu.&lt;span class="sb"&gt;`&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 2: Configure Git Identity
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;Set your name:

git config &lt;span class="nt"&gt;--global&lt;/span&gt; user.name &lt;span class="s2"&gt;"Your_Name"&lt;/span&gt;

Set your email:

git config &lt;span class="nt"&gt;--global&lt;/span&gt; user.email &lt;span class="s2"&gt;"your_email@gmail.com"&lt;/span&gt;

Check configuration:

git config &lt;span class="nt"&gt;--list&lt;/span&gt;

user.name&lt;span class="o"&gt;=&lt;/span&gt;Your Name
user.email&lt;span class="o"&gt;=&lt;/span&gt;your_email@gmail.com&lt;span class="sb"&gt;`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Shows Git configuration including username and email.&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;user.name=Your Name → Git knows who is making commits. This name appears in the commit history.

user.email=your_email@gmail.com → Git associates your commits with this email. It must match your GitHub account email.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;This confirms Git is configured correctly. Without this, Git will not track who is making changes, and commits may not appear properly on GitHub.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 3: Create SSH Key and Link to GitHub
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Generate a secure SSH key:

ssh-keygen -t ed25519 -C "your_email@gmail.com"

Start SSH agent:

eval "$(ssh-agent -s)"

Add your key to the agent:

ssh-add ~/.ssh/id_ed25519

Display your public key:

cat ~/.ssh/id_ed25519.pub

Test the connection:

ssh -T git@github.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Example of wrong input:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh -T git@github.com
`Permission denied (publickey).`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Correct output:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Hi username! You've successfully authenticated, but GitHub does not provide shell access.&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Shows a successful authentication with GitHub via SSH.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Create a GitHub Repository
&lt;/h3&gt;

&lt;p&gt;Create a new repository on GitHub and do not initialize with README. This will be your remote repository.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Create Local Project Directory
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a new folder and navigate into it:

mkdir beginner-git-project
cd beginner-git-project

Shows folder creation and navigation.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 6: Initialize Git
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git init

Shows that the folder is now a Git repository.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 7: Create a File
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a README file:

touch README.md
echo "# Beginner Git Project" &amp;gt;&amp;gt; README.md

Check repository status:

git status

Example of wrong input:

git statuz

Output:

`git: 'statuz' is not a git command. See 'git --help'.`
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 8: Stage Changes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git add .

Check status:

git status

Shows staged files ready to commit.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 9: Commit Changes
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git commit -m "Initial commit"

Example of wrong input:

git commit -m Initial commit

Output:

`error: pathspec 'Initial' did not match any file(s) known to git`

Correct syntax requires quotes around the commit message.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 10: Connect Local Repo to GitHub
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Add the remote repository:

git remote add origin git@github.com:username/beginner-git-project.git
git remote -v

Shows the remote repository connection.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 11: Push Code to GitHub
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Rename branch to main:

git branch -M main

Push your code:

git push -u origin main

Example of wrong push command:

git push
`fatal: No configured push destination.`

This occurs if the remote wasn’t added.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 12: Pull Changes from GitHub
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git pull origin main

Downloads the latest changes from GitHub and merges them into your local project.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 13: View Project History
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git log

Exit the log view:

q
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why Git Matters (Data Science &amp;amp; Data Engineering)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Track notebooks &amp;amp; pipelines&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Collaborate safely&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Revert mistakes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Production-ready workflows&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Git allows developers, data scientists, and engineers to track code changes, collaborate safely, and recover from mistakes. By understanding each command and its purpose, beginners can confidently manage projects and work professionally using Git and GitHub.&lt;/p&gt;

&lt;p&gt;Learning Git on Ubuntu empowers you to manage projects confidently and collaborate effectively. Whether you’re working on ETL pipelines, machine learning models, or data analysis notebooks, Git ensures your work is organized, secure, and scalable.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>github</category>
      <category>git</category>
      <category>dataengineering</category>
    </item>
  </channel>
</rss>
