<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Cyrus Ndungu</title>
    <description>The latest articles on DEV Community by Cyrus Ndungu (@cyrus_ndungu_79376c09c059).</description>
    <link>https://dev.to/cyrus_ndungu_79376c09c059</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3261040%2Fb6cd5295-d904-4deb-97d9-cedb861e843d.jpg</url>
      <title>DEV Community: Cyrus Ndungu</title>
      <link>https://dev.to/cyrus_ndungu_79376c09c059</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cyrus_ndungu_79376c09c059"/>
    <language>en</language>
    <item>
      <title>Power BI and PostgreSQL: Connecting Your Database for Data Analysis</title>
      <dc:creator>Cyrus Ndungu</dc:creator>
      <pubDate>Sun, 08 Mar 2026 17:23:01 +0000</pubDate>
      <link>https://dev.to/cyrus_ndungu_79376c09c059/power-bi-and-postgresql-connecting-your-database-for-data-analysis-133c</link>
      <guid>https://dev.to/cyrus_ndungu_79376c09c059/power-bi-and-postgresql-connecting-your-database-for-data-analysis-133c</guid>
      <description>&lt;p&gt;In the current world, businesses around the world generate a lot of data daily. The ability to make sense of data in a quick and accurate manner can be the difference between a thriving company and a retrogressing one. This is where Power BI comes in.&lt;/p&gt;

&lt;p&gt;Power BI is a business intelligence tool made by Microsoft, primarily used for data visualizations. It allows individuals and organizations to connect to several data sources, transform data into meaningful insights, and present those insights using interactive dashboards and reports. It provides the tools to convert numbers into stories that decision makers can then act on.&lt;/p&gt;

&lt;p&gt;The drag-and-drop interface makes the tool accessible to non-techies, while DAX (Data Analysis Expressions) and Power Query give data professionals the flexibility needed for complex transformations and calculations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Connecting Power BI to a Local PostgreSQL Database
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Launch Power BI Desktop
&lt;/h3&gt;

&lt;p&gt;Open the Power BI Desktop application on your machine.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Select Get Data
&lt;/h3&gt;

&lt;p&gt;On the &lt;strong&gt;Home&lt;/strong&gt; ribbon at the top of the screen, click the &lt;strong&gt;Get Data&lt;/strong&gt; button. This opens a menu that gives you access to Power BI's wide range of supported data connectors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8l2uzdnwm6jm64wvwue.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8l2uzdnwm6jm64wvwue.png" alt="Get Data button in the Home ribbon" width="218" height="563"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Choose PostgreSQL Database
&lt;/h3&gt;

&lt;p&gt;In the &lt;strong&gt;Get Data&lt;/strong&gt; window, type &lt;code&gt;PostgreSQL&lt;/code&gt; in the search bar. Select it and click &lt;strong&gt;Connect&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxzs3jty7c1im2o1sx8b4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxzs3jty7c1im2o1sx8b4.png" alt="PostgreSQL option in the Get Data window" width="676" height="656"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Enter the Server Name and Database Name
&lt;/h3&gt;

&lt;p&gt;A dialogue box will appear asking for the server address, which is normally &lt;code&gt;localhost&lt;/code&gt;. To specify a particular port, append it to the server name (e.g. &lt;code&gt;localhost:5433&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;You will also be required to enter the name of the specific database you want to connect to.&lt;/p&gt;

&lt;p&gt;Additionally, you will be presented with the option to choose a &lt;strong&gt;Data Connectivity mode&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Import&lt;/strong&gt; — Loads a copy of the data into Power BI's internal model. It is generally faster, however it requires scheduled refreshes to stay up to date.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DirectQuery&lt;/strong&gt; — Queries the database in real time whenever a report interaction occurs. This is useful for large datasets or when you need always-current data.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 5: Provide Your Credentials
&lt;/h3&gt;

&lt;p&gt;After the above steps, you will be asked to authenticate with the database. Select &lt;strong&gt;Database&lt;/strong&gt; as the credential type, enter your PostgreSQL username and password, then click &lt;strong&gt;Connect&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Select and Load the Tables
&lt;/h3&gt;

&lt;p&gt;After Step 5, the &lt;strong&gt;Navigator&lt;/strong&gt; window will open, displaying a list of all the tables and views available in your PostgreSQL database. Check the boxes next to the tables you want to bring into Power BI, then load them.&lt;/p&gt;




&lt;h2&gt;
  
  
  Connecting Power BI to a Cloud PostgreSQL Database (Microsoft Azure)
&lt;/h2&gt;

&lt;p&gt;In the current world, most production environments run on cloud infrastructure. Microsoft Azure Database for PostgreSQL is a fully managed cloud database service offered by Microsoft. To connect to it, the following steps are required.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Get Connection Details
&lt;/h3&gt;

&lt;p&gt;Log in to the &lt;strong&gt;Azure Portal&lt;/strong&gt;, navigate to your PostgreSQL resource, and find the following details under &lt;strong&gt;Connection Strings&lt;/strong&gt; or &lt;strong&gt;Overview&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqassgy8fy4xjjp4qko5d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqassgy8fy4xjjp4qko5d.png" alt="Azure PostgreSQL connection details" width="800" height="212"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Configure Firewall Access
&lt;/h3&gt;

&lt;p&gt;Go to &lt;strong&gt;Networking&lt;/strong&gt; in your Azure PostgreSQL resource and add your IP address under &lt;strong&gt;Firewall Rules&lt;/strong&gt;. If you are connecting through the Power BI Service, enable &lt;strong&gt;Allow access to Azure services&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3bbmnwfk7dkmi4bxv9j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3bbmnwfk7dkmi4bxv9j.png" alt="Azure Firewall Rules configuration" width="800" height="194"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: SSL and Secure Connections
&lt;/h3&gt;

&lt;p&gt;SSL encrypts data travelling between Power BI and the database over the internet, protecting sensitive information from interception. Azure uses a trusted root certificate authority, and Power BI handles this automatically in most cases.&lt;/p&gt;

&lt;p&gt;Once these three steps are complete, open Power BI and follow the same steps as before — &lt;strong&gt;Get Data → PostgreSQL Database&lt;/strong&gt; — then enter your Azure host, database name, and credentials to connect.&lt;/p&gt;




&lt;h2&gt;
  
  
  Data Modelling in Power BI
&lt;/h2&gt;

&lt;p&gt;After connecting and loading your tables, navigate to &lt;strong&gt;Model View&lt;/strong&gt; in Power BI. This view displays your tables as cards and allows you to define how they relate to one another. Power BI can detect some relationships automatically; however, you should review and create any that are missing by dragging a column from one table to the matching column in another table.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Does Data Modelling Matter?
&lt;/h3&gt;

&lt;p&gt;When relationships are properly defined, Power BI knows how to filter data across tables automatically, ensuring accurate and consistent results across your reports and dashboards.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Are SQL Skills Essential for Power BI Analysis?
&lt;/h2&gt;

&lt;p&gt;Power BI provides a very useful visual interface for building dashboards. However, SQL is the language that communicates with relational databases, and it plays an important role at every stage of the Power BI workflow. Key roles include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Retrieving the Right Data&lt;/strong&gt; — When connecting to a database like PostgreSQL, Power BI gives you the option to load entire tables or write a custom SQL query to retrieve only the data you need.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filtering Datasets Before Loading&lt;/strong&gt; — The &lt;code&gt;WHERE&lt;/code&gt; clause allows analysts to filter data at the database level, reducing the volume of data loaded into Power BI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performing Aggregations&lt;/strong&gt; — This is achieved using functions like &lt;code&gt;SUM&lt;/code&gt;, &lt;code&gt;COUNT&lt;/code&gt;, &lt;code&gt;MIN&lt;/code&gt;, and &lt;code&gt;MAX&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Preparing and Shaping Data&lt;/strong&gt; — SQL allows analysts to join multiple tables, create derived columns, handle null values, cast data types, and reshape data into the format required by Power BI.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>data</category>
      <category>analytics</category>
    </item>
    <item>
      <title>SQL Joins and Window Functions</title>
      <dc:creator>Cyrus Ndungu</dc:creator>
      <pubDate>Mon, 02 Mar 2026 16:57:29 +0000</pubDate>
      <link>https://dev.to/cyrus_ndungu_79376c09c059/sql-joins-and-window-functions-1g7a</link>
      <guid>https://dev.to/cyrus_ndungu_79376c09c059/sql-joins-and-window-functions-1g7a</guid>
      <description>&lt;p&gt;Joins and window functions are two of the most powerful tools in SQL. Understanding them well is really key in the journey of becoming a data professional.&lt;br&gt;
A join combines rows from two or more tables based on a related column. The most common is the INNER JOIN, which returns only rows where a match exists in both tables. If you need to preserve all rows from one side regardless of a match, you use a LEFT or RIGHT JOIN — NULLs fill in where no match is found. A FULL OUTER JOIN goes further, returning all rows from both tables with NULLs on either side where matches are missing. Less common but worth knowing, a CROSS JOIN produces every possible combination of rows between two tables, and a SELF JOIN joins a table to itself — handy for hierarchical data like employee-manager relationships.&lt;br&gt;
Here's a join query combining employees with their departments:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SELECT e.name, d.name AS department, e.salary&lt;br&gt;
FROM employees e&lt;br&gt;
LEFT JOIN departments d ON e.dept_id = d.id;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Window functions compute values across a set of rows related to the current row — without collapsing the result like GROUP BY does. They use the OVER() clause, with PARTITION BY to define groups and ORDER BY to control row ordering within each group.&lt;/p&gt;

&lt;p&gt;Ranking functions like RANK(), DENSE_RANK(), and ROW_NUMBER() are among the most used. The difference is: RANK() skips numbers after a tie, while DENSE_RANK() does not. Aggregate functions like SUM() and AVG() can also be used as window functions. Add an ORDER BY inside OVER() and they become cumulative, perfect for running totals. LAG and LEAD let you look at previous or next row values without a self-join, making period-over-period comparisons simple. Functions like FIRST_VALUE and NTILE round out the toolkit for benchmarking and bucketing data into equal groups.&lt;br&gt;
below is an example of a window function showing each student' score alongside the class average and their rank without losing any rows. Assuming that a list of students and their exam score was initially given.&lt;br&gt;
&lt;code&gt;SELECT&lt;br&gt;
    name,&lt;br&gt;
    subject,&lt;br&gt;
    score,&lt;br&gt;
    ROUND(AVG(score) OVER (PARTITION BY subject), 2) AS class_average,&lt;br&gt;
    RANK() OVER (PARTITION BY subject ORDER BY score DESC) AS class_rank&lt;br&gt;
FROM student_scores;&lt;/code&gt;&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>data</category>
    </item>
    <item>
      <title>How analysts translate messy data, DAX, and dashboards into action using Power BI</title>
      <dc:creator>Cyrus Ndungu</dc:creator>
      <pubDate>Mon, 09 Feb 2026 04:34:49 +0000</pubDate>
      <link>https://dev.to/cyrus_ndungu_79376c09c059/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-99o</link>
      <guid>https://dev.to/cyrus_ndungu_79376c09c059/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-99o</guid>
      <description>&lt;p&gt;Power BI is a decision pipeline: it takes messy, siloed inputs and turns them into measurable, operational outcomes. The difference between dashboards that look nice and dashboards that change behavior is an outcomes-first approach, disciplined data work, clear semantic modeling, purposeful DAX, and operational integration.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start with the decision&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Define the decision the dashboard must enable, who will act, and the concrete action expected.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Translate that into one primary KPI, supporting metrics, and clear thresholds tied to actions.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Tame messy data (Power Query)&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Profile sources to find nulls, inconsistent types, and outliers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Clean and standardize fields (dates, categories, numeric types).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Deduplicate and reconcile records; use fuzzy matching where needed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Make transforms repeatable and traceable with parameters, functions, and source metadata.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use incremental refresh and early aggregation for large volumes.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Build a trustworthy model&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Structure data in a star-schema: facts for events/transactions and dimensions for entities.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Provide a dedicated Date table and mark it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Prefer single-direction relationships and reduce unnecessary cardinality.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Remove unused columns and maintain clear, documented relationships.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Use DAX for business logic (measures over columns)&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Encapsulate dynamic, context-sensitive calculations as measures so results respond correctly to filters and visuals.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Keep DAX readable and performant: use variables, avoid needless row-by-row iteration, and handle edge cases (e.g., divisions by zero).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Document complex measures so maintainers and stakeholders understand the logic.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Design dashboards to prompt action&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Lead with a one-page decision view: the KPI, its trend, and the top drivers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Make next steps explicit: display the owner, required action, and conditional highlights tied to thresholds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Enable quick drill paths and focused views for investigation without overwhelming the top-level page.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Operationalize insights&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Connect dashboards to workflows: alerts, subscriptions, and Power Automate flows that create tickets, notify teams, or update trackers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Surface accountability: owners, status, and an action log on or linked from the dashboard.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensure viewers can quickly move from insight to a recorded action.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Performance, governance, and quality&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Optimize model size and query performance by removing unused fields, using aggregates, and tuning DAX.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Apply row-level security, document data sources and transformations, and use deployment pipelines or version control for PBIX assets.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Test ETL and measures with representative data and regression checks after changes.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;ol&gt;
&lt;li&gt;Measure impact and iterate&lt;/li&gt;
&lt;/ol&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Track usage and business outcomes: who uses the dashboard, what actions were taken, and whether KPIs moved.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Treat dashboards as products: collect feedback, prioritize improvements, and release updates with measurable goals.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Wrap-up&lt;/p&gt;

&lt;p&gt;Turning messy data into action with Power BI requires technical rigor and operational design. Clean, auditable ETL; a clear semantic model; robust, documented DAX; and dashboards built around decisions — connected to workflows and ownership — are the levers that move teams from insight to impact.&lt;/p&gt;

</description>
      <category>data</category>
    </item>
    <item>
      <title>A pianist’s take on Power BI: Schemas &amp; data modelling made musical 🎹</title>
      <dc:creator>Cyrus Ndungu</dc:creator>
      <pubDate>Sun, 01 Feb 2026 17:55:49 +0000</pubDate>
      <link>https://dev.to/cyrus_ndungu_79376c09c059/a-pianists-take-on-power-bi-schemas-data-modelling-made-musical-2hhm</link>
      <guid>https://dev.to/cyrus_ndungu_79376c09c059/a-pianists-take-on-power-bi-schemas-data-modelling-made-musical-2hhm</guid>
      <description>&lt;p&gt;Hi — I’m someone who spends more than a little time at the keyboard. When I arrange a tune I think about structure (intro, verse, chorus, bridge) and how the parts fit together so the melody breathes. Data modelling in Power BI is the same kind of craft: if the foundation is good, the report performs and the insights sing. Below I’ll walk you through schemas, fact &amp;amp; dimension tables, relationships, and why good modelling matters — in plain, friendly language with practical tips you can use right away.&lt;/p&gt;




&lt;p&gt;What you’ll get from this article&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Clear definitions of fact tables, dimension tables, star and snowflake schemas&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;How relationships work in Power BI (direction, cardinality, many-to-many)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Why modelling affects performance and correctness (real-world examples)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A practical, step-by-step recipe to design a clean Power BI model&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Quick checklist and troubleshooting tips&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Facts and dimensions — the melody and the harmony&lt;/p&gt;

&lt;p&gt;Think of a report like a song:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The fact table is the melody — the events you measure (sales, clicks, shipments).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The dimension tables are the harmonies — the context (dates, customers, products, regions).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example: a simple sales model&lt;/p&gt;

&lt;p&gt;FactSales (the melody)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OrderID, OrderLineID, DateKey, CustomerKey, ProductKey, Quantity, Revenue&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DimDate (harmony)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;DateKey, FullDate, Month, Quarter, Year&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DimCustomer (harmony)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CustomerKey, CustomerName, Segment, Region&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DimProduct (harmony)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ProductKey, ProductName, Category, Brand&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Key characteristics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Fact = many rows, numeric measures, foreign keys to dims.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dimension = relatively few rows, descriptive attributes, primary key.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When fact and dims are aligned by consistent keys and grain, queries are simple and correct.&lt;/p&gt;




&lt;p&gt;Star schema — the classic pop song (simple &amp;amp; fast)&lt;/p&gt;

&lt;p&gt;A star schema has one central fact table with dimension tables radiating out. It’s the most common and recommended pattern for Power BI.&lt;/p&gt;

&lt;p&gt;Visual (ASCII):&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DimDate
         |
DimCustomer — FactSales — DimProduct
         |
      DimRegion
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Why star schema works well in Power BI&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Fewer joins → faster queries in VertiPaq (the in-memory engine).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;DAX measures are simpler because relationships are straightforward.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Good for aggregation (rollups by date, product, customer).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Easy for report consumers to understand.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When to use it: Most analytics and reporting scenarios where speed and simplicity matter.&lt;/p&gt;




&lt;p&gt;Snowflake schema — the classical piece (normalized)&lt;/p&gt;

&lt;p&gt;A snowflake schema normalizes dimensions into multiple tables. Example: DimProduct → DimCategory → DimSubcategory.&lt;/p&gt;

&lt;p&gt;Why you might choose snowflake:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Less redundancy; easier to maintain when attributes change frequently across many items.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Smaller dimension tables (sometimes saving storage).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why it can slow things down in Power BI:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;More joins increase query complexity and can slow VertiPaq queries.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;DAX can become more complex when traversing normalized hierarchies.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rule of thumb: prefer star for analytical models in Power BI. Use snowflake only when normalization gives clear maintenance or governance benefits.&lt;/p&gt;




&lt;p&gt;Relationships — the chord progressions of your model&lt;/p&gt;

&lt;p&gt;Relationships tell Power BI how tables connect. Important concepts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Cardinality: One-to-many (1:), many-to-one (:1), many-to-many (:). Most common is 1:* (dimension → fact).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cross-filter direction: single or both. Single is safer and faster; both (bidirectional) can be convenient but may introduce ambiguity and performance issues.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Active vs inactive relationships: only active relationships filter by default. USERELATIONSHIP in DAX can activate an inactive relationship in a calculation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Relationship keys: use surrogate numeric keys (integers) for best performance; avoid text-based keys for relationships if possible.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Examples (DAX)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Basic measure&lt;/p&gt;

&lt;p&gt;Total Revenue = SUM(FactSales[Revenue])&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Use an alternate date relationship&lt;/p&gt;

&lt;p&gt;Total Sales by Ship Date =&lt;br&gt;
CALCULATE(&lt;br&gt;
  [Total Revenue],&lt;br&gt;
  USERELATIONSHIP(DimDate[DateKey], FactSales[ShipDateKey])&lt;br&gt;
)&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Many-to-many: use a bridge table or composite model to avoid ambiguous filters and double counting.&lt;/p&gt;




&lt;p&gt;Grain matters — set the right level for facts&lt;/p&gt;

&lt;p&gt;The “grain” of a fact table defines what a single row represents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Order line (one row per SKU per order)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Order header (one row per order)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Daily aggregated sales (one row per product per day)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If granularity is inconsistent across tables or measures, you’ll get wrong numbers (double counts, weird averages). Always:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Decide the grain early.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Keep the fact table at the lowest necessary grain for your reports.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use aggregated tables for faster summary reports if needed.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Performance — why modelling makes or breaks speed&lt;/p&gt;

&lt;p&gt;Power BI uses VertiPaq: a columnar, in-memory engine with dictionary encoding and compression. Good modelling optimizes those internals.&lt;/p&gt;

&lt;p&gt;Practical performance rules&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Remove unnecessary columns (they increase memory).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Prefer numeric surrogate keys — smaller dictionaries and faster joins.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Reduce cardinality where possible (high-cardinality columns are expensive).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use star schema so queries join fewer tables.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Keep dimension attributes that you use in visuals; move rarely used attributes to a separate table.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use Import mode for best performance; DirectQuery has runtime dependency on source and limits optimization.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use incremental refresh for large fact tables.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Advanced tools&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Aggregations: create pre-aggregated summary tables for high-level reports and let Power BI route queries to them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Composite models &amp;amp; Dual storage mode: combine Import and DirectQuery, use dual to optimize lookup tables.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;VertiPaq Analyzer or Power BI Performance Analyzer to find bottlenecks.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Concrete benefit: a well-designed star model can reduce dataset size drastically and cut query times from minutes to seconds.&lt;/p&gt;




&lt;p&gt;Accuracy — avoid the false harmonies&lt;/p&gt;

&lt;p&gt;Bad modelling doesn’t just slow you down — it misleads.&lt;/p&gt;

&lt;p&gt;Common accuracy pitfalls&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Duplicate or inconsistent dimension keys (e.g., “John Smith” vs “john smith”) → wrong joins and inflated counts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mixing granularities in measures (summing order lines then counting orders without DISTINCT) → double counts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Using many-to-many without careful bridging → incorrect aggregations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Relying on bi-directional filters to “fix” an issue — it may mask poor model design.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How to validate&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Do spot checks: compare totals in your fact table vs a simple SUM in the model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Verify DISTINCTCOUNT(OrderID) between source and model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Use small test measures to assert expected behavior before building complex visuals.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Practical recipe: build a clean Power BI model (step-by-step)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In Power Query: clean &amp;amp; shape&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Remove unused columns.

- Standardize keys and text (trim, clean, proper case).

- Ensure dates are real date types.

- Aggregate if you can reduce grain safely.

- Create surrogate keys if necessary (e.g., ProductKey).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;
Create a Date table (essential)&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Generate a full date dimension with Year, Quarter, Month, Fiscal columns.

- Mark it as Date table in Model view (Modeling → Mark as Date table).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;
Load fact(s) and dimension(s)&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- FactSales should have foreign keys to dims.

- Ensure keys are the right data type (whole number for IDs).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;
In Model view:&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Create one-to-many relationships from dimension → fact.

- Set cross-filter to single direction unless you have a specific reason.

- Hide technical key columns from report view.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;
Create measures (not calculated columns) where possible&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Measures calculate on the fly and are memory efficient.


Total Revenue = SUM(FactSales[Revenue])
Orders = DISTINCTCOUNT(FactSales[OrderID])
Average Order Value = DIVIDE([Total Revenue], [Orders])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;
Test for correctness&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Compare totals to source extracts.

- Validate a few sample customers/products/dates.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;
Optimize&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- Remove unused columns/tables.

- Consider aggregations for very large datasets.

- Use incremental refresh for historical fact data.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;




&lt;p&gt;Handling special cases&lt;/p&gt;

&lt;p&gt;Slowly Changing Dimensions (SCD)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Type 1: overwrite attributes (current view only)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Type 2: store history with row-effective dates or version keys — useful when you need historical reporting at the same grain as facts.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Role-playing dimensions&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Date can be order date, ship date, invoice date. Use separate foreign keys in fact and USERELATIONSHIP for alternate measures.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Many-to-many&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use a bridge (junction) table or composite model; avoid ad-hoc bidirectional relationships.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DirectQuery &amp;amp; Composite models&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;DirectQuery keeps data at source (good for real-time but slower).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Composite models allow mixing Import and DirectQuery to get the best of both worlds.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Quick checklist — tuneup before publishing&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Grain of fact table defined and documented&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Star schema (or justified snowflake) in place&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Date table present and marked as such&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Relationships 1:* with single direction (unless required)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Numeric surrogate keys used for joins&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Unused columns removed &amp;amp; hidden from report view&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Measures created for aggregation (not unnecessary calculated columns)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Test totals match source system for several samples&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Performance validated (Performance Analyzer)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Troubleshooting common issues (quick tips)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Wrong totals? Check relationship direction and active relationships.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Duplicate counts? Check grain and use DISTINCTCOUNT.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Slow visuals? Remove high-cardinality columns from visuals, consider aggregation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Many-to-many confusion? Introduce a bridge table and use measures carefully.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Final chord — why this matters&lt;/p&gt;

&lt;p&gt;Good modelling is the sheet music for your data. When you model well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Reports are fast and responsive (your audience stays engaged).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Numbers are correct and trustworthy (your stakeholders have confidence).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;DAX stays readable and maintainable (you can iterate quickly).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Future changes are easier — like modulating into a new key without breaking the song.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Start simple: build a clean star schema, treat the date table as sacred, use measures, and optimize only where you need to. As a keyboard player, I know how freeing it feels to have the skeleton of a good chord progression — you can improvise wonders on top. The same is true for your data: solid structure unlocks creativity.&lt;/p&gt;

</description>
      <category>powerbi</category>
      <category>data</category>
    </item>
    <item>
      <title>A Guide to Git and GitHub for Data Analysts</title>
      <dc:creator>Cyrus Ndungu</dc:creator>
      <pubDate>Sat, 17 Jan 2026 08:59:54 +0000</pubDate>
      <link>https://dev.to/cyrus_ndungu_79376c09c059/a-guide-to-git-and-github-for-data-analysts-2n1a</link>
      <guid>https://dev.to/cyrus_ndungu_79376c09c059/a-guide-to-git-and-github-for-data-analysts-2n1a</guid>
      <description>&lt;h2&gt;
  
  
  A Guide to Git and GitHub for Data Analysts
&lt;/h2&gt;

&lt;p&gt;In the world of software engineering, writing code is only half the battle. The other half is managing that code—tracking its evolution, collaborating with others, and preventing data loss which might be catastrophic. This is where &lt;strong&gt;Version Control&lt;/strong&gt; comes in.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. What is Git and Why Version Control Matters
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Version Control&lt;/strong&gt; is a system that records changes to a file or set of files over time so that you can recall specific versions later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Git&lt;/strong&gt; is a &lt;em&gt;Distributed Version Control System (DVCS)&lt;/em&gt;. Unlike a central server where files are locked, every developer's computer has a full copy of the code history.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why is this important?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The "Undo" Button:&lt;/strong&gt; If you break your code at 2:00 AM, you can instantly revert the project to the state it was in at 10:00 PM. isn't this exciting!&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Collaboration:&lt;/strong&gt; Multiple data analysts can work on the same file simultaneously. Git uses mathematical algorithms to merge(combine) these changes together.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Branching:&lt;/strong&gt; You can create parallel universes (branches) to test crazy ideas without breaking the main working code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Context:&lt;/strong&gt; It tells you &lt;em&gt;who&lt;/em&gt; wrote a line of code, &lt;em&gt;when&lt;/em&gt;, and importantly, &lt;em&gt;why&lt;/em&gt; (via commit messages).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note on Git vs. GitHub:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Git&lt;/strong&gt; is the tool (the software installed on your machine).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt; is the service (a website that hosts Git repositories in the cloud). Think of it as: Git is MP3, GitHub is Spotify.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  2. How to Track Changes (The Git Workflow)
&lt;/h2&gt;

&lt;p&gt;Tracking changes in Git follows a three-stage process. Imagine you are packing a moving truck:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Working Directory:&lt;/strong&gt; Where you edit files.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Staging Area (Index):&lt;/strong&gt; Where you choose what to save.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repository (HEAD):&lt;/strong&gt; A cloud storage for your code.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Commands
&lt;/h3&gt;

&lt;p&gt;First, initialize Git in your project folder:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check the status of your files (your "dashboard"):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step A: Staging&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Move changes from the Working Directory to the Staging Area.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Add a specific file&lt;/span&gt;
git add main.py

&lt;span class="c"&gt;# OR add all changed files in the current directory&lt;/span&gt;
git add &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step B: Committing&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Seal the snapshot. This creates a permanent record in the history graph (a node in the tree).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Implement the quadratic formula function"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;-m&lt;/code&gt; flag allows you to write a message.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best Practice:&lt;/strong&gt; Write messages in the imperative mood (e.g., "Add feature" not "Added feature").&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. How to Push Code to GitHub
&lt;/h2&gt;

&lt;p&gt;"Pushing" is the act of uploading your local repository history to a remote server (GitHub).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisite:&lt;/strong&gt; Create a new &lt;strong&gt;empty&lt;/strong&gt; repository on GitHub.com.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step A: Connect Local to Remote
&lt;/h3&gt;

&lt;p&gt;You need to tell your local Git where the GitHub server is. We usually name the remote server &lt;code&gt;origin&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git remote add origin https://github.com/cyrusz55/my-project.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step B: Push the Code
&lt;/h3&gt;

&lt;p&gt;Send your committed changes up to GitHub.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git push &lt;span class="nt"&gt;-u&lt;/span&gt; origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;origin&lt;/code&gt;: The destination (GitHub).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;main&lt;/code&gt;: The branch you are sending (standard naming used to be &lt;code&gt;master&lt;/code&gt;, now it is &lt;code&gt;main&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;-u&lt;/code&gt;: Sets the "upstream." After doing this once, you can simply type &lt;code&gt;git push&lt;/code&gt; in the future.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4. How to Pull Code from GitHub
&lt;/h2&gt;

&lt;p&gt;"Pulling" is downloading data from GitHub to your computer. There are two scenarios for this.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario A: Starting from scratch (&lt;code&gt;git clone&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;If you are on a new computer or joining a new project, you need to download the entire repository history.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/cyrusz55/my-project.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This command does &lt;code&gt;git init&lt;/code&gt;, creates the remote link, and downloads the data all in one go.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario B: Updating existing code (&lt;code&gt;git pull&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;If you already have the folder, but your teammate pushed new code (or you pushed code from a different computer), you need to update your current setup.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git pull origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This fetches the new changes and immediately merges them into your local files.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary Cheatsheet
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Goal&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Start Git&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git init&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Check status&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git status&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Stage files&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git add .&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Save snapshot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git commit -m "message"&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Download repo&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git clone &amp;lt;url&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Upload changes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git push&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Update local&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;git pull&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;Happy coding! 🚀&lt;/em&gt;&lt;/p&gt;

</description>
      <category>git</category>
      <category>github</category>
      <category>data</category>
      <category>luxdevhq</category>
    </item>
    <item>
      <title>How Excel is Used in Real-World Data Analysis</title>
      <dc:creator>Cyrus Ndungu</dc:creator>
      <pubDate>Thu, 12 Jun 2025 21:35:43 +0000</pubDate>
      <link>https://dev.to/cyrus_ndungu_79376c09c059/how-excel-is-used-in-real-world-data-analysis-1gjg</link>
      <guid>https://dev.to/cyrus_ndungu_79376c09c059/how-excel-is-used-in-real-world-data-analysis-1gjg</guid>
      <description>&lt;h1&gt;
  
  
  My First Week Exploring Excel: Turning Numbers into Insight
&lt;/h1&gt;

&lt;p&gt;Hello, my name is &lt;strong&gt;Cyrus Ndung'u&lt;/strong&gt;. Over the past week, I’ve been immersing myself in the vast and fascinating world of data—specifically &lt;strong&gt;Microsoft Excel&lt;/strong&gt;. The experience has been exciting and deeply engaging. Even the few challenges I encountered made the journey more interesting, because each obstacle pushed me to learn something new and rewarding.&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction to Excel
&lt;/h2&gt;

&lt;p&gt;As a pianist and music lover, part of my responsibility is transforming scattered notes into a harmonious symphony. In a similar way, Excel helps transform raw numbers into meaningful insights that drive real-world decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Microsoft Excel&lt;/strong&gt; is a powerful spreadsheet application and a cornerstone of data analysis across many industries. It enables users to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;input and organize data clearly,&lt;/li&gt;
&lt;li&gt;perform calculations using formulas and functions,&lt;/li&gt;
&lt;li&gt;visualize information with charts and formatting,&lt;/li&gt;
&lt;li&gt;and analyze trends to support better decision-making.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Excel blends structure and creativity—allowing analysts to explore data with both precision and imagination.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Applications of Excel in Data Analysis
&lt;/h2&gt;

&lt;p&gt;Excel is widely used because it can support real, practical work in many fields. Here are a few common examples:&lt;/p&gt;

&lt;h3&gt;
  
  
  1) Business Decision-Making
&lt;/h3&gt;

&lt;p&gt;Organizations rely on Excel to analyze sales trends, track performance metrics, and forecast growth. Managers can compare results across departments, identify patterns, and make strategic decisions that influence overall success.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) Financial Reporting and Analysis
&lt;/h3&gt;

&lt;p&gt;Financial professionals use Excel for budgeting, financial modeling, and reporting to stakeholders. It supports tasks such as analyzing investment portfolios, calculating ratios, and producing detailed summaries that guide major business and investment choices.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) Marketing Performance Tracking
&lt;/h3&gt;

&lt;p&gt;Marketing teams use Excel to evaluate campaign effectiveness through metrics like conversion rates, customer acquisition cost, and return on investment (ROI). With structured tracking, teams can improve strategies by learning what works—and what doesn’t.&lt;/p&gt;




&lt;h2&gt;
  
  
  Essential Excel Features for Data Analysis
&lt;/h2&gt;

&lt;p&gt;Excel offers many features that strengthen analytical work. Three that stand out are:&lt;/p&gt;

&lt;h3&gt;
  
  
  VLOOKUP
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;VLOOKUP&lt;/strong&gt; helps search for a value in a table and return related information from another column. It is especially useful when combining data from multiple sheets or retrieving specific records quickly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pivot Tables
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pivot tables&lt;/strong&gt; are powerful summarization tools that help analysts reorganize, group, and filter large datasets. They make it easier to view data from multiple perspectives and uncover trends that aren’t obvious in raw tables.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conditional Formatting
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Conditional formatting&lt;/strong&gt; uses visual cues—such as colors, icons, and data bars—to highlight patterns, trends, and outliers. It helps important insights stand out immediately, making analysis faster and clearer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Personal Reflection
&lt;/h2&gt;

&lt;p&gt;Learning Excel has changed how I see data. Where I once saw overwhelming spreadsheets full of numbers, I now see &lt;strong&gt;stories waiting to be told&lt;/strong&gt;. I’ve realized that effective data analysis is more than technical skill—it also requires creative problem-solving.&lt;/p&gt;

&lt;p&gt;In many ways, data feels like music: it has rhythm, patterns, and relationships. Just as musical compositions follow structure and harmony, datasets also contain patterns that can be discovered and interpreted. This connection has made learning Excel more intuitive and enjoyable for me.&lt;/p&gt;

&lt;p&gt;Turning raw information into meaningful insights feels remarkably similar to creating music. Both require:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;understanding underlying structure,&lt;/li&gt;
&lt;li&gt;recognizing patterns,&lt;/li&gt;
&lt;li&gt;and presenting results in a way that resonates with an audience.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This perspective has strengthened my curiosity and motivation to keep learning—especially to explore more advanced Excel features. This journey has taught me that data analysis isn’t just about formulas and numbers; it’s about discovering what the data is saying and using those insights to make better decisions in a data-driven world.&lt;/p&gt;




</description>
      <category>machinelearning</category>
      <category>database</category>
      <category>datascience</category>
    </item>
  </channel>
</rss>
