<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Robert Njuguna</title>
    <description>The latest articles on DEV Community by Robert Njuguna (@robert_njuguna).</description>
    <link>https://dev.to/robert_njuguna</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3708682%2Fe501d077-e806-48ac-8ef6-41c3b67dc59e.png</url>
      <title>DEV Community: Robert Njuguna</title>
      <link>https://dev.to/robert_njuguna</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/robert_njuguna"/>
    <language>en</language>
    <item>
      <title>ETL vs ELT: Which One Should You Use and Why?</title>
      <dc:creator>Robert Njuguna</dc:creator>
      <pubDate>Tue, 14 Apr 2026 10:28:09 +0000</pubDate>
      <link>https://dev.to/robert_njuguna/etl-vs-elt-which-one-should-you-use-and-why-4gd2</link>
      <guid>https://dev.to/robert_njuguna/etl-vs-elt-which-one-should-you-use-and-why-4gd2</guid>
      <description>&lt;h2&gt;
  
  
  ETL and ELT: Which to use and why?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What Is ETL?
&lt;/h3&gt;

&lt;p&gt;ETL is an abbreviation that means "Extract, Transform, Load." It involves the transfer of data between two locations. "Extract" refers to a term that typically means to draw raw data out of a source system. These sources may be a database, spreadsheet, API, or web application. The term "transform" states that it cleans the data and restructures it before sending it to its destination. "Load" is used to refer to loading the cleaned data into a storage mechanism such as a data warehouse.&lt;/p&gt;

&lt;p&gt;Consider the ETL as laundry prior to stuffing it in a suitcase. Clean first and put away.&lt;/p&gt;

&lt;p&gt;A retailing company gathers sales information from five branches of stores. The data is stored in a different format and is stored in each branch. The ETL process obtains all this data, corrects the formatting disparities, and eliminates any duplicates, then loads a single clean, consolidated table into the central database of the company. Reporting by analysts can now be done without messy data.&lt;/p&gt;

&lt;p&gt;ETL has been utilized for decades. Its popularity came at a time when storage was costly, and companies could not afford to store raw, unprocessed data. It was also time-saving to transform data prior to loading it and maintain clean warehouses.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Is ELT?
&lt;/h3&gt;

&lt;p&gt;ELT is used as an abbreviation to mean "Extract, Load, Transform." The processes are nearly similar to ETL, except that these last two processes are reversed. Data are extracted from the source and loaded into the destination system in their raw form and transformed within that destination system.&lt;/p&gt;

&lt;p&gt;ELT can be imagined as putting all your clothes in the suitcase and sorting them out when you get to the hotel.&lt;/p&gt;

&lt;p&gt;One tech startup transmits millions of logs of user activities daily in the company's mobile application. The group puts all those crude logs in a cloud data warehouse such as BigQuery. After the data is in, data analysts then write SQL queries to clean and convert the data into useful reports. The raw data remains accessible to anybody in need of it in the future.&lt;/p&gt;

&lt;p&gt;ELT was made popular by the fact that cloud storage became very cheap, and cloud data warehouses became very powerful. Snowflake, Google BigQuery, and Amazon Redshift are examples of platforms that have the ability to perform large transformations through their own internal computing capabilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  The major differences between ETL and ELT.
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. The location of the transformation.
&lt;/h3&gt;

&lt;p&gt;In ETL, conversion occurs out of the destination system. The heavy lifting is done by a separate tool or server prior to the arrival of the data. Transformations, in ELT, occur within the destination system once the data has been loaded.&lt;/p&gt;

&lt;p&gt;This is important due to its impact on speed, cost, and flexibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Speed of Loading
&lt;/h3&gt;

&lt;p&gt;ETL may be slow to load since the data has to be cleaned prior to being loaded into the warehouse. ELT is much faster in loading data as raw data are directly transferred into the storage with no pre-processing.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Data Storage
&lt;/h3&gt;

&lt;p&gt;Only clean, transformed data is stored in ETL. Transformation usually results in the discarding of the raw version. Raw data is always stored in ELT, and thus, the original information does not get lost. Later, teams can re-transform the same data with other rules.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Flexibility
&lt;/h3&gt;

&lt;p&gt;ETL pipelines are stiff. In case of a change in business rules, the pipeline will have to be constructed again. ELT is more lenient. As the raw data remains intact, the analysts can add new transformation queries without having to touch the pipeline itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Cost
&lt;/h3&gt;

&lt;p&gt;ETL needs an additional transformation server or a tool, which increases the costs. ELT is based on the computing power of the cloud warehouse itself, which can be less expensive, depending on the size of operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Skill Requirements
&lt;/h3&gt;

&lt;p&gt;ETL frequently demands engineers familiar with particular software, such as Informatica or Talend. ELT frequently involves individuals familiar with SQL, a more common skill.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;ETL&lt;/th&gt;
&lt;th&gt;ELT&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Order&lt;/td&gt;
&lt;td&gt;Extract → Transform → Load&lt;/td&gt;
&lt;td&gt;Extract → Load → Transform&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Where Transform?&lt;/td&gt;
&lt;td&gt;Outside warehouse&lt;/td&gt;
&lt;td&gt;Inside warehouse&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Raw Data Stored?&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed to Load&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;td&gt;Faster&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flexibility&lt;/td&gt;
&lt;td&gt;Lower&lt;/td&gt;
&lt;td&gt;Higher&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best For&lt;/td&gt;
&lt;td&gt;On-premise systems&lt;/td&gt;
&lt;td&gt;Cloud-based systems&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Common Tools&lt;/td&gt;
&lt;td&gt;Informatica, Talend, SSIS&lt;/td&gt;
&lt;td&gt;dbt, BigQuery, Snowflake&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  ETL in real-world applications.
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Banking and Finance
&lt;/h3&gt;

&lt;p&gt;Banks process transactions made on ATMs, mobile applications, and branch counters. All these data are in various formats. All the records are standardized by an ETL pipeline, and any unfinished ones are removed and then loaded into the reporting database. Banks cannot just afford to store bad information, as it directly impacts financial reports and regulatory compliance.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Healthcare
&lt;/h3&gt;

&lt;p&gt;Patient data is gathered by the various systems at hospitals. There is a system that holds lab results, a system that holds prescriptions, and a system that holds appointment history. ETL consolidates all such data in a single clean patient record. The step of transformation is important, as any incorrect data format in a medical record can lead to grave consequences.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Legacy System Migration
&lt;/h3&gt;

&lt;p&gt;ETL is usually utilized when a firm is changing its database system to a new one. The existing data is taken out, cleaned to suit the new system, and finally inserted into the new database.&lt;/p&gt;

&lt;h2&gt;
  
  
  ELT use cases in the real world.
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. E-commerce Analytics
&lt;/h3&gt;

&lt;p&gt;Even an online shop such as a Jumia or Amazon-type store gathers billions of clicks, searches, and purchases on a daily basis. All of this raw data is directly stored in a cloud warehouse. The next step is writing SQL queries by data teams to identify trends, such as the most commonly viewed products prior to a purchase. The raw data remains open to analysis in the future.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Social Media Platforms
&lt;/h3&gt;

&lt;p&gt;ELT is highly utilized in platforms that follow the user behavior, likes, shares, and watch time. The amount of data is too huge to convert prior to loading. The only feasible way is to load raw data quickly and to transform data at a later stage within the warehouse.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Start-ups and emerging firms.
&lt;/h3&gt;

&lt;p&gt;Early-stage companies are not always sure of the questions that they will pose to their data in the future. ELT allows storage of raw data and enables users to explore. On the occurrence of a new business query, analysts write new transformation queries without the reconstruction of any pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools Used in ETL
&lt;/h2&gt;

&lt;p&gt;One of the most robust and reliable ETL tools is &lt;a href="https://www.informatica.com/download.html" rel="noopener noreferrer"&gt;Informatica PowerCenter&lt;/a&gt;. It is popular in big businesses in the banking and insurance sectors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://learn.microsoft.com/en-us/sql/integration-services/sql-server-integration-services?view=sql-server-ver17" rel="noopener noreferrer"&gt;The Microsoft SSIS (SQL Server Integration Services)&lt;/a&gt;is an inbuilt part of the Microsoft ecosystem. SQL Server databases are already in use by companies, so they are likely to select SSIS to perform ETL workflows.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.talend.com/" rel="noopener noreferrer"&gt;Talend&lt;/a&gt; is an ETL tool that has free and paid versions and is an open-source tool. It is linked to hundreds of data sources and allows a visual interface to be used to construct transformation logic.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://nifi.apache.org/" rel="noopener noreferrer"&gt;Apache NiFi&lt;/a&gt; is a system created to transfer data between systems. It manages data routing, transformation, and scaling of data.&lt;/p&gt;

&lt;p&gt;Another free alternative that is being used by many mid-sized businesses to create ETL pipelines without incurring high licensing fees is Pentaho.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools Used in ELT
&lt;/h2&gt;

&lt;p&gt;The most used tool of transformation steps during ELT is the &lt;a href="https://www.getdbt.com/" rel="noopener noreferrer"&gt;dbt (Data Build Tool)&lt;/a&gt;. SQL models are written in dbt by data engineers, and dbt executes such transformations directly on the data warehouse, along with automatically tracing documentation and data lineage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://cloud.google.com/bigquery" rel="noopener noreferrer"&gt;Google BigQuery&lt;/a&gt; is a cloud data warehouse capable of housing raw data and performing large-scale SQL transformations at scale. A lot of companies rely on BigQuery as the heart of their ELT.&lt;/p&gt;

&lt;p&gt;Another example of a cloud-based warehouse is &lt;a href="https://www.snowflake.com/en/" rel="noopener noreferrer"&gt;Snowflake&lt;/a&gt;, where storage and computing capabilities are independently handled. The design renders it highly economical for ELT processes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/redshift/" rel="noopener noreferrer"&gt;Amazon Redshift&lt;/a&gt; is a cloud data warehouse by AWS. It can be easily integrated with other AWS services and is a powerful option for teams already on the Amazon cloud.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.fivetran.com/" rel="noopener noreferrer"&gt;Fivetran&lt;/a&gt; and &lt;a href="https://airbyte.com" rel="noopener noreferrer"&gt;Airbyte&lt;/a&gt;/ are software that manage the ELT extract and load stages. They are linked to hundreds of data sources and automatically synchronize raw data to the warehouse. DBT then performs the transformation step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which to use and which to leave.
&lt;/h2&gt;

&lt;p&gt;These questions need to be asked prior to selecting an approach.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is the amount of data extremely large?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;High volumes prefer ELT since unprocessed information can be loaded in the shortest amount of time and processed in the future with the help of the computing ability of the warehouse.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is storage cheap?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is cheap today to store data in the cloud. ELT is much more appropriate in clouds. Storage: On-premise storage is more costly, and the ETL's strategy of only storing clean data remains valid for older systems.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is there a high rate of business rule change?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Regular amendments of the rules are in favor of ELT. The raw data remains the same, and additional transformations can be written without modification of the pipeline.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Do you have any concerns over data privacy?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ETL has the capability to cover up or anonymize sensitive data even prior to its coming into the warehouse. This is critical in the fields of health care and finance, where uncontrolled keeping of raw personal data should not be allowed.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What are the skills of the team?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SQL-based teams have the ability to work more swiftly with ELT and dbt. Experienced teams that are familiar with certain ETL tools might find it easier to adhere to the ETL model.&lt;/p&gt;

&lt;h2&gt;
  
  
  A real-life application of a combination of the two.
&lt;/h2&gt;

&lt;p&gt;There are numerous actual businesses that adopt a hybrid model. ETL may be used by a company to strip out any personally identifiable information in customer records prior to loading them. It is then sensitively cleaned up, and the rest of the data is stored in the warehouse as raw data with ELT making all further transformations. This offers the best of both worlds: the protection of data privacy, which ETL provides, and flexibility, which ELT provides.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;ETL and ELT are solutions to the same issue in dissimilar ways. ETL standardizes data prior to storage and is most effective in structured settings that have rigid data guidelines. ELT is a data storage that is transformed subsequently and is therefore the best fit for cloud platforms that deal with large and fast-moving data. The decision to make is based on the tools available in the team, the amount of data, the storage system, and the frequency of business requirements. Each method is valid, and each is extensively utilized in the data engineering industry today.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>database</category>
      <category>data</category>
      <category>luxdev</category>
    </item>
    <item>
      <title>SQL Joins and Window Functions: A Practical Guide.</title>
      <dc:creator>Robert Njuguna</dc:creator>
      <pubDate>Fri, 06 Mar 2026 13:28:54 +0000</pubDate>
      <link>https://dev.to/robert_njuguna/sql-joins-and-window-functions-a-practical-guide-1m49</link>
      <guid>https://dev.to/robert_njuguna/sql-joins-and-window-functions-a-practical-guide-1m49</guid>
      <description>&lt;p&gt;SQL is an efficient language that is used to manage and analyze database data contained in relational databases. Joins and window functions are considered to be among the most significant tools of high-level data querying.&lt;/p&gt;

&lt;h1&gt;
  
  
  SQL Joins - Fusion of Different Tables.
&lt;/h1&gt;

&lt;p&gt;The SQL used to put together rows of two or more tables is a JOIN in SQL. It is done by joining 2 or more tables using a common column in both tables being joined. Joins are important to work with normalized databases, in which data is distributed across a number of tables to minimize redundancy.&lt;/p&gt;

&lt;h1&gt;
  
  
  Types of Joins
&lt;/h1&gt;

&lt;h2&gt;
  
  
  INNER JOIN
&lt;/h2&gt;

&lt;p&gt;Returns only the rows that have matching values in both tables.&lt;/p&gt;

&lt;p&gt;Sample: Selecting the names of the employees and the name of the department in which they work. There will only be the employees in a department; employees who have no department or departments that have no employees will not be returned.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  LEFT JOIN (or LEFT OUTER JOIN)
&lt;/h2&gt;

&lt;p&gt;Gives back all the rows of the left table and the matching rows from the right table. In case no match is found, NULLs are shown.&lt;br&gt;
Sample: Giving a list of all the employees, including those who are not assigned to any department.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  RIGHT JOIN (or right outer join)
&lt;/h2&gt;

&lt;p&gt;Returns all rows from the right table and the matching rows from the left table.&lt;br&gt;
Example: Selecting all departments, even those that have no workers/employees.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;
&lt;span class="k"&gt;RIGHT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  FULL JOIN (full outer join)
&lt;/h2&gt;

&lt;p&gt;Returns everything from the joined tables.&lt;/p&gt;

&lt;p&gt;Example: Combining the employees and department tables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;FULL&lt;/span&gt; &lt;span class="k"&gt;OUTER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CROSS JOIN
&lt;/h2&gt;

&lt;p&gt;Gets the Cartesian product of two tables. Each row of the first table is multiplied by each row of the second one. Essentially, if table one has 5 rows and row 2 has 4 rows, you would expect a total of 20 rows in the cross-joined table.&lt;/p&gt;

&lt;p&gt;Example: Generating the combinations of employees and departments.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;
&lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing about cross joins, it doesn't need the on statement.&lt;/p&gt;

&lt;h1&gt;
  
  
  Window Functions: Row-Level Calculations
&lt;/h1&gt;

&lt;p&gt;Window functions allow you to perform calculations across a set of rows that are related to the current row.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Window Functions
&lt;/h2&gt;

&lt;h2&gt;
  
  
  ROW_NUMBER()
&lt;/h2&gt;

&lt;p&gt;Assigns a unique sequential integer to rows within a partition.&lt;br&gt;
Example: Ranking employees by salary within each department.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;department_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ROW_NUMBER&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department_name&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;salary_rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  RANK() and DENSE_RANK()
&lt;/h2&gt;

&lt;p&gt;RANK gives a sequence to rows, but in the case of rows with a similar value, it skips ranks based on ties, while DENSE_RANK doesn't skip ranks even in a tie.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  SUM(), AVG, MIN, and MAX as Window Functions
&lt;/h2&gt;

&lt;p&gt;Carries out aggregate inspections on a window of rows.&lt;br&gt;
Example: Calculating the total salary expense per department while still showing individual employees.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;total_salary&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  LEAD() and LAG()
&lt;/h2&gt;

&lt;p&gt;Accesses subsequent or previous row values within a partition.&lt;/p&gt;

&lt;p&gt;Example: Comparing the salary of each of the employees with both the past and the future salary in the dataset.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;LAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;previous_salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;LEAD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;next_salary&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;TIP&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figuring out how to rank, do totals, calculate moving averages, and make comparisons between current and previous values, window functions are best suited for analytics efforts where row-level information cannot be lost.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Joins and Window Functions Combination.
&lt;/h2&gt;

&lt;p&gt;Joins and window functions tend to be used together when it comes to more intricate analytics. For instance, you could join employees with departments, then rank salaries within each department:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;RANK&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;OVER&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;PARTITION&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt; &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;salary_rank&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;department_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This query provides a ranked view of salaries for each department, combining relational data retrieval with advanced row-wise computation.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Joins and window functions are highly applicable in SQL. Joins make it easy and effective to combine data from several tables. Windows functions allow one to compute calculations of rows without losing any detail. A combination of the two will provide more powerful insights on your datasets. They are useful in ranking, totals, comparisons, and analytics. The practice of these functions will make your SQL queries strong. Working with data involves working with learning joins as well as window functions, which is a crucial part of the work of any data professional.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>datascience</category>
      <category>database</category>
    </item>
    <item>
      <title>How Analysts Translate Messy Data, DAX, and Dashboards into Action Using Power BI</title>
      <dc:creator>Robert Njuguna</dc:creator>
      <pubDate>Mon, 09 Feb 2026 15:14:31 +0000</pubDate>
      <link>https://dev.to/robert_njuguna/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-1c0k</link>
      <guid>https://dev.to/robert_njuguna/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-1c0k</guid>
      <description>&lt;p&gt;Data is rarely received in an ideal state. In the physical world it is incomplete, contradictory, replicated, and dispersed among various systems. However, business remains keen to find such straight answers: What is driving costs? Where are we losing revenue? How can performance improve? An analyst is supposed to fill that gap. Power BI offers the arsenal, which enables analysts to transform sloppy data into dependable information and, eventually, into action.&lt;/p&gt;

&lt;p&gt;A Power BI analyst is the intermediary between the source systems and decision-makers. Through this process, the application of organized data cleaning, dimensional modelling, data aggregation functions, and intentional visualization can allow the analysts to transform initial information into trustworthy knowledge that drives decision-making.&lt;/p&gt;

&lt;h2&gt;
  
  
  Power Query and the ETL Foundation.
&lt;/h2&gt;

&lt;p&gt;Analysis life cycle commences in Power Query, where Extract, Transform, Load (ETL) standardizes the raw inputs.&lt;/p&gt;

&lt;p&gt;Examples include hospital and pharmacy data, which usually include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The dates are recorded as a text rather than a date.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Different systems use different names of drugs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Blank values, negative values (where they should not be negative, like transaction costs, cost of medicine or even quantity of medicine given to patients).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Duplicate invoice numbers
&lt;/h3&gt;

&lt;p&gt;The analyst uses transformation steps to impose data types, trim and clean text, eliminate duplicates, substitute nulls, and derive attributes. It is possible to break down a single Date column into Year, Quarter, Month, and Day to expand the time intelligence in the future.&lt;/p&gt;

&lt;p&gt;In power Bi the Time intelligence functions are mainly used to clean the Date column&lt;/p&gt;

&lt;p&gt;For instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Year (Date)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Month (Date)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Day (Date)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These Time intelligence functions returns only the Year, month number, or Day number of the month respectively of the desired date on a separate column.&lt;/p&gt;

&lt;h3&gt;
  
  
  Star Schema and Dimensional Modeling.
&lt;/h3&gt;

&lt;p&gt;Once prepared, the analyst models the data with the dimensional modeling methods, the star schema being the most common.&lt;/p&gt;

&lt;p&gt;A transactional record may be contained in a central fact table like&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prescription ID&lt;/li&gt;
&lt;li&gt;Patient ID&lt;/li&gt;
&lt;li&gt;Product ID&lt;/li&gt;
&lt;li&gt;Date&lt;/li&gt;
&lt;li&gt;Quantity&lt;/li&gt;
&lt;li&gt;Unit Price&lt;/li&gt;
&lt;li&gt;Cost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Dimensions tables surrounds the facts table and give descriptive information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dim Patient - age, gender, type of insurance.&lt;/li&gt;
&lt;li&gt;Dim Product—name of drug, brand name, therapeutic classification.&lt;/li&gt;
&lt;li&gt;Dim Date—fiscal periods, weekdays, and months.&lt;/li&gt;
&lt;li&gt;Dim Department - ward, facility, region.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The relationships are generally one-to-many, where the dimensions flow to the facts. The structure lessen redundancy as well as make aggregations predictable. In the absence of this design, totals can be counted twice, filters can fail, and there will be worse performance.&lt;/p&gt;

&lt;h3&gt;
  
  
  DAX: Encoding Business Logic
&lt;/h3&gt;

&lt;p&gt;DAX is applied to develop standardized metrics with an appropriate model.&lt;/p&gt;

&lt;p&gt;The analyst defines it instead of computing revenue in several ways.&lt;br&gt;
For instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total Revenue = SUMX(FactSales, FactSales[Quantity] * FactSales[Unit Price])

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Profitability may be defined as&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Gross Profit = TOTAL Revenue [Cost] - SUM (FactSales[Cost])

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since DAX works in a filter environment, analysts can immediately observe the performance by month, department, or type of drug without needing to rewrite formulas.&lt;/p&gt;

&lt;p&gt;This guarantees one version of the truth in the organization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture and Visualization.
&lt;/h3&gt;

&lt;p&gt;Good dashboards are not created but designed. The layouts created by analysts are built in accordance with information hierarchy: KPIs are on top, diagnostics are in the middle, and details are at the bottom.&lt;/p&gt;

&lt;p&gt;For instance:&lt;/p&gt;

&lt;p&gt;There are cards showing Total Revenue, Total Prescriptions, and Gross Margin.&lt;/p&gt;

&lt;p&gt;Line charts depict trends of utilization on a daily or monthly basis.&lt;/p&gt;

&lt;p&gt;The bar charts are used to compare products (say products utilization per departments) or (departments found in each county).&lt;/p&gt;

&lt;p&gt;Color logic (e.g., red decline, green improvement) is used to provide the user with a quick way to understand performance. Date, facility, or Location (county) slicers allow the presenter to control the interactivity without making the audience feel bombarded.&lt;/p&gt;

&lt;p&gt;It is aimed at rapid thinking and directed investigation.&lt;/p&gt;

&lt;p&gt;Between the Insight and Operational Action.&lt;/p&gt;

&lt;p&gt;A dashboard may indicate that some drugs are in large stock but not being utilized. By cutting further, the analyst can find that demand declined following a treatment regimen alteration.&lt;/p&gt;

&lt;p&gt;The same understanding can stimulate the following procedures, like changing the quantities of procurement, renegotiating contracts with suppliers, or reallocating inventory between locations.&lt;/p&gt;

&lt;p&gt;In this case, analytics transforms the reporting of the past to the future strategy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Power BI allows an analyst to combine ETL, dimensional modeling, DAX computation, and visualization into one decision system. The layers are based on each other: clean data is used to come up with good models and good visuals, good models are used to come up with good measures, and good measures and visuals are used to make good decisions. When properly implemented, dashboards are working tools that minimize uncertainties, enhance productivity, and generate quantifiable business performance. This is the how chaotic data is brought into action.&lt;/p&gt;

</description>
      <category>powerfuldevs</category>
      <category>datascience</category>
      <category>analytics</category>
      <category>luxdev</category>
    </item>
    <item>
      <title>Schemas and Data Modeling in Power BI</title>
      <dc:creator>Robert Njuguna</dc:creator>
      <pubDate>Tue, 03 Feb 2026 18:55:09 +0000</pubDate>
      <link>https://dev.to/robert_njuguna/schemas-and-data-modeling-in-power-bi-m32</link>
      <guid>https://dev.to/robert_njuguna/schemas-and-data-modeling-in-power-bi-m32</guid>
      <description>&lt;h2&gt;
  
  
  Schemas and Data Modeling in Power BI
&lt;/h2&gt;

&lt;p&gt;Data modeling is an important step in Power BI as it defines the way of organizing, associating, and analyzing data. An effective data model enhances the performance of a report, guarantees correct output, and simplifies it to create meaningful visualizations for the users. The major ideas in the Power BI data modeling are schemas, fact and dimension tables, and relationships.&lt;/p&gt;

&lt;h3&gt;
  
  
  Star Schema
&lt;/h3&gt;

&lt;p&gt;The most commonly used and most recommended data model in Power BI is the star schema. A star schema has one central fact table and several dimension tables that link to the fact table, forming a star.&lt;/p&gt;

&lt;p&gt;The fact table is filled with quantifiable values like the amount of sales made, the number of patients, or the yield.&lt;/p&gt;

&lt;h4&gt;
  
  
  Example
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvob0h5iow9n2x6qaayck.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvob0h5iow9n2x6qaayck.png" alt="Star schema Visualization using Sales data" width="800" height="573"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The information stored in dimension tables is descriptive and includes dates, locations, products, hospitals, or the type of crops.&lt;/p&gt;

&lt;p&gt;All the dimension tables are directly related to the fact table through a one-to-many relationship. Star schemas are simple to maintain and comprehend and are very effective for the in-memory engine of Power BI. The number of joins is reduced; therefore, reports load quicker, and DAX calculations are more effective.&lt;/p&gt;

&lt;h3&gt;
  
  
  Snowflake Schema
&lt;/h3&gt;

&lt;p&gt;The snowflake schema is a variation of the star schema that is more advanced. The use of this model entails the normalization of dimension tables into several related tables. As an illustration, a location dimension can be divided into country, region, and city tables.&lt;/p&gt;

&lt;h4&gt;
  
  
  Example
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flq6skf6j2m65md93u65w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flq6skf6j2m65md93u65w.png" alt="Snowflake schema visualization" width="800" height="314"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Although snowflake schemas minimize redundancy in data, they add more relationships and complexity. In Power BI, this has the ability to slow performance and complicate DAX formulas to write and debug. Owing to this reason, snowflake schemas are neither popular nor usually sought-after except when the dataset is large or already structured in that manner.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fact and Dimension Tables
&lt;/h2&gt;

&lt;p&gt;To master good data modeling, it is important to understand the distinction between fact and dimension tables:&lt;/p&gt;

&lt;h3&gt;
  
  
  Fact tables
&lt;/h3&gt;

&lt;p&gt;Essentially consists of numerical values (metrics or measures).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Have many rows&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Examples: admissions, sales transactions, and crop production records.&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Dimension tables
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Hold descriptive characteristics.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Have fewer rows.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;em&gt;Examples: date, hospital, department, crop type, county.&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common measures in Power BI are &lt;strong&gt;totals, averages, and ratios&lt;/strong&gt; generated in &lt;strong&gt;fact tables&lt;/strong&gt;, whereas &lt;strong&gt;dimensions&lt;/strong&gt; are used to slice and filter the data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Relationships in Power BI
&lt;/h2&gt;

&lt;p&gt;Relationships indicate the connection between tables. Power BI has a &lt;strong&gt;one-to-many relationship&lt;/strong&gt;, the most prevalent type of relationship.&lt;/p&gt;

&lt;p&gt;Good practices for relationships include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Single-direction (dimension to fact filtering).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It is better to avoid many-to-many relationships.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Ensuring Matching data types and clean keys.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Distorted or complicated associations may cause incorrect totals, undesirable filtering, and inefficient execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Significance of Good Data Modeling.
&lt;/h2&gt;

&lt;p&gt;Good data modeling is important for the following reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Performance—Properly modeled models, in particular star schemas, save memory and increase report response time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Precision—Relationships are clear so that the measures are calculated, and filters act as expected.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Usability—Clean models allow easier report building as the user is not confused.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Scalability—A well-structured data model is capable of adding more data and data measures without significant redesign.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In practical data like hospital records or agricultural data of Kenyan crops, it is good to model the data so that the analysts can make credible information to aid in decision-making.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Effective Power BI reporting depends on data modeling. With the help of star schemas, well-defined fact and dimension tables, and straightforward relationships, one obtains better results in terms of performance and accuracy of the analysis. The snowflake schema has a place, but Power BI suits best with clean and intuitive models. One of the ways to guarantee quality dashboards and reliable business intelligence results is investing in suitable data modeling.&lt;/p&gt;

</description>
      <category>techtalks</category>
      <category>analytics</category>
      <category>luxdev</category>
      <category>data</category>
    </item>
    <item>
      <title>Introduction to Excel For DATA Analysis</title>
      <dc:creator>Robert Njuguna</dc:creator>
      <pubDate>Tue, 27 Jan 2026 17:02:17 +0000</pubDate>
      <link>https://dev.to/robert_njuguna/introduction-to-excel-for-data-analysis-2f04</link>
      <guid>https://dev.to/robert_njuguna/introduction-to-excel-for-data-analysis-2f04</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;The most common data analytics tool that is popular with beginners is Microsoft Excel. It is simple to operate, versatile, and strong enough to carry out simple data analysis functions without possessing any programming skills.&lt;/p&gt;

&lt;p&gt;In this article, I present MS Excel as an analytics tool used in data analysis and how it may be utilized to analyse data using straightforward examples. This tutorial will assist the beginners who are new to Excel and wish to know how it can assist in the analysis of data.&lt;/p&gt;

&lt;h1&gt;
  
  
  Data Analysis
&lt;/h1&gt;

&lt;p&gt;Data analysis is the process of collecting, cleaning, analysing data to discover important information, trends, and patterns in order to make informed decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Collection
&lt;/h2&gt;

&lt;p&gt;Data collection is the process of obtaining both primary and secondary data for the purpose of analysis. This can either be in the field (&lt;strong&gt;primary&lt;/strong&gt;) or already stored data from databases (&lt;strong&gt;Secondary data&lt;/strong&gt;).&lt;/p&gt;

&lt;p&gt;Once collected, data is stored in a tool(Excel) for analysis. This is called data entry. For example, say you are a HR analyst, and you wish to analyse employee data using Excel. You have to first obtain the employee data from the relevant databases and store it in an excel file as shown below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgze3k4b1sivwjum07ybv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgze3k4b1sivwjum07ybv.png" alt=" " width="800" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Data Cleaning
&lt;/h2&gt;

&lt;p&gt;Data cleaning involves ridding data of any inconsistencies which would otherwise negatively impact the analysis process. This can involve:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data formatting&lt;/li&gt;
&lt;li&gt;Getting rid of duplicates (&lt;strong&gt;Mainly done in the unique identifier Column&lt;/strong&gt;)&lt;/li&gt;
&lt;li&gt;Handling Outliers&lt;/li&gt;
&lt;li&gt;Handling Blanks/Missing Values&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Data Formatting
&lt;/h3&gt;

&lt;p&gt;This is the process of correcting any formatting problem.&lt;br&gt;
For instance in the HR dataset, the Hire date Column the date data doesn't currently fit in the Hire date column. See Below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvn6pwldtnrnpbapi0rmi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvn6pwldtnrnpbapi0rmi.png" alt=" " width="787" height="532"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is fixed by extending the Hire date Column. This is done by right clicking the &lt;strong&gt;+&lt;/strong&gt; sign at the furthest end of the column.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwzda3l225pbr6y3aopz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdwzda3l225pbr6y3aopz.png" alt=" " width="800" height="348"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The number formatting also comes into play, For example, the Employee ID column in the HR dataset is initially formatted into number format rather than text format.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;How do you know this?&lt;/em&gt; - &lt;em&gt;In excel, text is always aligned to the left, while Date and number formats are always aligned to the right for each cell in a column.&lt;/em&gt;  &lt;/p&gt;

&lt;p&gt;To fix this, the Format option is utilized in excel. Select the Employee_ID column Go to:&lt;/p&gt;

&lt;p&gt;Home&amp;gt;Font&amp;gt;Click the furthest Arrow&amp;gt;Format Cells&amp;gt;Text&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felwg56ej8q9ht0b0295b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Felwg56ej8q9ht0b0295b.png" alt=" " width="800" height="650"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Getting rid of duplicates
&lt;/h3&gt;

&lt;p&gt;This is done by first conditional formatting the Unique Identifier column, where you highlight the duplicates. Once the duplicates are visualized you can use logical arguments to suggest how to deal with duplicates.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select the Unique Identifier Column:&lt;/li&gt;
&lt;li&gt;Home&amp;gt;Styles&amp;gt;Conditional Formatting&amp;gt;Highlight Cell Rules&amp;gt;Duplicate Values&lt;/li&gt;
&lt;li&gt;and then highlight with your preferred colour.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F49tnyg6tk8y1e6g9y5lo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F49tnyg6tk8y1e6g9y5lo.png" alt=" " width="598" height="232"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once highlighted, the next step if applying you criteria. For instance, In our example for the &lt;strong&gt;HR dataset&lt;/strong&gt;, we can choose to keep the oldest Unique Identifier (Basically, the first ID entry and delete subsequent duplicate entries).&lt;/p&gt;
&lt;h3&gt;
  
  
  Handling Blanks/Missing Values
&lt;/h3&gt;

&lt;p&gt;Missing Values can be as a result of data entry problems, or unavailable data for that particular Cell. Say for example you have Sales data set that contains countries and their respective cities. If the a particular country doesn't have data in the city column. How do we handle this?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We can fill the empty cells with unavailable to prevent adding incorrect values/Cities for a particular set of data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next, for values in a particular column, we can fill empty cells using the averages, median or mode of other columns. This is decided by what the data represents. E.g., ratings(median), Salary(average, median) e.t.c.&lt;/p&gt;

&lt;p&gt;This is done by first computing the arithmetic's of these particular columns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;= Average ([Cell range])
= Median ([Cell range])
= Mode ([Cell range])

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Handling Outliers
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Identify Outliers
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Boxplots (most common)&lt;/li&gt;
&lt;li&gt;Z-scores (values &amp;gt; ±3 are often outliers)&lt;/li&gt;
&lt;li&gt;IQR method&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lower bound = Q1 − 1.5 × IQR&lt;/p&gt;

&lt;p&gt;Upper bound = Q3 + 1.5 × IQR&lt;/p&gt;

&lt;h4&gt;
  
  
  2. Verify the Outliers
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Before removing anything, check:&lt;/li&gt;
&lt;li&gt;Is it a data entry error? (extra zero, wrong unit)&lt;/li&gt;
&lt;li&gt;Is it a valid but extreme value?&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Remove Outliers
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Use this only if the value is an error or irrelevant.&lt;/li&gt;
&lt;li&gt;Delete the row&lt;/li&gt;
&lt;li&gt;Filter values outside acceptable ranges&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Transform the Data
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Reduce the impact instead of removing:&lt;/li&gt;
&lt;li&gt;Log transformation&lt;/li&gt;
&lt;li&gt;Square root transformation&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Data Analysis
&lt;/h2&gt;

&lt;p&gt;This is the process of obtaining imortant information/insights from cleaned and transformed data&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Descriptive Statistics (Start Here)
&lt;/h3&gt;

&lt;p&gt;This summarizes your data so you understand it.&lt;/p&gt;

&lt;p&gt;In Excel:&lt;/p&gt;

&lt;p&gt;Assume your cleaned prices are in Column B.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mean (Average)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=AVERAGE(B:B)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Median
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=MEDIAN(B:B)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Minimum &amp;amp; Maximum
&lt;/h3&gt;

&lt;p&gt;=MIN(B:B)&lt;br&gt;
=MAX(B:B)&lt;/p&gt;

&lt;h3&gt;
  
  
  Standard Deviation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=STDEV(B:B)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Interpretation example:&lt;br&gt;
“The average product price is KSh 1,666, with most prices clustered around the mean, as indicated by a standard deviation of …”&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Identify Outliers (IQR Method – Practical)
&lt;/h3&gt;

&lt;p&gt;This shows whether any prices are unusually high or low.&lt;/p&gt;

&lt;p&gt;Step-by-step:&lt;/p&gt;

&lt;h4&gt;
  
  
  Q1
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;= QUARTILE(B:B,1)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Q3
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;= QUARTILE(B:B,3)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  IQR
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;= Q3-Q1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Lower Bound
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=Q1-1.5*IQR

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Upper Bound
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=Q3+1.5*IQR
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Any price below the lower bound or above the upper bound is an outlier.&lt;/p&gt;

&lt;h1&gt;
  
  
  Pivot Tables and Visualizations
&lt;/h1&gt;

&lt;h2&gt;
  
  
  PIVOT TABLES
&lt;/h2&gt;

&lt;p&gt;Pivot tables are used to summarize data in excel. Pivot tables come in hand when one wants to group dataset using a particular criteria or to aggregate a given set of grouped data. For example, You may want to calculate the total revenue by a given region, or the average salaries of employees by Region. This can be easily done using pivot tables.&lt;/p&gt;

&lt;p&gt;To create one, you select the entire data range&amp;gt;Insert&amp;gt; Pivot table&amp;gt;From table Range.&lt;/p&gt;

&lt;p&gt;This inserts the pivot table in a new worksheet or a pre-created worksheet&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fri7kzbp357bvfbzouho4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fri7kzbp357bvfbzouho4.png" alt=" " width="336" height="604"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The next steps involves summarizing the desired data criteria. For example, Average Salary by departments as shown below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbdtu90j0c05ya3wc68xj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbdtu90j0c05ya3wc68xj.png" alt=" " width="800" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Visualization&lt;/p&gt;

&lt;p&gt;This can either be done by raw data, or data from the pivot table. From raw data, a column or columns is/are selected, and a chart is inserted.&lt;/p&gt;

&lt;p&gt;This are some of the charts that can be created using raw data in excel:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cldqrxptiorm4pzpb9n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cldqrxptiorm4pzpb9n.png" alt=" " width="208" height="736"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The charts can also be created using the pivot tables, by heading to:&lt;/p&gt;

&lt;p&gt;Pivot table analyze &amp;gt; pivot charts,&lt;/p&gt;

&lt;p&gt;and selecting the desired chart.&lt;/p&gt;

&lt;h1&gt;
  
  
  Dashboards
&lt;/h1&gt;

&lt;p&gt;Lastly, once the visualizations are complete, one can decide to present the entire visual, KPI's on one panel. This is called a dashboard. The charts are clearly and systematically arranged to tell a story about the data. The dashboards are also important since all the required information about your data can be placed in one place which makes it easier for presentation:&lt;/p&gt;

&lt;p&gt;See an example below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4svs85zitla2ueexlna.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff4svs85zitla2ueexlna.png" alt=" " width="800" height="605"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The above dashboard clearly and visually presents a story about the employees data.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;To sum up, Microsoft Excel is an effective and easy-to-use data analysis tool. It enables one to gather, clean, analyze, and visualize information without having to know how to program. Excel is useful because it allows users to discover valuable insights and trends through its data formatting capabilities, missing values, duplicates, descriptive statistics, pivot tables, and charts. Dashboards also improve knowledge as they display critical information in a single transparent display. Altogether, Excel is a great place to start with when someone wishes to learn more about data analysis and how to use simple and practical tools to make an informed decision.&lt;/p&gt;

</description>
      <category>luxdev</category>
      <category>programming</category>
      <category>datascience</category>
      <category>learning</category>
    </item>
    <item>
      <title>Understanding GIT: Version Control, Push and Pull Code, and Tracking changes</title>
      <dc:creator>Robert Njuguna</dc:creator>
      <pubDate>Sun, 18 Jan 2026 10:15:09 +0000</pubDate>
      <link>https://dev.to/robert_njuguna/understanding-git-version-control-push-and-pull-code-and-tracking-changes-1hje</link>
      <guid>https://dev.to/robert_njuguna/understanding-git-version-control-push-and-pull-code-and-tracking-changes-1hje</guid>
      <description>&lt;h1&gt;
  
  
  1. This article tackles:
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;What is Version Control&lt;/li&gt;
&lt;li&gt;How Git Tracks Changes&lt;/li&gt;
&lt;li&gt;How to Push and Pull code&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  1.1 What is Version Control
&lt;/h2&gt;

&lt;p&gt;For instance say you write an essay and there are different versions, e.g., Final_doc_1.docx, final_doc_2.docx. This can quickly become messy.&lt;/p&gt;

&lt;p&gt;When it comes to GIT, Version control solves this by automatically tracking and updating changes without saving two different versions. Git helps in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Saving each change Automatically&lt;/li&gt;
&lt;li&gt;Letting one go back to previous versions&lt;/li&gt;
&lt;li&gt;Seeing who changed any part of the document and when.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  1.1.1 Saving each change Automatically
&lt;/h3&gt;

&lt;p&gt;This is done through:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git add .
git commit -m "saved changes"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  1.1.2 Letting one go back to previous versions
&lt;/h3&gt;

&lt;p&gt;To see previous versions of commit, one can view through:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git log

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This displays the log of the commit, giving the commit Id, name of the author and the Date of the commit. For instance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;commit: abc3wef
Author: Robert
Date: Mon Jan 15 10:30:00 2026

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Therefore, in order to go back to a specific version of the commit, we use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git checkout &amp;lt;commit-id&amp;gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Moreover, when one wants to get the most recent version of the commit, we use the more general version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git checkout main

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  1.1.3 Showing who changed what and when
&lt;/h3&gt;

&lt;p&gt;Showing who changed what and when is done using the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git log --oneline

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This displays a short, easy-to-read summary of your project’s commit history.&lt;/p&gt;

&lt;p&gt;Next, to see what was changed the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git show

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;is used to display all the exact changes done to the files.&lt;/p&gt;

&lt;p&gt;Last, but not least, the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git blame filename.txt

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;is used to show line by line history of the changes made(who changed each line) &lt;/p&gt;

&lt;h2&gt;
  
  
  1.2 How Git Tracks Changes
&lt;/h2&gt;

&lt;p&gt;Git does not automatically save every change you make. Instead, it tracks changes using three simple stages. Understanding these stages is the key to using Git correctly.&lt;/p&gt;

&lt;p&gt;The Three Stages of Git&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2.1 Working Directory
&lt;/h3&gt;

&lt;p&gt;This is where you edit your files.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You write code&lt;/li&gt;
&lt;li&gt;You delete or modify files&lt;/li&gt;
&lt;li&gt;Git notices the changes but does not save them yet&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To check changes, one uses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git status

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Git gives you a clear summary of what’s happening in your repository.&lt;br&gt;
Example output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;On branch main
Changes not staged for commit:
  modified: index.html

Untracked files:
  script.js

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  1.2.3 Staging area
&lt;/h3&gt;

&lt;p&gt;This is where you tell Git the changes you want to save. This is done through:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git add .

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Basically, you are preparing Git for a commit and telling Git the changes you wish to keep.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2.4  Repository (Commit History)
&lt;/h3&gt;

&lt;p&gt;This is where Git saves the staged files Once you commit. The commit is done by:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git commit -m "New files added"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this is done, the committed staged files are now permanently saved in Git's history.&lt;/p&gt;

&lt;p&gt;Git knows what has changed by comparing the last commit with the current files. This can be displayed in Git using the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git diff

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This displays the added or removed lines, or any modified code.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;In simple terms, think of Git like a camera, where the &lt;strong&gt;Working Directory&lt;/strong&gt; is the scene you are setting up in order to capture a picture, &lt;strong&gt;Staging Area&lt;/strong&gt; is what you frame in the photo and &lt;strong&gt;Commit&lt;/strong&gt; is the photo you save.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  1.3 How to Push and Pull Code in Git
&lt;/h2&gt;

&lt;p&gt;Basically, when one is working with Git the code you are writing exists in two places &lt;strong&gt;Local repository&lt;/strong&gt;(your computer) and &lt;strong&gt;Remote Repository&lt;/strong&gt; (GitHub). Essentially, pushing and pulling keeps these two locations in sync.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.3.1 What is push ?
&lt;/h3&gt;

&lt;p&gt;When you are done committing any changes on a file, pushing basically saves the files into the &lt;strong&gt;Remote Repository&lt;/strong&gt;(GitHub). This is done by using the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git push

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In short, this is like uploading your saved work.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.3.2 What does "Pull" mean?
&lt;/h3&gt;

&lt;p&gt;Pull basically downloads the saved files from the &lt;strong&gt;Remote Repository&lt;/strong&gt; into your &lt;strong&gt;Local Repository&lt;/strong&gt;(into your computer). This is done if you are occasionally collaborating and others may have committed changes in the last version you interacted with.&lt;/p&gt;

&lt;p&gt;This is done using the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git pull

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In short, this is like downloading updates. If you don't pull first before pushing Git may block your push or You may see a merge conflict. That’s why pulling first is a good habit.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fun twist: The name Git comes from British slang meaning “someone who is stubborn”—which is fitting, because Git never forgets your changes.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>luxdev</category>
      <category>programming</category>
      <category>git</category>
      <category>github</category>
    </item>
  </channel>
</rss>
