<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shaban Ibrahim</title>
    <description>The latest articles on DEV Community by Shaban Ibrahim (@shabex).</description>
    <link>https://dev.to/shabex</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3708629%2Fde01968e-ff12-49be-8946-3dbb0cdcece3.jpeg</url>
      <title>DEV Community: Shaban Ibrahim</title>
      <link>https://dev.to/shabex</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shabex"/>
    <language>en</language>
    <item>
      <title>ETL vs ELT: Which One Should You Use and Why?</title>
      <dc:creator>Shaban Ibrahim</dc:creator>
      <pubDate>Sun, 12 Apr 2026 18:46:18 +0000</pubDate>
      <link>https://dev.to/shabex/etl-vs-elt-which-one-should-you-use-and-why-4hoi</link>
      <guid>https://dev.to/shabex/etl-vs-elt-which-one-should-you-use-and-why-4hoi</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3p8ifi1im7tkbh1ly81.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3p8ifi1im7tkbh1ly81.webp" alt="article_head" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Introduction.
&lt;/h1&gt;

&lt;p&gt;It is estimated that the global big data analytics market will grow by an estimated 165.5% by the year 2032. With this rapid growth in the data market, there is an increase in demand for processing the growing amount of data. It is here that ETL (Extract, Transform and Load) and ELT (Extract, Load and Transform) processes are essential. But as much as these processes sound the same, they are very different, and the differences can be confusing.&lt;/p&gt;

&lt;p&gt;ETL has been there for so long, long before the advent of ELT. For decades, ETL has been giving value to businesses from meaningful insights. ELT is a prodigy of the cloud revolution that will enable users to handle data at scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  ETL vs ELT
&lt;/h3&gt;

&lt;h4&gt;
  
  
  What is ETL (Extract, Transform, Load)
&lt;/h4&gt;

&lt;p&gt;As the acronym suggests, this stands for Extract, Transform and Load. This is a traditional data integration approach in which data from multiple sources is consolidated in a central system. These sources could include CRM, E-Commerce Websites and Helpdesk data or even more.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;&lt;em&gt;How ETL Works&lt;/em&gt;&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Extract(E):&lt;/em&gt;&lt;/strong&gt; Here, data is pulled from various sources e.g databases, APIs, files&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Transform(T):&lt;/em&gt;&lt;/strong&gt; It is here where data cleaning and modifications are done before storage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Load(L):&lt;/em&gt;&lt;/strong&gt;After transforming, the data has to be loaded in a database or warehouse for consumption&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkn9wyrkblbxl6fft9901.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkn9wyrkblbxl6fft9901.png" alt="image01" width="800" height="434"&gt;&lt;/a&gt;&lt;br&gt;
The diagram above illustrates how ETL works&lt;/p&gt;

&lt;h4&gt;
  
  
  What is ELT Works
&lt;/h4&gt;

&lt;p&gt;As the acronym suggests, this stands for Extract, Load and Transform. This is a mutant of ETL, a modern approach in which data is first loaded into the warehouse and then transformed using powerful cloud engines.&lt;/p&gt;

&lt;p&gt;ELT is mostly popular with cloud-based services and service providers such as Amazon Web Services, Microsoft Azure and Google Cloud. It is preferred because of its ability to handle and process large data, its flexibility and its ability to be developed faster as compared to an ETL.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;&lt;em&gt;How ELT Works&lt;/em&gt;&lt;/strong&gt;
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Extract(E):&lt;/em&gt;&lt;/strong&gt; Here, data is pulled from various sources &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Load(L):&lt;/em&gt;&lt;/strong&gt;The raw data is loaded into the system, whereby sensitive data are either masked, encrypted or dropped.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Transform(T):&lt;/em&gt;&lt;/strong&gt; It is here where data cleaning and modifications are done using the target system's computing resources.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft4pktblnoznzu5c53z5g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft4pktblnoznzu5c53z5g.png" alt="Image2" width="800" height="434"&gt;&lt;/a&gt;&lt;br&gt;
The diagram above illustrates how ELT works&lt;/p&gt;

&lt;h3&gt;
  
  
  Differences between ETL and ELT
&lt;/h3&gt;

&lt;p&gt;One of the key and noticeable differences between the two processes is the point at which data is transformed and how the warehouse retains the data. ETL transforms data outside the warehouse in a different server, while ELT transform data within the warehouse itself. Also, it should be clear that ETL does not transmit or move raw data into the warehouse, while ELT, on the other hand, moves raw data into the pipeline.&lt;/p&gt;

&lt;p&gt;ELT processes data faster as compared to the ELT process because ETL involves preliminary transformation before loading data into the warehouse, thereby making it difficult to scale, and hence, as the size of the data grows, the performance slows down. On the other hand, ELD loads the data directly into the target warehouse, saving time and easing scalability since transformation is done in parallel.&lt;/p&gt;

&lt;p&gt;Data ingestion in ETL is slowed down as a result of transforming data on a separate server before loading. On the other hand, ELT delivers faster ingestion since the process of loading and transformation can be done simultaneously.&lt;/p&gt;

&lt;p&gt;When it comes to processing unstructured data, ELT is the best since it provides superior processing of structured, semi-structured and unstructured data as compared to ETL, which is best when it comes to structured data only.&lt;/p&gt;

&lt;p&gt;We can firmly confirm that ELT outdoes ETL in many aspects, such as speed, cost, privacy, maintenance, flexibility, volume, and many other aspects and that it is a process that is going to come in handy with the everyday advancements in the data world.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Use Case for ETL
&lt;/h2&gt;

&lt;p&gt;In a world where data is growing constantly and evolving ETL is an efficient way of data handling because of its ability to solve key data management problems by ensuring data accuracy, consistency, and availability, which is key for decision-making.&lt;/p&gt;

&lt;p&gt;ETL enables &lt;strong&gt;&lt;em&gt;real-time data analysis for business insights&lt;/em&gt;&lt;/strong&gt;. Businesses need to accurately make decisions in a dynamic business environment. ETL ensures data is extracted, transformed and loaded as it's generated, allowing businesses to respond to market changes, optimize supply chain and track customer behaviours instantly.&lt;/p&gt;

&lt;p&gt;ETL has facilitated &lt;strong&gt;&lt;em&gt;migration of data from legacy systems to modern platforms&lt;/em&gt;&lt;/strong&gt;. The use of ETL has ensured safe migration of data from one system to another without losing data integrity and consistency.&lt;/p&gt;

&lt;p&gt;ETL process can &lt;strong&gt;&lt;em&gt;integrate and transform customer data from multiple touchpoints&lt;/em&gt;&lt;/strong&gt;. In the case of an e-commerce business, customer data is so valuable, especially when it comes to offering personalized experience.&lt;/p&gt;

&lt;p&gt;The manufacturing sector is also a major user of ETL process, especially when it comes to predictive maintenance to reduce downtime and prevent costly breakdowns. ETL &lt;strong&gt;&lt;em&gt;processes collect and transform data from IoT sensors and machinery to help predict when required maintenance is needed.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Data governance and compliance&lt;/em&gt;&lt;/strong&gt; is another area where ETL process can come in handy. Institutions and sectors that handle sensitive data, such as healthcare, finance or security sectors, must comply with strict regulatory requirements when it comes to data governance. Through ETL, data is transformed and loaded in compliance with the laid-out regulations, making ETL instrumental when it comes to the implementation of data governance policies and data security.&lt;/p&gt;

&lt;p&gt;We see that ETL, although it is a legacy process it is very crucial when it comes to day-to-day data handling, and it is crucial in ensuring accurate decision-making and ensuring data integrity is maintained.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Use Case for ELT
&lt;/h2&gt;

&lt;p&gt;ELT has emerged as an alternative in modern data architecture and is being adopted by many businesses since it offers much more compared to ETL.&lt;/p&gt;

&lt;p&gt;ELT pipelines let businesses extract customer data from all their channels, ad partners and marketing platforms, load the data into a cloud data warehouse and transform when needed. This helps when it comes to &lt;strong&gt;&lt;em&gt;building a unified customer profile&lt;/em&gt;&lt;/strong&gt;. Unified data enables faster, more profitable decisions, new revenue streams and stronger customer loyalty through personalized experiences.&lt;/p&gt;

&lt;p&gt;Banks, payment processors and other fintech companies use ELT process to &lt;strong&gt;&lt;em&gt;detect fraud and assess risk in real time&lt;/em&gt;&lt;/strong&gt; across millions of transactions. This has helped to avoid scams and protect customers.&lt;/p&gt;

&lt;p&gt;Medium and large enterprises in retail and manufacturing use ELT to &lt;strong&gt;&lt;em&gt;optimize their supply chains and inventory levels across warehouses, stores and distribution channels&lt;/em&gt;&lt;/strong&gt;. This is done through the creation of an efficient supply chain control tower in cloud data platforms.&lt;/p&gt;

&lt;p&gt;The healthcare industry can use ELT to &lt;strong&gt;&lt;em&gt;securely combine structured and unstructured data at scale&lt;/em&gt;&lt;/strong&gt;. This will help the healthcare providers and hospital systems to improve patient outcomes and operational efficiency.&lt;/p&gt;

&lt;p&gt;ELT is tasked with &lt;strong&gt;&lt;em&gt;data migration and consolidation into cloud data warehouses or lake houses&lt;/em&gt;&lt;/strong&gt;. This helps to solve many business problems, especially when it comes to consolidating data from multiple sources and systems, as well as upgrading to a more agile analytics environment.&lt;/p&gt;

&lt;p&gt;Enterprises across multiple industries use ELT-based consolidation of data to create a single source of truth. This results in a scalable, low-maintenance environment that is supportive of advanced analytics.&lt;/p&gt;

&lt;h3&gt;
  
  
  ETL TOOLS
&lt;/h3&gt;

&lt;p&gt;When it comes to data integration, having the right tools makes all the difference. Some of the ETL tools available include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Matillion&lt;/em&gt;&lt;/strong&gt; - cloud-based data integration platform with AI function, designed to simplify and accelerate ETL processes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;IBM DataStage&lt;/em&gt;&lt;/strong&gt; - Designed to support data integration across multiple sources and targets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Informatica PowerCenter&lt;/em&gt;&lt;/strong&gt; - An enterprise-grade ETL platform used by businesses to guarantee robust and efficient data integration across various sources.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Talend Data Fabric&lt;/em&gt;&lt;/strong&gt; - a tool that provides a range of data integration management solutions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Astera Centerprise&lt;/em&gt;&lt;/strong&gt; - This tool simplifies complex ETL processes with an intuitive, no-code approach. &lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ELT TOOLS
&lt;/h3&gt;

&lt;p&gt;Unlike ETL, ELT processes leverage the computation power of cloud data warehouses. Here are some of the ELT tools;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Azure Data Factory&lt;/em&gt;&lt;/strong&gt; - Is a cloud-based data integration service that automates the movement and transformation of data from various sources to destinations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Google Cloud Dataflow&lt;/em&gt;&lt;/strong&gt; - Is a fully managed stream and batch data processing service that enables users to develop and execute data processing pipelines with ease.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS Glue&lt;/em&gt;&lt;/strong&gt; - is a serverless data integration service that can be used for analytics, machine learning and application development.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Rivery&lt;/em&gt;&lt;/strong&gt; - Is a cloud-native ELT platform that automates data integration, transformation and orchestration without the need for infrastructure management.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Airbyte&lt;/em&gt;&lt;/strong&gt; - An open source data integration platform that simplifies ELT process by providing pre-built connectors for seamless data movement across various points.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) are two approaches used to move and prepare data for analysis, but they differ mainly in when the transformation happens. In ETL, data is cleaned and structured before it is loaded into a storage system, making it ideal for environments where data quality and consistency are critical from the start. This approach is commonly used in traditional systems where storage and processing power are limited, and only refined data is needed for reporting.&lt;/p&gt;

&lt;p&gt;ELT, on the other hand, loads raw data directly into a data warehouse first and then performs transformations within that system. This method takes advantage of modern cloud platforms that offer strong processing capabilities, allowing teams to store large volumes of data and transform it as needed. ELT is more flexible and faster for data ingestion, making it well-suited for big data analytics, real-time dashboards, and data science work where raw data exploration is important.&lt;/p&gt;

&lt;p&gt;Choosing between ETL and ELT depends on your specific needs. If your priority is strict data quality, control, and working with structured systems, ETL is often the better choice. However, if you need scalability, speed, and flexibility—especially in cloud-based environments—ELT is usually more effective. In many modern setups, organizations combine both approaches to balance control with performance and adaptability.&lt;/p&gt;

</description>
      <category>etl</category>
      <category>elt</category>
      <category>datapipeline</category>
    </item>
    <item>
      <title>FROM SQL TO POWER BI FOR ANALYSIS</title>
      <dc:creator>Shaban Ibrahim</dc:creator>
      <pubDate>Mon, 16 Mar 2026 04:51:17 +0000</pubDate>
      <link>https://dev.to/shabex/from-sql-to-power-bi-for-analysis-10ei</link>
      <guid>https://dev.to/shabex/from-sql-to-power-bi-for-analysis-10ei</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;As a data analyst, you are probably interacting with &lt;strong&gt;Microsoft Power BI&lt;/strong&gt; in your everyday operations, because it is one of the most powerful business intelligence and data visualisation tools ever developed. &lt;strong&gt;Power BI&lt;/strong&gt; is known for its ability to transform raw data and generate meaningful insights, reports and analytics through creating interactive dashboards for consumption by the end users of the data. Most companies and businesses rely on it to analyse trends, monitor performance and support data-driven decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL databases&lt;/strong&gt; play a critical role in storing and managing analytical data in modern organisations. Structured Query Language (SQL) databases are designed to organise large volumes of structured data into tables consisting of rows and columns. This structure makes it easier to store, retrieve, and manipulate data efficiently, which is essential for data analysis and business decision-making.&lt;/p&gt;

&lt;p&gt;SQL databases form the foundation of many modern data systems, including data warehouses and analytics platforms. Databases such as PostgreSQL, MySQL, and Microsoft SQL Server are widely used to store structured analytical data that can later be analysed using statistical tools, machine learning models, or reporting platforms. Power BI has the ability to connect directly to such databases to allow analysts to access real-time data for data visualisation and reporting.&lt;/p&gt;

&lt;h1&gt;
  
  
  Linking Power BI
&lt;/h1&gt;

&lt;h2&gt;
  
  
  1. To a Local PostgreSQL Database
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;You start by opening your Power BI Desktop, as it is the main environment for dashboard and report creation.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;2. Pulling your data from the database of your choice&lt;/strong&gt;&lt;br&gt;
Click 'Get Data', then in the drop-down, you will click "more", which will lead you to another interface where you will choose "database", then choose "PostGreSQL", then connect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rblbuefqp17a09k03xo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rblbuefqp17a09k03xo.png" alt="Image 01" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxy4ot5wybvv3esloxuc9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxy4ot5wybvv3esloxuc9.png" alt="Image 02" width="800" height="547"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqsxeceq2fb62dq3qrb7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqsxeceq2fb62dq3qrb7.png" alt="Image 03" width="800" height="558"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Connecting to PostgreSQL Database.&lt;/strong&gt; &lt;br&gt;
Here, you will be prompted to enter database connection details.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgj2e88az4wq8h8ia9dfq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgj2e88az4wq8h8ia9dfq.png" alt="Image 04" width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the dialogue box, you add the server_name, which is &lt;strong&gt;&lt;em&gt;localhost:5432&lt;/em&gt;&lt;/strong&gt; (default server name for locally connected PostgreSQL database) and the name of the database you would like to connect to. In my case, the database is &lt;strong&gt;&lt;em&gt;walmart_db&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7p7qq7nt1i1htywrp9x1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7p7qq7nt1i1htywrp9x1.png" alt="Image 05" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Configuration with your database&lt;/strong&gt; &lt;br&gt;
You will be directed to another dialogue table that will be used for configuration purposes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fncyahuy9ilz40bdr65cj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fncyahuy9ilz40bdr65cj.png" alt="Image 06" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since you are connecting from your local environment, the default connection details for PostgreSQL are&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight properties"&gt;&lt;code&gt;&lt;span class="py"&gt;Host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;localhost&lt;/span&gt;
&lt;span class="py"&gt;Port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5432&lt;/span&gt;
&lt;span class="py"&gt;Username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres&lt;/span&gt;
&lt;span class="py"&gt;Database&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres&lt;/span&gt;
&lt;span class="py"&gt;Password&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;your_password&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe4zx63m9wbzpnyqyi4sc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe4zx63m9wbzpnyqyi4sc.png" alt="Image 07" width="796" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Loading your data from your database to Power BI&lt;/strong&gt; &lt;br&gt;
After filling in the right credentials and connecting, once connected successfully, a navigator window will appear, displaying the tables available in PostgreSQL. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyrtmrbthcsolupa7t3li.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyrtmrbthcsolupa7t3li.png" alt="Image 08" width="800" height="622"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click "Load" to import the selected tables into  Power BI&lt;br&gt;
In our case, we have loaded our table that has more than 9000 rows&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtmw8m45iven4y15hsd3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxtmw8m45iven4y15hsd3.png" alt="Image 09" width="800" height="445"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  2. To a Cloud PostgreSQL Database
&lt;/h2&gt;

&lt;p&gt;Modern systems databases are hosted in the cloud environment as opposed to the traditional local environment for management, because it offers flexibility, scalability, and reliability compared to traditional systems. &lt;/p&gt;

&lt;p&gt;For organizations working with large analytical datasets, cloud databases provide a flexible and powerful foundation for data-driven decision-making. It also improves scalability, accessibility, cost efficiency, reliability, and integration with advanced analytics tools.&lt;/p&gt;

&lt;p&gt;One such cloud platform that provides fully managed open-source data infrastructure services is &lt;strong&gt;Aiven&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Obtain Connection Details&lt;/strong&gt;&lt;br&gt;
On your &lt;strong&gt;Aiven&lt;/strong&gt; page after creating PostgreSQL service, you can retrieve connection details such as &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Host&lt;/li&gt;
&lt;li&gt;Port&lt;/li&gt;
&lt;li&gt;Database Name&lt;/li&gt;
&lt;li&gt;Username&lt;/li&gt;
&lt;li&gt;Password
These details will be used in Power BI for the connection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36puorxsq5tzktwk4j96.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F36puorxsq5tzktwk4j96.png" alt="Image 10" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Download SSL Certificate&lt;/strong&gt;&lt;br&gt;
An SSL certificate (Secure Sockets Layer certificate) is a digital certificate that enables encrypted and secure communication between your application and database, protecting credentials and data during transmission.&lt;/p&gt;

&lt;p&gt;From the Aiven dashboard, locate the CA certificate and download it. Save it locally.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Furvlt8as6jkgbfxmj7ww.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Furvlt8as6jkgbfxmj7ww.png" alt="Image 11" width="800" height="431"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After down loading the CA certificate, you create a folder in your computer and store the CA certificate there and rename it from &lt;code&gt;ca.pem&lt;/code&gt; to &lt;code&gt;ca.crt&lt;/code&gt;. Power BI will automatically detect the &lt;code&gt;ca.crt&lt;/code&gt;, enable SSL and verify server certificate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Connect to Power BI&lt;/strong&gt;&lt;br&gt;
Click 'Get Data', then in the drop-down, you will click "more", which will lead you to another interface where you will choose "database", then choose "PostGreSQL", then connect.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rblbuefqp17a09k03xo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3rblbuefqp17a09k03xo.png" alt="Image 01" width="800" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxy4ot5wybvv3esloxuc9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxy4ot5wybvv3esloxuc9.png" alt="Image 02" width="800" height="547"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqsxeceq2fb62dq3qrb7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzqsxeceq2fb62dq3qrb7.png" alt="Image 03" width="800" height="558"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Connecting to PostgreSQL Database.&lt;/strong&gt; &lt;br&gt;
Here, you will be prompted to enter database connection details.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgj2e88az4wq8h8ia9dfq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgj2e88az4wq8h8ia9dfq.png" alt="Image 04" width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the dialogue box, you add the server_name, which is &lt;strong&gt;&lt;em&gt;pg-2f14f89-shabsibrah-9c1c.c.aivencloud.com:10116&lt;/em&gt;&lt;/strong&gt; (server name from Aiven and the port number) and the name of the database you would like to connect to. In my case, the database is &lt;strong&gt;&lt;em&gt;luxsales&lt;/em&gt;&lt;/strong&gt; from DBeaver, click &lt;em&gt;OK&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqb6j6fz05floir4ew4f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcqb6j6fz05floir4ew4f.png" alt="Image 12" width="800" height="474"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You will go back to your Aiven and copy the user_name and password, and fill in the dialogue box and click "Connect".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqij4xlpomqf5rcn7r6fn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqij4xlpomqf5rcn7r6fn.png" alt="Image 13" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fynyor4vi8gl7y4vsdopx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fynyor4vi8gl7y4vsdopx.png" alt="Image 14" width="800" height="454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Loading your data from your database to Power BI&lt;/strong&gt; &lt;br&gt;
After filling in the right credentials and authenticating, once it connects successfully, a navigator window will appear, displaying all the tables available in that particular database. Select the tables that you need and load them into Power BI&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c2twdr9r00nrxq1d8j4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2c2twdr9r00nrxq1d8j4.png" alt="Image 15" width="800" height="631"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Loaded Tables and Relationships between them&lt;/strong&gt;&lt;br&gt;
Once te tables have been loaded to your Power BI the next step is to carry out data modelling by detecting and establishing relationships that exist between the loaded tables.&lt;/p&gt;

&lt;p&gt;This is first done by understanding what each table communicates&lt;/p&gt;

&lt;p&gt;Customer table --------&amp;gt; Gives information about the customer&lt;br&gt;
Products table --------&amp;gt; Gives information about the products&lt;br&gt;
Invetory table --------&amp;gt; Gives details on the stock levels&lt;br&gt;
Sales table -----------&amp;gt; This is the fact table tells more on sales tranactions.&lt;/p&gt;

&lt;p&gt;Power BI automatically attempts to detect and establish the relationships that exist between the tables such that it connects table using primary key from dimension table to the corresponding foreign key in a facts table or foreign key of another dimension table.&lt;/p&gt;

&lt;p&gt;For examples&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sales.customer_id -&amp;gt; customers.customer_id&lt;/li&gt;
&lt;li&gt;sales.product_id -&amp;gt; products.product_id&lt;/li&gt;
&lt;li&gt;inventory.product_id -&amp;gt; products.product_id&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result can be seen on the model view with a diagram displaying the final model created.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyuoq071n4v5mfw2w8fby.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyuoq071n4v5mfw2w8fby.png" alt="Image 16" width="800" height="362"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Data Modelling is a very powerful process since it helps in combining data correctly, proper filtering of data across the tables and finally ensuring that accurate reports are built at the end.&lt;/p&gt;
&lt;h2&gt;
  
  
  Power in combining SQL with Power BI
&lt;/h2&gt;

&lt;p&gt;SQL has proven to be one of the most powerful analysis tools to date, it comes in very handy especially when you are dealing with large datasets. This is well completemented with Power BI which is one of the most powerful tools of data visualisation and transformation capabilities.&lt;/p&gt;

&lt;p&gt;With use of proper queries SQL can effectively retrieve data effeciently from very large databases.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;lt;/&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;SQL&lt;/span&gt;
&lt;span class="k"&gt;Select&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sum_rev&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SQL helps analysts filter datasets before loading them into Power BI. Instead of importing an entire database, analysts can use SQL &lt;code&gt;WHERE&lt;/code&gt; clauses to limit the data to a specific period, region, or category&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;lt;/&amp;gt;&lt;/span&gt;&lt;span class="k"&gt;SQL&lt;/span&gt;
&lt;span class="k"&gt;Select&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;customer_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;count_client&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt;
&lt;span class="k"&gt;group&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;"Male"&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;SQL enables analysts to perform aggregations and calculations directly within the database. Using functions such as &lt;code&gt;SUM&lt;/code&gt;, &lt;code&gt;COUNT&lt;/code&gt;, &lt;code&gt;AVG&lt;/code&gt;, and &lt;code&gt;GROUP BY&lt;/code&gt;, analysts can compute metrics like total sales, average product prices, or the number of customers per region before the data reaches Power BI.&lt;/p&gt;

&lt;p&gt;Additionally, SQL is extremely useful for preparing and transforming data. Analysts frequently need to join multiple tables, clean inconsistent values, or create intermediate datasets. SQL features such as &lt;code&gt;JOIN&lt;/code&gt;, &lt;code&gt;CASE&lt;/code&gt;, &lt;code&gt;CTE&lt;/code&gt; (Common Table Expressions), and subqueries allow analysts to shape the data into a structure that works well with Power BI’s data model.&lt;/p&gt;

&lt;p&gt;Finally, SQL complements Power BI by enabling analysts to build reliable and scalable data pipelines. When data is properly prepared using SQL, the dashboards built in Power BI become simpler, faster, and easier to maintain.&lt;/p&gt;

&lt;p&gt;By combining the power of SQL for data extraction and transformation with visualization capabilities in Microsoft Power BI, analysts can build insightful dashboards that support better business decision-making.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;While Power BI is a very powerful, most effective business intelligence platform for transforming database to interactive dashboards and reports, connecting it with SQL databases such as PostgreSQL analysts can access large volumes of data and analyse it effeciently.&lt;/p&gt;

</description>
      <category>microsoftpowerbi</category>
      <category>pgsql</category>
      <category>businessintelligence</category>
      <category>database</category>
    </item>
    <item>
      <title>UNDERSTANDING SQL JOIN AND WINDOW FUNCTIONS</title>
      <dc:creator>Shaban Ibrahim</dc:creator>
      <pubDate>Tue, 10 Mar 2026 14:06:32 +0000</pubDate>
      <link>https://dev.to/shabex/understanding-sql-join-and-window-functions-431d</link>
      <guid>https://dev.to/shabex/understanding-sql-join-and-window-functions-431d</guid>
      <description>&lt;h1&gt;
  
  
  INTRODUCTION
&lt;/h1&gt;

&lt;p&gt;SQL is one of the most powerful tools that has stood the test of time when it comes to the world of data. It has proved to be so reliable and flexible enough to work on any relational data. It has many functionalities within it that make data manipulation easy to navigate.&lt;br&gt;
Among the functionalities are the join and window functions.&lt;/p&gt;
&lt;h2&gt;
  
  
  JOIN FUNCTION
&lt;/h2&gt;

&lt;p&gt;This comes in handy when dealing with data spread across multiple tables. The join function is the one used when one wants to analyse data from different tables and perform data analysis or data engineering. It uses related columns from the different normalised tables and combines the rows. It can be used in two or more tables. There are different types of joins, including:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Inner Join&lt;/strong&gt;&lt;br&gt;
This is one of the most common forms of join that is frequently used. It helps to join two tables based on the rows that have matching values in both tables. This means that if there are values that do not match in the two tables, they will be dropped from the resultant joined tables.&lt;/p&gt;

&lt;p&gt;The syntax of an inner join looks like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;inner&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assuming you have products and sales tables&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the above example, you could see that the two tables are relatable by the product_id column in both tables, that make it easy to join the two tables.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flh4.googleusercontent.com%2FZaF77tpvsDLgQgWHqOdbzJVPDJ2EAmp4OSSmbk0sSFe4TcsS3jtxY7EmyfMQ3ta6WIs7bg_ZvKsUZOHkXETHRltRXC5YNi4brzpchn4HBzq4dThms2jAyU9E2KohfEL7j0fOevT5PeMdOdKEJj24C1Y" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flh4.googleusercontent.com%2FZaF77tpvsDLgQgWHqOdbzJVPDJ2EAmp4OSSmbk0sSFe4TcsS3jtxY7EmyfMQ3ta6WIs7bg_ZvKsUZOHkXETHRltRXC5YNi4brzpchn4HBzq4dThms2jAyU9E2KohfEL7j0fOevT5PeMdOdKEJj24C1Y" alt="inner join"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Left Join&lt;/strong&gt;&lt;br&gt;
Also called &lt;strong&gt;&lt;em&gt;Left Outer Join&lt;/em&gt;&lt;/strong&gt;, this one returns every row on the left table and only the matching columns on the right table. This means there will be null values returned from the right tables against the unmatched and unreturned values on the left table.&lt;/p&gt;

&lt;p&gt;The syntax of a left join looks like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;left&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assuming you have products and sales tables&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;left&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the above examples, all the rows from the products table will be returned, and on matching rows(product_id) from the sales department will be returned. The excess rows from the products table will have null values in the sales department.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flh6.googleusercontent.com%2FI7BWNmU-rtwMozqKzbWBRgnk2sIv1a1FGElwOheS4ybu8o8erqvNR8Z57CsHndxMpdKlUq8jqaDqyUt7pR775-lSupnm_Cqe5nyncxH3eh0MTf3IA2cWz_8rnMWyDXFIrf-z_MM18zMVs_rQuiIigOw" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flh6.googleusercontent.com%2FI7BWNmU-rtwMozqKzbWBRgnk2sIv1a1FGElwOheS4ybu8o8erqvNR8Z57CsHndxMpdKlUq8jqaDqyUt7pR775-lSupnm_Cqe5nyncxH3eh0MTf3IA2cWz_8rnMWyDXFIrf-z_MM18zMVs_rQuiIigOw" alt="left Join"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Right Join&lt;/strong&gt;&lt;br&gt;
Also called &lt;strong&gt;&lt;em&gt;Right Outer Join&lt;/em&gt;&lt;/strong&gt;, this one returns every row on the right table and only the matching columns on the left table. This means there will be null values returned from the left tables against the unmatched and unreturned values on the right table.&lt;/p&gt;

&lt;p&gt;The syntax of a right join looks like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;right&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assuming you have products and sales tables&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;right&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From the above examples, all the rows from the products table will be returned, and on matching rows(product_id) from the sales department will be returned. The excess rows from the products table will have null values in the sales department.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flh4.googleusercontent.com%2FklmC4Kg_LYE9Ic7TO7Yw8IBJMHAx9GBeBm2RU2w3lUM_-TtHUfqYOXgDzoQMCGFLmBVhSeaauM9gUhQUwbt9IZvW7zy1TwLGGONoq8Kn0rPeZmR4txw2LfBctj2TNCPAYUrxe3IgWudpI5FOZ8EWrX-yMa-aVR12tNLX9e_4ggagP1lYGwxA1w-B" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flh4.googleusercontent.com%2FklmC4Kg_LYE9Ic7TO7Yw8IBJMHAx9GBeBm2RU2w3lUM_-TtHUfqYOXgDzoQMCGFLmBVhSeaauM9gUhQUwbt9IZvW7zy1TwLGGONoq8Kn0rPeZmR4txw2LfBctj2TNCPAYUrxe3IgWudpI5FOZ8EWrX-yMa-aVR12tNLX9e_4ggagP1lYGwxA1w-B" alt="right join"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Full Outer Join&lt;/strong&gt;&lt;br&gt;
This one combines both the &lt;strong&gt;&lt;em&gt;Left outer join&lt;/em&gt;&lt;/strong&gt; and the &lt;strong&gt;&lt;em&gt;Right outer join&lt;/em&gt;&lt;/strong&gt;. This one shows all the rows from both tables, including unmatched rows from both tables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;full&lt;/span&gt; &lt;span class="k"&gt;outer&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assuming you have products and sales tables&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;full&lt;/span&gt; &lt;span class="k"&gt;outer&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe621krcwr3f1bvivzspj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe621krcwr3f1bvivzspj.png" alt="Full Outer Join"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Cross Join&lt;/strong&gt;&lt;br&gt;
This one returns every row from the first table combined with every row from the second, i.e., a cartesian product. Assuming you have 3 customers and 3 orders, the resulting table will have 9 rows.&lt;/p&gt;

&lt;p&gt;The syntax of a cross join looks like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;cross&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assuming you have products and sales tables&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;cross&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fencrypted-tbn0.gstatic.com%2Fimages%3Fq%3Dtbn%3AANd9GcQp7N5YVLgNPJkKIKnZe1ip8KYvYepon-IDww%26s" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fencrypted-tbn0.gstatic.com%2Fimages%3Fq%3Dtbn%3AANd9GcQp7N5YVLgNPJkKIKnZe1ip8KYvYepon-IDww%26s" alt="cross join"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Self Join&lt;/strong&gt;&lt;br&gt;
This is where the table is joined to itself; this is mostly used when one wants to compare rows within the same table. For this to be effective, table aliases are used to treat the same table as two separate tables in a query.&lt;/p&gt;

&lt;p&gt;The syntax of a self-join looks like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;
&lt;span class="k"&gt;self&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;A&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;B&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;column&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A and B are aliases representing the same table1&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;Employees&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;
&lt;span class="k"&gt;self&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;Employees&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;
&lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;E&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;manager_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;employee_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faouno30h4ei7pexyvaxp.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faouno30h4ei7pexyvaxp.jpg" alt="self join"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7 Natural Join&lt;/strong&gt;&lt;br&gt;
This is a type of join that combines two tables with the same name and compatible data types. This one does not need to join using a condition &lt;em&gt;ON&lt;/em&gt;  because it automatically detects common columns and uses them to match rows.&lt;/p&gt;

&lt;p&gt;The syntax of a self-join looks like&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt; 
&lt;span class="k"&gt;natural&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Assuming you have products and sales tables&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;natural&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;sales&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxaza8sbqcesd806t07f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxaza8sbqcesd806t07f.png" alt="Natural Joins"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The downside of natural joins is that if the tables have common multiple columns with the same name, SQL will join using all the columns, producing unexpected results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Window Functions
&lt;/h2&gt;

&lt;p&gt;These are functions that perform calculations across a set of rows related to the current rows without grouping them into a single result. They return results for each row, unlike the aggregate functions, such as &lt;em&gt;sum()&lt;/em&gt; or &lt;em&gt;avg()&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The window functions use &lt;strong&gt;&lt;em&gt;over()&lt;/em&gt;&lt;/strong&gt; clause to define the windows or a set of rows.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="n"&gt;emp_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
       &lt;span class="k"&gt;avg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;over&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;partition&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;department&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;avg_department_salary&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one calculates the average salary for each department while returning the name of each employee.&lt;/p&gt;

&lt;p&gt;Among the common window functions are &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Row_Number()&lt;/strong&gt;&lt;br&gt;
This one assigns a unique number to each row starting from 1, following the specified order by &lt;strong&gt;&lt;em&gt;order by&lt;/em&gt;&lt;/strong&gt; clause.&lt;br&gt;
In the case of &lt;strong&gt;&lt;em&gt;partition by&lt;/em&gt;&lt;/strong&gt;, the starting number will reset on each new partition.&lt;/p&gt;

&lt;p&gt;when using &lt;strong&gt;&lt;em&gt;row_number()&lt;/em&gt;&lt;/strong&gt; without resetting the row number&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
      &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;row_number&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;over&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;row_num&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;when using &lt;strong&gt;&lt;em&gt;row_number()&lt;/em&gt;&lt;/strong&gt; while resetting the row number&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
      &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;row_number&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;over&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;partition&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;row_num&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Rank()&lt;/strong&gt;&lt;br&gt;
Assigns ranks to rows but allows gaps when there are ties, for example, whenever ranking say order_id based on prices, if the first two order_ids have the same price will rank 1, the next order will be ranked 3rd.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
      &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;over&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;order_rank&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Dense_Rank()&lt;/strong&gt;&lt;br&gt;
As opposed to &lt;strong&gt;&lt;em&gt;Rank()&lt;/em&gt;&lt;/strong&gt;, this one does not allow gaps between ranks even if there are ties.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
      &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;product_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;dense_rank&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="n"&gt;over&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;order_rank&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Lead()&lt;/strong&gt;&lt;br&gt;
This one is used to access the next rows in a particular table, useful for comparing or predicting the next period&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt;
      &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;sale_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;lead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;over&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;next_day&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sale&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Lag()&lt;/strong&gt;&lt;br&gt;
This one is used to access the previous row values in a particular table, useful for comparing with the previous period&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt;
      &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;sale_amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;lag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;over&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;previous_day&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sale&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;5. Ntile()&lt;/strong&gt;&lt;br&gt;
This one is mostly used when dividing a result into a specific number of equal groups. Each row will be assigned a specific group number based on the order specified in the query. It is commonly used when categorizing groups such as &lt;strong&gt;&lt;em&gt;quartile, percentiles or rankings.&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;select&lt;/span&gt; 
     &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
     &lt;span class="n"&gt;salary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
     &lt;span class="n"&gt;ntile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;over&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;salary&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;salary_group&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;employee&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;SQL joins and window functions are powerful tools that enable analysts to work efficiently with complex datasets. Joins allow us to bring together data stored across multiple tables, while window functions provide deeper analytical capabilities without losing row-level detail. Mastering both techniques significantly improves the ability to analyse, transform, and derive insights from data.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>postgressql</category>
      <category>joins</category>
      <category>windowfunctions</category>
    </item>
    <item>
      <title>ANALYSTS TURNING MESSY DATA INTO ACTIONABLE INSIGHT</title>
      <dc:creator>Shaban Ibrahim</dc:creator>
      <pubDate>Sun, 08 Feb 2026 07:29:56 +0000</pubDate>
      <link>https://dev.to/shabex/analysts-turning-messy-data-into-actionable-insight-3d99</link>
      <guid>https://dev.to/shabex/analysts-turning-messy-data-into-actionable-insight-3d99</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;Have you ever come across a fancy, attractive and well-packaged dashboard in your company, media, internet or even some presentation? They might have been sales dashboards, Attrition dashboards, customer churn dashboard or even financial reporting dashboards. At first, you are curious to understand what the dashboard is trying to communicate to its users, and then you become more curious about how the dashboard was created and who created it.&lt;br&gt;
As a junior data analyst or an enthusiast in the world of data, you are obviously curious and anxious to understand what the life of an analyst is, and their daily routine. Well, at least 75% of the work of a data analyst is to clean messy data.&lt;/p&gt;
&lt;h2&gt;
  
  
  Messy Data
&lt;/h2&gt;

&lt;p&gt;It is normal and very okay for a data analyst to first deal with messy data before continuing with other analysis tasks. Thismight be messy in that it is full of inconsistencies, duplicates, incomplete or even spread across different sources that are not compatible. For example, you are dealing with sales figures from different departments, regions using different naming conventions, dates stored as texts, currencies not standardised, and a missing key field, such as ID. As an analyst, you need to address the messy data first for the accuracy of your analysis and to avoid wrong and misleading insights.&lt;/p&gt;
&lt;h2&gt;
  
  
  What to do as an analyst
&lt;/h2&gt;

&lt;p&gt;As the data analyst that you are, you need to start by understanding the context of the data and the data itself. Understand the sources, the users and the question you are to answer as an analyst. Understanding some of these concepts will guide you in the process of cleaning your data for correct insights and accurate analysis. Note that not all inconsistencies should bother you; what matters most are those that will inform your final analysis and make it accurate.&lt;/p&gt;
&lt;h2&gt;
  
  
  PowerQuery
&lt;/h2&gt;

&lt;p&gt;Now you might be asking yourself how data cleaning is done and how cumbersome the process is. One thing as an Analyst, you must have come across is &lt;strong&gt;&lt;em&gt;PowerQuery&lt;/em&gt;&lt;/strong&gt;. As a data analyst, understanding Power Query should be mandatory because that is where most of the data analyst's time is spent. &lt;br&gt;
PowerQuery is so powerful in that it is used for:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1- Cleaning and Transforming of Data&lt;/strong&gt;&lt;br&gt;
It is a very powerful tool that is used for cleaning, transforming anf combining data in a structured and repeatable way.&lt;br&gt;
It is at the PowerQuery where tasks such as removal of duplicates, fixing data types, handling missing values and standardisation of text fields are done. Here, you can also merge data from different sources.&lt;br&gt;
On the same PowerQuery tasks, such as adding, removing columns and adding new columns to a data set are done. And the good thing about this is that every activity carried out within the PowerQuery is recorded for reference and also to easily track where a mistake was made during the cleaning and transformation stage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2- Data Modelling&lt;/strong&gt;&lt;br&gt;
 After the data has been cleaned, the data set now proceeds to the next stage of analysis, a very crucial phase for that matter, data modelling. Data Modelling in &lt;strong&gt;&lt;em&gt;Power BI&lt;/em&gt;&lt;/strong&gt; is where relationships are defined, especially when working with more than one data table. The data tables are organised in a way that clearly shows how the business operates. Here is where the analyst chooses between a star schema and a snowflake schema, but for simplicity and efficiency, most analysts prefer the former to the latter. &lt;br&gt;
As an analyst, keep in mind that a well-defined data model is going to make analysis easier, faster and more reliable. So the analyst should be keen on the relationship direction, cardinality and granularity to ensure that the model answers the business question accurately.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3-Using DAX&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;&lt;em&gt;DAX (Data Analysis Expression)&lt;/em&gt;&lt;/strong&gt;. It is like a formula language used in Power BI and Power Pivot to create calculations on your data, kind of like Excel formulas, but on steroids. It is used by analysts to turn structured data into meaningful metrics.&lt;/p&gt;

&lt;p&gt;DAX is very useful when calculating measures such as total sales, growth rates, rolling averages, periodical performance comparison e.t.c&lt;br&gt;
It also comes in handy when creating tables and columns in your PowerQuery&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DAX for Calculated Measures

Total Sales = SUM(Sales[Amount])

Total Sales LY =
CALCULATE(
    [Total Sales],
    SAMEPERIODLASTYEAR(DimDate[Date])
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DAX for Calculated Column

Profit Status =
IF(
    Sales[Profit] &amp;gt; 0,
    "Profit",
    "Loss"
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DAX for Calculated Table

DimDate =
ADDCOLUMNS(
    CALENDAR (DATE(2018,1,1), DATE(2026,12,31)),
    "Year", YEAR([Date]),
    "Month", FORMAT([Date], "MMMM"),
    "Month No", MONTH([Date]),
    "YearMonth", FORMAT([Date], "YYYY-MM")
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;DAX can be very powerful if the analyst understands the dynamism that exists between the row context and filter context in order to achieve accuracy and efficiency. DAX is more of a translation of business needs into calculations that update dynamically and consistently without breaking across the entire report.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4- Visualisation&lt;/strong&gt;&lt;br&gt;
After cleaning the data, modelling and using DAX to make sense out of your data, as an analyst, you take the next step of your analysis journey, which is visualisation. Here is where you design your dashboard into a communication tool that answers specific questions. The analyst is guided by the business need and the question he intends to answer using the dashboard.&lt;/p&gt;

&lt;p&gt;A well-built and effective dashboard highlights Key Performance Indicators (KPI's), trends over time, and makes it easy to identify outliers. Here, the analyst should understand the type of visualisation to be used at a particular time to answer a specific question, for instance, line charts for trends, bar charts for comparisons, and tables for details.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5- Generating Insights&lt;/strong&gt; &lt;br&gt;
Analyst goes through the entire process, goes beyond dashboards to explain what the data is saying, answers the questions, identifies risks and opportunities, and makes recommendations.&lt;/p&gt;

&lt;p&gt;Integrating dashboards into an analyst's daily workflows, team meetings, performance reviews and operational check-ins makes them more of an accountability tool rather than just a reporting tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Through careful data cleaning and transformation, modelling, relevant and accurate DAX, and intentional dashboard designs analyst transforms chaos into clarity, turning messy data into an insightful, well-organised dashboard. This helps inform the present and shape the future.&lt;br&gt;
When done purposefully, the users see more answers and direction rather than seeing messy data, complex DAX formulas and technical models.&lt;/p&gt;

&lt;p&gt;That is how PowerQuery give the data analysts the powers he enjoys. &lt;/p&gt;

</description>
      <category>dax</category>
      <category>powerbi</category>
      <category>dashboard</category>
    </item>
    <item>
      <title>DATA MODELLING IN POWER BI</title>
      <dc:creator>Shaban Ibrahim</dc:creator>
      <pubDate>Sun, 01 Feb 2026 10:13:36 +0000</pubDate>
      <link>https://dev.to/shabex/data-modelling-in-power-bi-11h4</link>
      <guid>https://dev.to/shabex/data-modelling-in-power-bi-11h4</guid>
      <description>&lt;h2&gt;
  
  
  Data Modelling in Power BI
&lt;/h2&gt;

&lt;p&gt;For any successful solution when it comes to visualisation and analysis using &lt;strong&gt;&lt;em&gt;Power BI&lt;/em&gt;&lt;/strong&gt;, the user has to be well versed with &lt;em&gt;Data Modelling&lt;/em&gt; and understand how to navigate around it because that is the core foundation. The model determines a lot, especially when it comes to the outcome of the analysis in terms of performance, usability, DAX complexity and correctness of the results. &lt;/p&gt;

&lt;h2&gt;
  
  
  Facts and Dimensional Tables
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Fact Tables&lt;/strong&gt;&lt;br&gt;
These can be referred to as master tables; they store measurable and quantitative data. They contain numeric columns and mostly have the largest number of rows compared to the dimension tables, and have foreign keys that connect to the dimension tables. In cases where there are no dimension tables and an analyst wants to do a detailed analysis, they can use the fact table to generate a dimension table or tables.&lt;/p&gt;

&lt;p&gt;Examples of fact tables are the sales table, the transaction table, the inventory movement, and the customer table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dimension Tables&lt;/strong&gt;&lt;br&gt;
These tables provide context and detailed descriptions for the fact tables; they can contain textual or categorical data. The primary keys of dimension tables are the ones that form foreign keys in the fact table. They have fewer rows compared to the fact tables, and most of their attributes are used for filtering, grouping and slicing.&lt;/p&gt;

&lt;p&gt;An example of a dimension table includes: a date table, a geographic table, and a products table.&lt;/p&gt;
&lt;h2&gt;
  
  
  Star Schema
&lt;/h2&gt;

&lt;p&gt;A start schema can be described as a data modelling pattern where a fact table is connected to multiple dimension tables, and the dimension tables are &lt;em&gt;NOT&lt;/em&gt; connected to each other. A start schema is characterised by a single fact table, a one-to-many relationship from the dimension tables to the fact tables and denormalised dimension tables.&lt;/p&gt;

&lt;p&gt;For Examples a model might have FactSales, DimDate, DimCustomer, DimProduct e.t.c&lt;/p&gt;

&lt;p&gt;The start schema is commonly used in Power BI because it simplifies the model understanding, can predict filter behaviour, has simple and more reliable DAX calculations and reduces the risks of ambiguity.&lt;/p&gt;

&lt;p&gt;This is a recommended and preferred modelling approach.&lt;/p&gt;
&lt;h2&gt;
  
  
  Snowflake Schema
&lt;/h2&gt;

&lt;p&gt;This is a modelling pattern that is the mutation of the star schema and has its variations, where dimension tables are normalised into multiple related tables.&lt;/p&gt;

&lt;p&gt;The snowflake schema is characterised by the fact that, apart from the dimension tables connecting to the fact table alone, they are connected to other sub-dimension tables, making the structure hierarchical in nature.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;FactSales --&amp;gt; DimProducts
DimProducts --&amp;gt; DimProductsCategory
DimProductCategory --&amp;gt; DimProductGroup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The snowflake schema is less preferred because it has a more complex relationship chain, making the model more complex, harder to understand and maintain, might degrade performance and makes the DAX calculation complex as a result of indirect filter paths.&lt;/p&gt;

&lt;p&gt;The snowflake schema dimensions are usually flattened into a single denormalised dimension, i.e., start schema, to improve performance and usability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Relationship in Power BI
&lt;/h2&gt;

&lt;p&gt;A relationship connects tables so Power BI knows how data in one table relates to data in another. Without relationships, Power BI treats tables as unrelated. Power Bi supports several types of relationships&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Cardinality&lt;/strong&gt;&lt;br&gt;
It defines how rows match between tables:&lt;br&gt;
The Include :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;One-to-Many (most common and recommended)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Many-to-One&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;One-to-One&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Many-to-Many&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Active and Inactive Relationships&lt;/strong&gt;&lt;br&gt;
Only one active relationship can exist between the two tables at a time, and one can activate an inactive relationship by a DAX syntax &lt;code&gt;USEREALTIONSHIP&lt;/code&gt;&lt;br&gt;
Excessive use of inactive relationships can increase the risk of errors and the complexity of the model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Active → solid line
Inactive → dotted line
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Cross Filter Direction&lt;/strong&gt;&lt;br&gt;
This one controls the flow of the filter&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Single (Recommended)&lt;br&gt;
Filters go from dimension → fact&lt;br&gt;
Example: Customers → Sales&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Both&lt;br&gt;
Filters flow both ways, and this can cause ambiguity&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Importance of a good Data Model
&lt;/h2&gt;

&lt;p&gt;A good data model goes a long way to ensure optimal performance and critical reporting, especially when decision-making is needed. The good model will first of all reduce complexity, especially when joining using start schema, having fewer and necessary tables would lead to faster queries, and denormalisation improves compression. &lt;/p&gt;

&lt;p&gt;A good data model also goes along way to ensure accurate reporting by ensuring clear relationships and dimensions are established, which helps to ensure filter propagation and consistent slicing and grouping. This also helps to reduce DAX complexities and clean measures that are easy to read, maintain and understand.&lt;/p&gt;

&lt;p&gt;The downside of a poor model is slow visuals, timeouts and excessive resource utilisation, misleading calculations and inconsistent results across visuals.&lt;/p&gt;

&lt;p&gt;To achieve better and more accurate results, always stick to these principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prefer star schema over snowflake schema&lt;/li&gt;
&lt;li&gt;Separate facts and dimension tables clearly.&lt;/li&gt;
&lt;li&gt;Use a one-to-many relationship&lt;/li&gt;
&lt;li&gt;Keep filter direction singular where possible.&lt;/li&gt;
&lt;li&gt;Flat dimension for reporting&lt;/li&gt;
&lt;li&gt;Design the model before writing DAX&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When it comes to Power BI, Data Modelling is a primary task since it forms the foundation of successful analytics. Being conversant with the schemas, properly defining facts and dimension tables, and proper establishment of relationships goes a great length to impact performance, accuracy and usability of the model. BI Developers can build robust, scalable and reliable reports that can deliver trustworthy insights if they stick to some of the principles we have mentioned.&lt;/p&gt;

</description>
      <category>powerbi</category>
      <category>starschema</category>
      <category>snowflakeschema</category>
      <category>datamodelling</category>
    </item>
    <item>
      <title>LINUX FOR A ROOKIE DATA ENGINEERING STUDENT</title>
      <dc:creator>Shaban Ibrahim</dc:creator>
      <pubDate>Sun, 25 Jan 2026 19:09:53 +0000</pubDate>
      <link>https://dev.to/shabex/linux-for-a-rookie-data-engineering-student-255h</link>
      <guid>https://dev.to/shabex/linux-for-a-rookie-data-engineering-student-255h</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2xpttisw3l5fz2dqddd.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2xpttisw3l5fz2dqddd.jpg" alt="Cover page" width="620" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;As a student of &lt;strong&gt;&lt;em&gt;Data Engineering&lt;/em&gt;&lt;/strong&gt;, learning and understanding the fundamentals of &lt;strong&gt;&lt;em&gt;Linux&lt;/em&gt;&lt;/strong&gt; is a &lt;strong&gt;MUST&lt;/strong&gt;. As a matter of fact, for one to smoothly learn and grow in the field of &lt;strong&gt;Data Engineering&lt;/strong&gt; they have to be good at &lt;strong&gt;Linux&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Linux&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Linux&lt;/strong&gt; is an open-source operating system mostly used in servers, cloud platforms and data systems to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run applications and services.&lt;/li&gt;
&lt;li&gt;Process and manage large amounts of data&lt;/li&gt;
&lt;li&gt;Host websites and backend systems&lt;/li&gt;
&lt;li&gt;Automate tasks and workflows (using scripts and schedulers)&lt;/li&gt;
&lt;li&gt;Support cloud infrastructure (virtual machines, containers)&lt;/li&gt;
&lt;li&gt;Ensure stability, security, and high performance for systems that must run 24/7&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In short, &lt;em&gt;Linux&lt;/em&gt; provides a secure and reliable environment to efficiently and continuously run applications, data pipelines, and cloud services. &lt;/p&gt;
&lt;h2&gt;
  
  
  Why Linux For Data Engineers
&lt;/h2&gt;

&lt;p&gt;Most of the core daily operations of a &lt;strong&gt;Data Engineer (DE)&lt;/strong&gt; are carried out on &lt;em&gt;Linux&lt;/em&gt; as most of the Data Systems run on it.&lt;br&gt;
These might include operations like&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Running Data Pipeline&lt;/strong&gt;&lt;br&gt;
Data Pipelines such as ETL/ELT are usually handled on &lt;em&gt;Linux&lt;/em&gt; servers, which include ingesting data from APIs, processing large files, transforming data using &lt;strong&gt;Python&lt;/strong&gt; or &lt;strong&gt;Spark&lt;/strong&gt; and also loading data into data warehouses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Automation and Scheduling&lt;/strong&gt;&lt;br&gt;
With &lt;em&gt;Linux&lt;/em&gt; tools such as &lt;code&gt;cron&lt;/code&gt;, you can schedule jobs and use &lt;code&gt;bash scripts&lt;/code&gt; to automate tasks e.g weekly logs cleaning, archive data periodically and run scripts on schedules that have been set.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Handling Big Data&lt;/strong&gt;&lt;br&gt;
To handle large data, you need to have frameworks that run only on &lt;em&gt;Linux&lt;/em&gt;, such as &lt;strong&gt;Hadoop&lt;/strong&gt; for distributed storage and processing, &lt;strong&gt;Spark&lt;/strong&gt; for fast processing of large data, &lt;strong&gt;Kafka&lt;/strong&gt; for streaming the data and &lt;strong&gt;Airflow&lt;/strong&gt; for workflow orchestration which is the process of organizing, scheduling, and managing multiple tasks so they run in the correct order and at the right time and with complete reliability from start to finish.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Working with Cloud Infrastructure&lt;/strong&gt;&lt;br&gt;
Most of the cloud infrastructures that run on major cloud platforms such as &lt;strong&gt;Amazon Web Services (AWS)&lt;/strong&gt;, &lt;strong&gt;Google Cloud Platform (GCP)&lt;/strong&gt;, and &lt;strong&gt;Microsoft Azure&lt;/strong&gt; offer &lt;em&gt;Linux&lt;/em&gt; run infrastructure such as &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Virtual Machines &lt;strong&gt;VMs&lt;/strong&gt; - Ubuntu, Red Hat, Debian
-Containers &amp;amp; Orchestration – Docker, Kubernetes&lt;/li&gt;
&lt;li&gt;Big Data Services – Hadoop, Spark, Kafka clusters&lt;/li&gt;
&lt;li&gt;Databases – MySQL, PostgreSQL, MongoDB, Cassandra&lt;/li&gt;
&lt;li&gt;Data Warehouses – BigQuery engines, Redshift nodes (Linux-based)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5. File and Data Management&lt;/strong&gt;&lt;br&gt;
With &lt;em&gt;Linux&lt;/em&gt; you can effectively and efficiently handle large files and perform tasks such as moving massive datasets, compressing files, searching logs and streaming data. All of these tasks are done by executing commands such as &lt;code&gt;ls&lt;/code&gt;, &lt;code&gt;cd&lt;/code&gt;, &lt;code&gt;cp&lt;/code&gt;, &lt;code&gt;grep&lt;/code&gt;, &lt;code&gt;mv&lt;/code&gt; e.t.c&lt;/p&gt;
&lt;h2&gt;
  
  
  Running Linux Terminal and Its Commands
&lt;/h2&gt;

&lt;p&gt;Running a Linux terminal means using text commands to control a Linux system, either locally or on a server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. On a Linux machine (Ubuntu, etc.)&lt;/strong&gt;&lt;br&gt;
Press &lt;code&gt;Ctrl&lt;/code&gt; + &lt;code&gt;Alt&lt;/code&gt; + &lt;code&gt;T&lt;/code&gt;&lt;br&gt;
Or search “Terminal” in applications&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. On Windows (Most common)&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Option A:&lt;/strong&gt; Windows Subsystem for Linux (WSL) &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install WSL&lt;/li&gt;
&lt;li&gt;Open Ubuntu from the Start Menu&lt;/li&gt;
&lt;li&gt;This gives you a real Linux terminal inside Windows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Option B:&lt;/strong&gt; Git Bash&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install Git&lt;/li&gt;
&lt;li&gt;Open Git Bash&lt;/li&gt;
&lt;li&gt;Linux-like commands (not full Linux, but useful) &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3.On macOS&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open Terminal (Spotlight → Terminal)&lt;/li&gt;
&lt;li&gt;macOS is Unix-based, very similar to Linux&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. On a Remote Server (Cloud/Linux server)&lt;/strong&gt;&lt;br&gt;
Use SSH:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh username@server_ip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This opens a Linux terminal on a remote machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Basic Linux Commands
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Accessing Server&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Access remote server, you will need the server username, the server ip_address and the password for the server&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ssh username@server_ip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F19xhztwbwn47rts8slao.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F19xhztwbwn47rts8slao.png" alt="Image001" width="589" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update and upgrade server if and when necessary&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt update &amp;amp;&amp;amp; sudo apt upgrade
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Check the version of the ubuntu server you are using&lt;/strong&gt;*&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;lsb_release -a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0huzg3iif2fpa6vta322.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0huzg3iif2fpa6vta322.png" alt="Image 02" width="582" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;To understand the specifications of your VM understand the space usage and remaining storage&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;df -h
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9j9gkbl9g3f5pgjd0za.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm9j9gkbl9g3f5pgjd0za.png" alt="Image 03" width="598" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;To see the list of all files in the server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ls
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr5tdo50svmfqpg764wa3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr5tdo50svmfqpg764wa3.png" alt="Image 04" width="593" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Red&lt;/strong&gt; - Zipped Files&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Blue&lt;/strong&gt; - Folders&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;White&lt;/strong&gt; - Files&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;To see the list of all files hidden and unhidden in the server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ls -a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fewbvjc6ufaljw61gjouf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fewbvjc6ufaljw61gjouf.png" alt="Image 04A" width="599" height="472"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Print your current directory&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pwd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhb3ysz1ny464c9an8c4s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhb3ysz1ny464c9an8c4s.png" alt="Image 05" width="588" height="465"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;To Add another user in your server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo adduser 'username'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1x6nsbwh59zjkojl33g3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1x6nsbwh59zjkojl33g3.png" alt="Image 06" width="588" height="470"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Changing from the Super User 'Root' to the regular user n the server and changing directory&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;su 'username'
cd
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkmmpzvl8q0eli12t3m33.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkmmpzvl8q0eli12t3m33.png" alt="Image 07" width="582" height="310"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Creating Directories and Files and navigating between them&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir - Create a directory
touch - Create an empty file
cd 'mkdir' - To access or open your directory 
cd .. to move one step back from your current location
cd + space - To go back to the end of the path
cp - copy files
mv - To move/Rename files
rm - To delete a file
rm -r - To delete a folder
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F26y3bpns28dicq84o8fa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F26y3bpns28dicq84o8fa.png" alt="Image 08" width="589" height="309"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcxvi37m16q6ya0tiaot3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcxvi37m16q6ya0tiaot3.png" alt="Image 09" width="602" height="471"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copying file from the local machine to the server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cp 'file_name' user_name@ip:path_to_the_serve_loaction_of_choice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdrqjfwsknhj7ps5ijkab.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdrqjfwsknhj7ps5ijkab.png" alt="Image 10" width="584" height="472"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy4suvrlw4ecy7kapws65.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy4suvrlw4ecy7kapws65.png" alt="Image 11" width="597" height="469"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copying file from the server to the local machine&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scp username@remote_host:/remote/path/to/file /local/path/to/destination
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkiwy9ch9hkfs8og8vvqy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkiwy9ch9hkfs8og8vvqy.png" alt="Image 12" width="584" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F91u2itfiu33mzu9z9pbs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F91u2itfiu33mzu9z9pbs.png" alt="Image 13" width="600" height="375"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copying folder from the local machine to the server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scp -r /local/path/to/folder ibrahim@157.245.209.236:/home/ibrahim/

scp -r MyMusicFolder ibrahim@157.245.209.236:/home/ibrahim/

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Copying folder from the server to the local machine&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scp -r ibrahim@157.245.209.236:/home/ibrahim/MyMusicFolder /local/destination/

scp -r ibrahim@157.245.209.236:/home/ibrahim/MyMusicFolder ~/Downloads/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can also rename the folder during transfer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scp -r ibrahim@157.245.209.236:/home/ibrahim/MyMusicFolder ~/Downloads/NewFolderName
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For large folders, consider adding -C to compress during transfer (faster for slow connections):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scp -r -C MyMusicFolder ibrahim@157.245.209.236:/home/ibrahim/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Copying files from the internet to your server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget 'link'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdf088rtyiexnfat7iytu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdf088rtyiexnfat7iytu.png" alt="Image 1" width="580" height="370"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Writing and Reading line on an empty file in the server&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;echo 'The line you wish to write' &amp;gt;&amp;gt; file_name

cat 'file name' - Read a file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd01x19d8oi8rydn8algr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd01x19d8oi8rydn8algr.png" alt="Image 2" width="585" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Editing Using Nano and Vi
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Nano&lt;/strong&gt; is a simple, beginner-friendly text editor you use directly in the Linux terminal. It comes in handy when editing files, writing scripts and viewing changes to files on the servers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;nano app.py - to open nano interface
Ctl + O - save
Ctrl + x -exit

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the file doesn't exist, Nano creates the file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F41a9fianejcgbnbh1834.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F41a9fianejcgbnbh1834.png" alt="Image 4" width="581" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vim&lt;/strong&gt; is a modal text editor used in the &lt;strong&gt;Linux&lt;/strong&gt; terminal and is widely used in servers, cloud machines, and containers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;vim app.py - to open vim interface
i --&amp;gt; insert mode
Type your text
Esc --&amp;gt; back to Linux
:w --&amp;gt; Save
:q --&amp;gt; Quit
:wq --&amp;gt; save and quit
:q! --&amp;gt; quit without saving
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffg96dk6zxu8ys027liee.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffg96dk6zxu8ys027liee.png" alt="Image04" width="577" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>linux</category>
      <category>dataengineering</category>
      <category>vim</category>
      <category>datascience</category>
    </item>
    <item>
      <title>HOW TO GIT IT</title>
      <dc:creator>Shaban Ibrahim</dc:creator>
      <pubDate>Sun, 18 Jan 2026 03:49:41 +0000</pubDate>
      <link>https://dev.to/shabex/how-to-git-it-38kp</link>
      <guid>https://dev.to/shabex/how-to-git-it-38kp</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmywqh7j0ndmmy338a714.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmywqh7j0ndmmy338a714.jpg" alt="Git Logo" width="800" height="450"&gt;&lt;/a&gt;  &lt;/p&gt;

&lt;h1&gt;
  
  
  UNDERSTANDING GIT AND GITHUB
&lt;/h1&gt;

&lt;p&gt;You remember the first day you tried your hand in tech, maybe in &lt;em&gt;software engineering, data science, web development or data engineering&lt;/em&gt;? Your trainer or technical mentor insisted that you have a &lt;strong&gt;GitHub account&lt;/strong&gt;, and they made it a mandatory requirement for you to have one as industry best practice. I am sure you were wondering what GitHub is and how it will be important in your tech journey.&lt;/p&gt;

&lt;p&gt;Many moons later, you came to realise that actually Github is like a treasure chest for any tech person, although before you got the grasp of it, you were confused and didn't understand how it works. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt; is an online platform that hosts &lt;strong&gt;git repositories&lt;/strong&gt;, among many other platforms such as &lt;strong&gt;Gitlab&lt;/strong&gt; and &lt;strong&gt;Bitbucket&lt;/strong&gt;. It helps users to store, share and collaborate on code. Basically, the treasure chest of coders.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create Github account
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/" rel="noopener noreferrer"&gt;To create &lt;em&gt;Github&lt;/em&gt; account,click here&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Now what is &lt;strong&gt;&lt;em&gt;Git&lt;/em&gt;&lt;/strong&gt;?
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Git&lt;/em&gt; is simply a time machine for your code. This means it is a tool that helps you track changes in your code over time, whereby you can easily save different versions of your work, go back to earlier versions if something breaks, and collaborate with others without overwriting each other’s work. This makes &lt;strong&gt;git&lt;/strong&gt; a &lt;strong&gt;version control system&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing Git
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://git-scm.com" rel="noopener noreferrer"&gt;To install &lt;em&gt;Git&lt;/em&gt;, you can click the link&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What are some of the components of &lt;em&gt;Git&lt;/em&gt;?
&lt;/h2&gt;

&lt;p&gt;We need to understand some of the components of &lt;em&gt;Git&lt;/em&gt; and what they mean before understanding how to navigate inside or around it.&lt;br&gt;
Some of the main components of &lt;strong&gt;Git&lt;/strong&gt; are;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Repository (Repo)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Working Directory&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Staging Area&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Commit&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  a) Repository
&lt;/h3&gt;

&lt;p&gt;This is a particular project in &lt;strong&gt;Git&lt;/strong&gt; that is being tracked. It &lt;em&gt;contains your files, change history and git configuration&lt;/em&gt;, all of which are contained in the &lt;code&gt;.git&lt;/code&gt; folder.&lt;br&gt;
For example&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;my-project/
|-- app.py
|--README.md
|--.git/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;em&gt;How to Create Your Repository on git&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Create a project folder&lt;br&gt;
&lt;code&gt;mkdir my-project&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Navigate into your project folder by&lt;br&gt;
&lt;code&gt;cd my-project&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Initialise &lt;em&gt;Git&lt;/em&gt; by &lt;br&gt;
&lt;code&gt;git init&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  b) Working Directory
&lt;/h3&gt;

&lt;p&gt;The working directory is the folder on your computer where you:&lt;em&gt;Write code, Edit files, and delete or add files&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Git continuously checks this directory for changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;You can track changes in *Git&lt;/em&gt; by&lt;/strong&gt;* &lt;code&gt;git status&lt;/code&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  c) Staging Area
&lt;/h3&gt;

&lt;p&gt;Now this is the area where you instruct, and direct &lt;em&gt;Git&lt;/em&gt; to save a particular change that you might have made in your code, because not all changes are saved automatically, you have to choose what to save.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;To stage a specific file:&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
&lt;br&gt;
 &lt;code&gt;touch app.py&lt;br&gt;
touch README.md&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;To stage everything:&lt;/em&gt;&lt;/strong&gt; &lt;code&gt;git add .&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  d) Commit
&lt;/h2&gt;

&lt;p&gt;This is a snapshot of your code at a specific point in time, and commits are characterised by &lt;em&gt;unique id, message describing the change and a time stamp&lt;/em&gt;. Call them &lt;strong&gt;Checkpoints&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;To add a commit:&lt;/em&gt;&lt;/strong&gt;  &lt;code&gt;git commit -m "Add the logic describing the change"&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;To see past commits:&lt;/em&gt;&lt;/strong&gt; &lt;code&gt;git log&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Below are the image demonstrations on how to go about operating git&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8pujllnhrg9voiqrewe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8pujllnhrg9voiqrewe.png" alt="Image 1" width="662" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F136fd75t4ygidpo9re8m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F136fd75t4ygidpo9re8m.png" alt="Image 2" width="616" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting Git to GitHub Remotely
&lt;/h2&gt;

&lt;p&gt;After creating your repository on &lt;strong&gt;&lt;em&gt;Git&lt;/em&gt;&lt;/strong&gt; you will need to connect it remotely to your GitHub account.&lt;/p&gt;

&lt;p&gt;In your &lt;strong&gt;&lt;em&gt;Github account&lt;/em&gt;&lt;/strong&gt;, you will need to create a new repository in which you will name.&lt;br&gt;
An SSH key will be created in the form of:&lt;br&gt;
&lt;code&gt;git@github.com:Shabex/Data-Eng.git&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Copy the SSH Key and use the following syntax you can link your git to GitHub remotely:&lt;br&gt;
&lt;code&gt;git remote add origin git@github.com:Shabex/Data-Eng.git&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Pushing Your Code
&lt;/h2&gt;

&lt;p&gt;This is how you send your local commits to the remote repository in &lt;strong&gt;Github&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;git push -u origin main&lt;/code&gt;&lt;br&gt;
After pushing your code, they become visible on &lt;strong&gt;GitHub&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Pulling Your Codes
&lt;/h2&gt;

&lt;p&gt;This is how you send your changes from the remote repository to my local computer &lt;br&gt;
&lt;code&gt;git pull origin main&lt;/code&gt;&lt;br&gt;
This comes in handy when you are working on a shared project, switching computers or updating your local copy of the file.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw4rj2a0cjnz0gf5dwaq3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw4rj2a0cjnz0gf5dwaq3.png" alt="Image 3" width="627" height="477"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7uqg40bker507emx1027.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7uqg40bker507emx1027.png" alt="Image 4" width="627" height="486"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>luxdev</category>
      <category>git</category>
      <category>onboarding</category>
      <category>dataengineering</category>
    </item>
  </channel>
</rss>
