<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Gathuru_M</title>
    <description>The latest articles on DEV Community by Gathuru_M (@gathurum).</description>
    <link>https://dev.to/gathurum</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F912835%2Fe5214cca-5a55-402c-8806-eabe9d7fc72c.png</url>
      <title>DEV Community: Gathuru_M</title>
      <link>https://dev.to/gathurum</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gathurum"/>
    <language>en</language>
    <item>
      <title>TaskFlow API vs. Traditional Operators: Practical Airflow ETL Pipeline</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Sun, 07 Jun 2026 19:56:12 +0000</pubDate>
      <link>https://dev.to/gathurum/taskflow-api-vs-traditional-operators-practical-airflow-etl-pipeline-962</link>
      <guid>https://dev.to/gathurum/taskflow-api-vs-traditional-operators-practical-airflow-etl-pipeline-962</guid>
      <description>&lt;p&gt;In my &lt;a href="https://dev.to/gathurum/demystifying-apache-airflow-why-data-engineers-dont-just-use-cron-jobs-2pmk"&gt;last article&lt;/a&gt;, we went over the foundational pillars of Apache Airflow—DAGs, Tasks, and why orchestration beats manual scripts.&lt;/p&gt;

&lt;p&gt;For this practical Airflow project, we will build&lt;br&gt;
an ETL pipeline to aggregate market data from Massive API. But instead of just writing it one way, we look into writing &lt;strong&gt;two different versions of the same DAG&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The Traditional Approach:&lt;/strong&gt; Using classic standard operators (&lt;code&gt;PythonOperator&lt;/code&gt;) and manual XCom pulling/pushing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Modern Approach:&lt;/strong&gt; Using the &lt;strong&gt;TaskFlow API&lt;/strong&gt; (&lt;code&gt;@dag&lt;/code&gt; and &lt;code&gt;@task&lt;/code&gt; decorators) to make the code clean and Pythonic.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you’re confused about the difference or wondering which one you should use in your projects, let’s break down both.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Goal: Aggregating Data from Massive API
&lt;/h2&gt;

&lt;p&gt;The pipeline does three basic things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extract:&lt;/strong&gt; Pulls raw daily open/close asset data from our market API client.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transform:&lt;/strong&gt; Normalizes the heavy, nested JSON payload into a clean structure using &lt;code&gt;pandas&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load:&lt;/strong&gt; Inserts the structured records into a cloud PostgreSQL database.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Approach 1: The Traditional Way (Standard Operators)
&lt;/h2&gt;

&lt;p&gt;When Airflow was first built, you had to explicitly define every single task using an Operator class and manually stitch them together using the bitshift operators (&lt;code&gt;&amp;gt;&amp;gt;&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;The trickiest part here is data sharing. Because tasks run in isolation, we have to use explicit &lt;strong&gt;XComs&lt;/strong&gt; (cross-communications) to pass our API payload from the extract task to the transform task.&lt;/p&gt;

&lt;p&gt;Here is how the traditional DAG looks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DAG&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.operators.python&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PythonOperator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="c1"&gt;# (Assume our custom API extraction and DB loading logic are imported here)
&lt;/span&gt;
&lt;span class="n"&gt;default_args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;owner&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;my_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;start_date&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2026&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;retries&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;retry_delay&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minutes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_extract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Fetching data from our massive asset API
&lt;/span&gt;    &lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ticker&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BTC&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;65000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-06-07T00:00:00Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; 
    &lt;span class="c1"&gt;# We MUST explicitly push to XCom so the next task can see it
&lt;/span&gt;    &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;raw_market_data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# We MUST explicitly pull from XCom
&lt;/span&gt;    &lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ti&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;xcom_pull&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;raw_market_data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;extract_task&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Transforming data for: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ticker&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Transformation logic goes here...
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;raw_data&lt;/span&gt;  &lt;span class="c1"&gt;# Returning automatically pushes to default XCom
&lt;/span&gt;
&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DAG&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;dag_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;market_data_traditional_v1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;schedule_interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@hourly&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;catchup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="n"&gt;extract_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PythonOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;extract_task&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;python_callable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_extract&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;transform_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PythonOperator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;transform_task&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;python_callable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_transform&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Stitching the pipeline dependency together
&lt;/span&gt;    &lt;span class="n"&gt;extract_task&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;transform_task&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Verdict on Traditional:
&lt;/h3&gt;

&lt;p&gt;It works perfectly, but it feels heavy. You have to write a lot of boilerplate code just to set up the tasks, and passing data requires explicitly passing the Task Instance (&lt;code&gt;ti&lt;/code&gt;) and remembering exact &lt;code&gt;task_ids&lt;/code&gt; string names. If you typo a string, your whole pipeline crashes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Approach 2: The Modern Way (TaskFlow API)
&lt;/h2&gt;

&lt;p&gt;Introduced in Airflow 2.0, the &lt;strong&gt;TaskFlow API&lt;/strong&gt; uses Python decorators (&lt;code&gt;@dag&lt;/code&gt; and &lt;code&gt;@task&lt;/code&gt;). It fundamentally changes how we write pipelines by making Airflow handle XComs silently behind the scenes.&lt;/p&gt;

&lt;p&gt;Look at how clean this version is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;airflow.decorators&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dag&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timedelta&lt;/span&gt;

&lt;span class="n"&gt;default_args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;owner&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;my_name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;start_date&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2026&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;retries&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;@dag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dag_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;market_data_taskflow_v1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;default_args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schedule&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;@hourly&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;catchup&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;market_data_etl&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;

    &lt;span class="nd"&gt;@task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ticker&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;BTC&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;price&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;65000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timestamp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-06-07T00:00:00Z&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;raw_data&lt;/span&gt;  &lt;span class="c1"&gt;# No manual xcom_push! Airflow handles it.
&lt;/span&gt;
    &lt;span class="nd"&gt;@task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# No manual xcom_pull! We just treat it like a regular Python variable.
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Transforming data for: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ticker&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;raw_data&lt;/span&gt;

    &lt;span class="c1"&gt;# This single clean line sets up dependencies AND passes data!
&lt;/span&gt;    &lt;span class="n"&gt;market_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;market_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Instantiate the DAG
&lt;/span&gt;&lt;span class="n"&gt;market_data_etl_dag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;market_data_etl&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Verdict on TaskFlow:
&lt;/h3&gt;

&lt;p&gt;This feels like writing native Python! You don’t have to instantiate &lt;code&gt;PythonOperator&lt;/code&gt; manually, and dependencies are implicitly built simply by passing outputs into inputs (&lt;code&gt;transform(extract())&lt;/code&gt;).&lt;/p&gt;




&lt;h2&gt;
  
  
  Seeing Both in the Airflow UI
&lt;/h2&gt;

&lt;p&gt;Once I deployed both DAG files to my local environment running on WSL, they popped up immediately in my Airflow Web UI dashboard.&lt;/p&gt;

&lt;p&gt;Even though the code looks completely different, &lt;strong&gt;Airflow creates the exact same visual graph architecture&lt;/strong&gt; for both under the hood.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Traditional DAG&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhu79qtemb4ibphjhk3gv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhu79qtemb4ibphjhk3gv.png" alt="Traditional DAG" width="800" height="541"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When I ran them, checking the logs for the TaskFlow DAG showed how cleanly it handled the data context without any missing parameters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Which One Should You Use?
&lt;/h2&gt;

&lt;p&gt;After building this massive API project using both paradigms, here is my takeaway for fellow beginners:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use TaskFlow API&lt;/strong&gt; whenever you are working strictly with Python functions (&lt;code&gt;PythonOperator&lt;/code&gt;). It reduces boilerplate code significantly, makes your code highly readable, and saves you from the headache of managing raw XCom keys.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Standard Operators&lt;/strong&gt; when you need to interact with external tools directly via specialized operators (like &lt;code&gt;PostgresOperator&lt;/code&gt;, &lt;code&gt;S3CreateObjectOperator&lt;/code&gt;, or &lt;code&gt;BashOperator&lt;/code&gt;). TaskFlow is amazing for Python-native workflows, but standard classes are still essential for interacting with cloud infra and heavy enterprise databases.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Now that our pipeline can be scheduled and orchestrated automatically, another major question popped up: &lt;strong&gt;Where do we deploy this safely so it runs the exact same way on a remote server as it does on my local machine?&lt;/strong&gt; Right now, everything is relying on my specific local python setup and WSL configuration. If I send this code to a teammate, they might hit a wall of environment errors.&lt;/p&gt;

&lt;p&gt;In the next part of this series, we are diving into &lt;strong&gt;Docker&lt;/strong&gt; to containerize our database and ETL processes so they can run seamlessly anywhere!&lt;/p&gt;

&lt;p&gt;Which style do you prefer writing in Airflow? Let me know your thoughts in the comments!&lt;/p&gt;

</description>
      <category>automation</category>
      <category>dataengineering</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Demystifying Apache Airflow: Why Data Engineers Don't Just Use Cron Jobs</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Sun, 07 Jun 2026 18:02:19 +0000</pubDate>
      <link>https://dev.to/gathurum/demystifying-apache-airflow-why-data-engineers-dont-just-use-cron-jobs-2pmk</link>
      <guid>https://dev.to/gathurum/demystifying-apache-airflow-why-data-engineers-dont-just-use-cron-jobs-2pmk</guid>
      <description>&lt;p&gt;If you followed my &lt;a href="https://dev.to/gathurum/etl-pipeline-fetching-real-time-news-data-with-python-and-postgres-2c88"&gt;last post&lt;/a&gt;, we successfully built an ETL pipeline that fetched data from the News API, cleaned it with &lt;code&gt;pandas&lt;/code&gt;, and loaded it into a PostgreSQL database. It felt amazing to watch it run successfully in the terminal.&lt;/p&gt;

&lt;p&gt;But what if the News API goes down for 10 minutes? or what if my laptop is asleep when the script is supposed to run? &lt;/p&gt;

&lt;p&gt;In the real world, you can't just sit at your laptop and manually click "Run" on a Python script every day. You need automation, monitoring, and a way to handle failures. That is exactly where &lt;strong&gt;Apache Airflow&lt;/strong&gt; comes in.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem: The "Crashing Script" Nightmare
&lt;/h2&gt;

&lt;p&gt;Before tools like Airflow, developers relied heavily on &lt;strong&gt;Cron jobs&lt;/strong&gt; (a built-in Linux tool used to schedule scripts at specific times). Cron is great for simple things, but it has huge blind spots for data engineering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No Dependency Management:&lt;/strong&gt; If your "Transform" script takes longer than usual, your "Load" script might start running before the data is even ready, causing a massive crash.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lack of Visibility:&lt;/strong&gt; If a script fails at 3 AM, you won't know until you check the logs manually or notice empty tables the next morning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Easy Retries:&lt;/strong&gt; If a network glitch causes an API call to fail, Cron won't automatically try again 5 minutes later. You have to handle that messy logic yourself in Python.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Airflow solves all of this by acting as the &lt;strong&gt;workflow orchestrator&lt;/strong&gt;. It doesn't actually store or process your data; instead, it acts as the manager telling your scripts exactly &lt;em&gt;when&lt;/em&gt; to run, &lt;em&gt;in what order&lt;/em&gt;, and &lt;em&gt;what to do&lt;/em&gt; if something breaks.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Concepts Explained
&lt;/h2&gt;

&lt;p&gt;Let’s break down the four most important concepts using a simple analogy: &lt;strong&gt;Baking a Cake&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The DAG (Directed Acyclic Graph)
&lt;/h3&gt;

&lt;p&gt;Think of a &lt;strong&gt;DAG&lt;/strong&gt; as the entire recipe for your cake.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Directed:&lt;/strong&gt; It has a clear starting point and moves in a specific direction (you can't frost the cake before you bake it).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Acyclic:&lt;/strong&gt; It cannot go in circles. Step C cannot loop back and trigger Step A, otherwise your pipeline would run forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graph:&lt;/strong&gt; It's just a structural map of how your steps link together.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In data engineering, your DAG is the blueprint of your entire ETL pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Operators
&lt;/h3&gt;

&lt;p&gt;If the DAG is the recipe, &lt;strong&gt;Operators&lt;/strong&gt; are the kitchen appliances. They are the pre-built templates that define &lt;em&gt;what&lt;/em&gt; actually gets done. Airflow provides different types of operators:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PythonOperator&lt;/code&gt;: Used to execute a piece of Python code (like our &lt;code&gt;transform_data&lt;/code&gt; function).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;PostgresOperator&lt;/code&gt;: Used to run SQL queries directly inside a Postgres database.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;BashOperator&lt;/code&gt;: Used to run command-line terminal scripts.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Tasks
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;Task&lt;/strong&gt; is an operator that has been given specific instructions. It’s a single node inside your DAG. For example, using a &lt;code&gt;PythonOperator&lt;/code&gt; to run &lt;code&gt;extract_data()&lt;/code&gt; becomes the &lt;strong&gt;"Extract Task"&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. XComs (Cross-Communications)
&lt;/h3&gt;

&lt;p&gt;In our standalone Python script, passing data was easy: we just returned a value from one function and passed it into the next (&lt;code&gt;cleaned_df = transform_data(raw_data)&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;In Airflow, tasks run completely independently. They can't easily talk to each other. &lt;strong&gt;XComs&lt;/strong&gt; are like little sticky notes that tasks use to pass small amounts of data or metadata down the line. One task "pushes" a note, and the next task "pulls" it.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Quick Peek at the Airflow UI
&lt;/h2&gt;

&lt;p&gt;The absolute best part of Apache Airflow is its user interface. Instead of staring at text scrolling through a dark terminal window, Airflow gives you a beautiful visual dashboard where you can see your pipelines running in real-time.&lt;/p&gt;

&lt;p&gt;When a pipeline runs successfully, the tasks turn a satisfying dark green. If a task fails, it turns red, making it incredibly easy to spot exactly where your pipeline broke.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Airflow Web UI Dags page &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3fzovxjgyvpcb07ukxy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo3fzovxjgyvpcb07ukxy.png" alt="Airflow Web UI Dags page Example" width="799" height="327"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What We’re Doing Next
&lt;/h2&gt;

&lt;p&gt;Now that we understand the foundational pillars of Airflow—DAGs, Operators, Tasks, and why we use them over simple cron jobs—it’s time to get our hands dirty.&lt;/p&gt;

&lt;p&gt;In my next article, we are going to break down an ETL project into Airflow Tasks and watch it run automatically inside the Airflow UI.&lt;/p&gt;

&lt;p&gt;Are you using Airflow or which other tools do you prefer for orchestration?&lt;/p&gt;

</description>
    </item>
    <item>
      <title>ETL Pipeline: Fetching Real-Time News Data with Python and Postgres</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Sun, 07 Jun 2026 17:25:39 +0000</pubDate>
      <link>https://dev.to/gathurum/etl-pipeline-fetching-real-time-news-data-with-python-and-postgres-2c88</link>
      <guid>https://dev.to/gathurum/etl-pipeline-fetching-real-time-news-data-with-python-and-postgres-2c88</guid>
      <description>&lt;p&gt;The best way to actually understand data engineering is to build something that breaks, fix it, and watch it successfully run.&lt;/p&gt;

&lt;p&gt;In this article, we build an ETL pipeline that pulls data from the &lt;strong&gt;News API&lt;/strong&gt;, cleans it up using &lt;code&gt;pandas&lt;/code&gt;, and loads it into a local &lt;strong&gt;PostgreSQL&lt;/strong&gt; database.&lt;/p&gt;

&lt;p&gt;If you are a beginner Python developer or just getting into data engineering, this one is for you!&lt;/p&gt;




&lt;h2&gt;
  
  
  The Goal &amp;amp; The Architecture
&lt;/h2&gt;

&lt;p&gt;Before writing a single line of code, let’s look at what we are actually trying to achieve:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Extract:&lt;/strong&gt; Connect to the News API using Python, fetch the top headlines about technology, and pull down the raw data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transform:&lt;/strong&gt; The raw data comes back as a messy, nested JSON object. We'll use &lt;code&gt;pandas&lt;/code&gt; to flatten it, pick the columns we actually care about, handle missing values, and format the dates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load:&lt;/strong&gt; Connect to a local PostgreSQL database and append our clean data into a structured table.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Step 1: Setting Up the Database
&lt;/h2&gt;

&lt;p&gt;First, we need a place for our data to live. I used a PostgreSQL instance running on the cloud with Aiven.&lt;/p&gt;

&lt;p&gt;We need a clean target table. Here is the SQL script I used to create a simple &lt;code&gt;news_articles&lt;/code&gt; table. Notice how we have to be careful with our data types (like using &lt;code&gt;TIMESTAMP&lt;/code&gt; for dates and &lt;code&gt;TEXT&lt;/code&gt; for long URLs).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;news_articles&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;source&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;author&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;published_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;extracted_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Below is a diagram of the news_articles table created.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgp3bvuf9haczrig9pd43.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgp3bvuf9haczrig9pd43.png" alt="A diagram of the news_articles table created" width="748" height="565"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: The Python ETL Script
&lt;/h2&gt;

&lt;p&gt;To keep things clean and modular, I broke the code down into three distinct functions representing &lt;strong&gt;E&lt;/strong&gt;, &lt;strong&gt;T&lt;/strong&gt;, and &lt;strong&gt;L&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Make sure you have your dependencies installed before running this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;requests pandas psycopg2-binary sqlalchemy 

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is the full, documented script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;psycopg2&lt;/span&gt;

&lt;span class="c1"&gt;# Configuration
&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_NEWS_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://newsapi.org/v2/everything?q=technology&amp;amp;apiKey=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;DB_PARAMS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;host&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;localhost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;database&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your_db_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your_username&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;password&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your_password&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;port&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5432&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_data&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--- Starting Extraction ---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;articles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;articles&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Successfully extracted &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; articles.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;API Request Failed with status code: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Transforming Data
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transform_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

    &lt;span class="c1"&gt;# Create an empty list to store clean articles after looping
&lt;/span&gt;    &lt;span class="n"&gt;cleaned_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# Parse through each dictionary to extract what we need
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;articles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;clean_article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;author&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;author&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;publishedAt&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;publishedAt&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="n"&gt;cleaned_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean_article&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cleaned_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Rename column to match Postgres fields
&lt;/span&gt;    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;columns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;publishedAt&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;published_at&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c1"&gt;# Handle missing values  
&lt;/span&gt;    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;author&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;author&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;fillna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;fillna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;No Description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="c1"&gt;# Format dates
&lt;/span&gt;    &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;published_at&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_datetime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;published_at&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Transformation complete!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

    &lt;span class="n"&gt;db_URI&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;URI&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db_URI&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_sql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;news_articles&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;con&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;news_api&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;if_exists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;append&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Data loaded successfully to &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;news_articles&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to load data to the database:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Run the pipeline
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;raw_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extract_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;cleaned_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;transform_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;load_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cleaned_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ETL Pipeline Finished Successfully! 🎉&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pipeline failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 3: Running the Pipeline and Verifying the Results
&lt;/h2&gt;

&lt;p&gt;When I first ran this script, I ran into a classic beginner issue: the date format coming from the API included a &lt;code&gt;Z&lt;/code&gt; at the end (e.g., &lt;code&gt;2026-06-07T06:00:00Z&lt;/code&gt;), which caused my local database to complain until I used &lt;code&gt;pd.to_datetime()&lt;/code&gt; to safely parse it.&lt;/p&gt;

&lt;p&gt;But once those quirks were ironed out, running the script in the terminal yielded beautiful logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;![Successful execution](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/vhq6zbzsct5asbdocdcf.png)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To verify it actually worked, I hopped back into my database client and ran a simple query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;source_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;author&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;published_at&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;news_articles&lt;/span&gt; &lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And there it was—real live web data, organized neatly into my own database tables!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4zq9qdz2if544qtmq7t2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4zq9qdz2if544qtmq7t2.png" alt="First 5 rows of extracted data from Postgres" width="799" height="319"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways from My First Project
&lt;/h2&gt;

&lt;p&gt;Building this taught me a few massive lessons that you don't get just by reading textbooks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API Data is messy:&lt;/strong&gt; You can almost never load API responses directly into a database. Nested dictionaries (like the &lt;code&gt;source&lt;/code&gt; field in this project) require explicit flattening.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency matters:&lt;/strong&gt; If I run this script twice right now, it will duplicate all the articles. As I move forward, I need to look into how to handle duplicates (like checking for existing URLs before inserting).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;Running this manually via a Python script is great for practice, but what if I want this data updated every single morning at 6 AM while I'm asleep? I can't just sit here and click "Run" manually.&lt;/p&gt;

&lt;p&gt;That is where &lt;strong&gt;Data Orchestration&lt;/strong&gt; comes in. In my next article, we are going to look at &lt;strong&gt;Apache Airflow&lt;/strong&gt; and how we can take this exact code and turn it into an automated, scheduled workflow!&lt;/p&gt;

&lt;p&gt;Have you built an ETL pipeline before? What was the trickiest data cleaning issue you faced? Let’s chat in the comments!&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>api</category>
      <category>etl</category>
      <category>beginners</category>
    </item>
    <item>
      <title>ETL vs. ELT: Which Approach Should You Use and Why?</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Thu, 14 May 2026 20:20:27 +0000</pubDate>
      <link>https://dev.to/gathurum/etl-vs-elt-which-approach-should-you-use-and-why-386d</link>
      <guid>https://dev.to/gathurum/etl-vs-elt-which-approach-should-you-use-and-why-386d</guid>
      <description>&lt;h3&gt;
  
  
  1. Introduction
&lt;/h3&gt;

&lt;p&gt;Understanding a company's data architecture can feel overwhelming, but once you cut out the noise, you will notice one of two operations taking place: &lt;strong&gt;ETL&lt;/strong&gt; or &lt;strong&gt;ELT&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;These two operations are the backbone of how data moves from a source (like an app or a database) to a destination (like a data warehouse). While they sound almost identical, the order of the letters changes everything about how a company manages its data. &lt;br&gt;
In this article, we will break down both approaches for better understanding and I will give my take on which is better.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. What is ETL? (Extract, Transform, Load)
&lt;/h3&gt;

&lt;p&gt;ETL is the traditional way of handling data that follows the process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extract:&lt;/strong&gt; Pull data from various sources (Excel, SQL databases, APIs).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transform:&lt;/strong&gt; Before the data reaches its final home, it is cleaned and formatted in a &lt;strong&gt;Staging Layer&lt;/strong&gt; (a temporary storage area). Business logic is applied here to make the data "useful."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load:&lt;/strong&gt; The cleaned, "ready-to-use" data is finally saved in the destination.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Characteristic:&lt;/strong&gt; The data is transformed &lt;em&gt;before&lt;/em&gt; it is stored.&lt;br&gt;
&lt;strong&gt;Common Tools:&lt;/strong&gt; Microsoft SSIS, Talend, Informatica.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. What is ELT? (Extract, Load, Transform)
&lt;/h3&gt;

&lt;p&gt;ELT is the more modern approach, with the same process as you would guess but in a different order.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extract:&lt;/strong&gt; Pull the raw data from the sources.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Load:&lt;/strong&gt; Instead of cleaning it first, you move the &lt;strong&gt;raw data&lt;/strong&gt; directly into a high-capacity storage system, like a Data Lake or a Data Warehouse (BigQuery, Snowflake).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transform:&lt;/strong&gt; You perform the cleaning and modelling &lt;em&gt;after&lt;/em&gt; the data is already in its destination.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Characteristic:&lt;/strong&gt; Data grows into a historical archive. Since the raw data is always there, you can go back and re-transform it differently next year if your business needs change.&lt;br&gt;
&lt;strong&gt;Common Tools:&lt;/strong&gt; Fivetran or Airbyte (for loading), and &lt;strong&gt;dbt&lt;/strong&gt; (for the transformation part).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Comparison Table&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;ETL&lt;/th&gt;
&lt;th&gt;ELT&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Order&lt;/td&gt;
&lt;td&gt;Transform before Loading&lt;/td&gt;
&lt;td&gt;Load before Transforming&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storage&lt;/td&gt;
&lt;td&gt;Uses temporary Staging&lt;/td&gt;
&lt;td&gt;Uses permanent Data Lake&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flexibility&lt;/td&gt;
&lt;td&gt;Rigid / Fixed&lt;/td&gt;
&lt;td&gt;Highly Flexible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best For&lt;/td&gt;
&lt;td&gt;On-premise / Small data&lt;/td&gt;
&lt;td&gt;Cloud / Big Data&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  4. Which is Better?
&lt;/h3&gt;

&lt;p&gt;With cloud storage becoming cheaper and databases becoming more powerful, &lt;strong&gt;ELT would be the preferred option.&lt;/strong&gt; Here’s why:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Scalability:&lt;/strong&gt; ELT can handle massive "Big Data" sets that would crash a traditional ETL staging server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexibility:&lt;/strong&gt; Because you store the &lt;em&gt;raw&lt;/em&gt; data first, you never lose information. In ETL, if you don't "transform" a column, it's gone. In ELT, you can decide to use that column later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Speed:&lt;/strong&gt; You can load data as often as you want without waiting for complex cleaning scripts to finish.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;While ETL is still used for highly sensitive data or older systems, &lt;strong&gt;ELT is the best approach for modern, cloud-based data engineering.&lt;/strong&gt; It is more scalable, flexible, and allows for much deeper historical analysis.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>beginners</category>
      <category>data</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>How to connect Power BI to a PostgreSQL Database</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Thu, 14 May 2026 17:20:31 +0000</pubDate>
      <link>https://dev.to/gathurum/connecting-power-bi-to-a-postgresql-database-a-practical-guide-4p6i</link>
      <guid>https://dev.to/gathurum/connecting-power-bi-to-a-postgresql-database-a-practical-guide-4p6i</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Power BI&lt;/strong&gt; is a business analytics platform developed by Microsoft. It is mostly used by organizations to transform raw data from &lt;strong&gt;multiple sources&lt;/strong&gt; into interactive visualizations that support &lt;strong&gt;data-driven decision-making.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Data sitting in various places in an organization—such as databases, Excel files, and cloud storage—can be collectively imported into Power BI to create a wholesome report. This allows data analysts and developers to extract insights that would otherwise be hidden in rows of text.&lt;/p&gt;

&lt;p&gt;Most companies connect Power BI directly to databases for the following reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Data Access:&lt;/strong&gt; It gives visibility of the most current data available.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automation:&lt;/strong&gt; Reports can be developed and scheduled to refresh instantly, removing the need for manual updates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficiency:&lt;/strong&gt; It saves analysts time by eliminating the need to generate new reports every time the underlying data changes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;PostgreSQL&lt;/strong&gt; is a primary example of a SQL database. These are commonly used in companies as they offer a structured and secure environment to store, manage, and retrieve large volumes of data effectively.&lt;/p&gt;

&lt;p&gt;In this article, we will walk through connecting a Postgres Database to Power BI Desktop.&lt;/p&gt;




&lt;h3&gt;
  
  
  Part 1: Connecting to a Local PostgreSQL Database
&lt;/h3&gt;

&lt;p&gt;To connect to a Postgres database sitting on your own machine:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Open Power BI Desktop.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;On the Home ribbon, click &lt;strong&gt;Get Data&lt;/strong&gt;, select &lt;strong&gt;PostgreSQL database&lt;/strong&gt; from the list, and click &lt;strong&gt;Connect&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Connection Details:&lt;/strong&gt; For the Server, type &lt;code&gt;localhost&lt;/code&gt;. This tells Power BI to look at your own machine.&lt;br&gt;
&lt;strong&gt;Database Name:&lt;/strong&gt; Enter the specific name of your database (e.g., &lt;code&gt;sales_db&lt;/code&gt;).&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3ltj8lca5jqt6n9727p2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3ltj8lca5jqt6n9727p2.png" alt="Enter Connection Details" width="800" height="491"&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Credentials:&lt;/strong&gt; When prompted, enter the username and password you created during PostgreSQL installation.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After successful authentication, a popup appears showing the &lt;strong&gt;existing&lt;/strong&gt; tables in your database. Select the tables you need and click &lt;strong&gt;Load&lt;/strong&gt; to bring them into the Power BI environment.&lt;/p&gt;




&lt;h3&gt;
  
  
  Part 2: Moving to the Cloud (Aiven PostgreSQL)
&lt;/h3&gt;

&lt;p&gt;In most real work environments, data lives in the cloud so everyone can access it from anywhere. For this guide, we will be using &lt;strong&gt;Aiven&lt;/strong&gt;. Connecting from a cloud database requires a few extra security steps.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to connect:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Open Power BI Desktop.&lt;/strong&gt; Click &lt;strong&gt;Get Data&lt;/strong&gt;, select &lt;strong&gt;PostgreSQL database&lt;/strong&gt;, and click &lt;strong&gt;Connect&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Obtain Details:&lt;/strong&gt; From your Aiven console, copy your &lt;strong&gt;Host, Port, Database Name, Username,&lt;/strong&gt; and &lt;strong&gt;Password&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Download the Certificate:&lt;/strong&gt; In the list of connection details, download the &lt;code&gt;ca.pem&lt;/code&gt; file by clicking the download icon.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Why SSL?&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
 When connecting to a cloud database, we use &lt;strong&gt;SSL certificates&lt;/strong&gt;. This acts as a secure tunnel for your data as it travels over the internet. SSL encrypts the connection so that malicious actors cannot "intercept" your credentials or your company's private data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9ti3p8m8z3u577iu7ez.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9ti3p8m8z3u577iu7ez.png" alt="Database Details" width="800" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enter database details:&lt;/strong&gt; Fill in the details obtained from Aiven.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F19zh0gp0ujf0kqjtn1h3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F19zh0gp0ujf0kqjtn1h3.png" alt="Power BI Connection Details" width="800" height="395"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Under server name, enter the host name and the port in the format: `host:port&lt;/em&gt;`&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SSL Configuration:&lt;/strong&gt; You may receive an error because Power BI cannot automatically verify the cloud's CA certificate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1da5e99ye0q27z16ffhj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1da5e99ye0q27z16ffhj.png" alt="Error Msssage" width="703" height="349"&gt;&lt;/a&gt;&lt;br&gt;
To resolve this, manually import the certificate to your Windows machine:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Press &lt;strong&gt;Windows + R&lt;/strong&gt;, type &lt;code&gt;certmgr.msc&lt;/code&gt;, and press Enter.&lt;/li&gt;
&lt;li&gt;Select &lt;strong&gt;Trusted Root Certification Authorities&lt;/strong&gt;, then click &lt;strong&gt;Certificates&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Right-click the folder, choose &lt;strong&gt;All Tasks&lt;/strong&gt; &amp;gt; &lt;strong&gt;Import&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Browse to your &lt;code&gt;ca.pem&lt;/code&gt; file (change file type to "All Files" to see it). Select it and click &lt;strong&gt;Next&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Finish the wizard. You will get a popup saying "Import Successful."&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj37oxbkqvhatqfm0tk1w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj37oxbkqvhatqfm0tk1w.png" alt="Windows cerificates page" width="800" height="572"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Restart Power BI&lt;/strong&gt; to apply the changes. Your connection will now be successful, and your tables will be ready for loading.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Part 3: Building the Data Model
&lt;/h3&gt;

&lt;p&gt;Once your tables (&lt;strong&gt;Customers, Products, Sales, and Inventory&lt;/strong&gt;) are loaded, they appear in the "Data" pane. Now you must connect them.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftum56wleb27cbljqakze.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftum56wleb27cbljqakze.png" alt="Loaded Data on Power BI" width="800" height="608"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Data Modeling&lt;/strong&gt; is the process of telling Power BI how these tables relate to one another. For example, the &lt;code&gt;Customer_ID&lt;/code&gt; in your &lt;strong&gt;Sales&lt;/strong&gt; table should link to the &lt;code&gt;ID&lt;/code&gt; in your &lt;strong&gt;Customers&lt;/strong&gt; table.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Benefit:&lt;/strong&gt; By creating these relationships, you can filter a chart by "Customer Name" and see exactly what "Products" they bought across all "Sales."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Power BI often creates these automatically, which you can verify in the &lt;strong&gt;Model View&lt;/strong&gt; pane.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzvhm1agrcyovn6835qtd.png" alt="Relationships on Power BI" width="800" height="372"&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Conclusion: Why SQL is a Superpower for Power BI Analysts
&lt;/h3&gt;

&lt;p&gt;You might wonder, "If I have Power BI, do I still need SQL?" The answer is &lt;strong&gt;yes&lt;/strong&gt;. SQL skills are important because they allow you to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Filter at the source:&lt;/strong&gt; Instead of bringing 1 million rows into Power BI and slowing it down, you can write a SQL query to only bring in the specific data you need.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Preparation:&lt;/strong&gt; You can perform complex aggregations and clean up messy data before it even reaches your dashboard.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mastering the connection between SQL and Power BI is what turns a basic report into a powerful, automated business tool.&lt;/p&gt;

</description>
      <category>postgressql</category>
      <category>powerapps</category>
      <category>dataengineering</category>
      <category>beginners</category>
    </item>
    <item>
      <title>SQL Joins Explained: Case Example</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Mon, 02 Mar 2026 18:43:59 +0000</pubDate>
      <link>https://dev.to/gathurum/sql-joins-explained-case-example-2l3</link>
      <guid>https://dev.to/gathurum/sql-joins-explained-case-example-2l3</guid>
      <description>&lt;p&gt;Structured Query Language(SQL) is a computer language for storing, manipulating, and retrieving data stored in a relational database.&lt;/p&gt;

&lt;p&gt;SQL Joins are like clauses used to combine records from two or more tables in a database, just as the name(join) means.&lt;/p&gt;

&lt;p&gt;In this article, you will learn, using a case example, the different types of joins, when, and how they are used. Be sure to check for “&lt;strong&gt;bonus joins”&lt;/strong&gt; included at the end of the article.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example Data
&lt;/h2&gt;

&lt;p&gt;We will use data from these 2 tables to show various ways to display employees from John Smith's company.&lt;/p&gt;

&lt;h3&gt;
  
  
  Departments Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Department_Id&lt;/th&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Research&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Employees Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Employee_Id&lt;/th&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;th&gt;Department_Id&lt;/th&gt;
&lt;th&gt;Job_Role&lt;/th&gt;
&lt;th&gt;Manager_Id&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;CEO&lt;/td&gt;
&lt;td&gt;Null&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;CFO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;CIO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR Manager&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Sales Manager&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Support Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR Representative&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Front-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;Null&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The code below shows the syntax for writing a JOIN:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;column1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;table2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;column1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Types of Joins&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There are four major types of joins in SQL, as listed below:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Inner Join&lt;/li&gt;
&lt;li&gt;Left Join&lt;/li&gt;
&lt;li&gt;Right Join&lt;/li&gt;
&lt;li&gt;Full Join&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Inner Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It creates a result table that displays information &lt;strong&gt;common&lt;/strong&gt; between two tables based on a shared piece of information. &lt;/p&gt;

&lt;p&gt;It is the most important and frequently used of the joins.&lt;/p&gt;

&lt;p&gt;In this case, we could use it to display employees from the departments listed in the first table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Job_Role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;
 &lt;span class="c1"&gt;-- You can replace the keyword INNER JOIN with JOIN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;th&gt;Job_Role&lt;/th&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;CEO&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;CFO&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;CIO&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR Manager&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;Sales Manager&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development Manager&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support Manager&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR Representative&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Front-End Developer&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Left Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;LEFT JOIN&lt;/em&gt; returns &lt;strong&gt;all&lt;/strong&gt; rows from the left table, plus matched values from the right table or NULL in case of no match.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: The left table refers to the table that appears before the "LEFT JOIN" keywords in your SQL query. Same case applies when using RIGHT JOIN&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We could use left join to retrieve a list of all employees along with their department names. If an employee doesn't belong to a department, display NULL&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; 
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Employee_Id&lt;/th&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;SALES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;SALES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can use the COALESCE() function to replace NULL values with more relatable user-defined values.&lt;/p&gt;

&lt;p&gt;i.e. If an employee doesn't belong to a department, display "No Department" instead.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"No Department"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Department_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; 
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Employee_Id&lt;/th&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;SALES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;SALES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;No Department&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Right Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;SQL RIGHT JOIN&lt;/em&gt; returns all rows from the right table, even if there are no matches&lt;br&gt;
in the left table. Not so different from the LEFT JOIN.&lt;/p&gt;

&lt;p&gt;In this case, we could view employees and their various departments&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;Department_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;COALESCE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"No Employee"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Full_Name&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; 
&lt;span class="k"&gt;RIGHT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Research&lt;/td&gt;
&lt;td&gt;No Employee&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Full Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;SQL FULL JOIN&lt;/em&gt; combines the results of both left and right outer joins.&lt;/p&gt;

&lt;p&gt;The joined table will contain all records from both tables and fill in NULLs for missing&lt;br&gt;
matches on either side.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Job_Role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; 
&lt;span class="k"&gt;FULL&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;EMPLOYEE_ID&lt;/th&gt;
&lt;th&gt;FULL_NAME&lt;/th&gt;
&lt;th&gt;JOB_ROLE&lt;/th&gt;
&lt;th&gt;DEPARTMENT_ID&lt;/th&gt;
&lt;th&gt;DEPARTMENT_NAME&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;CEO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;CFO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;CIO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR Manager&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;Sales Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development Manager&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support Manager&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR Representative&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Front-End Developer&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Research&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Bonus Joins
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Self Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;SELF JOIN&lt;/em&gt; is typically not a join type but a special way of joining a table to itself. You may want to combine rows in a table based on a related column present in the table.&lt;/p&gt;

&lt;p&gt;It is commonly used when you need to traverse a hierarchical structure where each row references another row in the same table &lt;/p&gt;

&lt;p&gt;OR&lt;/p&gt;

&lt;p&gt;When comparing data within rows in the same table.&lt;/p&gt;

&lt;p&gt;In this case, we will use self-join in the employees’ table to display the employee-manager relationship. Each employee record contains a reference to the manager's ID, allowing us to retrieve more information about the managers by adding another column, “Manager_Name”. In Data Science, this is an example of &lt;strong&gt;&lt;em&gt;feature engineering.&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Job_Role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Manager_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Manager_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Manager_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;EMPLOYEE_ID&lt;/th&gt;
&lt;th&gt;FULL_NAME&lt;/th&gt;
&lt;th&gt;JOB_ROLE&lt;/th&gt;
&lt;th&gt;DEPARTMENT_ID&lt;/th&gt;
&lt;th&gt;MANAGER_NAME&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;CFO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;CIO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR Manager&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;Sales Manager&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR Representative&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Front-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's most of what you need to know about SQL Joins. Hopefully, you found this information helpful. Feel free to share it with anyone who's having a hard time with Joins, and keep learning! 😊&lt;/p&gt;

&lt;p&gt;If you have any questions, please leave them in the comments section below.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thanks for reading!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>sql</category>
      <category>datascience</category>
      <category>beginners</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>Turning Messy Data into Business Action</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Mon, 09 Feb 2026 02:07:28 +0000</pubDate>
      <link>https://dev.to/gathurum/turning-messy-data-into-business-action-3jj8</link>
      <guid>https://dev.to/gathurum/turning-messy-data-into-business-action-3jj8</guid>
      <description>&lt;h2&gt;
  
  
  How Analysts Translate Messy Data, DAX, and Dashboards into Action Using Power BI
&lt;/h2&gt;

&lt;p&gt;As a data analyst, whether in the corporate world or working on a personal project, you will find that data analysis is often driven by the &lt;strong&gt;need to solve a problem&lt;/strong&gt; or wanting an accurate, insightful view of data to make a better decision.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Define the Objective
&lt;/h3&gt;

&lt;p&gt;First, we need to come up with the objective of our analysis.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ask Questions:&lt;/strong&gt; Write down the specific questions you want answers to from the analysis.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Theme Your Dashboard:&lt;/strong&gt; If you need a dashboard, design one that maintains a clear theme. Don’t just show every kind of data to the user—focus on what matters.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Gathering the Data
&lt;/h3&gt;

&lt;p&gt;The data we need is often scattered. You might need to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scrape it online.&lt;/li&gt;
&lt;li&gt;Use data from survey results.&lt;/li&gt;
&lt;li&gt;Combine several Excel sheets.&lt;/li&gt;
&lt;li&gt;Fetch data from a database.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, how do analysts translate this &lt;strong&gt;Messy Data&lt;/strong&gt; into action? &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Power BI&lt;/strong&gt; allows you to access data from various sources, load it, transform it, and analyze it—all in the same workspace.&lt;/p&gt;

&lt;p&gt;In Power BI, we follow the &lt;strong&gt;ELT approach&lt;/strong&gt; (Extract, Load, Transform).&lt;/p&gt;




&lt;h2&gt;
  
  
  The ELT Process
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Extract and Load
&lt;/h3&gt;

&lt;p&gt;First, we start by &lt;strong&gt;Extracting&lt;/strong&gt; the required data. &lt;br&gt;
When you launch Power BI, you get a prompt to select a data source. Simply select where your data lies and import it into Power BI by &lt;strong&gt;Loading&lt;/strong&gt; it there.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvo322w5lt36jliquox1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyvo322w5lt36jliquox1.png" alt="Power BI Data Sources" width="800" height="176"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Transform
&lt;/h3&gt;

&lt;p&gt;Once all your data is loaded, the next step is &lt;strong&gt;Cleaning&lt;/strong&gt;. You want to remove all inconsistencies in your data by ensuring:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Correct Data Types:&lt;/strong&gt; All fields (Dates, Decimals, Text) must be set correctly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handle Missing Values:&lt;/strong&gt; Deal with nulls or blanks by either removing rows or filling them where appropriate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Splitting/Merging:&lt;/strong&gt; Split or merge columns where necessary (e.g., splitting an "Address" column to get separate "Location" and "Street" fields).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standardization:&lt;/strong&gt; Ensure consistency (e.g., making sure all currency data is in either Dollars or Shillings across the entire dataset).&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Data Modelling: The Integration Step
&lt;/h2&gt;

&lt;p&gt;After the data is clean and transformed, the best way to integrate it is through &lt;strong&gt;Data Modelling&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;As discussed in my previous article &lt;a href="https://dev.to/gathurum/how-to-implement-data-modelling-in-power-bi-4p2h"&gt;Link&lt;/a&gt; , data modelling helps you create a clear structure and establish necessary relationships before you begin any plotting or calculations. This involves creating a proper &lt;strong&gt;Star Schema&lt;/strong&gt;, consisting of a Fact table and Dimension tables.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Always make sure to click &lt;strong&gt;"Close &amp;amp; Apply"&lt;/strong&gt; to save all changes made in the Power Query Editor before you start building your reports.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  DAX
&lt;/h2&gt;

&lt;p&gt;Once your data model is set up, you might find that the raw data doesn't provide all the answers immediately. This is where &lt;strong&gt;DAX (Data Analysis Expressions)&lt;/strong&gt; comes in. &lt;br&gt;
DAX is the formula language of Power BI, allowing us to calculate and generate new information in two primary ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Measures:&lt;/em&gt;&lt;/strong&gt; Use DAX to create summary aggregations. These calculate values on the fly, such as Total Sales, Year-over-Year Growth, Profit Margin, or Average Selling Price.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Calculated Columns:&lt;/em&gt;&lt;/strong&gt; Generate new columns within your tables to provide more granular information. For example, you could create a "Profit Status" column that labels each row as "Profitable" or "Loss" based on a calculation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DAX respects your model’s relationships. When you create a measure and plot it in a visual, it automatically reacts to the "Filter Context." This means the numbers will automatically filter and update based on the Dimensions (such as Date, Category, or Location) you use in your report.&lt;/p&gt;

&lt;h2&gt;
  
  
  Visualizing for Answers
&lt;/h2&gt;

&lt;p&gt;A well-modeled dataset allows us to answer any question we might have. The best way to do this is to plot &lt;strong&gt;Visuals&lt;/strong&gt;. Why?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Spot Patterns Fast:&lt;/strong&gt; Visuals help you and your audience see trends or patterns immediately.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tell a Story:&lt;/strong&gt; After answering your initial questions, creating a dashboard tells a story about the problem you want to solve.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;A great dashboard should:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Describe the problem clearly.&lt;/li&gt;
&lt;li&gt;Illustrate insights found in the data.&lt;/li&gt;
&lt;li&gt;Suggest a potential solution to the problem.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By following this flow, your analysis remains relevant and provides genuine value, rather than just showing the dashboard user what they already know.&lt;/p&gt;




</description>
      <category>analytics</category>
      <category>beginners</category>
      <category>ai</category>
      <category>dataengineering</category>
    </item>
    <item>
      <title>How to Implement Data Modelling in Power BI</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Mon, 02 Feb 2026 21:25:14 +0000</pubDate>
      <link>https://dev.to/gathurum/how-to-implement-data-modelling-in-power-bi-4p2h</link>
      <guid>https://dev.to/gathurum/how-to-implement-data-modelling-in-power-bi-4p2h</guid>
      <description>&lt;p&gt;In this article we will explore the fundamental concepts of data modelling and specifically how to implement data modelling within Power BI for effective data analysis. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Modelling&lt;/strong&gt; is simply structuring or setting up your data in tables, relationships, and access to data for analysis scenarios.&lt;/p&gt;

&lt;p&gt;Often you'll find data distributed in several sheets or systems since, that is how it's maintained. If lucky your data could be in a neat setup &lt;em&gt;(Data Warehouse)&lt;/em&gt; where most of the structuring has already been done.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;em&gt;Case Scenario&lt;/em&gt;:
&lt;/h3&gt;

&lt;p&gt;Here is some sample data, to help make sense of this concept:&lt;br&gt;
&lt;a href="https://www.youtube.com/redirect?event=video_description&amp;amp;redir_token=QUFFLUhqbllqX1p4bnBVbUhVTVozSTJvMEFxZWV1azRWZ3xBQ3Jtc0ttdG53eFJac2w2am9Lb3lDVWlrMm10UFVkRkgyZFRHX25OaU9RcmlmUmpIRHlpZ05WcjR0WjlnUWN6S1BjcWZzQXRoX0ItbjJ3U3FKWFY5NndzQmQ0VkZNcjJrNGY4SzFOek5ROUFxQTRFdS1iUWROaw&amp;amp;q=https%3A%2F%2Fchandoo.org%2Fwp%2Fwp-content%2Fuploads%2F2024%2F10%2Fsample-chocolate-sales-data-all.xlsx&amp;amp;v=4ePNrdxWtY0" rel="noopener noreferrer"&gt;Sample Chocolate Sales Data&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Understanding the Tables
&lt;/h3&gt;

&lt;p&gt;In our case, we have 5 tables.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fact Table:&lt;/strong&gt; One table that contains pointers to other tables, but has no specific information. In our case, the &lt;em&gt;shipments table&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dimension Tables:&lt;/strong&gt; Each explains one dimension of data. What is happening from the perspective of each dimension/entity. These will be the rest of our tables.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Calendar Table:&lt;/strong&gt; A special Dimension table found in most data that holds the time component of analysis, i.e., [Timeseries, Forecasting or Trend analysis].&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Doing this in Power BI
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Load Data:&lt;/strong&gt; Load the data in Power BI by selecting Excel Workbook as the data source. &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fidjk4u8k5m4pr41v418v.png" alt="Loading Data to Power BI" width="800" height="609"&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic Modelling:&lt;/strong&gt; Power BI automatically tries to match the tables and create a model; you can see this in the &lt;strong&gt;Model View&lt;/strong&gt;. &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbrrodf06j67g5i12v5y4.png" alt="Data Modelling in Power BI" width="800" height="487"&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Manual Relationship Setup:&lt;/strong&gt; 🤔 However, not all the data was modelled automatically. To finish up, drag the &lt;strong&gt;'Shipdate'&lt;/strong&gt; field from the Shipment table onto &lt;strong&gt;'cal_date'&lt;/strong&gt; in the Calendar table to create a new relationship and complete the look.&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs48l55io96b3356hagye.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs48l55io96b3356hagye.png" alt="Relationships in Power BI" width="765" height="649"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Notice that the schema now looks &lt;em&gt;Star-shaped&lt;/em&gt;. This kind of model with one fact table and multiple dimension tables is called a &lt;strong&gt;Star Schema&lt;/strong&gt;. The process of setting up this structure is &lt;strong&gt;Data Modelling&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Manage Relationships:&lt;/strong&gt; You can view and edit all connections by clicking on &lt;strong&gt;Manage Relationships&lt;/strong&gt;. This menu shows the relationship status, specifically whether they are &lt;strong&gt;Active&lt;/strong&gt; or &lt;strong&gt;Inactive&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Table Characteristics
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Relationship Design:&lt;/strong&gt; The Products table shows each product in a row with more information from the Shipments table indicating a Many to One Relationship.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fact Table Design:&lt;/strong&gt; The shipments table has more records as compared to the dimension tables in our data. &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Therefore, Fact tables are engineered to host many records and may also have fewer columns as compared to dimension tables.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Effective data modelling is the backbone of any powerful analysis. By transforming messy, multi-sheet data into a structured Star Schema, everything will feel fast and intuitive; when it's wrong, you spend more time fixing "broken" numbers than actually analyzing data.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>datascience</category>
      <category>bigdata</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Introduction to Linux for Data Engineers, Including Practical Use of Vi and Nano with Examples</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Mon, 26 Jan 2026 05:08:41 +0000</pubDate>
      <link>https://dev.to/gathurum/introduction-to-linux-for-data-engineers-including-practical-use-of-vi-and-nano-with-examples-2ck0</link>
      <guid>https://dev.to/gathurum/introduction-to-linux-for-data-engineers-including-practical-use-of-vi-and-nano-with-examples-2ck0</guid>
      <description>&lt;h2&gt;
  
  
  Why Every Data Engineer Needs to Make Friends with Linux
&lt;/h2&gt;

&lt;p&gt;Diving into Data Engineering, you’re probably well into writing some Python scripts or SQL queries. But there's mentions of "The Server" or "Production," and suddenly everyone is talking about Linux.&lt;/p&gt;

&lt;p&gt;If you’re wondering why you can’t just keep using Windows or macOS for everything, you’re not alone. Let’s break down &lt;strong&gt;why Linux&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you look at regular people browsing the web, about 70-75% of them use Windows. It’s comfortable, right? But in the "under-the-hood" world, where we have servers that actually run the internet and process massive datasets, Linux owns about 90% to 100% of that space.&lt;/p&gt;

&lt;p&gt;Knowing Linux isn't just a "nice to have"; it’s your secret weapon. If you can navigate a Linux terminal, you instantly become more hireable because you can actually manage the tools you build.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Here is why Linux is a big deal:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;It’s where your work actually lives: You might write code on your laptop, but your data pipelines (the stuff that moves and cleans data) will almost certainly live on a Linux server. If it’s in "Production," it’s on Linux.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Security: In data engineering, you’re handling sensitive information — names, emails, credit card digits. Linux is built like a fortress. It handles permissions and privacy way better than the standard consumer OS.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Command Line makes you a pro: Typing commands might feel like you're in an old hacker movie, but it’s actually way faster than clicking through menus. Mastering the command line makes you faster, more confident, and—honestly—just a better engineer.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It’s built for speed: Linux is "lightweight." It doesn’t waste energy on background apps you don't need. This means your data pipelines run faster and more efficiently.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Part 2: Getting Set Up (The WSL2 Guide)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;On Linux/Mac&lt;/strong&gt;: You’re already set! Just search for "Terminal" in your apps. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;On Windows&lt;/strong&gt;: The best way to do this is through WSL2 (Windows Subsystem for Linux).&lt;/p&gt;

&lt;h3&gt;
  
  
  How to install WSL2:
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;Enable the Feature:&lt;/em&gt; Go to your Windows Search Bar and Type 'Turn on Windows Features On or Off'.&lt;br&gt;
Find "Windows Subsystem for Linux" in the list, check the box, and click OK.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1e1nyel0zvl68ppsongd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1e1nyel0zvl68ppsongd.png" alt=" " width="693" height="609"&gt;&lt;/a&gt;&lt;br&gt;
Open PowerShell as Administrator and type: &lt;br&gt;
&lt;code&gt;wsl --install&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Restart:&lt;/em&gt; You must restart your computer for the changes to take effect.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Finalize:&lt;/em&gt; After rebooting, a terminal will open. Follow the prompts to create a Username and Password. (Note: The password won't show characters as you type—this is normal!).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9m6cz9yh8dt61a1dqib0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9m6cz9yh8dt61a1dqib0.png" alt=" " width="800" height="196"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Part 3: The Essential Commands
&lt;/h2&gt;

&lt;p&gt;We first need to know how to navigate this environment. Here are some basic commands you can try out.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;pwd&lt;/code&gt; (Print Working Directory):&lt;/strong&gt; Tells you exactly which folder you are currently in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;ls&lt;/code&gt; (List):&lt;/strong&gt; Shows you what’s inside your current folder.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;cd&lt;/code&gt; (Change Directory):&lt;/strong&gt; Your "walking" command. Use &lt;code&gt;cd ..&lt;/code&gt; to go back one folder.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;mkdir folder_name&lt;/code&gt; (Make Directory):&lt;/strong&gt; Creates a new folder.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;touch filename&lt;/code&gt;:&lt;/strong&gt; Creates a brand new, empty file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;rm filename&lt;/code&gt; (Remove):&lt;/strong&gt; Deletes a file. &lt;strong&gt;Be careful:&lt;/strong&gt; there is no "Undo."&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Pro Productivity Tips
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tab Completion:&lt;/strong&gt; Start typing a name and hit &lt;code&gt;Tab&lt;/code&gt;. Linux will finish the word for you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Up Arrow:&lt;/strong&gt; Hit the up arrow to see commands you typed previously.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;clear&lt;/code&gt;:&lt;/strong&gt; Wipes the screen clean.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;Ctrl + C&lt;/code&gt;:&lt;/strong&gt; The "Emergency Stop" button if a command is stuck.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Part 4: Real-World Practice
&lt;/h2&gt;

&lt;p&gt;We’re going to create a workspace and download a real dataset (the famous Iris flower dataset).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a folder and redirect to it&lt;/span&gt;
&lt;span class="nb"&gt;mkdir &lt;/span&gt;linux_practice
&lt;span class="nb"&gt;cd &lt;/span&gt;linux_practice
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create your own file
&lt;/h3&gt;

&lt;p&gt;Before we download anything, let's create a small file to store settings or write small notes.&lt;br&gt;
&lt;code&gt;touch my_notes.txt&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;If you run &lt;code&gt;ls&lt;/code&gt;, you’ll see your new empty file sitting there.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcohssn19djuh2jjq7lok.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcohssn19djuh2jjq7lok.png" alt=" " width="800" height="289"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Download the data using 'wget'
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget https://raw.githubusercontent.com/dataprofessor/data/master/iris.csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you run &lt;code&gt;ls&lt;/code&gt; again, you’ll see your new csv file added to the list.&lt;/p&gt;

&lt;h3&gt;
  
  
  Peek at the top 10 rows
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;head -n 10 iris.csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsgb224ic7e9mplyr9tzl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsgb224ic7e9mplyr9tzl.png" alt=" " width="800" height="311"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 5: Mastering the Editors (Nano vs. Vim)
&lt;/h2&gt;

&lt;p&gt;On a remote server, you don't have VS Code. You have to edit files inside the terminal.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Nano:
&lt;/h3&gt;

&lt;p&gt;Nano is like a very basic version of Notepad that lives in your terminal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to use it:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Type &lt;code&gt;nano iris.csv&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Editing:&lt;/strong&gt; Use your arrow keys to move the cursor and just start typing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Bottom Menu:&lt;/strong&gt; See those &lt;code&gt;^&lt;/code&gt; symbols? That means the &lt;strong&gt;Ctrl&lt;/strong&gt; key.&lt;/li&gt;
&lt;li&gt;To Save: Press &lt;code&gt;Ctrl + O&lt;/code&gt; (Write Out), then hit &lt;strong&gt;Enter&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;To Exit: Press &lt;code&gt;Ctrl + X&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  2. Vim:
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;How to use it:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Type &lt;code&gt;vim iris.csv&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Normal Mode:&lt;/strong&gt; You start here. You &lt;em&gt;cannot&lt;/em&gt; type text yet. This mode is for moving around and running commands.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Insert Mode:&lt;/strong&gt; To start typing, press the letter &lt;strong&gt;&lt;code&gt;i&lt;/code&gt;&lt;/strong&gt;. You’ll see &lt;code&gt;-- INSERT --&lt;/code&gt; at the bottom. Now you can edit the file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Saving your work:&lt;/strong&gt; * First, hit the &lt;strong&gt;&lt;code&gt;Esc&lt;/code&gt;&lt;/strong&gt; key to leave "Insert Mode" and go back to "Normal Mode."&lt;/li&gt;
&lt;li&gt;Type &lt;strong&gt;&lt;code&gt;:wq&lt;/code&gt;&lt;/strong&gt; and hit &lt;strong&gt;Enter&lt;/strong&gt;. (&lt;code&gt;w&lt;/code&gt; means save/write, &lt;code&gt;q&lt;/code&gt; means quit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Emergency Exit:&lt;/strong&gt; If you messed up and want to leave &lt;em&gt;without&lt;/em&gt; saving, hit &lt;code&gt;Esc&lt;/code&gt;, then type &lt;strong&gt;&lt;code&gt;:q!&lt;/code&gt;&lt;/strong&gt; and hit &lt;strong&gt;Enter&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;You navigated a server, pulled data, and edited files using the terminal.👏&lt;/p&gt;

&lt;p&gt;You can also use a "cheat sheet" to guide you for the first few weeks. Happy coding!&lt;/p&gt;

</description>
      <category>linux</category>
      <category>beginners</category>
      <category>dataengineering</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Want to Learn Git and GitHub? A Step-by-Step Guide to Version Control</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Sun, 18 Jan 2026 13:09:29 +0000</pubDate>
      <link>https://dev.to/gathurum/want-to-learn-git-and-github-a-step-by-step-guide-to-version-control-dko</link>
      <guid>https://dev.to/gathurum/want-to-learn-git-and-github-a-step-by-step-guide-to-version-control-dko</guid>
      <description>&lt;p&gt;&lt;em&gt;If you’re like me, you probably heard people talking about "pushing to main" or "merging branches" and thought, “What on earth are they talking about?” &lt;br&gt;
It feels like everyone knows this secret language. But honestly? Once you get the hang of it, Git and GitHub are just a fancy way of making sure you don't accidentally delete your entire project.&lt;/em&gt;😊&lt;/p&gt;

&lt;p&gt;Let’s break it down.&lt;/p&gt;

&lt;p&gt;This guide will walk you through the fundamentals of Git and GitHub. You'll learn how to install them, connect them, and use them to manage your code effectively.&lt;/p&gt;
&lt;h2&gt;
  
  
  What is Git?
&lt;/h2&gt;

&lt;p&gt;Git is a &lt;strong&gt;version control system&lt;/strong&gt;. Think of it as a smart history book for your code. Every time you make changes to your project, Git can record them. This means you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Track modifications:&lt;/strong&gt; See exactly what changes were made, when, and by whom.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Revert:&lt;/strong&gt; Easily go back to an earlier state of your code if something goes wrong.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  What is GitHub?
&lt;/h2&gt;

&lt;p&gt;GitHub is a &lt;strong&gt;web-based hosting service for Git repositories&lt;/strong&gt;, like the cloud (think Google Drive).&lt;br&gt;
While Git is the tool you use on your computer to manage versions, GitHub is like a cloud storage and collaboration platform for your Git projects. It allows you to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Store:&lt;/strong&gt; Keep a backup if your laptop decides to take a permanent nap.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Share:&lt;/strong&gt; Make your projects accessible to others.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collaborate:&lt;/strong&gt; Provide a central hub for others to work on a project with you without overwriting each other's work.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Showcase:&lt;/strong&gt; Create a portfolio of your projects.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Step 1: Installing Git Bash
&lt;/h2&gt;

&lt;p&gt;Git Bash is a command-line interface (CLI) for Windows where you type commands to talk to Git.&lt;/p&gt;
&lt;h3&gt;
  
  
  For Windows:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Download Git Bash:&lt;/strong&gt; Go to the official Git website: &lt;a href="https://git-scm.com/download/win" rel="noopener noreferrer"&gt;https://git-scm.com/download/win&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Run the installer:&lt;/strong&gt; Follow the on-screen instructions. You can generally stick with the default options, but here is something to keep in mind:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"Choosing the default editor used by Git":&lt;/strong&gt; You can keep the default (Vim) or choose another editor you're familiar with (like VS Code, Notepad++, etc.).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Adjusting the name of the initial branch":&lt;/strong&gt; Choose "Let Git decide" (it usually defaults to &lt;code&gt;main&lt;/code&gt; now).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Verify the installation:&lt;/strong&gt; Open the Git Bash application, type &lt;code&gt;git --version&lt;/code&gt; and press Enter. You should see the Git version number.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Step 2: Configuring Git
&lt;/h2&gt;

&lt;p&gt;Before you start using Git, you need to tell it who you are.&lt;br&gt;
This information will be attached to your commits to track who is making changes.&lt;/p&gt;

&lt;p&gt;While still on Git Bash and run these two commands, replacing the placeholders with your name and email:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; user.name &lt;span class="s2"&gt;"Your Name"&lt;/span&gt;
git config &lt;span class="nt"&gt;--global&lt;/span&gt; user.email &lt;span class="s2"&gt;"your.email@example.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can verify your settings with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git config &lt;span class="nt"&gt;--global&lt;/span&gt; &lt;span class="nt"&gt;--list&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Step 3: Connecting the Two (The Handshake)
&lt;/h2&gt;

&lt;p&gt;You will first need a GitHub account. &lt;br&gt;
If you don't have one already, head over to &lt;a href="https://github.com/join" rel="noopener noreferrer"&gt;https://github.com/join&lt;/a&gt; and sign up.&lt;br&gt;
Follow the prompts to create your username, password, and verify your email.&lt;/p&gt;

&lt;p&gt;Now, to make your computer talk to GitHub without asking for your password every five seconds, we use something called an &lt;strong&gt;SSH Key&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Generate a new SSH key:&lt;/strong&gt; In Git Bash, type:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh-keygen &lt;span class="nt"&gt;-t&lt;/span&gt; ed25519 &lt;span class="nt"&gt;-C&lt;/span&gt; &lt;span class="s2"&gt;"your@email.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Just keep hitting Enter until it stops.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Start the SSH agent:&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;eval&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;ssh-agent &lt;span class="nt"&gt;-s&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Add your SSH key to the SSH agent:&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh-add ~/.ssh/id_ed25519
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Copy your public SSH key:&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; ~/.ssh/id_ed25519.pub
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;A bunch of random letters and numbers will pop up. Highlight and copy them.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pass this to GitHub:&lt;/strong&gt; 
Go to your GitHub settings, look for &lt;strong&gt;SSH and GPG keys&lt;/strong&gt;, click &lt;strong&gt;New SSH Key&lt;/strong&gt;, and paste that wall of text in there.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Step 4: Your First Project (The Fun Part)
&lt;/h2&gt;

&lt;p&gt;Let’s say you have a folder on your computer called &lt;code&gt;my-cool-app&lt;/code&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Tell Git to watch this folder:&lt;/strong&gt; Inside Git Bash, go to that folder and type
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Make a file:&lt;/strong&gt; Create a simple text file called &lt;code&gt;readme.txt&lt;/code&gt; and write "Hello world" inside it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The "Add and Commit":&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add &lt;span class="nb"&gt;.&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"My first save point"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;&lt;br&gt;
 (This tells Git, "Hey, look at all the changes I just made!").&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 5: Sending it to the Cloud (Pushing)
&lt;/h2&gt;

&lt;p&gt;Now, go to GitHub, create a "New Repository," and name it whatever you want. GitHub will give you a link that looks like &lt;code&gt;git@github.com:yourname/your-repo.git&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Link them up:&lt;/strong&gt; Type,
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git remote add origin &lt;span class="o"&gt;[&lt;/span&gt;PASTE YOUR LINK HERE]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Send it!:&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git push &lt;span class="nt"&gt;-u&lt;/span&gt; origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Refresh your GitHub page, and boom! Your code is on the internet.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 6: Pulling it Back
&lt;/h2&gt;

&lt;p&gt;If you ever go to a different computer, or if a friend changes your code on GitHub, you need to bring those changes back to your machine.&lt;/p&gt;

&lt;p&gt;Just type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git pull origin main
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It’s like syncing your phone—it grabs the newest version from the cloud and puts it on your computer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;That’s pretty much the "Big Three":&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Add&lt;/strong&gt; (Pick what you want to save)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit&lt;/strong&gt; (Save it)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Push&lt;/strong&gt; (Send it to GitHub)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Don't worry if you forget the commands at first. I still have to Google them half the time. &lt;br&gt;
Just keep playing around with them, and eventually, it’ll feel like second nature. &lt;br&gt;
Happy coding!🥳&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>github</category>
      <category>git</category>
      <category>datascience</category>
    </item>
    <item>
      <title>SQL Joins Explained: Case Example</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Wed, 13 Sep 2023 01:55:44 +0000</pubDate>
      <link>https://dev.to/gathurum/sql-joins-explained-case-example-2kgd</link>
      <guid>https://dev.to/gathurum/sql-joins-explained-case-example-2kgd</guid>
      <description>&lt;p&gt;Structured Query Language(SQL) is a computer language for storing, manipulating, and retrieving data stored in a relational database.&lt;/p&gt;

&lt;p&gt;SQL Joins are like clauses used to combine records from two or more tables in a database, just as the name(join) means.&lt;/p&gt;

&lt;p&gt;In this article, you will learn, using a case example, the different types of joins, when, and how they are used. Be sure to check for “&lt;strong&gt;bonus joins”&lt;/strong&gt; included at the end of the article.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example Data
&lt;/h2&gt;

&lt;p&gt;We will use data from these 2 tables to show various ways to display employees from John Smith's company.&lt;/p&gt;

&lt;h3&gt;
  
  
  Departments Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Department_Id&lt;/th&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Research&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Employees Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Employee_Id&lt;/th&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;th&gt;Department_Id&lt;/th&gt;
&lt;th&gt;Job_Role&lt;/th&gt;
&lt;th&gt;Manager_Id&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;CEO&lt;/td&gt;
&lt;td&gt;Null&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;CFO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;CIO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR Manager&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Sales Manager&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Support Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR Representative&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Front-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;Null&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The code below shows the syntax for writing a JOIN:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;columns&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;table2&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;table1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;column1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;table2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;column1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Types of Joins&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There are four major types of joins in SQL, as listed below:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Inner Join&lt;/li&gt;
&lt;li&gt;Left Join&lt;/li&gt;
&lt;li&gt;Right Join&lt;/li&gt;
&lt;li&gt;Full Join&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Inner Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It creates a result table that displays information &lt;strong&gt;common&lt;/strong&gt; between two tables based on a shared piece of information. &lt;/p&gt;

&lt;p&gt;It is the most important and frequently used of the joins.&lt;/p&gt;

&lt;p&gt;In this case, we could use it to display employees from the departments listed in the first table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Job_Role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;
&lt;span class="k"&gt;INNER&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;
 &lt;span class="c1"&gt;-- You can replace the keyword INNER JOIN with JOIN&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;th&gt;Job_Role&lt;/th&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;CEO&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;CFO&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;CIO&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR Manager&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;Sales Manager&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development Manager&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support Manager&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR Representative&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Front-End Developer&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Left Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;LEFT JOIN&lt;/em&gt; returns &lt;strong&gt;all&lt;/strong&gt; rows from the left table, plus matched values from the right table or NULL in case of no match.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: The left table refers to the table that appears before the "LEFT JOIN" keywords in your SQL query. Same case applies when using RIGHT JOIN&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We could use left join to retrieve a list of all employees along with their department names. If an employee doesn't belong to a department, display NULL&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; 
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Employee_Id&lt;/th&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;SALES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;SALES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can use the COALESCE() function to replace NULL values with more relatable user-defined values.&lt;/p&gt;

&lt;p&gt;i.e. If an employee doesn't belong to a department, display "No Department" instead.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"No Department"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Department_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; 
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Employee_Id&lt;/th&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;SALES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;SALES&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;No Department&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Right Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;SQL RIGHT JOIN&lt;/em&gt; returns all rows from the right table, even if there are no matches&lt;br&gt;
in the left table. Not so different from the LEFT JOIN.&lt;/p&gt;

&lt;p&gt;In this case, we could view employees and their various departments&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;Department_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;COALESCE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"No Employee"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Full_Name&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; 
&lt;span class="k"&gt;RIGHT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Department_Name&lt;/th&gt;
&lt;th&gt;Full_Name&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Research&lt;/td&gt;
&lt;td&gt;No Employee&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Full Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;SQL FULL JOIN&lt;/em&gt; combines the results of both left and right outer joins.&lt;/p&gt;

&lt;p&gt;The joined table will contain all records from both tables and fill in NULLs for missing&lt;br&gt;
matches on either side.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Job_Role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
&lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; 
&lt;span class="k"&gt;FULL&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;departments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Department_Id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt; &lt;span class="k"&gt;ASC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;EMPLOYEE_ID&lt;/th&gt;
&lt;th&gt;FULL_NAME&lt;/th&gt;
&lt;th&gt;JOB_ROLE&lt;/th&gt;
&lt;th&gt;DEPARTMENT_ID&lt;/th&gt;
&lt;th&gt;DEPARTMENT_NAME&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;td&gt;CEO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;CFO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;CIO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Executive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR Manager&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;Sales Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development Manager&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support Manager&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR Representative&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;HR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Sales&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Front-End Developer&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Development&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;NULL&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Research&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Bonus Joins
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Self Join&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;SELF JOIN&lt;/em&gt; is typically not a join type but a special way of joining a table to itself. You may want to combine rows in a table based on a related column present in the table.&lt;/p&gt;

&lt;p&gt;It is commonly used when you need to traverse a hierarchical structure where each row references another row in the same table &lt;/p&gt;

&lt;p&gt;OR&lt;/p&gt;

&lt;p&gt;When comparing data within rows in the same table.&lt;/p&gt;

&lt;p&gt;In this case, we will use self-join in the employees’ table to display the employee-manager relationship. Each employee record contains a reference to the manager's ID, allowing us to retrieve more information about the managers by adding another column, “Manager_Name”. In Data Science, this is an example of &lt;strong&gt;&lt;em&gt;feature engineering.&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt;
&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Job_Role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Manager_Id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Full_Name&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;Manager_Name&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employees&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Manager_Id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Employee_Id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get the results below&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Result&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;EMPLOYEE_ID&lt;/th&gt;
&lt;th&gt;FULL_NAME&lt;/th&gt;
&lt;th&gt;JOB_ROLE&lt;/th&gt;
&lt;th&gt;DEPARTMENT_ID&lt;/th&gt;
&lt;th&gt;MANAGER_NAME&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;td&gt;CFO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;td&gt;CIO&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;td&gt;HR Manager&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;John Smith&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;td&gt;Sales Manager&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Sarah Goodes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;td&gt;Development Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;td&gt;Support Manager&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Wayne Ablett&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Matthew Swan&lt;/td&gt;
&lt;td&gt;HR Representative&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Michelle Carey&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Stephanie Richardson&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Tony Grant&lt;/td&gt;
&lt;td&gt;Salesperson&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Chris Matthews&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Jenna Lockett&lt;/td&gt;
&lt;td&gt;Front-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Michael Dunstall&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Jane Voss&lt;/td&gt;
&lt;td&gt;Back-End Developer&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Andrew Judd&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Anthony Hird&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;Natalie Rocca&lt;/td&gt;
&lt;td&gt;Support&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Danielle McLeod&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's most of what you need to know about SQL Joins. Hopefully, you found this information helpful. Feel free to share it with anyone who's having a hard time with Joins, and keep learning! 😊&lt;/p&gt;

&lt;p&gt;If you have any questions, please leave them in the comments section below.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thanks for reading!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>sql</category>
      <category>datascience</category>
      <category>joins</category>
      <category>programming</category>
    </item>
    <item>
      <title>Data Engineering 102: Introduction to Python for Data Engineering</title>
      <dc:creator>Gathuru_M</dc:creator>
      <pubDate>Sun, 11 Sep 2022 20:08:07 +0000</pubDate>
      <link>https://dev.to/gathurum/data-engineering-102-introduction-to-python-for-data-engineering-4m2l</link>
      <guid>https://dev.to/gathurum/data-engineering-102-introduction-to-python-for-data-engineering-4m2l</guid>
      <description>&lt;p&gt;Python is today’s most popular programming language with endless applications in various fields. It is ideally suited for deployment, analysis, and maintenance thanks to its flexible and dynamic nature. &lt;br&gt;
It is one of the crucial skills required in the field of Data Engineering, to create Data Pipelines, set up Statistical Models, and perform a thorough analysis of them.&lt;br&gt;
To start using python, you'll need it installed in the Operating system you're currently using be it Linux, Mac OS, or the most common Windows. &lt;br&gt;
Python provides an ample amount of libraries and packages for various applications. These are the top 5 Python for Data Engineering packages. They include:&lt;/p&gt;

&lt;p&gt;-Pandas&lt;br&gt;
-pygrametl&lt;br&gt;
-petl&lt;br&gt;
-Beautiful Soup&lt;br&gt;
-SciPy&lt;/p&gt;

&lt;p&gt;But firstly, in Learning python, these are some of the topics you should perhaps start with:-&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.MATH EXPRESSIONS&lt;/strong&gt;&lt;br&gt;
Syntax Math Meaning&lt;br&gt;
a+b a+b addition&lt;br&gt;
a-b a-b) subtraction&lt;br&gt;
a*b a\times b\ multiplication&lt;br&gt;
a/b a\div b\, division (see note below)&lt;br&gt;
a//b a\div b\ division - in python 2.2 &amp;amp; abv&lt;br&gt;
a%b a mod b modulo&lt;br&gt;
-a -a negation&lt;br&gt;
abs(a) |a| absolute value&lt;br&gt;
a**b a^{b} exponent&lt;br&gt;
math.sqrt square root&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Strings&lt;/strong&gt;&lt;br&gt;
Strings in python are surrounded by either single quotation marks or double quotation marks.&lt;br&gt;
e.g.&lt;br&gt;
'hello' is the same as "hello".&lt;/p&gt;

&lt;p&gt;You can display a string literal with the print() function&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Variables&lt;/strong&gt;&lt;br&gt;
Variables are containers for storing data values.&lt;br&gt;
A variable is created the moment you first assign a value to it.&lt;br&gt;
x = 5&lt;br&gt;
y = "John"&lt;br&gt;
print(x)&lt;br&gt;
print(y)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4.Loops&lt;/strong&gt;&lt;br&gt;
Python provides three ways for executing the loops&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(a)While Loop:&lt;/em&gt;&lt;br&gt;
In python, a while loop is used to execute a block of statements repeatedly until a given condition is satisfied. And when the condition becomes false, the line immediately after the loop in the program is executed.&lt;/p&gt;

&lt;p&gt;while expression:&lt;br&gt;
statement(s)&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(b) For in Loop:&lt;/em&gt;&lt;br&gt;
For loops are used for sequential traversal. For example: traversing a list or string or array etc. In Python, there is no C style for loop, i.e., for (i=0; i&amp;lt;n; i++). There is “for in” loop which is similar to for each loop in other languages. Let us learn how to use for in loop for sequential traversals&lt;/p&gt;

&lt;p&gt;for iterator_var in sequence:&lt;br&gt;
statements(s)&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(c)Nested Loops:&lt;/em&gt;&lt;br&gt;
Python programming language allows using one loop inside another loop. The following section shows a few examples to illustrate the concept.&lt;/p&gt;

&lt;p&gt;Syntax:&lt;/p&gt;

&lt;p&gt;while expression:&lt;br&gt;
while expression:&lt;br&gt;
statement(s)&lt;br&gt;
statement(s)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Functions&lt;/strong&gt;&lt;br&gt;
A function is a block of code that only runs when it is called.&lt;/p&gt;

&lt;p&gt;The basic syntax is:&lt;br&gt;
def my_function():&lt;br&gt;
print("Hello from a function")&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. List, Tuples, Dictionary, and sets&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It's also important to learn how to connect databases with:-&lt;br&gt;
1.BOTO3&lt;br&gt;
2.Psycopg2&lt;br&gt;
3.MySQL&lt;/p&gt;

&lt;p&gt;So, as long as there is data to process, data engineers will be in demand. I wish you all the best as you choose to pursue this journey.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thanks for reading!&lt;/strong&gt;&lt;br&gt;
Any questions? Leave your comment below to start fantastic discussions!&lt;/p&gt;

</description>
      <category>codenewbie</category>
      <category>career</category>
      <category>data</category>
    </item>
  </channel>
</rss>
