<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Magichu Njoroge</title>
    <description>The latest articles on DEV Community by Magichu Njoroge (@magichu_njoroge_9123627a6).</description>
    <link>https://dev.to/magichu_njoroge_9123627a6</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3965279%2Fc8df8991-181f-48c8-8f92-4e94930d1fe6.png</url>
      <title>DEV Community: Magichu Njoroge</title>
      <link>https://dev.to/magichu_njoroge_9123627a6</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/magichu_njoroge_9123627a6"/>
    <language>en</language>
    <item>
      <title>UNDERSTANDING DATA MODELING, SCHEMAS, RELATIONSHIPS, AND JOINS.</title>
      <dc:creator>Magichu Njoroge</dc:creator>
      <pubDate>Tue, 30 Jun 2026 11:22:24 +0000</pubDate>
      <link>https://dev.to/magichu_njoroge_9123627a6/understanding-data-modeling-schemas-relationships-and-joins-1d63</link>
      <guid>https://dev.to/magichu_njoroge_9123627a6/understanding-data-modeling-schemas-relationships-and-joins-1d63</guid>
      <description>&lt;h2&gt;
  
  
  Data Modeling in Power BI: Schemas, Relationships, and Joins Explained
&lt;/h2&gt;

&lt;p&gt;Learning Power BI was a very exciting part of this journey, and my first time on model view almost changed that. I genuinely had no idea what I was looking at. But with good tutoring, practice, and patience, I finally got the hang of it. Almost... With every wall or obstacle hit, I write about it for the next person and for my deeper understanding. This is data modeling as how I understand it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is Data Modeling?
&lt;/h2&gt;

&lt;p&gt;Data modeling is the process of creating visual representations of the connections between data structures, with information about the individual attributes contained within those data structures.&lt;/p&gt;

&lt;p&gt;In simple terms, a data model answers three questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What tables do I have?&lt;/li&gt;
&lt;li&gt;How do they relate to each other?&lt;/li&gt;
&lt;li&gt;Which direction do filters travel between them?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Get this right and your visuals update instantly; your numbers are accurate, and adding new data is easy. Get it wrong and you end up with reports that show inflated totals, filters that don't work, and dashboards that are a nightmare to maintain.&lt;/p&gt;

&lt;p&gt;A well-designed data model helps&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Understand data requirements.&lt;/li&gt;
&lt;li&gt;Ensure proper structure for reporting.&lt;/li&gt;
&lt;li&gt;Align with organizational goals.&lt;/li&gt;
&lt;li&gt;Maintain data integrity.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Two Types of Tables: Fact and Dimension
&lt;/h2&gt;

&lt;p&gt;Every data model is built from two kinds of tables. Understanding &lt;br&gt;
the difference between them is the single most important foundation &lt;br&gt;
in data modeling.&lt;/p&gt;
&lt;h3&gt;
  
  
  Fact Tables
&lt;/h3&gt;

&lt;p&gt;The fact table is at the center of the &lt;em&gt;star schema&lt;/em&gt; and stores the core transactional data you want to analyze, such as sales records, orders, or financial transactions. Each row in the fact table is unique and contains keys that link it to related &lt;em&gt;dimension tables&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Fact tables are usually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Very long (thousands or millions of rows)&lt;/li&gt;
&lt;li&gt;Full of numbers (quantities, amounts, counts)&lt;/li&gt;
&lt;li&gt;Connected to multiple other tables via ID columns&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Dimension Tables
&lt;/h3&gt;

&lt;p&gt;A &lt;em&gt;dimension table&lt;/em&gt; is connected to the fact table and stores the context around the data, the who, what, when, and where. Customers. Products. Dates. Stores.&lt;/p&gt;

&lt;p&gt;Dimension tables are usually&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Short but wide (fewer rows, more descriptive columns)&lt;/li&gt;
&lt;li&gt;Full of text and categories&lt;/li&gt;
&lt;li&gt;The tables you filter and slice by in your reports
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│                  FACT vs DIMENSION                          │
├──────────────────────────┬──────────────────────────────────┤
│     FACT TABLE           │     DIMENSION TABLE              │
│     (Sales_Fact)         │     (Dim_Product)                │
├──────────────────────────┼──────────────────────────────────┤
│  SaleID       (PK)       │  ProductID     (PK)              │
│  CustomerID   (FK) ──────┼──▶ CustomerID                    │
│  ProductID    (FK) ──────┼──▶ ProductID                     │
│  DateID       (FK)       │  ProductName                     │
│  StoreID      (FK)       │  Category                        │
│  Quantity                │  Price                           │
│  SalesAmount             │  Supplier                        │
├──────────────────────────┼──────────────────────────────────┤
│  Rows: Millions          │  Rows: Hundreds or Thousands     │
│  Updates: Daily          │  Updates: Rarely                 │
│  Contains: Numbers       │  Contains: Descriptions          │
└──────────────────────────┴──────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;A good way to remember it: the fact table is &lt;em&gt;what happened&lt;/em&gt;. The dimension table is &lt;em&gt;everything around what happened&lt;/em&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  Schemas: The Blueprint of Your Model
&lt;/h2&gt;

&lt;p&gt;This is where most beginners, myself included, get lost first. &lt;br&gt;
A schema refers to the structure and organization of data within a data model. Schemas define how data is connected and related within the model. There are two main types of schemas one interacts with in Power BI.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Star Schema
&lt;/h3&gt;

&lt;p&gt;The &lt;em&gt;star schema&lt;/em&gt; is the recommended approach for Power BI. One fact table sits at the center. All dimension tables connect directly to it. When you step back and look at it, it resembles a star; the fact table is the center, and the dimensions are the points radiating outward.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    ┌─────────────────┐
                    │   Dim_Date      │
                    │─────────────────│
                    │ DateID    (PK)  │
                    │ Day             │
                    │ Month           │
                    │ Quarter         │
                    │ Year            │
                    └────────┬────────┘
                             │ 1
                             │
       ┌─────────────────────┼─────────────────────┐
       │ 1                   │ N                   │ 1
┌──────┴──────────┐  ┌───────▼──────────┐  ┌──────┴──────────┐
│  Dim_Customer   │  │   Fact_Sales     │  │  Dim_Product    │
│─────────────────│  │──────────────────│  │─────────────────│
│ CustomerID (PK) ├─▶│ SaleID    (PK)   │◀─┤ ProductID  (PK) │
│ FullName        │  │ CustomerID (FK)  │  │ ProductName     │
│ Email           │  │ ProductID  (FK)  │  │ Category        │
│ County          │  │ DateID     (FK)  │  │ UnitPrice       │
│ Phone           │  │ StoreID    (FK)  │  │ Supplier        │
└─────────────────┘  │ Quantity         │  └─────────────────┘
                     │ SalesAmount      │
       ┌─────────────│ Discount         │
       │ 1           └──────────────────┘
┌──────┴──────────┐
│   Dim_Store     │
│─────────────────│
│ StoreID    (PK) │
│ StoreName       │
│ Town            │
│ Region          │
└─────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why is the star schema the gold standard?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simplifies queries by clearly defining relationships between facts and dimensions.&lt;/li&gt;
&lt;li&gt;Reduces data redundancy through organized dimension tables.&lt;/li&gt;
&lt;li&gt;Improves performance for large datasets and complex analytics.&lt;/li&gt;
&lt;li&gt;It is easy for anyone to look at the model and understand it.&lt;/li&gt;
&lt;li&gt;Adding a new dimension table later is straightforward.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Snowflake Schema
&lt;/h3&gt;

&lt;p&gt;The &lt;em&gt;snowflake schema&lt;/em&gt; is when the dimension table is split up into multiple related sub-tables. It is an extension of the star schema. A product dimension, for example, &lt;br&gt;
might split into a separate category table and a separate supplier table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────┐        ┌──────────────────┐
│  Dim_Category    │        │  Dim_Supplier    │
│──────────────────│        │──────────────────│
│ CategoryID  (PK) │        │ SupplierID  (PK) │
│ CategoryName     │        │ SupplierName     │
└────────┬─────────┘        └────────┬─────────┘
         │ 1                         │ 1
         │                           │
         ▼ N                         ▼ N
┌──────────────────────────────────────────────┐
│              Dim_Product                     │
│──────────────────────────────────────────────│
│ ProductID    (PK)                            │
│ ProductName                                  │
│ CategoryID   (FK) ──▶ Dim_Category           │
│ SupplierID   (FK) ──▶ Dim_Supplier           │
│ UnitPrice                                    │
└───────────────────────┬──────────────────────┘
                        │ 1
                        │
                        ▼ N
               ┌────────────────────┐
               │    Fact_Sales      │
               │────────────────────│
               │ SaleID      (PK)   │
               │ ProductID   (FK)   │
               │ CustomerID  (FK)   │
               │ SalesAmount        │
               └────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────┬───────────────────────────┐
│                      │  STAR      │  SNOWFLAKE   │
├──────────────────────┼────────────┼──────────────┤
│ Query Speed          │ Faster     │ Slower       │
│ Ease of Use          │ Simpler    │ More complex │
│ Power BI Performance │ Ideal      │ Not ideal    |
│ Storage              │ More space │ Less space   │
└──────────────────────┴────────────┴──────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;Stick with the star schema in Power BI. It is what Microsoft recommends and what most professional models use.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Relationships: Connecting Your Tables
&lt;/h2&gt;

&lt;p&gt;A relationship is used to define how tables are linked to each other, which helps to analyze and visualize data across multiple tables seamlessly. There are several types of relationships, as it will be discussed later. They appear as literal lines with a number on each end.&lt;/p&gt;

&lt;h3&gt;
  
  
  Primary Keys and Foreign Keys
&lt;/h3&gt;

&lt;p&gt;Every relationship is built on two column types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Primary Key (PK)&lt;/strong&gt; — uniquely identifies every row in a table. No duplicates. Example: &lt;code&gt;ProductID&lt;/code&gt; in the Products table.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Foreign Key (FK)&lt;/strong&gt; — a column in another table that references that primary key. Example: &lt;code&gt;ProductID&lt;/code&gt; in the Sales table, pointing back to which product was sold.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Dim_Product                        Fact_Sales
┌──────────────────────┐           ┌──────────────────────┐
│ ProductID  ← PK      │─────1─────│ ProductID  ← FK      │
│ ProductName          │           │ SaleID               │
│ Category             │     N     │ CustomerID           │
│ UnitPrice            │◀──────────│ Quantity             │
└──────────────────────┘           │ SalesAmount          │
                                   └──────────────────────┘
One product can appear in many sales rows — this is 1:N
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Types of relationships
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;One-to-Many (1:N) — the most common&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One row in the dimension table connects to many rows in the fact table. This is the backbone of every star schema.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Dim_Customer                     Fact_Sales
┌─────────────────┐              ┌─────────────────┐
│ CustomerID: C01 │──────────────│ SaleID: 1001    │
│ Name: Wanjiru   │    1 : N     │ CustomerID: C01 │
└─────────────────┘       │      ├─────────────────┤
                           │      │ SaleID: 1002    │
                           └─────▶│ CustomerID: C01 │
                                  ├─────────────────┤
                                  │ SaleID: 1003    │
                                  │ CustomerID: C01 │
                                  └─────────────────┘
Wanjiru appears once as a customer but has made three purchases.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;One-to-One (1:1) — rare&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each row in one table matches exactly one row in another. Used &lt;br&gt;
mostly when splitting a very wide table for performance reasons.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Dim_Employee               Dim_EmployeePrivate
┌──────────────────┐        ┌────────────────────────┐
│ EmployeeID: E01  │────────│ EmployeeID: E01        │
│ Name: Kamau      │  1:1   │ NationalID: 12345678   │
│ Department: IT   │        │ EmergencyContact: ...  │
└──────────────────┘        └────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Many-to-Many (N:N) — handle carefully&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Many rows in Table A match many rows in Table B. Power BI can &lt;br&gt;
handle this, but it often leads to ambiguous results. The clean solution is a bridge table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Dim_Student              Dim_Course
┌───────────────┐         ┌──────────────────┐
│ StudentID     │         │ CourseID         │
│ StudentName   │         │ CourseName       │
└───────┬───────┘         └────────┬─────────┘
        │ 1                        │ 1
        │                          │
        ▼ N    Bridge_Enrollment   ▼ N
        └─────▶┌─────────────────┐◀┘
               │ StudentID  (FK) │
               │ CourseID   (FK) │
               │ EnrolledDate    │
               └─────────────────┘
The bridge table resolves the many-to-many into two 1:N links.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Joins: What Happens Behind the Scenes
&lt;/h2&gt;

&lt;p&gt;When Power BI evaluates a visual that pulls data from multiple tables, it performs a join; it combines rows from two tables based on a shared column. You never write the join yourself in Power BI, but knowing what type of join is happening helps you understand why some rows appear and others don't.&lt;/p&gt;

&lt;h3&gt;
  
  
  Inner Join — Only Matching Rows
&lt;/h3&gt;

&lt;p&gt;Returns only the rows that have a match in both tables. If a row in one table has no match in the other, it does not appear from the results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Fact_Sales             Dim_Product
┌───────────────────┐   ┌───────────────────┐
│ SaleID │ProductID │   │ ProductID │ Name  │
│ S001   │  P10     │   │ P10       │ Rice  │
│ S002   │  P20     │   │ P20       │ Sugar │
│ S003   │  P99     │   └───────────────────┘
└───────────────────┘

        INNER JOIN on ProductID
        ▼
┌──────────────────────────────────┐
│ SaleID  │ ProductID │ Name       │
│ S001    │ P10       │ Rice       │
│ S002    │ P20       │ Sugar      │
└──────────────────────────────────┘
Sale S003 disappears — ProductID P99 has no match in Dim_Product
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Left Join — Keep Everything from the Left
&lt;/h3&gt;

&lt;p&gt;Returns all rows from the left table, with matching data from the right table. Non-matching rows get a blank value rather than being dropped.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        LEFT JOIN on ProductID
        ▼
┌──────────────────────────────────┐
│ SaleID  │ ProductID │ Name       │
│ S001    │ P10       │ Rice       │
│ S002    │ P20       │ Sugar      │
│ S003    │ P99       │ (blank)    │ ← kept, but no product name
└──────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;In Power BI, when you define a relationship between two tables, it uses a Left Join by default, all rows from the dimension side are preserved, and fact rows without a matching dimension value show as blank rather than disappearing entirely.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Cross-Filter Direction: Which Way Do Filters Travel?
&lt;/h2&gt;

&lt;p&gt;When you click on a value in one visual, say, you click "Nairobi" on a map, Power BI filters every other visual on the page. The direction that filter travels between tables is controlled by the cross-filter direction setting on each relationship.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single Direction (Default)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Filters flow from the dimension table toward the fact table only. &lt;br&gt;
This is the safe, recommended default.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Dim_Product                        Fact_Sales
┌─────────────────────┐             ┌──────────────────────┐
│ Category = "Flour"  │────filter──▶│ Shows only Flour     │
│                     │  one way    │ sales rows           │
└─────────────────────┘             └──────────────────────┘
                     Filter does NOT travel back 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Bidirectional&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Filters flow both ways. This sounds useful but can create circular filter paths and slow your report down significantly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Dim_Product          ◀────────────▶        Fact_Sales
┌─────────────────────┐  both ways  ┌──────────────────────┐
│ ProductName         │◀───────────▶│ SalesAmount          │
└─────────────────────┘             └──────────────────────┘
Use only when you have a specific, tested reason to do so
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;blockquote&gt;
&lt;p&gt;One thing that caught me off guard while learning this was assuming that because Power BI drew the relationship lines automatically, the model was correct. Sometimes it connected the wrong columns. You should always double-check the relationships before going any further to prevent the waste of time and resources.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Quick Reference Cheat Sheet
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────────┐
│           POWER BI DATA MODELING CHEAT SHEET                │
├─────────────────────┬───────────────────────────────────────┤
│ CONCEPT             │ WHAT IT MEANS                         │
├─────────────────────┼───────────────────────────────────────┤
│ Fact Table          │ Stores transactions — the numbers     │
│ Dimension Table     │ Stores context — the descriptions     │
│ Star Schema         │ Fact at center, dimensions around it  │
│ Snowflake Schema    │ Dimensions split into sub-tables      │
│ Primary Key (PK)    │ Uniquely identifies every row         │
│ Foreign Key (FK)    │ References a PK in another table      │
│ 1:N Relationship    │ One dimension row → many fact rows    │
│ N:N Relationship    │ Needs a bridge table to resolve       │
│ Inner Join          │ Only rows that match in both tables   │
│ Left Join           │ All rows from left + matches on right │
│ Single Filter       │ Filters flow dimension → fact         │
│ Bidirectional       │ Filters flow both ways (use carefully)│
├─────────────────────┴───────────────────────────────────────┤
│ 
└─────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As someone still early in this journey, I'd appreciate comments, suggestions, and corrections, as they are my real learning curve and the most valuable part of my journey.&lt;/p&gt;




</description>
      <category>beginners</category>
      <category>database</category>
      <category>microsoft</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>A Data Engineer's First Day on a Linux Server: SSH, PostgreSQL &amp; Everything In Between</title>
      <dc:creator>Magichu Njoroge</dc:creator>
      <pubDate>Sun, 14 Jun 2026 12:36:57 +0000</pubDate>
      <link>https://dev.to/magichu_njoroge_9123627a6/a-data-engineers-first-day-on-a-linux-server-ssh-postgresql-everything-in-between-3032</link>
      <guid>https://dev.to/magichu_njoroge_9123627a6/a-data-engineers-first-day-on-a-linux-server-ssh-postgresql-everything-in-between-3032</guid>
      <description>&lt;h2&gt;
  
  
  From Zero to Data Engineering: The First Mile on a Thousand-Mile Road That Refuses to Be Straight.
&lt;/h2&gt;

&lt;p&gt;With just basic and level-one intermediate tech skills and from a non-tech background, when I got this assignment, I actually thought it would be as easy as class examples, but it was far from that. I encountered a series of errors, mistakes, and confusion, and I'm sure that's how the guy in class did it moments, each showing something new. I'm writing exactly how it all took place. &lt;/p&gt;

&lt;h2&gt;
  
  
  The Environment
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Server OS:&lt;/strong&gt; Ubuntu 24.04.4 LTS (DigitalOcean cloud host)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database:&lt;/strong&gt; PostgreSQL 16.14&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local Machine:&lt;/strong&gt; Windows (Powershell)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dataset:&lt;/strong&gt; kenya_counties.csv - 1,000 records generated via Mockaroo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools:&lt;/strong&gt; SSH, SCP, psql, bash&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Section 1: Getting into the server
&lt;/h2&gt;

&lt;p&gt;Establishing a remote connection to the cloud host using SSH was the first step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh root@159.65.222.96
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Easy enough. Using root credentials, I connected to port 22, the default SSH port.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What is SSH?&lt;/strong&gt; The Secure Shell(SSH) protocol is a method for securely sending commands to a computer over an unsecured network. Used for controlling servers remotely, for managing infrastructure, and transferring files.&lt;a href="https://www.cloudflare.com/learning/access-management/what-is-ssh/" rel="noopener noreferrer"&gt;CLICK HERE To read more on &lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2tvg2ccqmvbw2i4sf2v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2tvg2ccqmvbw2i4sf2v.png" alt="Screenshot: ssh root.png—successful login showing Ubuntu 24.04.4 banner" width="800" height="555"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;h2&gt;
  
  
  Section 2: Creating a User - Things start being real; my first error.
&lt;/h2&gt;

&lt;p&gt;The instructions given clearly stated the format for creating a Linux user account was first name + surname initials. My name is Magichu Njoroge. So the intended username was &lt;code&gt;MagichuN&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;I ran.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;adduser MagichuN
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then came the first wall: &lt;br&gt;
Please enter a username matching the regular expression...&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiyidpnyc7bh4ft7euu1u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiyidpnyc7bh4ft7euu1u.png" alt="Error when creating user with capital letters" width="800" height="93"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Root Cause:&lt;/strong&gt; From my research, since this was a first, Linux enforces username rules. Usernames must all be lowercase. Capital letters are rejected at the system security layer to prevent directory pathing conflicts. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; I simply changed to lowercase:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;adduser magichun
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3lh1nifnkfbvzxmdp0v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj3lh1nifnkfbvzxmdp0v.png" alt="Successfully creating user magichun" width="799" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Worked like a charm. Got a home directory set up under &lt;code&gt;/home/magichun&lt;/code&gt;the user context registered.&lt;/p&gt;

&lt;p&gt;With the user now recognized by the system, I promoted the user to the sudo group so it could run administrative commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;usermod &lt;span class="nt"&gt;-aG&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;magichun
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What is a &lt;code&gt;sudo&lt;/code&gt; Group?&lt;/strong&gt; The sudo group is the de facto standard for granting elevated privileges in modern Linux distributions. Its purpose is tightly linked to the sudo command, which allows users to execute commands with the permissions of another user (typically the root user, the superuser with unrestricted access). Read more: &lt;a href="https://www.dotlinux.net/blog/distinguishing-between-the-sudo-and-admin-groups-in-linux-roles-and-usage/" rel="noopener noreferrer"&gt;CLICK HERE To read more on &lt;code&gt;sudo&lt;/code&gt; and &lt;code&gt;admin&lt;/code&gt; groups in Linux&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnp54japvb1xhlay46bpj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnp54japvb1xhlay46bpj.png" alt="Adding user magichun to the sudo group with usermod command" width="800" height="58"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then counterchecked the user existed and had correct group membership:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;id &lt;/span&gt;magichun
&lt;span class="nb"&gt;cat&lt;/span&gt; /etc/passwd | &lt;span class="nb"&gt;grep &lt;/span&gt;magichun
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk75js832ng8c8hg2wj17.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk75js832ng8c8hg2wj17.png" alt="Verifying user magichu group membership" width="800" height="95"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, I switched from root into my new user to confirm everything was working:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;su magichun
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjt9n5nni8iva5hkas7d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftjt9n5nni8iva5hkas7d.png" alt="Switching active terminal session from root to magichun" width="638" height="85"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;whoami&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9nrzau9uzgakdbbz341j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9nrzau9uzgakdbbz341j.png" alt="Confirming active identity is magichun" width="520" height="105"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;pwd&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27sawl6v6jo3g5o1d4di.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27sawl6v6jo3g5o1d4di.png" alt="Confirm working directory" width="472" height="98"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Why not just hang onto root?&lt;/strong&gt; Not ideal. Security best practice leans on minimal access rights. Root wields total control, one slip, whole machine at risk. Working as a named user with &lt;code&gt;sudo&lt;/code&gt; access is the professional standard.&lt;a href="https://www.zdnet.com/article/logging-in-as-root-on-linux-heres-why-that-disaster-waiting-to-happen/" rel="noopener noreferrer"&gt;CLICK HERE To read more&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Section 3: PostgreSQL - The Role Wall
&lt;/h2&gt;

&lt;p&gt;Back as root, I checked what lived on the server already:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;psql &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3qatx2ni3yuiytlgbpu3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3qatx2ni3yuiytlgbpu3.png" alt="PostgreSQL 16.14 version confirmed on the server" width="800" height="116"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status postgresql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zu2c3i22rg6ogy9j0zu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6zu2c3i22rg6ogy9j0zu.png" alt="PostgreSQL service active and running" width="800" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;PostgreSQL 16.14 was already installed and running in an active state. No installation needed.&lt;/p&gt;

&lt;p&gt;Now it was time to set up the database. Had to switch to the postgres admin shell.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; postgres
psql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyt59qsskb86jmf1letmm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyt59qsskb86jmf1letmm.png" alt="PostgreSQL 16.14 version confirmed on the server" width="798" height="149"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After that, I attempted to log back as my new user &lt;code&gt;magichun&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;psql &lt;span class="nt"&gt;-U&lt;/span&gt; magichun &lt;span class="nt"&gt;-d&lt;/span&gt; magichun
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then trouble showed up for the second time:&lt;br&gt;
psql:error:connection to server on socket...failed: FATAL: role"magichun" does not exist&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk6ylvoo53v4trckcdl32.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk6ylvoo53v4trckcdl32.png" alt="FATAL error - role magichun does not exist in PostgreSQL" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It took me a minute to figure out what was going on; I had to seek help from a friend. This is what I understood from what he told me (I'm open to much simpler explanations): creating a Linux user does not automatically create a matching PostgreSQL role. One lives in the machine. The other only knows what you tell it. A user that exists in Linux is invisible to PostgreSQL until you explicitly register it there.&lt;br&gt;
And that's exactly what I did&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;ROLE&lt;/span&gt; &lt;span class="n"&gt;magichun&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;LOGIN&lt;/span&gt; &lt;span class="n"&gt;PASSWORD&lt;/span&gt; &lt;span class="s1"&gt;'StrongPassword123'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;ROLE&lt;/span&gt; &lt;span class="n"&gt;magichun&lt;/span&gt; &lt;span class="k"&gt;CREATEDB&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;DATABASE&lt;/span&gt; &lt;span class="n"&gt;magichun&lt;/span&gt; &lt;span class="k"&gt;OWNER&lt;/span&gt; &lt;span class="n"&gt;magichun&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw22ycfujhsuray2v8ttv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw22ycfujhsuray2v8ttv.png" alt="Creating PostgreSQL role and database for user magichun" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After that horrifying experience, I logged back as magichun and created the staging schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="n"&gt;magichun&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;SCHEMA&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Confirmed with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="n"&gt;dn&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This experience made me realize I didn't really understand the database operations and needed to revisit and do much practice. All the corrections I was led on to were what to do, and I would be lying if I said I corrected them all by myself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Section 4: Generating and Moving Real Data
&lt;/h2&gt;

&lt;p&gt;Rather than inserting manual dummy records, I used &lt;strong&gt;Mockaroo&lt;/strong&gt; to generate a realistic dataset. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mockaroo:&lt;/strong&gt; A free test data generator and API mocking tool. It lets you create custom CSV, JSON, SQL, and Excel datasets, perfect for testing database pipelines. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What is a data pipeline?&lt;/strong&gt; A data pipeline is a set of tools and processes for collecting, processing, and delivering data from one or more sources to a destination where it can be analyzed and used. &lt;a href="https://www.geeksforgeeks.org/software-engineering/overview-of-data-pipeline/" rel="noopener noreferrer"&gt;CLICK HERE To read more on data pipelines.&lt;/a&gt; &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;My dataset: &lt;code&gt;kenya_counties.csv&lt;/code&gt; - 1,000 records with these columns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Column&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;id&lt;/td&gt;
&lt;td&gt;Integer (Primary Key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;first_name&lt;/td&gt;
&lt;td&gt;VARCHAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;last_name&lt;/td&gt;
&lt;td&gt;VARCHAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;email&lt;/td&gt;
&lt;td&gt;VARCHAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;gender&lt;/td&gt;
&lt;td&gt;VARCHAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ip_address&lt;/td&gt;
&lt;td&gt;VARCHAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;county_name&lt;/td&gt;
&lt;td&gt;VARCHAR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;population&lt;/td&gt;
&lt;td&gt;Integer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;area_sq_km&lt;/td&gt;
&lt;td&gt;Decimal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;population_density&lt;/td&gt;
&lt;td&gt;Decimal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;average_age&lt;/td&gt;
&lt;td&gt;Decimal&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The SCP Error (Running It From the Wrong Place)
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;What is SCP?:&lt;/strong&gt; The SCP (Secure Copy Protocol) command is used to securely transfer files and directories between systems over a network. It works over SSH, which means all transferred data is encrypted. It is commonly used by system administrators to move backup files, fetch logs, copy configuration files, and transfer data between servers safely and quickly. &lt;a href="https://www.geeksforgeeks.org/linux-unix/scp-command-in-linux-with-examples/" rel="noopener noreferrer"&gt;CLICK HERE To read more on SCP&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;scp &lt;span class="s2"&gt;"D:&lt;/span&gt;&lt;span class="se"&gt;\D&lt;/span&gt;&lt;span class="s2"&gt;atasets&lt;/span&gt;&lt;span class="se"&gt;\k&lt;/span&gt;&lt;span class="s2"&gt;enya_counties.csv"&lt;/span&gt; magichun@159.65.222.96:/home/magichun/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Error:&lt;br&gt;
ssh: could not resolve hostname d: Temporary failure in name resolution&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fswe9pite6avaub2e4x4r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fswe9pite6avaub2e4x4r.png" alt="SCP error when run from inside the SSH session" width="800" height="192"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Root Cause:&lt;/strong&gt; I used the command on the remote server's terminal; it had no context. &lt;code&gt;D:\&lt;/code&gt; means - that's a Windows drive path on my local machine. For SCP to work, it needs to run from the machine that holds the source file. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Ran Windows PowerShell locally.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;scp &lt;span class="s2"&gt;"D:&lt;/span&gt;&lt;span class="se"&gt;\D&lt;/span&gt;&lt;span class="s2"&gt;atasets&lt;/span&gt;&lt;span class="se"&gt;\k&lt;/span&gt;&lt;span class="s2"&gt;enya_counties.csv"&lt;/span&gt; magichun@159.65.222.96:/home/magichun/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu2zu7ocsje63vltnlhck.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu2zu7ocsje63vltnlhck.png" alt="Successfully transferring kenya_counties.csv from local machine to server" width="800" height="157"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then verified on the server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-lh&lt;/span&gt; /home/magichun/
&lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-5&lt;/span&gt; kenya_counties.csv
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfitit80ppmrbqvk2g2s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfitit80ppmrbqvk2g2s.png" alt="Confirming kenya_counties.csv arrived on the server with ls -lh" width="796" height="61"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2mx5vvgpdvs3dlz9ianl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2mx5vvgpdvs3dlz9ianl.png" alt="Previewing first 5 rows of kenya_counties.csv with head command" width="797" height="101"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;File existed: 89K, headers confirmed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Table
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt; &lt;span class="n"&gt;magichun&lt;/span&gt; 
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;search_path&lt;/span&gt; &lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kenya&lt;/span&gt; &lt;span class="n"&gt;_counties&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;first_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;last_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;gender&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;ip_address&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;county_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;population&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;area_sq_km&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;population_density&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;average_age&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fidt7z2o8a98fyuxa7hcc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fidt7z2o8a98fyuxa7hcc.png" alt="Creating kenya_counties table inside the staging schema" width="799" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The COPY Permission Wall
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;COPY&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kenya_counties&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="s1"&gt;'/home/magichun/kenya_counties.csv'&lt;/span&gt;
&lt;span class="k"&gt;DELIMITER&lt;/span&gt; &lt;span class="s1"&gt;','&lt;/span&gt; &lt;span class="n"&gt;CSV&lt;/span&gt; &lt;span class="n"&gt;HEADER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Error:&lt;br&gt;
ERROR: could not open file "/home/magichun/kenya_counties.csv" for reading: Permission denied&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F95vjsfl2lxjla6d4y9wq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F95vjsfl2lxjla6d4y9wq.png" alt="COPY command permission denied error when reading from home directory" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Another error that had me in a chokehold: I didn't understand what was going on until I realized it was an issue with permissions.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Linux Permissions&lt;/strong&gt; Linux file permissions form the foundation of the system's security model. They define who can read, write, or execute files and directories. &lt;a href="https://www.geeksforgeeks.org/linux-unix/set-file-permissions-linux/" rel="noopener noreferrer"&gt;CLICK HERE To read more on Linux Permissions&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Fix - Two steps:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod &lt;/span&gt;755 /home/magichun
&lt;span class="nb"&gt;chmod &lt;/span&gt;644 /home/magichun/kenya_counties.csv 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then used the client-side &lt;code&gt;\copy&lt;/code&gt;wrapper instead, which streams the file through the active user session rather than the server process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;copy&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kenya_counties&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="s1"&gt;'/home/magichun/kenya_counties.csv'&lt;/span&gt;
&lt;span class="k"&gt;DELIMITER&lt;/span&gt; &lt;span class="s1"&gt;','&lt;/span&gt; &lt;span class="n"&gt;CSV&lt;/span&gt; &lt;span class="n"&gt;HEADER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result: &lt;code&gt;copy 1000&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0lj3qcid1t6g7h4h21di.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0lj3qcid1t6g7h4h21di.png" alt="Setting directory and file permissions with chmod 755 and 644" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Verifying the Data
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;county_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;population_density&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kenya_counties&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;population_density&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Mombasa came out as the county with the highest density.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffzt5158fhjfeoio75vgy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffzt5158fhjfeoio75vgy.png" alt="Query output showing ingested kenya counties data" width="800" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;COPY vs \copy:&lt;/strong&gt; Felt the need to touch on this as it confused me at first and it might help someone else confused about it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. COPY (SQL Command)&lt;br&gt;
-&lt;/strong&gt;Runs on the PostgreSQL server side.&lt;br&gt;
-&lt;strong&gt;File path is relative to the server&lt;/strong&gt;, not your local machine &lt;br&gt;
-Requires the PostgreSQL server process to have &lt;strong&gt;read/write&lt;/strong&gt;&lt;br&gt;
permissions to that file path.&lt;/p&gt;


&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;COPY&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="s1"&gt;'/var/lib/postgresql/data/input.csv'&lt;/span&gt; &lt;span class="n"&gt;CSV&lt;/span&gt; &lt;span class="n"&gt;HEADER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;br&gt;
sql&lt;br&gt;
-This will fail if the file is on your local machine but not on the server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. \copy (psql Meta-command)&lt;br&gt;
-&lt;/strong&gt;Runs on the client side** (inside the psql terminal)&lt;br&gt;
-Reads/writes files &lt;strong&gt;from your local machine&lt;/strong&gt; where psql is running&lt;br&gt;
-Uses the &lt;strong&gt;COPY&lt;/strong&gt; command internally, but streams data between the client and server.&lt;/p&gt;


&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;copy&lt;/span&gt; &lt;span class="n"&gt;my_table&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="s1"&gt;'C:/Users/you/data.csv'&lt;/span&gt; &lt;span class="n"&gt;CSV&lt;/span&gt; &lt;span class="n"&gt;HEADER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;-Works even if the PostgreSQL server is remote, because the file is used locally&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb&lt;/strong&gt;&lt;br&gt;
-If your file is &lt;strong&gt;on the server&lt;/strong&gt;, use &lt;strong&gt;COPY&lt;/strong&gt;.&lt;br&gt;
-If your file is &lt;strong&gt;on your local machine,&lt;/strong&gt; use &lt;strong&gt;\copy&lt;/strong&gt; in psql.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Section 5: The Reverse Pipeline - Exporting Back to windows.
&lt;/h2&gt;

&lt;p&gt;The final step completed the loop, moving the file back to my local machine&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="k"&gt;copy&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;county_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;population_density&lt;/span&gt; 
       &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;staging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;kenya_counties&lt;/span&gt;
       &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;population_density&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;TO&lt;/span&gt; &lt;span class="s1"&gt;'/home/magichun/high_density_counties.csv'&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;CSV&lt;/span&gt; &lt;span class="n"&gt;HEADER&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcq054u81r41tpkei9lsx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcq054u81r41tpkei9lsx.png" alt="Exporting high density counties to CSV file on the server" width="800" height="60"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Output: &lt;code&gt;COPY 205&lt;/code&gt; - 205 high-density records exported&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;scp &lt;span class="nt"&gt;-p&lt;/span&gt; 22 magichu@159.65.222.96:/home/magichun/high_density_counties.csv &lt;span class="s2"&gt;"D:&lt;/span&gt;&lt;span class="se"&gt;\D&lt;/span&gt;&lt;span class="s2"&gt;atasets"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;100% download confirmed&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb9hmeyn4zun28huksk4k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb9hmeyn4zun28huksk4k.png" alt="scp file download from server to local machine" width="800" height="428"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Here's a quick reference of the commands used throughout this assignment:
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ssh root@&amp;lt;IP&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Remote server login&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sudo adduser magichun&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Create a new system user&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sudo usermod -aG sudo magichun&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Grant admin privileges&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;code&gt;id magichun&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Verify user and group membership&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cat /etc/passwd \&lt;/code&gt; &lt;code&gt;grep magichun&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Confirm user in system registry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;&lt;code&gt;su magichun&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Switch active user&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;&lt;code&gt;whoami&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Check current identity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pwd&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Print working directory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;&lt;code&gt;psql --version&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Check PostgreSQL version&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sudo systemctl status postgresql&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Check service status&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;&lt;code&gt;sudo -i -u postgres&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Switch to postgres admin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ls -lh /home/magichun/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;List files with sizes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;&lt;code&gt;head -5 kenya_counties.csv&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Preview file contents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;&lt;code&gt;chmod 755 /home/magichun&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Open directory permissions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;&lt;code&gt;chmod 644 /home/magichun/kenya_counties.csv&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Set file read permissions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;&lt;code&gt;scp "D:\file.csv" user@IP:/path/&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Upload file to server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;td&gt;&lt;code&gt;scp -P 22 user@IP:/path/file.csv "D:\dest"&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Download file from server&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Clearly, this assignment wasn't a walk in the park, including writing this article, but I pushed through, and that's where growth happens. Perhaps the biggest lesson was realizing that some of the skills I was most confident in were the very ones that exposed my blind spots.&lt;br&gt;
Working on the errors and mistakes through research and external help really brought growth to my skills and confidence. &lt;br&gt;
As someone still early in this journey, I'd appreciate comments, suggestions, and corrections, as they are my real learning curve and the most valuable part of my journey.&lt;/p&gt;

&lt;p&gt;*This article documents a real setup task, configuring servers and pulling data, a project completed as part of the LuxDevHQ Data Engineering Program.&lt;/p&gt;

&lt;h2&gt;
  
  
  Github Repository
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/lone-sudo/Linux-fundamentals-data-engineering" rel="noopener noreferrer"&gt;View the full project on Github&lt;/a&gt;&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>postgres</category>
      <category>dataengineering</category>
      <category>linux</category>
    </item>
  </channel>
</rss>
