<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: David Mwandairo</title>
    <description>The latest articles on DEV Community by David Mwandairo (@david_mwandairo_777f888b4).</description>
    <link>https://dev.to/david_mwandairo_777f888b4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3952232%2F91399198-fc4d-4e4b-94f6-42ce0687951b.png</url>
      <title>DEV Community: David Mwandairo</title>
      <link>https://dev.to/david_mwandairo_777f888b4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/david_mwandairo_777f888b4"/>
    <language>en</language>
    <item>
      <title>Connecting Data the Right Way: Modeling, Relationships, and Schema Design in Power BI</title>
      <dc:creator>David Mwandairo</dc:creator>
      <pubDate>Mon, 29 Jun 2026 01:39:51 +0000</pubDate>
      <link>https://dev.to/david_mwandairo_777f888b4/connecting-data-the-right-way-modeling-relationships-and-schema-design-in-power-bi-4602</link>
      <guid>https://dev.to/david_mwandairo_777f888b4/connecting-data-the-right-way-modeling-relationships-and-schema-design-in-power-bi-4602</guid>
      <description>&lt;p&gt;As a data professional, one norm that I have had to accept is that raw data rarely comes in one neat package. Most of the times, it'll be scattered, messy, and it lives across different tables that naturally don't talk to each other. Have you ever stayed up late looking at several spreadsheets trying to figure out how they relate to each other? Well, the good folks at Microsoft built Power BI to solve this. Power BI gives you the tools to create relationships between different sets of messy data. Before building beautiful dashboards with Power BI, however, you will need to understand what is happening beneath the surface.&lt;/p&gt;

&lt;p&gt;This article examines the importance of &lt;strong&gt;data modelling&lt;/strong&gt;, &lt;strong&gt;relationships&lt;/strong&gt;, &lt;strong&gt;schemas&lt;/strong&gt; and &lt;strong&gt;joins&lt;/strong&gt; in Power BI. Understanding how these concepts work together and getting them right makes all the difference in creating insightful dashboards.&lt;/p&gt;

&lt;h2&gt;
  
  
  Beyond the Spreadsheet: Giving Your Data a Voice through Meaningful Relationships
&lt;/h2&gt;

&lt;p&gt;Data modelling involves structuring and organizing your data in a way that Power BI can make sense of it. It's similar to setting up a filing system before you start working. Without having a sensible structure in place, your reports will either break or produce results that are untrustworthy.&lt;/p&gt;

&lt;p&gt;In Power BI, the data model resides in the &lt;strong&gt;Model View&lt;/strong&gt;, where all your tables and the connections between them are visibly laid out. A good data model answers one simple question: &lt;em&gt;how does each piece of data relate to every other piece?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In this article, we will use one relatable example. Imagine you own an online bookstore and your data lives in three different tables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Customers&lt;/strong&gt; - stores customer names, emails and locations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orders&lt;/strong&gt; - stores order IDs, dates, amounts, and customer references.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Books&lt;/strong&gt; - stores book titles, authors, genres, and prices.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each table is useful on its own but limited. Structuring a good data model will connect them in a way that Power BI will understand the full story, for example who bought what, when and for how much.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using Joins in Power BI to Merge Tables
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;join&lt;/strong&gt; is an operation that combines rows from two tables based on a related column they share. This happens in Power BI within the &lt;strong&gt;Power Query Editor&lt;/strong&gt;, which is the data transformation workspace before the data enters the model. To perform a join in Power Query, you navigate to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Home &amp;gt; Merge Queries
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You then select the two tables that you would like to combine and choose the matching columns. For our bookstore, you would match the &lt;strong&gt;Customer_ID&lt;/strong&gt; column in the Orders table with the &lt;strong&gt;Customer_ID&lt;/strong&gt; column in the Customers table. &lt;/p&gt;

&lt;p&gt;There are six types of joins in Power BI and picking the right one is critical:&lt;br&gt;
&lt;em&gt;&lt;strong&gt;Note&lt;/strong&gt;: The left table is the primary table you will choose and make reference to. The right table will be the table you'd like to establish a connection with.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Inner Join
&lt;/h3&gt;

&lt;p&gt;This join returns only the rows where there is a match in &lt;u&gt;both&lt;/u&gt; tables. In our bookstore data, only customers who have placed at least one order will be seen in the results&lt;/p&gt;
&lt;h3&gt;
  
  
  Left Outer Join
&lt;/h3&gt;

&lt;p&gt;This join returns &lt;u&gt;all rows&lt;/u&gt; from the left table and only the matching rows from the right table. From our bookstore data, it will return all the customers, even the ones with no orders yet.&lt;/p&gt;
&lt;h3&gt;
  
  
  Right Outer Join
&lt;/h3&gt;

&lt;p&gt;This is the opposite of a left outer join. It returns all rows from the right table and only the matching rows from the left. From our bookstore example, it will return all the orders even if the customer record is missing.&lt;/p&gt;
&lt;h3&gt;
  
  
  Full Outer Join
&lt;/h3&gt;

&lt;p&gt;This join returns &lt;u&gt;everything&lt;/u&gt; from both tables whether they match or not and missing values will appear as null. From our data, it will return every customer and every order regardless of a match.&lt;/p&gt;
&lt;h3&gt;
  
  
  Left Anti Join
&lt;/h3&gt;

&lt;p&gt;This join returns only the rows from the left that have &lt;u&gt;no match&lt;/u&gt; in the right table. From our bookstore data, it will return the customers who have never placed an order, thus can be used in targeting inactive users.&lt;/p&gt;
&lt;h3&gt;
  
  
  Right Anti Join
&lt;/h3&gt;

&lt;p&gt;This join returns the rows from the right that have no match in the left table. In our case, it will return the orders with no associated customer record.&lt;/p&gt;

&lt;p&gt;Selecting the wrong type of join can subtly corrupt you reports, therefore, always pause and ask yourself what you actually need to see.&lt;/p&gt;
&lt;h2&gt;
  
  
  Connecting the Dots: How Relationships Shape Power BI Data Models
&lt;/h2&gt;

&lt;p&gt;While joins are necessary in connecting tables during the data preparation stage, &lt;strong&gt;relationships&lt;/strong&gt; connect tables at the model level, thus allowing them to interact during analysis. In the &lt;strong&gt;Model View&lt;/strong&gt;, relationships are created by dragging a column from one table and dropping it onto the matching column in another table. The relationship between them will be represented by a line that Power BI will draw between the two tables.&lt;/p&gt;
&lt;h3&gt;
  
  
  Types of Relationships
&lt;/h3&gt;
&lt;h4&gt;
  
  
  One-to-Many (1:*)
&lt;/h4&gt;

&lt;p&gt;In this relationship type, one record in table X relates to multiple records in table Y, but each record in table Y relates to only one record in table X. For example, one customer can make many orders but each order belongs to one customer.&lt;/p&gt;
&lt;h4&gt;
  
  
  Many-to-Many (&lt;em&gt;:&lt;/em&gt;)
&lt;/h4&gt;

&lt;p&gt;Several records in table X can relate to multiple records in table Y. This relationship is complex and can return unanticipated results if not handled carefully even though it is supported in Power BI. It requires special attention to execute. For example, one book can appear in many orders and a single order can contain many books.&lt;/p&gt;
&lt;h4&gt;
  
  
  One-to-One (1:1)
&lt;/h4&gt;

&lt;p&gt;Each record in table X matches exactly on record in table Y. For example, each customer has exactly one loyalty profile record.&lt;/p&gt;
&lt;h3&gt;
  
  
  Cross-Filter Direction
&lt;/h3&gt;

&lt;p&gt;Filter direction controls how filters flow between tables when interacting with reports. They are classified into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single Direction&lt;/strong&gt; - Filters flow in one direction i.e. from the lookup table to the data table. This is the usual default.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bidirectional&lt;/strong&gt; - Filters flow in both directions. If overused, this can produce confusing results.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  How Schema Design Shapes Your Data Model
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;schema&lt;/strong&gt; is the basic layout and structure of a data model. The schema you choose will affect the performance and readability of your Power BI reports. There are two types of schema used in Power BI:&lt;/p&gt;
&lt;h3&gt;
  
  
  Star Schema ⭐
&lt;/h3&gt;

&lt;p&gt;This schema consists of one central &lt;strong&gt;Fact Table&lt;/strong&gt; enveloped by multiple &lt;strong&gt;Dimension Tables&lt;/strong&gt;. The fact table holds the numerical, measurable data while the dimension table holds the descriptive, categorical data. The figure below shows a representation of the star schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        [Customers]
             |
[Books] — [Orders] — [Dates]
             |
         [Locations]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Table Type&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;th&gt;Contains&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fact Table&lt;/td&gt;
&lt;td&gt;Orders&lt;/td&gt;
&lt;td&gt;Order amounts, quantities, revenue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dimension Table&lt;/td&gt;
&lt;td&gt;Customers&lt;/td&gt;
&lt;td&gt;Names, emails, locations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dimension Table&lt;/td&gt;
&lt;td&gt;Books&lt;/td&gt;
&lt;td&gt;Titles, authors, genres, prices&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dimension Table&lt;/td&gt;
&lt;td&gt;Dates&lt;/td&gt;
&lt;td&gt;Day, month, quarter, year&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The star schema is fast and easy to understand. Power BI's engine is optimized to work with this structure, hence reports will load faster and DAX calculations will work more efficiently. &lt;/p&gt;

&lt;h3&gt;
  
  
  Snowflake Schema ❄️
&lt;/h3&gt;

&lt;p&gt;This schema further extends the star schema by breaking down the dimension tables into more related tables. It's structure resembles that of a snowflake because of its branching structure. Its structure is as shared below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Author Details] — [Books] — [Orders] — [Customers] — [Customer Segments]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The snowflake schema is known to add complexity to a data model even though it may save on storage space and reduce data redundancy. Filters, in effect, have to move through more relationships to get to their destination, thus slowing down performance and making the data model hard to maintain. &lt;br&gt;
&lt;em&gt;&lt;strong&gt;Pro-tip: Stick with the star schema when starting out with Power BI because it is the most used in Power BI projects.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Do We Need to Know This?
&lt;/h2&gt;

&lt;p&gt;Charts and visuals are exciting to create when we need to showcase our Power BI projects, but a poorly structured data model will produce bad results. Some of the effects of a poorly structured data model include; filters behaving strangely and numbers not adding up when using DAX functions. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A solid foundation in the creation of good data models will save you from hours of debugging. - David Mwandairo&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Getting your &lt;strong&gt;joins&lt;/strong&gt;, &lt;strong&gt;relationships&lt;/strong&gt; and &lt;strong&gt;schema&lt;/strong&gt; right from the beginning results to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simpler DAX calculations.&lt;/li&gt;
&lt;li&gt;Faster report performance.&lt;/li&gt;
&lt;li&gt;Dashboards that are easier to scale and maintain.&lt;/li&gt;
&lt;li&gt;Accurate and trustworthy numbers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The data model is the foundation of a good Power BI report. When you build it well, everything else will become considerably easier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Power BI is a capable Business Intelligence tool. It just requires you to understand the fundamentals and it will treat you well. Joins, modelling and relationships aren't the most exciting topics but understanding them well will separate data professionals that prepare reports which look good from those who prepare reports that actually work.&lt;/p&gt;

&lt;p&gt;To perfect the Star Schema concept, you should ensure that you use intentional joins in your data. The one-to-many relationships that you build will have to be clean and always ask yourself whether you data reflects real-world logic. Once that foundation is solid, the dashboards and insights will take care of themselves.&lt;/p&gt;

</description>
      <category>learning</category>
      <category>data</category>
      <category>modelling</category>
      <category>powerbi</category>
    </item>
    <item>
      <title>Linux Fundamentals for Data Engineering</title>
      <dc:creator>David Mwandairo</dc:creator>
      <pubDate>Mon, 15 Jun 2026 22:07:34 +0000</pubDate>
      <link>https://dev.to/david_mwandairo_777f888b4/linux-fundamentals-for-data-engineering-1hii</link>
      <guid>https://dev.to/david_mwandairo_777f888b4/linux-fundamentals-for-data-engineering-1hii</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;The Linux operating system has become the go to platform for handling data engineering workloads in the modern data landscape. Some of the prominent data-related software it powers include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Amazon Redshift which is a cloud based data warehouse that runs on a customized version of Linux.&lt;/li&gt;
&lt;li&gt;Apache Spark which is a big data framework that runs on Linux clusters.&lt;/li&gt;
&lt;li&gt;Apache Airflow which is an orchestration tool that is deployed on Linux servers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since these industry standard tools depend on Linux to run, proficiency in Linux is now a core competency, therefore, data engineers need to know how to navigate, manipulate and automate data workflows using the command line interface. This article covers some of the basic commands that a beginner in using Linux should know.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting to a Remote Server
&lt;/h2&gt;

&lt;p&gt;To demonstrate how to use the commands, I will first connect to a remote Linux server using the secure shell(ssh) command. Since I use the Fedora Linux distribution, I did not have to install any proprietary software to access the server. I can do it directly from the terminal. To access the server from a specific port, I added &lt;code&gt;-p 22&lt;/code&gt; at the end of the server IP address and entered the password to log in.&lt;br&gt;
&lt;code&gt;ssh root@159.65.222.96 -p 22&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F60ds694phi7hizwoawwf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F60ds694phi7hizwoawwf.png" alt="Logging in to the remote server" width="800" height="700"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once I logged in, I first updated the server to be able to use the latest features using the command &lt;code&gt;sudo apt update&lt;/code&gt;. The commands used in the Ubuntu Linux distribution may be different from other Linux distributions. In this case, we use &lt;code&gt;apt&lt;/code&gt; since the server is an Ubuntu Linux distribution.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39nt92krqopslw7yfcpn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F39nt92krqopslw7yfcpn.png" alt="Updating the server" width="800" height="839"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After completing the update, I created my user account within the server using the command &lt;code&gt;sudo adduser DavidM&lt;/code&gt; to isolate my workflow.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuc8v6g6g4f20mt3olw9u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuc8v6g6g4f20mt3olw9u.png" alt="Adding the user" width="800" height="319"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I got an error telling me to change the naming convention to my user name but I intend to keep it as it is to make it distinct. To bypass the error, I use the command &lt;code&gt;sudo adduser DavidMw --force-badname&lt;/code&gt; and proceed to create my user account.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1cdwy7fi8gsxnv8w86vk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1cdwy7fi8gsxnv8w86vk.png" alt="Force add my specific user name" width="800" height="640"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To switch to my user, I run the command &lt;code&gt;su DavidMw&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgqeqexp1j5orj45wo9xq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgqeqexp1j5orj45wo9xq.png" alt="Switch to my user account" width="799" height="86"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since my account is active, I can alternatively log in directly into the server without going through the root user as shown below. To do this, I run the command &lt;code&gt;ssh DavidMw@159.65.222.96 -p 22&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvliafbbpk3pw2us80msn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvliafbbpk3pw2us80msn.png" alt="Direct log in to my user account" width="800" height="700"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  Checking Postgresql version
&lt;/h3&gt;

&lt;p&gt;To check the Postgresql version available, I run the command &lt;code&gt;psql --version&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8r4p0hemn2bh6zd6a0u4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8r4p0hemn2bh6zd6a0u4.png" alt="Check Postgresql version" width="799" height="214"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, I check the status of the Postgresql server using the command &lt;code&gt;sudo systemctl status postgresql&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fteq1heo971qnvr7p6lzj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fteq1heo971qnvr7p6lzj.png" alt="Check psql server status" width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that I have confirmed it is enabled, I can log in to the Postgresql server using the command &lt;code&gt;sudo -i -u postgres&lt;/code&gt;. After that, I access the Postgres interface using the command &lt;code&gt;psql&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81pst57diahqrgsz4g34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F81pst57diahqrgsz4g34.png" alt="Accessing the Postgres user interface" width="800" height="363"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Creating a Database and Schema
&lt;/h4&gt;

&lt;p&gt;Now that I have logged in to the Postgres user interface, I can create my database using the SQL script &lt;code&gt;CREATE DATABASE DavidMw;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzyczfdrskmhibff0u5mw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzyczfdrskmhibff0u5mw.png" alt="Create database" width="799" height="327"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To confirm whether it has been created, I run the command &lt;code&gt;\l&lt;/code&gt; and I am able to see it as shown below. To exit the list of databases, I press 'q'.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg3w6w5pflc1xaxemlk1a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg3w6w5pflc1xaxemlk1a.png" alt="List database command" width="799" height="174"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi7mvdbil9iig5u3dx8yy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi7mvdbil9iig5u3dx8yy.png" alt="Database list" width="800" height="1000"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, I create a schema named 'staging' within my database in which I will upload my data. To do that, I will first move into my database using the command &lt;code&gt;\c&lt;/code&gt; then I run the script &lt;code&gt;CREATE SCHEMA staging;&lt;/code&gt;. &lt;br&gt;
&lt;strong&gt;&lt;em&gt;Note: Always include a semi-colon(;) at the end of your SQL scripts to mark the end of the statement.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk47vr0b0f08znh28461c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk47vr0b0f08znh28461c.png" alt="create schema" width="800" height="282"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To view the schema, I run the command &lt;code&gt;\dn&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flcsjkrih5npvbywy2ovl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flcsjkrih5npvbywy2ovl.png" alt="View schema" width="800" height="468"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  Upload Data to a Schema
&lt;/h4&gt;

&lt;p&gt;To upload data to a schema, we use Dbeaver which is a universal database management tool. First, we connect it to the Postgresql server following the steps below.&lt;br&gt;
First, I create a new connection in Dbeaver&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3q3f1w458z1rh0ed4fco.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3q3f1w458z1rh0ed4fco.png" alt="create new connection" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, I setup the connection using the details I created previously.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fje3bgho96kkeuanhmx1x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fje3bgho96kkeuanhmx1x.png" alt="test connection" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, I establish my connection and connect to my database.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimyd6d7u6xkwzsua5sax.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimyd6d7u6xkwzsua5sax.png" alt="connection established" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To import the data, I right-click the staging schema and hover to the "Import Data" option after which I select the file I want to upload.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51gk6zg2khkmvkkse02c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51gk6zg2khkmvkkse02c.png" alt="import data" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this case, I select the '&lt;em&gt;salary_data.csv&lt;/em&gt;' file and continue.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr7zs9vxz8fm4vhmcmrg1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr7zs9vxz8fm4vhmcmrg1.png" alt="select salarydata.csv" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuzkfwo7yzpsqmn4a8yu0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuzkfwo7yzpsqmn4a8yu0.png" alt="select salarydata.csv" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After a few seconds, the data is uploaded.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft72wgj2ktnyka1697w6g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft72wgj2ktnyka1697w6g.png" alt="Data imported" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Common Linux Commands
&lt;/h2&gt;

&lt;p&gt;The core commands used in Linux are necessary for creating, editing and navigating through files. The creating, editing and updating of files in Linux systems happens within directories, hence a data engineer should always know where he/she is within the file system. Thus, the command to run to know where you are in the file system is &lt;code&gt;pwd&lt;/code&gt;(print working directory) as shown below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwbq6jiyf176b3kq8px59.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwbq6jiyf176b3kq8px59.png" alt="pwd command" width="800" height="680"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To see the contents of the directory, we use the command &lt;code&gt;ls&lt;/code&gt;(list) and to create a new directory, we use the command &lt;code&gt;mkdir&lt;/code&gt;(make directory) to create the 'newfolder' as shown below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuv8uy610kclb9y3ys19o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuv8uy610kclb9y3ys19o.png" alt="ls and mkdir" width="798" height="177"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When working with remote servers, it is possible to upload files and directories directly using the &lt;code&gt;scp&lt;/code&gt;(secure copy) command. To locate the file to be uploaded, I first navigated into the folder containing the file by using the &lt;code&gt;cd&lt;/code&gt;(change directory) command. The difference between uploading a file and a folder is adding '-r' after the scp command. In my case, I uploaded the '&lt;em&gt;hotel_data.csv&lt;/em&gt;', '&lt;em&gt;salary_data.csv&lt;/em&gt;' files and '&lt;em&gt;stocks&lt;/em&gt;' folder to my instance of the server using the process below. &lt;br&gt;
&lt;strong&gt;&lt;em&gt;Note: When running the scp command, ensure that to run the command in the origin device's command line interface.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fybtk0dhaf72aeb7dbpv3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fybtk0dhaf72aeb7dbpv3.png" alt="scp upload file and folder" width="799" height="621"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zx5xes1m0mw6m9b4yqg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7zx5xes1m0mw6m9b4yqg.png" alt="list scp file and folder" width="800" height="491"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To create files in Linux, one can use either the &lt;code&gt;echo&lt;/code&gt; or &lt;code&gt;touch&lt;/code&gt; command. The &lt;code&gt;echo&lt;/code&gt; command inputs content directly within a file while the &lt;code&gt;touch&lt;/code&gt; command creates an empty file. Below is an example of how I created the '&lt;em&gt;file1.txt&lt;/em&gt;' and '&lt;em&gt;file2.py&lt;/em&gt;' files using the two commands respectively.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftczlsgsykxr004npscih.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftczlsgsykxr004npscih.png" alt="echo and touch commands" width="800" height="348"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To insert data into the empty '&lt;em&gt;file2.py&lt;/em&gt;' file, I used the vim editor by running the &lt;code&gt;vi&lt;/code&gt; command as shown below. To insert data into the file, press "i" to access the interactive interface and once done, press "Esc" + ":" and subsequently "wq" to save the changes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uk175yfno7tlotsmgqc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6uk175yfno7tlotsmgqc.png" alt="vi command" width="798" height="190"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimcwgx2v18u3m7jah515.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fimcwgx2v18u3m7jah515.png" alt="Enter data and quit" width="800" height="713"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To view the changes made, use the command &lt;code&gt;cat&lt;/code&gt; followed by the file name.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fip10roxjix2pctf92597.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fip10roxjix2pctf92597.png" alt="cat command" width="799" height="197"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Files can be viewed in different formats. The &lt;code&gt;ls&lt;/code&gt; command enables one to view the files within a directory through the basic format. To view them in a more detailed format, we use the commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-l&lt;/span&gt;   &lt;span class="c"&gt;#lists the files in a directory in the long listing format.&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt;  &lt;span class="c"&gt;#shows all the details of all the files within a directory.&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-a&lt;/span&gt;   &lt;span class="c"&gt;#lists all the files including the hidden ones.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx48hsd1ywno6r9nm3azw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx48hsd1ywno6r9nm3azw.png" alt="listing files" width="800" height="708"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Copying and moving files are also important commands used in Linux. They are executed by the &lt;code&gt;cp&lt;/code&gt; and &lt;code&gt;mv&lt;/code&gt; commands respectively. The &lt;code&gt;cp&lt;/code&gt; command comes in handy when creating back-up/duplicate files. The &lt;code&gt;mv&lt;/code&gt; command moves files from one folder to another and can also be used to rename files. Directories can also be copied by including &lt;code&gt;-r&lt;/code&gt; after the &lt;code&gt;cp&lt;/code&gt; command as shown below. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmoq2ma264lx8k78b90hf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmoq2ma264lx8k78b90hf.png" alt="copy file" width="800" height="248"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9g086hsjq8u5vg8fe8rx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9g086hsjq8u5vg8fe8rx.png" alt="copy file into directory" width="800" height="322"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ccw9fzjbkjh24q98mhh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ccw9fzjbkjh24q98mhh.png" alt="mv rename" width="800" height="148"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ww9x6k22fsvtuueekmx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ww9x6k22fsvtuueekmx.png" alt="mv move to folder" width="798" height="184"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv42p3udeatygdqjtw18z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv42p3udeatygdqjtw18z.png" alt="mv to rename in different folder" width="798" height="190"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ixu33m1hyolqmf4j69z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4ixu33m1hyolqmf4j69z.png" alt="cp directory" width="799" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To delete files, we use the &lt;code&gt;rm&lt;/code&gt; command. Folders can also be deleted by adding "-r" to the rm command to delete the files recursively. To move up a directory, we use the &lt;code&gt;cd ..&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8cuktclcsshrb31jh90.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa8cuktclcsshrb31jh90.png" alt="delete files" width="799" height="284"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  File Permissions
&lt;/h4&gt;

&lt;p&gt;Setting file permissions help data engineers determine who has access to particular files. File permissions are determined using a 10-character string. An example of a permission string is &lt;code&gt;-rw-rw-r--&lt;/code&gt; where the first character represents the file type which can be "&lt;strong&gt;-&lt;/strong&gt;" to signify a file, "&lt;strong&gt;d&lt;/strong&gt;" to signify a directory and "&lt;strong&gt;l&lt;/strong&gt;" to signify a link. The next 3 characters represent the file owner's permissions, the middle 3 characters represent the group's file permissions and the last 3 characters represent others' file permissions. There are two ways in which file permissions can be assigned as shown in the tables below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkw0jr308ia72qh8qp5l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgkw0jr308ia72qh8qp5l.png" alt="chown syntax 1" width="800" height="643"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1adv2hevi3v9t6kskajc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1adv2hevi3v9t6kskajc.png" alt="chown syntax 2" width="799" height="335"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The example below shows how file permissions are given on the command line using the &lt;code&gt;chmod&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lrz81han651l2oaa6c0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lrz81han651l2oaa6c0.png" alt="chmod in action" width="800" height="580"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Mastering the Linux fundamentals transforms a data engineer from someone that just runs scripts to someone that can design, build and deploy strong data systems. The commands are important in managing tasks such as file navigation and handling, remote access, file permissions and text processing which form the foundation of data engineering. Through frequent practice, the command line becomes a powerful tool to run data engineering tasks on a large scale. &lt;/p&gt;

</description>
      <category>linux</category>
      <category>dataengineering</category>
      <category>cli</category>
      <category>learning</category>
    </item>
  </channel>
</rss>
