DEV Community

Cover image for How Linux is Used in Real-World Data Engineering (For Beginners)
Odhiambo
Odhiambo

Posted on

How Linux is Used in Real-World Data Engineering (For Beginners)

Prerequisites

1. A working public IP address to an Ubuntu server on the cloud
2. Access to admin user credentials (password or private key for key-based authentication)

Linux For The Cloud

Data engineering is increasingly seeing the uptake of cloud infrastructure as essential in how firms manage their data. From storage, automation, and analysis, many tools and resources are now deployed in the cloud.

Working on Linux environments is an unspoken necessary skill for data engineers. There are tools and processes you will come across in the data engineering lifecycle for which you need to have skills on Linux. Linux servers are almost an intrinsic choice for cloud servers. Their ubiquity on the cloud is because they are lightweight and highly optimized for the cloud.

As such, navigating and running commands from the Linux terminal is the focus of this article. It gives you a beginner friendly feel to working from the terminal interface.

Accessing the server

You need to be able to access the server remotely to do routine management and maintenance tasks. Open a terminal and type the following command:

ssh root@143.110.225.134
Enter fullscreen mode Exit fullscreen mode

replace the IP address after the @ sign with the actual server public IP address. The first time you log in to the server, your computer will prompt you to add the host to your local machine addresses of known hosts. Click yes to accept this step.

root@143.110.225.134's password: 
Welcome to Ubuntu 24.04.4 LTS (GNU/Linux 6.8.0-71-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/pro

 System information as of Wed Mar 25 14:54:56 UTC 2026

  System load:  0.0               Processes:             137
  Usage of /:   4.8% of 47.39GB   Users logged in:       4
  Memory usage: 20%               IPv4 address for eth0:143.110.225.134
  Swap usage:   0%                IPv4 address for eth0: 10.48.0.5

Expanded Security Maintenance for Applications is not enabled.

0 updates can be applied immediately.

Enable ESM Apps to receive additional future security updates.
See https://ubuntu.com/esm or run: sudo pro status


*** System restart required ***
Last login: Wed Mar 25 14:35:08 2026 from 41.99.105.22
root@ubuntu:~#
Enter fullscreen mode Exit fullscreen mode

_NOTE: The numbers are fictional for demo only.
The line root@ubuntu indicates that we are logged in as user root and the name of the host/server machine is ubuntu. If you run the pwd command, you should see that you are inside the root folder as shown below.

root@ubuntu:~# pwd
/root
root@ubuntu:~# 
Enter fullscreen mode Exit fullscreen mode

Creating a user

Now let us create a user called odhiambo on this system. We use the _adduser_ command as shown below.

root@ubuntu:~# sudo adduser odhiambo
Enter fullscreen mode Exit fullscreen mode

You will get a prompt to set and confirm the password of the user. You will also get several optional prompts including to enter the full name and phone details. We skip these optional steps by just pressing Enter at each prompt.

Giving the user administrative privileges

The user odhiambo does not have permission to run administrative tasks on the system. We then grant these privileges by adding the user odhiambo to the sudo group with the following command:

root@ubuntu:~# sudo usermod -aG sudo odhiambo
Enter fullscreen mode Exit fullscreen mode

The first sudo is to allow us to run the command without running into permission denied issues

The usermod command is for modifying system user settings

The -a instructs the system to append the user with the G specifying the append is to a group. So the command is appending to a group

The group the user is being appended to is called sudo, which is the superuser group. This will give the user full access to all the privileges. However, they should invoke the _sudo_ keyword before running these commands

The username being modified is odhiambo

Logging in with our user account

We are now able to log on to the server using our user.

ssh odhiambo@143.110.225.134
Enter fullscreen mode Exit fullscreen mode

We should see that the terminal should now indicate that we are logged in as odhiambo.

odhiambo@ubuntu:~$
Enter fullscreen mode Exit fullscreen mode

A glimpse into the Linux file structure

For new Linux users, one of the most important things to have a fundamental understanding on is the Linux file structure. The top-most level folder structure typically looks as shown below:

odhiambo@ubuntu:~$ ls /
bin                home               mnt   sbin.usr-is-merged  usr
bin.usr-is-merged  lib                opt   snap                var
boot               lib64              proc  srv
cdrom              lib.usr-is-merged  root  swap.img
dev                lost+found         run   sys
etc                media              sbin  tmp
odhiambo@ubuntu:~$ 
Enter fullscreen mode Exit fullscreen mode

Here we have used the ls command that lists all the files in a folder. The / sign indicates that we want to view the files in theroot ie _top most level_folder.While this file structure is not the focus of this article, I want to talk about the home folder.

Notice that our path on the terminal when we logged in as odhiambo included a tilde ~ ie odhiambo@ubuntu:~$. This is a shorthand to show that we are in our user's home folder, in this case, odhiambo's home folder.

User's home folder is different from the home folder at the root level. All user accounts created on the system will have a folder with their name inside the root level home folder eg. /home/odhiambo. So in actuality, the path odhiambo@ubuntu:~$ corresponds to odhiambo@ubuntu:~$/home/odhiambo

Handling files

A data engineer works a lot with files on the terminal. This includes making edits to configuration files, or making creating script from the terminal. Knowing the basics of file handling in data engineering in Linux server environment is important.

1. Creating files

The most basic way to create a file is using the touch command with the file name.

touch orders.txt
Enter fullscreen mode Exit fullscreen mode

This will create a file called orders.txt

touch orders.txt Manual.md suppliers.txt
Enter fullscreen mode Exit fullscreen mode

This will create multiple files

2. Deleting files

use rm to remove a file

rm orders.txt
Enter fullscreen mode Exit fullscreen mode
3. Listing files

We have already hinted at this before. The ls command lists files and folders

ls
Enter fullscreen mode Exit fullscreen mode

This will list all files and folders on a path. File and folders are distinguished based on the colors. Files also may have an extension.

ls -a
Enter fullscreen mode Exit fullscreen mode

This lists all files and folders, including hidden files and folders.

4. Moving files

Moving a file does not leave a copy behind. We use the mv command followed with the source file followed by the destination path.

mv orders.txt completed/
Enter fullscreen mode Exit fullscreen mode

This moves the orders.txt to a folder called completed inside the current path. If the folder does not exist, the command will fail.

5. Copying files

The cp command is used to copy files.

cp orders.txt completed/
Enter fullscreen mode Exit fullscreen mode

This will copy the orders.txt file into a folder called completed in the current path. If the folder does not exist, the command fails.

scp orders.txt odhiambo@143.110.225.134:/home/odhiambo/orders/
Enter fullscreen mode Exit fullscreen mode

This will copy the orders.txt file from our local machine to the server inside the orders folder on the path shown. Note: You will be prompted to enter the server password. The scp allows to securely copy files over the internet.

File permissions

The last thing we will look at is a brief overview of file permissions in Linux. Use the ls -l command to view file permissions.

odhiambo@ubuntu:~/$ ls -l
total 0
-rw-rw-r-- 1 odhiambo odhiambo 0 Mar 27 13:04 orders.txt
odhiambo@ubuntu:~/$ 
Enter fullscreen mode Exit fullscreen mode

Permissions in Linux are read, write and execute with their corresponding values as 4, 2 and 1. So a value of 7 means a user has all the three permissions (4+2+1).

In addition, the ls -l commands lists permissions in a 3 part string, user, group and other. r, w and x correspond to the read, write and execute permissions.

The image below shows a summary of this.

Permissions in Linux

photo from bytebytego.com
cover photo by pressfoto on freepik

Top comments (0)