DEV Community

ClintWithK
ClintWithK

Posted on

Getting Started With Linux For Data Engineering

As a beginner using Linux, many people tend to feel scared or intimidated — especially because most operations are done through the terminal. This fear is completely normal, but the good news is: once you understand the basics, Linux becomes simple, powerful, and even enjoyable to use.

Linux comes in different flavors known as distributions (distros). Some common ones include:

  • Ubuntu
  • Arch Linux
  • Fedora
  • Parrot OS
  • Red Hat

Although these distros may look different, most Linux commands are very similar across them, especially the core ones you’ll use daily.

In this article, we’ll focus on the most minimal but critical Linux commands every beginner — especially an aspiring Data Engineer — should know.

We’ll be working specifically with Ubuntu Server, as it is widely used in data engineering, cloud environments, and production systems.


What We’ll Cover

By the end of this article, you’ll be comfortable with:

  • Updating and upgrading the system
  • Navigating directories (folders)
  • Listing directory contents
  • Reading and editing files using the terminal
  • Copying and moving files
  • Logging into servers using SSH
  • Transferring files using SCP and SFTP

Let’s upskill you from Beginner to Intermediate
Let’s explore these basics together.


Step 1: Identify Your Linux System

Run the command below:

uname -a
Enter fullscreen mode Exit fullscreen mode

This command shows:

  • The Linux kernel version
  • System architecture
  • OS details

It’s a quick way to know what system you’re working on, especially when logged into remote servers.


Step 2: Update and Upgrade the System

Keeping your system updated is one of the most important habits in Linux.

Update Package Lists

sudo apt update
Enter fullscreen mode Exit fullscreen mode

This command refreshes the list of available packages and versions.
It ensures that any software you install is up to date.

Upgrade Installed Packages

sudo apt upgrade
Enter fullscreen mode Exit fullscreen mode

This upgrades all installed applications and system tools to their latest versions.
Think of this as patching the OS.


Step 3: Navigating Directories

Linux uses directories instead of folders, but they mean the same thing.

Check Your Current Directory

pwd
Enter fullscreen mode Exit fullscreen mode

List Files and Directories

ls
Enter fullscreen mode Exit fullscreen mode

To see more details:

ls -l
Enter fullscreen mode Exit fullscreen mode

To include hidden files:

ls -a
Enter fullscreen mode Exit fullscreen mode

Move Between Directories

cd directory_name
Enter fullscreen mode Exit fullscreen mode

Go back one level:

cd ..
Enter fullscreen mode Exit fullscreen mode

Go to your home directory:

cd ~
Enter fullscreen mode Exit fullscreen mode

Step 4: Reading and Editing Files from the Terminal

View File Contents

cat filename.txt
Enter fullscreen mode Exit fullscreen mode

For long files:

less filename.txt
Enter fullscreen mode Exit fullscreen mode

Step 5: Editing Files Using Nano and Vi

Using Nano (Beginner-Friendly)

nano filename.txt
Enter fullscreen mode Exit fullscreen mode

Type to edit

  • Press Ctrl + O to save

  • Press Ctrl + X to exit

Using Vi / Vim (Advanced but Powerful)

vi filename.txt
Enter fullscreen mode Exit fullscreen mode

Basic commands:

  • Press i → Insert mode

  • Press Esc → Exit insert mode

  • Type :wq → Save and quit

  • Type :q! → Exit without saving


Step 6: Copying and Moving Files

Copy Files

cp file1.txt /path/to/destination/
Enter fullscreen mode Exit fullscreen mode

Copy directories:

cp -r folder1 /path/to/destination/
Enter fullscreen mode Exit fullscreen mode

Move or Rename Files

mv oldname.txt newname.txt
Enter fullscreen mode Exit fullscreen mode

Move files:

mv file.txt /new/location/
Enter fullscreen mode Exit fullscreen mode

Step 7: Logging into a Server Using SSH

SSH allows you to securely access remote servers.

ssh username@server_ip
Enter fullscreen mode Exit fullscreen mode

Example:

ssh ubuntu@192.168.1.10
Enter fullscreen mode Exit fullscreen mode

SSH is heavily used in:

  • Cloud platforms

  • Data pipelines

  • Production servers


Step 8: File Transfer Using SCP and SFTP

Copy Files Using SCP

scp file.txt username@server_ip:/remote/path/
Enter fullscreen mode Exit fullscreen mode

Copy files from server to local machine:

scp username@server_ip:/remote/file.txt .
Enter fullscreen mode Exit fullscreen mode

Using SFTP

sftp username@server_ip
Enter fullscreen mode Exit fullscreen mode

Common SFTP commands:

  • ls
  • get filename
  • put filename
  • exit

Why Linux Matters for Data Engineers

  • As a Data Engineer, Linux is unavoidable:

  • Most data systems run on Linux

  • Cloud servers are Linux-based

  • Automation and pipelines rely on terminal commands

  • Mastering these basics gives you:

  • Confidence

  • Speed

  • Control over systems

Conclusion

Linux might feel overwhelming at first, but you don’t need to know everything at once. Start with the basics, practice daily, and build confidence gradually.


Happy Coding!!!

Top comments (0)