DEV Community

Cover image for Getting Started With Linux for Data Engineers (With Vi and Nano Examples)
John Wakaba
John Wakaba

Posted on

Getting Started With Linux for Data Engineers (With Vi and Nano Examples)

If you're getting into data engineering, there's one skill that keeps
showing up everywhere: Linux.

Whether you're working with cloud servers, big data tools, or pipeline
automation, Linux is almost always running behind the scenes. The good
news? You don't need to be a Linux wizard to get started.

In this guide, we'll break down:

  • Why data engineers need Linux
  • Basic commands you'll actually use
  • How to edit files using Vi
  • How to edit files using Nano
  • Real-world examples

Why Should Data Engineers Learn Linux?

Here's the honest truth --- most production data systems run on Linux
servers.

When you deploy Spark jobs, schedule Airflow pipelines, or manage
databases, you'll likely connect to a Linux machine.

It's Built for Performance

Linux handles heavy workloads really well, which is perfect for big data
processing.

It's Highly Customizable

Since Linux is open source, companies tailor it for their
infrastructure.

It Runs the Cloud

Most AWS, Azure, and Google Cloud servers run Linux.

It Supports Automation

Data engineers constantly automate workflows using shell scripts.


Linux Commands Every Beginner Should Know

Check Where You Are

pwd
Enter fullscreen mode Exit fullscreen mode

List Files

ls
Enter fullscreen mode Exit fullscreen mode

Move Between Folders

cd folder_name
Enter fullscreen mode Exit fullscreen mode

Create a Folder

mkdir data_project
Enter fullscreen mode Exit fullscreen mode

Create a File

touch notes.txt
Enter fullscreen mode Exit fullscreen mode

Read a File

cat notes.txt
Enter fullscreen mode Exit fullscreen mode

Why Text Editors Matter in Linux

When you log into a server, there's usually no graphical editor like VS
Code or Notepad.

Instead, you use terminal editors like: - Vi (powerful but tricky) -
Nano (simple and beginner-friendly)


Using Vi (The Power Tool)

Open or Create a File

vi sample.txt
Enter fullscreen mode Exit fullscreen mode

Enter Insert Mode

Press i and start typing.

Save and Exit

Press ESC, then type:

:wq
Enter fullscreen mode Exit fullscreen mode

Exit Without Saving

:q!
Enter fullscreen mode Exit fullscreen mode

Example Script

vi pipeline.sh
Enter fullscreen mode Exit fullscreen mode

Add:

#!/bin/bash
echo "Pipeline started"
Enter fullscreen mode Exit fullscreen mode

Using Nano (The Friendly Editor)

Open a File

nano notes.txt
Enter fullscreen mode Exit fullscreen mode

Save Your Work

Press:

CTRL + O
Enter fullscreen mode Exit fullscreen mode

Exit Nano

CTRL + X
Enter fullscreen mode Exit fullscreen mode

Example Config

nano config.conf
Enter fullscreen mode Exit fullscreen mode

Add:

database=postgres
username=admin
Enter fullscreen mode Exit fullscreen mode

Real-Life Data Engineering Scenario

You may need to:

  • Update Airflow configuration
  • Fix pipeline scripts
  • Modify database credentials
  • Check logs

Commands might include:

nano airflow.cfg
Enter fullscreen mode Exit fullscreen mode

or

vi pipeline.sh
Enter fullscreen mode Exit fullscreen mode

Pro Tips for Beginners

  • Always back up files before editing
  • Practice Vi commands slowly
  • Use Nano when learning
  • Learn basic shell commands daily

Final Thoughts

Linux is part of the foundation of modern data infrastructure.

Learning Linux commands and text editors gives you confidence when
working with production servers and cloud platforms.

Start with Nano.
Grow into Vi.
Practice consistently.

Top comments (0)