DEV Community

Alvin Obonyo
Alvin Obonyo

Posted on

Why is Linux important to Data Engineers

Linux is a kernel distribution used by programmers. It's an easier way for programmers to use it, as there are different Linux distributions. There is Ubuntu (Ubuntu.com), Parrot (parrotsec.org), used for security, and Kali (Kali Linux). Different programmers and developers use different distributions.
Although different, they all have similar commands, and one can use them easily.

Why is Linux important to Data Engineers

  • Its important for a Data engineer as it makes their work easier when working through Docker, easy to navigate through, as we work with SQL, just to name a few. It's open source and cost-effective, which means it's free, and most of its products are free to use compared to other operating systems. *Its performance is high, efficient and easy to use in big data and handling massive data as well. *Its more flexible and automated with scripting capabilities, e.g., Bash coz the DE can integrate data with ETL tools.

Practical usage of Vi e.g., creating and editing files
Vi is a powerful and complex editor for Linux users.
To open a file in vi you start by vi (name of document)

To quit vi you (:q)
To save a file you zz

For Linux, its commands can also be used in Git and github such as ;
mkdir to make directories
git add to make an addition or amend the original GitHub file
git commit and push to push the amended files

Basic Linux commands Data Engineers should be aware of;

  • pwd - Shows current directory
  • ls - Lists files and folders
  • cd- Changes directory
  • cd .. - returns to the previous page
  • touch - Creates an empty file
  • mv - Moves or renames files
  • rm - Deletes files

Conclusion
Its easy for a newbie to get through Data Engineering, passion and dedication are all that is required to help the learner get through having an open mind and willingness to work through.

Top comments (0)