DEV Community

Cover image for My Journey at LuxDevHQ: Version Control (The Basics)
Victor Kamau
Victor Kamau

Posted on

My Journey at LuxDevHQ: Version Control (The Basics)

Pictorial Representation of Version Control

Version Control with Git

TLDR;

This post provides an in-depth guide into version control in software development and collaborative project.

You can click here to install and configure a version control system (VCS) or here to connect with an online VCS repository service.

Introduction

In a previous post, we were able to successfully set up a Windows virtual machine using VirtualBox. This left us ready to begin working on data science projects.

But, data science, like most tech fields, boils down to code and files. And thus, we must consider a number of questions around these, including:

  1. How do we store them?
  2. How do we track changes made?
  3. How can we access them across different machines?
  4. How can we share them with collaborators?

This is where version control comes to save the day.

Version Control / Source Control

Version Control is defined as "the practice of tracking and managing changes to software code". 1 Essentially, it is a mechanism of keeping track of changes made to computer files, particularly source code files, with the ability to view old versions and optionally revert a file to a preferred previous version.

It has been around in some form for as along as humans have worked with computers, with early attempts having collaborators managing sharing of computer files and projects, through means such as pendrives, emails and shared folders. Most approaches were clunky, with no built-in way of detecting who made what changes to what files and folders when. This resulted in teams coming up with strange conventions e.g. folders being named project, project1, project-latest etc. Also, errors such as accidental overwrites were quite common with users commonly having their work lost when others working on the same files and folders posted their work to the common storage area after them.

Ultimately, all these problems led to the rise of Version Control Systems (VCS). These are software tools used mostly in, but not limited to, software development and collaborative projects to automatically track and manage changes to source code files.

Benefits of such systems include:

  • Automated functionality to record and track every update to code base.
  • Enhanced collaboration without fear of accidental permanent overwrites.
  • Easy reversion to previous versions of specific files and folders (or the project as a whole).
  • Ingrained, structured history which simplifies auditing of the project's progression.

There exist a number of popular VCS tools used by individuals, teams and enterprises globally, including Mercurial, Subversion, Fossil and CVS. The most popular one, however, and also the focus of this article and VCS in use at LuxDevHQ, is Git.

Git

Git is a lightning-fast, free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. 2

It was originally developed by Linus Torvalds - the creator of Linux - as version control for the development of the Linux Kernel. It has since grown to become used by over 93% of developers worldwide. It is compatible with and installable on most conventional operating systems as well as on technologies such as Docker.

Installation

To install on one's machine, head on over to the Install for guidance on the specific OS.

For Windows Users

Visit here and download the latest version of Git for Windows which includes a command line interface (CLI) within which one can run Git commands (known as Git Bash).

Git Install for Windows

Git Install for Windows

Upon completion of the download, double click on the file to trigger installation. Click Yes when prompted for User Account Control (UAC) permission. This will open the Git Setup Wizard.

Most of the default configurations should suffice so you can consistently click "Next" throughout the process. (Should you encounter a configuration that you wish to change, kindly feel free to change it.) Once done, the installation will run for a few seconds. Upon completion, check the Launch Git Bash option, uncheck View Release Notes then click Finish. It should open a window similar to the one below.

Git Bash Window

Git Bash

For macOS Users

The easiest way to install Git on macOS is using Homebrew, a tool that simplifies installation of software tools on macOS and Linux. If you don't yet have Homebrew on your computer, download and install it first.

Once Homebrew is ready, run brew install git in Terminal (macOS' CLI). This will install Git on macOS and have it ready for use.

For Linux Users

A number of Linux distributions have Git bundled and installed within the OS. However, if your OS doesn't have it (run git --version in the Terminal to confirm), then visit here and follow the instructions for the particular distribution.

Setup

Before continuing, confirm that Git is properly installed and functional by running
git --version

If output is similar to the following, then you are begin setting up Git.

Git Version Output

Git Version Output

First off, there needs to be a global user configured to use Git (this can be overridden within individual repositories). To do this, run the following commands:

  • User's name git config --global user.name "<Your Name>"
  • User's email address git config --global user.email "<your.email@emaildomain>"

NB: Replace <Your Name> and <your.email@emaildomain> with actual name and email address, respectively.

Once done, run git config --list to confirm the configurations have applied correctly.

Repository Setup

Now, let's cross over to creating and configuring a Git repository. A repository is a central location in which data is stored and managed. Within the context of a software project, it can be thought of as the root folder containing the source file and code.

First, we shall a create our project folder

  1. Create a folder within the CLI
    mkdir -p test-project

    NB: -p instructs the CLI to create any intermediate parent folder that may be missing.

  2. Navigate into the new folder
    cd test-project

  3. Create a sample README
    touch README.md

  4. Populate the README with some informative message
    echo "This is a test project showing how to set up a Git repository" >> README.md

    NB: You can confirm that the content has been written into the README file using the following command
    cat README.md

  5. Being a data science project, let's create a test a sample Python that simply prints greetings to the CLI
    touch test.py && echo "print('Hello, World')" >> test.py

    NB: You can combine multiple commands using the && operator.

  6. Let's now list the content of the current folder (i.e test-project) to confirm that all our files are in place
    ls .

    If you see both of our newly created files (i.e README.md and test.py), then you are good to continue.

    NB: . in the command ls . represents our current working folder.

NB: You can find a cheat sheet of common Linux commands that can run in the CLI here.

Now, let's configure our folder into a git repository

  1. Initialize the repository git init
  2. Stage the created files in preparation for committing git add .
  3. Commit the new files with an optional message git commit -m "Initial Commit"

The repository is now ready to be connected to an online Git repository service.

GitHub

Git, while being a distributed VCS, is largely a tool running on user's local machine. As such, particularly in collaborative projects, there needs to be a mechanism in which a Git repository can be accessed and utilized by different team members remotely.

Fortunately, there exist a number of services that provide such functionality. They are typically web-based platforms that host Git repositories, providing a centralized location for storing code, tracking changes, and enabling collaboration among developers. Furthermore, these platforms offer additional functionality including:

  • Continuous Integration and Continuous Deployment (CI/CD) automation
  • Access Control and Security on repositories
  • Collaboration Tools such as issue tracking, pull requests etc.
  • Integrations with third-party apps and services among others.

A number of them are free to create an account in and use, although there typically exists incurred charges for specified features. Examples include GitHub, GitLab and Bitbucket.

Of these services, the most widely used is GitHub. It was released in February 2008 and acquired by Microsoft in October 2018. 3 It has over a billion published repositories and is what will be used for both this article and the entirety of the course.

Account Creation

To get started with GitHub, one needs to have a registered account, which can be achieved by following the steps below:

  1. Visit GitHub's website and clicking the Sign up button in the top left corner. This will redirect to Account Creation page.

GitHub Account Creation Page

GitHub Account Creation Page

Fill in the form with your details to proceed. Choose a distinctive user name that you can remember. Alternatively, you can create an account using Google or Apple for simplified but secure signing up.


  1. Atlassian. (2026, Jan 16). What is version control: Atlassian Git Tutorial https://www.atlassian.com/git/tutorials/what-is-version-control 

  2. Git. (2026, Jan 18). Git https://git-scm.com 

  3. Git and GitHub Use, Collaboration, and Workflow. (2026, Jan 18). History of GitHub https://pslmodels.github.io/Git-Tutorial/content/background/GitHubHistory.html 

Top comments (0)