Whether you're a data engineer building pipelines or a data scientist working on models, keeping track of your code changes is crucial. That's where Git and GitHub come in. In this guide, I'll walk you through setting up Git and mastering the basics of version control.
1. Installing Git Bash
Windows Users
1.Visit
2.Download the Windows installer
3.Run the installer with these recommended settings:
- Select "Use Git from Git Bash only"
- Choose "Checkout Windows-style, commit Unix-style line endings"
- Use MinTTY as the terminal emulator
- Enable file system caching
Mac Users:
brew install git
Linux Users:
sudo apt-get install git # Debian/Ubuntu
sudo yum install git # CentOS/Fedora
Verify installation by opening Git Bash/Terminal and typing:
git --version
2. Connecting Git to Your GitHub Account
Step 1: Configure Your Identity
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
Step 2: Generate SSH Key (Secure Connection)
ssh-keygen -t rsa -b 4096 -C "your.email@example.com"
Press Enter to accept default file location, then create a passphrase.
Step 3: Add SSH Key to GitHub
- View your public key:
cat ~/.ssh/id_rsa.pub
Copy the entire output
Go to GitHub → Settings → SSH and GPG keys → New SSH key
Paste your key and save
Step 4: Test Connection
ssh -T git@github.com
You should see: "Hi username! You've successfully authenticated...
## 5. Understanding Version Control: The What & Why
What is Version Control?
Think of it as a time machine for your code. Every change is saved, so you can:
Track who made what changes
Revert to previous versions
Work on features without breaking your main code
Collaborate without overwriting others' work
6. Why Data Professionals Need Git:
Reproducibility: Track exactly which version of code produced which results
Collaboration: Multiple team members can work on same project
Experimentation: Try new approaches without fear of breaking working code
Documentation: Commit messages explain why changes were made
Top comments (0)