Your hard drive dies at 2 AM. No warning. Just a clicking sound and then silence.
You had three months of scraping code on that drive. Your Scrapy spiders. Your data cleaning scripts. Your PostgreSQL pipelines. Everything you built following tutorials and blogs.
You never backed it up. Why would you? It was all on your computer. Safe. Until it wasn't.
Or maybe this happens instead. You're working on your laptop. You solve a tricky bug. Perfect. You close the laptop and go to sleep. Next morning, you're at your desktop computer. You want to continue working.
But the code is on your laptop. You could email it to yourself. Or use a USB drive. Or upload to Dropbox. But then you'd have two versions. Which one is newer? Did you edit both? How do you keep them in sync?
This is why GitHub exists.
GitHub is Git in the cloud. You push your code from your computer to GitHub. Now it's backed up. Your hard drive can explode. Your laptop can get stolen. Your code is safe on GitHub's servers.
Plus, you can pull that code to any computer. Work on your laptop. Push to GitHub. Pull to your desktop. Keep working. Everything stays in sync automatically.
GitHub isn't just backup storage. It's also a social network for code. Millions of developers share projects. You can see their code. Copy it. Learn from it. Build on it. And they can see yours (if you want them to).
Let me show you how it works.
What GitHub Actually Is (And What It's Not)
Let's clear up confusion first.
Git: Software on your computer that tracks code changes (blog 1)
GitHub: A website that stores your Git repositories in the cloud
Think of it this way:
- Git = Microsoft Word (the software that edits documents)
- GitHub = Google Drive (the cloud storage for those documents)
You can use Git without GitHub (local only, like blog 1). You can't use GitHub without Git (it needs Git to understand your code).
What GitHub gives you:
- Cloud backup (your code is safe)
- Multi-computer sync (work anywhere)
- Collaboration (work with others)
- Portfolio (show employers your code)
- Open source access (millions of free projects)
- Free hosting (for simple websites)
What GitHub is NOT:
- Not Google Drive for code (it's smarter than that)
- Not automatic (you manually push/pull)
- Not required (Git works without it)
- Not the only option (GitLab, Bitbucket exist too)
For most people, GitHub is the default choice. It's free for unlimited public and private repositories. It has the most users. It has the best tools.
Creating a GitHub Account
Go to github.com
Click "Sign up"
Pick a username carefully. This shows up in URLs and on your profile. Future employers will see it.
Good usernames:
- yourname (john-smith)
- firstname-lastname (sarah-jones)
- professional handle (dev-mike, code-ninja-sarah)
Bad usernames:
- xxcoolcoder420xx
- 1337hacker
- randomnumbers12345
Use your real email. You'll need it for verification and notifications.
Choose the free plan. It includes:
- Unlimited public repositories
- Unlimited private repositories
- Unlimited collaborators
- 2,000 GitHub Actions minutes/month
You don't need paid plans unless you're a company.
Verify your email. GitHub will send you a confirmation link.
You're in. You now have a GitHub account.
Understanding Repositories
A repository (repo) is a project folder in Git. It contains:
- Your code files
- Git history (all commits)
- Configuration files
- README (project description)
On your computer: my_scraper/ folder with .git inside
On GitHub: The same folder, stored in the cloud
When you "push" to GitHub, you upload your local repo. When you "pull" from GitHub, you download it.
Your First Repository on GitHub
Let's create a repo on GitHub and connect it to your local Git project.
Option 1: Create on GitHub First (Easier for Beginners)
Step 1: Create New Repository
- Click the
+icon (top right) - Select "New repository"
- Repository name:
my-first-scraper - Description: "Learning Git and GitHub"
- Public or Private: Choose "Public" (anyone can see) or "Private" (only you)
- Check "Add a README file"
- Click "Create repository"
You now have a repo on GitHub. It has one file: README.md
Step 2: Clone to Your Computer
"Cloning" means downloading a copy from GitHub to your computer.
# Go to where you want the project
cd ~/projects
# Clone the repository
git clone https://github.com/YOUR-USERNAME/my-first-scraper.git
# Go into the folder
cd my-first-scraper
Replace YOUR-USERNAME with your actual GitHub username.
Check what you got:
ls -la
You'll see:
-
README.md(the file GitHub created) -
.git/(the Git folder)
This is a Git repository. It's connected to GitHub. Any commits you make can be pushed to the cloud.
Option 2: Push Existing Local Repo (For Blog 1 Projects)
You already have a local Git project from blog 1. Let's put it on GitHub.
Step 1: Create Empty Repository on GitHub
- Click
+→ "New repository" - Name:
price-scraper - Description: "E-commerce price scraper"
- Public or Private: Choose
- DO NOT check "Add a README file" (you already have files)
- Click "Create repository"
GitHub shows you instructions. We'll follow them.
Step 2: Connect Your Local Repo
# Go to your existing project
cd ~/my_scraper
# Add GitHub as a remote
git remote add origin https://github.com/YOUR-USERNAME/price-scraper.git
# Verify it's connected
git remote -v
Output:
origin https://github.com/YOUR-USERNAME/price-scraper.git (fetch)
origin https://github.com/YOUR-USERNAME/price-scraper.git (push)
"origin" is the default name for your GitHub connection.
Step 3: Push Your Code to GitHub
# Push your commits to GitHub
git push -u origin main
If this is your first time, Git will ask for your GitHub username and password.
Important: GitHub no longer accepts passwords for Git operations. You need a Personal Access Token (PAT).
Getting a Personal Access Token:
- Go to GitHub Settings → Developer settings → Personal access tokens → Tokens (classic)
- Generate new token
- Name: "Git operations"
- Expiration: 90 days (or longer)
- Select scopes: Check "repo" (all sub-items)
- Generate token
- Copy it immediately (you won't see it again)
When Git asks for your password, paste the token instead.
Your code is now on GitHub! Go to https://github.com/YOUR-USERNAME/price-scraper and you'll see all your files.
The Push/Pull Workflow
This is your new daily routine.
Making Changes Locally
# Edit your files
echo "print('New feature')" >> scraper.py
# Check what changed
git status
# Add and commit (like blog 1)
git add scraper.py
git commit -m "Added new feature"
Your commit is local only. GitHub doesn't have it yet.
Pushing to GitHub
# Upload your commits to GitHub
git push
Output:
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 345 bytes | 345.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/YOUR-USERNAME/price-scraper.git
a1b2c3d..d4e5f6g main -> main
Refresh your GitHub repo page. Your new commit is there.
Pulling from GitHub
Let's say you made changes on another computer and pushed them. Or a teammate pushed changes. You need to download them.
# Download latest commits from GitHub
git pull
Output:
remote: Counting objects: 3, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 3 (delta 0)
Unpacking objects: 100% (3/3), done.
From https://github.com/YOUR-USERNAME/price-scraper
d4e5f6g..g7h8i9j main -> origin/main
Updating d4e5f6g..g7h8i9j
Fast-forward
scraper.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Your local code now matches GitHub.
The Daily Pattern
# Start of day: Get latest changes
git pull
# Work on your code
# (edit files)
# Commit your changes (like blog 1)
git add .
git commit -m "Implemented login feature"
# Push to GitHub (backup)
git push
# End of day: Everything is backed up on GitHub
Always pull before you push. This avoids conflicts.
The README File (Your Project's Front Page)
When someone visits your GitHub repo, the first thing they see is README.md.
This file describes your project. What it does. How to use it. Why it exists.
Creating a Good README
# Create README in your project
touch README.md
Basic template:
# Price Scraper
A Python scraper that extracts product prices from e-commerce websites.
## What It Does
- Scrapes product names and prices
- Saves data to JSON
- Handles pagination
- Includes error handling
## Installation
bash
pip install -r requirements.txt
## Usage
bash
python scraper.py
## Requirements
- Python 3.8+
- BeautifulSoup4
- Requests
## Example Output
json
[
{
"name": "Laptop Pro 15",
"price": 1299.99
}
]
## Author
Your Name - [GitHub](https://github.com/YOUR-USERNAME)
shell
Save this as README.md in your project root.
git add README.md
git commit -m "Added README with project documentation"
git push
Visit your GitHub repo. The README appears on the front page, formatted nicely.
Why READMEs matter:
- First impression of your project
- Shows you can document code
- Helps others (and future you) understand the project
- Employers read them when evaluating candidates
Spend 10 minutes writing a good README. It's worth it.
Cloning Other People's Projects
GitHub has millions of open source projects. You can download and use them.
Finding Projects
Search on GitHub:
- Search bar (top): "python web scraper"
- Explore: github.com/explore
- Trending: github.com/trending
Popular scraping projects:
- Scrapy: github.com/scrapy/scrapy
- BeautifulSoup: github.com/getananas/bs4
- Selenium: github.com/SeleniumHQ/selenium
Cloning a Project
Let's clone Scrapy's source code (just to look at it).
# Clone Scrapy
git clone https://github.com/scrapy/scrapy.git
# Go into the folder
cd scrapy
# Look around
ls -la
You now have Scrapy's entire source code on your computer. You can read it. Learn from it. Modify it (locally).
You cannot push changes back to their repo. You don't have permission. That's what "forking" is for (blog 4).
Using Cloned Projects
Most projects have a README explaining installation.
# Typical pattern
cd project-name
pip install -r requirements.txt
python main.py
Follow the README instructions.
Public vs Private Repositories
When creating a repo, you choose visibility.
Public Repositories
Anyone can see:
- Your code
- Your commits
- Your README
- Everything
Use public repos for:
- Portfolio projects (show employers)
- Open source contributions
- Learning projects you want to share
- Tutorials and examples
Don't put in public repos:
- API keys or passwords
- Company/client code
- Personal information
- Embarrassing early projects (just kidding, everyone's early code is bad)
Private Repositories
Only you (and invited collaborators) can see.
Use private repos for:
- Client work
- Company projects
- Personal tools you don't want public
- Learning projects you're not ready to share
You can make a repo public later. Start private if unsure.
Settings → General → Danger Zone → Change visibility
GitHub as a Portfolio
Employers look at your GitHub. Here's how to make it impressive.
What Employers Want to See
- Active commits (regular contributions, not one big dump)
- Good READMEs (you can document code)
- Clean code (readable, organized)
- Real projects (not just tutorials)
- Variety (different languages, tools)
Building Your Portfolio
Start with 3-5 solid projects:
- Web scraper (shows data collection skills)
- Data analysis (Pandas, visualizations)
- Automation tool (solves a real problem)
- API project (Flask or FastAPI)
- Your best work (whatever you're proud of)
For each project:
- Good README (what, why, how)
- Clean code (not messy)
- Requirements.txt (dependencies)
- Example output (screenshots, sample data)
- License file (MIT is standard)
Pin your best repos:
- Go to your GitHub profile
- Click "Customize your pins"
- Select your 6 best projects
- These show first on your profile
What NOT to Do
- Don't commit passwords or API keys
- Don't upload huge files (>100MB)
- Don't copy someone else's code without credit
- Don't have empty repos with no commits
- Don't use offensive or unprofessional repo names
Your GitHub is your resume. Treat it professionally.
GitHub Features You'll Use
Issues
Track bugs, features, and tasks.
Creating an issue:
- Go to repo → Issues tab
- Click "New issue"
- Title: "Bug: Scraper crashes on invalid URLs"
- Description: Explain the problem
- Submit
Why use issues:
- Remember bugs to fix
- Track feature requests
- Organize work (especially in teams)
- Show potential employers you maintain projects
Releases
Tag specific versions of your code.
Creating a release:
- Go to repo → Releases
- "Create a new release"
- Tag: v1.0.0
- Title: "First stable release"
- Description: What's included
- Publish
Why releases matter:
- Users know which version is stable
- You can reference specific versions
- Shows project maturity
GitHub Pages
Free website hosting for static sites.
Setting up Pages:
- Create repo named
YOUR-USERNAME.github.io - Add
index.html - Push to GitHub
- Visit
https://YOUR-USERNAME.github.io
Your website is live. Free.
Use cases:
- Personal portfolio site
- Project documentation
- Blog
- Resume site
Working from Multiple Computers
This is where GitHub really shines.
Scenario: Laptop and Desktop
On your laptop:
# Make changes
echo "print('Laptop code')" >> scraper.py
# Commit and push
git add scraper.py
git commit -m "Added feature on laptop"
git push
On your desktop:
# Pull the changes
git pull
# Your desktop now has the laptop's changes
cat scraper.py
# Output includes: print('Laptop code')
Make changes on desktop:
echo "print('Desktop code')" >> scraper.py
git add scraper.py
git commit -m "Continued work on desktop"
git push
Back on laptop:
git pull
# Now laptop has desktop changes
Everything stays in sync. No manual file copying. No emailing code to yourself.
The Golden Rule
Always pull before you start working.
# Every time you sit down to code
git pull
# Do your work
# Commit and push when done
git add .
git commit -m "Today's work"
git push
This prevents most sync issues.
Common GitHub Workflows
Starting Your Day
cd ~/my-project
git pull # Get latest changes
git status # Check everything is clean
# Start coding
After Making Progress
git add .
git commit -m "Implemented X feature"
git push # Backup to GitHub
Before Leaving Your Computer
git status # Anything uncommitted?
git add .
git commit -m "End of day commit"
git push # Make sure GitHub has everything
Checking Your Backup
Visit your repo on github.com. See your latest commit. Your code is backed up.
Handling Common Issues
Issue 1: Forgot to Pull, Made Changes
You edited files. Tried to push. Git says:
! [rejected] main -> main (fetch first)
error: failed to push some refs
What happened: Someone (or you on another computer) pushed to GitHub. Your local copy is outdated.
Fix:
# Pull the changes
git pull
# If no conflicts, it merges automatically
# Now push
git push
Issue 2: Merge Conflict
You and someone else edited the same lines of code.
Git shows:
CONFLICT (content): Merge conflict in scraper.py
Automatic merge failed; fix conflicts and then commit the result.
Fix:
Open the file. You'll see:
print('Hello')
<<<<<<< HEAD
print('Your change')
=======
print('Their change')
>>>>>>> a1b2c3d4
print('Goodbye')
Decide what to keep:
print('Hello')
print('Your change') # Kept yours
print('Goodbye')
Remove the markers (<<<<<<<, =======, >>>>>>>).
git add scraper.py
git commit -m "Resolved merge conflict"
git push
We'll cover conflicts in detail in blog 3.
Issue 3: Pushed Sensitive Data (API Keys)
You accidentally committed your API key and pushed it.
Immediate actions:
- Rotate the key immediately (get a new one from the service)
- Remove it from your code
- Commit and push
# Remove the key from code
# (edit the file)
git add config.py
git commit -m "Removed API key"
git push
But the key is still in Git history. Anyone can see old commits.
For now, just rotate the key (make it invalid). We'll cover removing from history in blog 5.
Prevention:
Never commit API keys. Use environment variables:
# Don't do this
api_key = "sk_live_abc123xyz"
# Do this
import os
api_key = os.getenv('API_KEY')
Set the environment variable outside Git:
export API_KEY="sk_live_abc123xyz"
Using GitHub on Your Phone
GitHub has a mobile app (iOS and Android).
What you can do:
- View code
- Read issues
- Review pull requests
- Check commit history
- Get notifications
What you can't do:
- Edit code directly
- Push commits
Why it's useful:
- Check notifications on the go
- Review code during commute
- Stay updated on projects
Download from App Store or Google Play.
GitHub Desktop (Alternative to Command Line)
Don't like terminal commands? Try GitHub Desktop.
Download: desktop.github.com
What it does:
- Visual interface for Git
- Clone repos with clicks
- Commit with buttons
- Push/pull visually
- See file changes side-by-side
Same operations, different interface:
- Command line:
git add . && git commit -m "message" && git push - GitHub Desktop: Click "Commit to main" → Click "Push origin"
Use whatever you prefer. The result is identical.
Best Practices for GitHub
Commit Messages
Good:
- "Added login validation"
- "Fixed crash when URL is empty"
- "Updated README with installation steps"
Bad:
- "stuff"
- "changes"
- "asdfasdf"
Write messages future you will understand.
Commit Frequency
Too infrequent:
- One commit with 100 file changes
- Message: "Everything"
- Can't undo specific features
Too frequent:
- Commit after every line
- 50 commits in 10 minutes
- Clutters history
Just right:
- Commit when you complete a logical unit
- One feature = one commit
- Can work for 2-3 hours = one commit
- End of day = one commit minimum
Repository Organization
Good structure:
my-project/
├── README.md
├── requirements.txt
├── .gitignore
├── src/
│ ├── scraper.py
│ └── utils.py
├── tests/
│ └── test_scraper.py
└── data/
└── .gitkeep
Bad structure:
my-project/
├── untitled1.py
├── scraper_old.py
├── scraper_new.py
├── test.py
├── asdf.py
└── README.txt
Organize before you push.
Real Example: Scraper Portfolio Project
Let's create a complete scraper project on GitHub.
Step 1: Create Local Project
mkdir amazon-price-tracker
cd amazon-price-tracker
# Initialize Git
git init
# Create project structure
touch README.md
touch requirements.txt
touch scraper.py
touch config.py.example
mkdir data
touch data/.gitkeep
Step 2: Write the Code
# scraper.py
import requests
from bs4 import BeautifulSoup
import json
from datetime import datetime
def scrape_product(url):
"""Scrape product name and price from URL"""
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')
# Extract data (selectors depend on site)
name = soup.select_one('#productTitle').text.strip()
price = soup.select_one('.a-price-whole').text.strip()
return {
'name': name,
'price': price,
'timestamp': datetime.now().isoformat(),
'url': url
}
def save_data(data, filename='data/prices.json'):
"""Save scraped data to JSON"""
try:
with open(filename, 'r') as f:
existing = json.load(f)
except FileNotFoundError:
existing = []
existing.append(data)
with open(filename, 'w') as f:
json.dump(existing, f, indent=2)
if __name__ == '__main__':
url = 'https://amazon.com/product-url'
product_data = scrape_product(url)
save_data(product_data)
print(f'Saved: {product_data["name"]} - ${product_data["price"]}')
# requirements.txt
requests==2.31.0
beautifulsoup4==4.12.2
# README.md
# Amazon Price Tracker
Track product prices on Amazon over time.
## Features
- Scrapes product name and price
- Stores historical data
- JSON output format
- Configurable via environment variables
## Installation
bash
pip install -r requirements.txt
## Usage
bash
python scraper.py
## Data Storage
Data is saved to `data/prices.json` in this format:
json
[
{
"name": "Product Name",
"price": "99.99",
"timestamp": "2024-03-15T14:30:00",
"url": "https://amazon.com/..."
}
]
## Configuration
Copy `config.py.example` to `config.py` and add your settings.
## License
MIT
shell
Step 3: First Commit
git add .
git commit -m "Initial commit: Amazon price tracker"
Step 4: Create GitHub Repository
- Go to github.com
- New repository
- Name:
amazon-price-tracker - Public
- No README (you have one)
- Create
Step 5: Push to GitHub
git remote add origin https://github.com/YOUR-USERNAME/amazon-price-tracker.git
git push -u origin main
Step 6: Add More Features Over Time
# Add email alerts
# (edit scraper.py)
git add scraper.py
git commit -m "Added email alerts when price drops"
git push
# Add error handling
# (edit scraper.py)
git add scraper.py
git commit -m "Added error handling for network issues"
git push
# Update README
# (edit README.md)
git add README.md
git commit -m "Updated README with email alert documentation"
git push
Each commit shows your progress. Your GitHub history tells the story of building the project.
Exploring Other People's Code
GitHub is a massive library of free code.
Learning from Popular Projects
Visit these repos and read the code:
Scrapy:
- github.com/scrapy/scrapy
- See how professionals structure scrapers
- Learn advanced patterns
Requests:
- github.com/psf/requests
- Beautiful Python code
- Great documentation
Pandas:
- github.com/pandas-dev/pandas
- Complex but educational
- See how libraries work internally
How to Learn
- Star interesting repos (bookmark them)
- Read the README (understand what it does)
- Browse the code (click through files)
- Look at issues (see common problems)
- Read recent commits (see how it evolved)
Don't try to understand everything. Just browse. You'll pick up patterns.
What's Next?
You now know GitHub basics. Your code is backed up. You can work from multiple computers. You have a portfolio.
But you're still working alone. What if you want to:
- Try new features without breaking working code?
- Experiment safely?
- Work on multiple things simultaneously?
- Collaborate with others?
That's where branches come in.
Blog 3 will cover:
- Creating branches (parallel universes for your code)
- Switching between branches
- Merging branches
- Handling merge conflicts
- Pull requests (proposing changes)
- Real collaboration workflow
Branches sound complicated. They're not. They're just copies of your code where you can break things safely.
Summary
GitHub is Git in the cloud plus social features.
Core workflow:
- Create repo on GitHub (or push existing local repo)
-
git pull(get latest changes) - Make changes locally
-
git add .andgit commit -m "message" -
git push(backup to GitHub)
Key commands:
-
git clone- Download repo from GitHub -
git pull- Get latest changes from GitHub -
git push- Upload your commits to GitHub -
git remote add origin URL- Connect local repo to GitHub
Why GitHub matters:
- Backup (hard drives die)
- Multi-computer sync (laptop + desktop)
- Portfolio (show employers)
- Collaboration (work with others)
- Learning (millions of open source projects)
Best practices:
- Always pull before pushing
- Write clear commit messages
- Create good READMEs
- Keep sensitive data out of repos
- Commit regularly (at least daily)
GitHub isn't complicated. It's just Git with a cloud backup. Clone, push, pull. That's the whole workflow.
Your code is now immortal. Hard drives can die. Laptops can be stolen. Your code lives forever on GitHub.
Next up: Blog 3 - "Branches: How to Break Things Without Breaking Things"
We'll learn how to experiment with code safely, work on multiple features simultaneously, and collaborate without chaos.
Resources:
- GitHub documentation: https://docs.github.com
- GitHub Skills: https://skills.github.com
- GitHub Desktop: https://desktop.github.com
- GitHub Mobile: https://github.com/mobile
Top comments (0)