DEV Community

Shivakumar
Shivakumar

Posted on

Version Control System & Git

Here is your comprehensive study note, compiled in the correct logical order for a Senior DevOps learning path. It includes all explanations, code snippets, and comparisons discussed previously.


Comprehensive DevOps Study Guide: Version Control, Git, & CI/CD

Part 1: Version Control Systems (VCS) Overview

What is a VCS?

A Version Control System (VCS), also known as source control, is a software tool that tracks and manages changes to a file system over time. While it is most commonly used in software development to manage source code, it can theoretically track changes for any collection of files.

Core Functions

At its most basic level, a VCS acts like a "time machine" for your project. It allows you to:

  • Track History: Record every change (addition, deletion, modification) made to files, including who made the change and why (via commit messages).
  • Collaborate: Enable multiple people to work on the same project simultaneously without overwriting each other's work.
  • Revert: Roll back the entire project or specific files to a previous state if a bug is introduced.
  • Branch & Merge: Create separate lines of development (branches) to work on features or fixes in isolation, then integrate them back into the main project (merge).

Types of Version Control Systems

There are two primary architectures for version control:

1. Centralized Version Control Systems (CVCS)
In a centralized system (e.g., Subversion/SVN, Perforce), there is a single server that contains the master copy of the project and all version history.

  • Workflow: Developers check out a specific version of a file from the server, modify it, and commit it back.
  • Risk: The central server is a single point of failure. If it goes down, no one can collaborate or save versioned changes.

2. Distributed Version Control Systems (DVCS)
In a distributed system (e.g., Git, Mercurial), clients don't just check out the latest snapshot of the files; they mirror the entire repository.

  • Workflow: Every developer has a full copy of the project history on their local machine.
  • Redundancy: If the server dies, any of the client repositories can be used to restore the server.
  • Offline Capability: Developers can commit changes and view history without a network connection.

Role in DevOps

For a DevOps engineer, VCS is not just about code history; it is the foundation of the CI/CD pipeline:

  1. Source of Truth: It holds the "Infrastructure as Code" (Terraform, Ansible), ensuring infrastructure changes are tracked just like application code.
  2. Automation Trigger: A "push" to the VCS is the standard trigger for automated builds, tests, and deployments (CI/CD).

Part 2: Git Fundamentals & Architecture

What is Git?

Git is the most widely used modern version control system. It is a Distributed Version Control System (DVCS) created by Linus Torvalds in 2005 to support the development of the Linux kernel. Unlike older systems, it prioritizes speed, data integrity, and non-linear workflows.

Core Architecture: The Three Stages

Understanding Git requires understanding the three states that your files can reside in:

  1. Working Directory: This is your actual workspace—the files you see, edit, and delete on your computer.
  2. Staging Area (Index): A "holding zone" where you prepare your next commit. You select specific changes from your working directory to include (e.g., "I want to commit file A but not file B yet").
  3. Local Repository (.git directory): This is where Git permanently stores the committed snapshots (history) of your project on your machine.

The Workflow:
Modify filesgit add (Move to Staging) → git commit (Save to Local Repo)

Essential Commands Cheat Sheet

  • git init: Initializes a new empty Git repository in your current folder.
  • git clone <url>: Copies an existing repository to your local machine.
  • git status: Shows the state of your working directory (modified, staged, or untracked files).
  • git add <file>: Moves changes from the Working Directory to the Staging Area.
  • git commit -m "msg": Saves the staged snapshot to the Local Repository.
  • git push: Uploads local commits to a remote repository (like GitHub).
  • git pull: Downloads changes from a remote repository and merges them.
  • git branch: Lists, creates, or deletes branches.
  • git checkout <branch>: Switches to a different branch.
  • git merge <branch>: Joins the specified branch into your current branch.

Git vs. GitHub

  • Git is the software tool installed on your local computer to manage version control.
  • GitHub (and GitLab, Bitbucket) is a hosting service on the web. It hosts the Git repositories so teams can push/pull code to a central location.

Part 3: Advanced Git Concepts (Senior Level)

1. History Hygiene: Merge vs. Rebase

  • Merge (git merge): Creates a "merge commit" that ties two branches together.
  • Pro: Non-destructive. Preserves exact history.
  • Con: Can create a "messy" history with lots of merge commits.

  • Rebase (git rebase): Moves your entire branch to begin on the tip of the main branch. It rewrites history to make it look like you wrote your code after the latest changes in main.

  • Pro: Creates a perfectly linear history. Easier for automation to parse.

  • Con: Destructive. It changes commit hashes.

  • The Golden Rule: Never rebase a branch that you have pushed to a public repository. Only rebase local branches.

2. Interactive Rebase (git rebase -i)

Used to "polish" your work before a code review.

  • Squash: Meld multiple commits into one single, clean commit.
  • Reword: Change a commit message.
  • Drop: Delete a commit entirely.

3. Debugging with git bisect

If a bug was introduced sometime in the last 500 commits, git bisect uses a binary search algorithm to find the bad commit.

  • Workflow: You define a "bad" commit (current) and a "good" commit (past). Git automatically checks out the middle commit for you to test, repeating until it pinpoints the culprit.

4. The Safety Net: git reflog

Git keeps a log of every movement of the HEAD pointer, even commits that have been "deleted" or are not part of any branch.

  • Usage: If you accidentally did a "hard reset" and lost a commit, use git reflog to find the commit hash and restore it.

Part 4: Strategic Workflows & Guardrails

1. Cherry-Picking (git cherry-pick)

The act of picking a specific commit from one branch and applying it to another, while leaving the rest of the branch behind.

  • Analogy: Taking only the milk from one shopping cart and putting it in another, leaving the eggs and bread behind.
  • DevOps Use Case: Pulling a specific hotfix from a feature branch into production immediately, without deploying the unfinished feature.

2. Git Hooks (Automation)

Hooks are scripts that run automatically on specific events to enforce standards.

  • Client-Side (pre-commit): Runs before the commit is saved. Used for linters or security checks.
  • Server-Side (pre-receive): Runs when a client pushes code. Used to reject bad pushes.

Practical Example: The "No AWS Keys" Pre-Commit Hook
A script to block commits containing AWS Access Keys.
File location: .git/hooks/pre-commit

#!/bin/bash
# Define the pattern for an AWS Access Key (starts with AKIA...)
FORBIDDEN="AKIA[0-9A-Z]{16}"

# Check only staged files
if git grep --cached -E "$FORBIDDEN"; then
    echo "🚨 SECURITY ALERT: AWS Access Key detected in staged files!"
    exit 1
fi
exit 0

Enter fullscreen mode Exit fullscreen mode

3. Branching Strategies

A. GitFlow (The Classic)

  • Structure: main (release), develop (integration), feature/*, release/*, hotfix/*.
  • Pros: Safe, clear separation of stable vs. WIP code.
  • Cons: Complex, slow, prone to "Merge Hell".

B. Trunk-Based Development (The DevOps Standard)

  • Structure: main (trunk) is the only long-lived branch. Developers commit to it daily.
  • Requirement: Feature Flags (toggles) are used to hide unfinished code in production.
  • Pros: Fast, enables true CI/CD.
  • Cons: Requires high discipline and automated testing.

Part 5: CI/CD Integration

The Professional Workflow Loop

  1. Branch: Create a focused isolated environment.
  2. Work: Make changes and commit locally.
  3. Push: Upload branch to GitHub.
  4. Pull Request (PR): Ask for review (Automation triggers here).
  5. Merge: Integrate into main after checks pass.

Automation Tool 1: GitHub Actions (Modern/Cloud)

  • Concept: You define the pipeline in a YAML file inside the repo (.github/workflows/). GitHub spins up a runner to execute it.
  • Setup: Zero server setup required.

Example Pipeline (ci-pipeline.yml):

name: DevOps CI Pipeline
on:
  push:
    branches: [ "main" ]
jobs:
  quality-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run Script Check
        run: echo "Running tests..."

Enter fullscreen mode Exit fullscreen mode

Automation Tool 2: Jenkins (Enterprise/Legacy)

  • Architecture: Jenkins is an external server. It relies on Webhooks.
  • Workflow:
  • User pushes code.
  • GitHub sends a webhook (HTTP POST) to Jenkins.
  • Jenkins wakes up, clones the repo, and executes the Jenkinsfile.

Comparison: GitHub Actions vs. Jenkins

Feature GitHub Actions Jenkins
Setup Zero setup (SaaS). Heavy setup (Dedicated Server).
Language YAML (Simple, Declarative). Groovy (Scripted, Complex).
Execution Cloud Runners. Your own Build Agents.
Best For Modern cloud-native projects. Legacy enterprise, massive pipelines.

Top comments (0)