DEV Community

John  Ajera
John Ajera

Posted on • Edited on

Sync GitHub User or Organization Repositories with a Simple Script

Sync GitHub User or Organization Repositories with a Simple Script

Keeping all repositories in a GitHub organization updated can be a tedious task, especially when working with multiple repositories. With a simple Bash script, you can automate the process of cloning and syncing all repositories in your organization. This article walks you through the script and how it works.

Why Automate Repository Syncing?

Manually cloning and pulling updates for multiple repositories is time-consuming and prone to errors. Automating this process offers the following benefits:

  • Time-Saving: Quickly fetch and update all repositories without manual intervention.
  • Consistency: Ensure every repository is up to date across local and remote systems.
  • Efficiency: Handle repositories with minimal effort using Git and the GitHub API.

The Script: Clone and Sync Repositories

Here's the script to automate cloning and syncing repositories:

#!/bin/bash

# Function to sync repositories for a GitHub organization or user
sync_github_repos() {
  local github_entity="$1"  # GitHub organization or user name passed as the first argument
  local base_dir="$2"       # Base directory to store repositories passed as the second argument
  local entity_type="orgs"  # Default entity type is set to organization

  # Validate input arguments
  if [ -z "$github_entity" ]; then
    echo "Error: No GitHub entity provided. Please specify an organization or user."
    exit 1
  fi

  if [ -z "$base_dir" ]; then
    echo "Error: No base directory provided. Please specify a valid path."
    exit 1
  fi

  # Determine if the entity is an organization or a user by checking the respective GitHub API endpoints
  if curl -s "https://api.github.com/orgs/$github_entity" | jq -e 'has("login") and .login == "'$github_entity'"' > /dev/null 2>&1; then
    entity_type="orgs"
  elif curl -s -o /dev/null -w "%{http_code}" "https://api.github.com/users/$github_entity" | grep -q '^200$'; then
    entity_type="users"
  else
    echo "Error: $github_entity is neither a valid GitHub organization nor a user."
    exit 1
  fi

  # Ensure the base directory exists
  mkdir -p "$base_dir/$github_entity"

  # Function to clone or update a repository
  sync_repo() {
    local repo_url="$1"  # Repository URL passed to the function
    local repo_name=$(basename "$repo_url" .git)  # Extract repository name from the URL
    local repo_dir="$base_dir/$github_entity/$repo_name"  # Full path to the local repository

    # Check if the repository directory exists locally
    if [ -d "$repo_dir" ]; then
      echo "Updating $repo_name..."
      git -C "$repo_dir" pull --rebase  # Update the repository using git pull with rebase
    else
      echo "Cloning $repo_name..."
      git clone "$repo_url" "$repo_dir"  # Clone the repository if it does not exist
    fi
  }

  # Fetch and synchronize repositories for the given entity
  echo "Fetching repositories for $entity_type: $github_entity..."
  local page=1  # Start with the first page of results

  # Loop through paginated results to fetch all repositories
  while true; do
    # Fetch repository data using the GitHub API
    repos=$(curl -s "https://api.github.com/$entity_type/$github_entity/repos?per_page=100&page=$page")

    # Validate if the response contains an array
    if ! echo "$repos" | jq -e 'type == "array"' > /dev/null 2>&1; then
      echo "Error: Unable to fetch repositories. Check your API rate limits or credentials."
      exit 1
    fi

    # Extract repository SSH URLs
    repo_urls=$(echo "$repos" | jq -r '.[].ssh_url')

    # Exit the loop if no repositories are found
    [ -z "$repo_urls" ] && break

    # Iterate over each repository URL and sync it
    for repo_url in $repo_urls; do
      sync_repo "$repo_url"
    done

    ((page++))  # Increment the page number for the next iteration
  done
}

# Check if arguments are provided and run the script
if [ "$#" -lt 2 ]; then
  echo "Usage: $0 <GitHub_Entity> <Base_Directory>"
  exit 1
fi

# Pass arguments to the function
sync_github_repos "$1" "$2"
Enter fullscreen mode Exit fullscreen mode

How It Works

  1. Setup Organization and Base Directory
- The first argument is the GitHub organization or user name (e.g., jdevto).
- The second argument specifies where repositories will be cloned or updated.
Enter fullscreen mode Exit fullscreen mode
  1. Clone or Sync Repositories
- The sync_repo function checks if a repository exists locally.
- If found, it updates the repository using git pull --rebase. Otherwise, it clones the repository.
Enter fullscreen mode Exit fullscreen mode
  1. Fetch Repository List
- The script uses the GitHub API to fetch repositories from the entity in pages of 100.
- It loops through all repositories and calls sync_repo for each.
Enter fullscreen mode Exit fullscreen mode

Prerequisites

Before using the script, ensure the following:

  • Git Installed: The script uses Git commands like git clone and git pull.
  • GitHub API Access: The script fetches repositories using GitHub's REST API. If your organization has private repositories, configure authentication (e.g., with a Personal Access Token).
  • jq Installed: This script uses jq to parse JSON responses from the API.

Run the Script Directly Without Cloning

You can run the script directly without cloning the repository by using the following command:

bash <(curl -s https://raw.githubusercontent.com/jdevto/cli-tools/main/scripts/sync_github_repos.sh) <GitHub_Entity> <Base_Directory>
Enter fullscreen mode Exit fullscreen mode

For example:

bash <(curl -s https://raw.githubusercontent.com/jdevto/cli-tools/main/scripts/sync_github_repos.sh) jdevto .
Enter fullscreen mode Exit fullscreen mode

Key Takeaways

  • Automating repetitive tasks like syncing repositories saves time and reduces errors.
  • The GitHub API makes it easy to interact with your organization's repositories programmatically.
  • With small tweaks, this script can handle additional tasks like branch switching or archiving inactive repositories.

Now, your GitHub repositories are always in sync with just one command! Happy automating!

Top comments (0)