DEV Community

João Vitor
João Vitor

Posted on • Edited on

2 2

Using gnu parallel to save time

As a developer I'm constantly reading code, specially from github.

Often I choose to clone repos from some user or organization and I've been using parallel to save me some time cloning and updating code.

List repos

Clone missing repos

Update repos from a directory in parallel


There are more details in parallel_commands readme.


The workflows / scripts defined below assumes that you already have a GITHUB_TOKEN environment variable capable of listing github repositories and that you have the parallel_commands/bin in your PATH.


update-repos.sh is the one that I most use daily.

It does a git pull --rebase and git fetch --prune in all repositories inside a directory in parallel. This can save you and your team lots of time.


Offline studing

PacktPublishing and oreillymedia have lots of quality repos to be used in studies.

Imagine that you want to study kubernetes and clone all repositories from PacktPublishing that contains kubernetes in the name.

Filtering kubernetes from PacktPublishing shows 63 repositories. Cloning 63 repositories by hand would be at least slow.

Here is an nicer way of doing this:

1- Create a list of repos from PacktPublishing

You can use the static file in this gist.

mkdir -p ~/github-orgs/PacktPublishing
cd ~/github-orgs/PacktPublishing
curl -s -L -o PacktPublishing.txt https://gist.githubusercontent.com/joaovitor/e1658abc0946ec9e5528c533f2f502a8/raw/88df9a43c2f9253d686b4ca6eb45bf50c047678b/PacktPublishing.txt
Enter fullscreen mode Exit fullscreen mode

Or get it with the github-print-organization-repos.sh.

This one take a while PacktPublishing has more than 6000 repositories. github-print-organization-repos.sh makes around 600 github api calls to generate the file with the list of repos.

mkdir -p ~/github-orgs/PacktPublishing
cd ~/github-orgs/PacktPublishing
gh_owner=$(basename $(pwd)); github-print-organization-repos.sh ${gh_owner} ${gh_owner}.txt
Enter fullscreen mode Exit fullscreen mode

2- Clone the missing repos that match kubernetes

cd ~/github-orgs/PacktPublishing
gh_owner=$(basename $(pwd)); grep -Ei kubernetes ${gh_owner}.txt | parallel  -j 25 'clone-missing.sh {}; echo job {#} completed {};'
Enter fullscreen mode Exit fullscreen mode

3- Study offline

Image of Datadog

Master Mobile Monitoring for iOS Apps

Monitor your app’s health with real-time insights into crash-free rates, start times, and more. Optimize performance and prevent user churn by addressing critical issues like app hangs, and ANRs. Learn how to keep your iOS app running smoothly across all devices by downloading this eBook.

Get The eBook

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more