DEV Community

lbonanomi
lbonanomi

Posted on • Edited on

5 1

Tallying Your Github Repo Views

You: are a data scientist harvesting Github visitor traffic stats for analysis.
I: am thirsty and insecure.
We are both interested in tallying Github's unique-visitor logs for all of our repos.

Collecting traffic data

Github offers a REST endpoint to query referrers, most-popular files, views (total or unique) and clone traffic, and most well-appointed Linux hosts offer the curl and jq utilities for polling and parsing the returned JSON.

#!/bin/bash

BUFF=$(mktemp)

# Query the Github `user` endpoint to get a list of the user's repos. 
# This is neater than hard-coding a username or URL into this script.
#
MYREPOS=$(curl -sn https://api.github.com/user | jq .repos_url | tr -d '"')

# Query $MYREPOS for a list of repository URLs and mutate them into 
# `https://api.github.com/$user/$repo/traffic/views` 
#
curl -sn $MYREPOS | jq .[].url | tr -d '"' | while read URL
do
    curl -sn "$URL/traffic/views" | jq -c .views[] | while read view;
    do
        # Query the repo's `views` URL and parse returned JSON 
        # for unique-visitor counts by date

        VIEWDATE=$(echo "$view" | jq .timestamp | awk -F"T" '{ print $1 }')
        VIEWERS=$(echo "$view" | jq .uniques)

        echo "$VIEWDATE, $VIEWERS, $URL" | tr -d '"'
    done
done > $BUFF

# Now we should have a temp file that looks like:
#
# 2019-05-11, 1, https://api.github.com/repos/lbonanomi/go
# 2019-05-12, 1, https://api.github.com/repos/lbonanomi/go
# 2019-05-13, 1, https://api.github.com/repos/lbonanomi/go
#

# Let's leverage `seq` and GNU `date` to make a list of dates for the 
# last 10 days and build a CSV

seq 0 9 | tac | while read SINCE
do
    # datestamp
    date -d "$SINCE days ago" +%Y-%m-%d | tr "\n" "\t"

    # If there's no-data report a '0'
    grep -q $(date -d "$SINCE days ago" +%Y-%m-%d) $BUFF || printf "0"

    # Normally grep-before-awk makes me psychotic, but I think this is clearer.
    grep $(date -d "$SINCE days ago" +%Y-%m-%d) $BUFF|awk -F"," '{t=t+$2} END{print t}'
done

# Give a hoot and delete temp files when they're done.
rm $BUFF

Running our script should give us a tsv-formatted count of unique visitors to our personal Github presence in the last 10 days:

2019-05-16  5
2019-05-17  9
2019-05-18  1
2019-05-19  2
2019-05-20  0
2019-05-21  2
2019-05-22  2
2019-05-23  8
2019-05-24  3
2019-05-25  4

Wouldn't this look handsome with spark graph or a line graph?

Heroku

Simplify your DevOps and maximize your time.

Since 2007, Heroku has been the go-to platform for developers as it monitors uptime, performance, and infrastructure concerns, allowing you to focus on writing code.

Learn More

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay