Introduction
This is my first article of 2024. Happy New Year to all of you.
I have to begin with a confession. I planned to put this article out more than month ago as a Christmas present. But I was spread too thin. Wearing multiple hats.
While this did not end up as the holiday present that I wanted to share with the open source community, I hope that it is welcome present in early 2024.
Github activity is near and dear to open source project owners and power users. It is motivating for me to see continuous streams of stars and forks activity. 🤩
In this blog, I will share my workflow to build a simple streaming data pipeline using Fluvio to collect and aggregate Star 🌟 and Fork 🎏 activity from GitHub, and build bots on Slack and/or discord to send automated updates on the Star and Fork changes.
Navigation
About Fluvio
Fluvio Open Source
Configurations
Pick the workflow that is relevant to you.
Shortcut: All the files, configs, and the commands in the blog are in this Git Repository
Workflow
- Fluvio Local Setup - Self Hosted
- API Keys and Secrets
- Inbound GitHub Connector
- Outbound Discord Connector
-
Fluvio Docs Reference
Fluvio Open Source
Fluvio is a data streaming system written in rust and web assembly. We have been building Fluvio for nearly 5 years. Fluvio is built as a cloud native distributed streaming system from the ground up. We are building ferociously to release stateful stream processing and time window based materializations with support for several web assembly compatible languages soon!
We have been tracking various GitHub activities including Stars and Forks on our Fluvio Open Source repository using Fluvio. We have channels on our company Slack and community Discord where we get continuous and real-time alerts on Star 🌟 and Fork 🎏 activity.
Of course you can get a lot more than stars and forks:
If you would like a full blown GitHub Dashboard that you can implement, ask for it in the comments and I will open source it if folks want it.
Following this blog, you will have your very own streaming data pipeline to get Star 🌟 and Fork 🎏 activity on your own Slack or Discord channel. Let's go!
Configurations
Fluvio Cluster Deployment: This tutorial will show you how to self host everything locally. In another tutorial, I will share this workflow using our managed cloud if you'd like to try that flow.
GitHub API: The inbound data comes from the GitHub API and you will need your GitHub API Key to get higher query limits.
As per GitHub docs, GitHub Apps authenticating with an installation access token use the installation's minimum rate limit of 5,000 requests per hour. If the installation is on a GitHub Enterprise Cloud organization, the installation has a rate limit of 15,000 requests per hour.
If there is no access token then you would be limited to 60 requests per hour. The default config of this blog will assume a slower frequency to work without the access token.
SlackBot / DiscordBot: In terms of the bots sending you updates, you can have them run on Discord or Slack or both. For the bots there is a pretty simple workflow to create applications to interact with the Slack and Discord APIs. I will share the relevant workflow in this blog and provide references for the official docs at the bottom of the post.
Fluvio Local Setup - Self Hosted
Install and launch Fluvio
Fluvio installation is managed by Fluvio Version Manager shortened to fvm. To install Fluvio run the command:
curl -fsS https://hub.infinyon.cloud/install/install.sh | bash
Copy and run the last line of the install log on the terminal to add the install directory to the PATH variable.
Start a local Fluvio cluster
fluvio cluster start
Once the cluster is running you will need to download the connectors and the smart modules. I linked to organize them in a single working directory.
mkdir github-stargazer-local && cd "$_"
Create a free account on InfinyOn Cloud
InfinyOn Cloud Hub is a repository of pre-built connectors, smartmodules and other workflow components.
Create a free account using the InfinyOn Cloud sign-up page to access the InfinyOn Hub.
Download connectors
We have full blown development kit to build connection or integration to practically any custom data source or sink. For this workflow we will use a couple of prebuilt connectors to accomplish our task.
Search available connectors:
cdk hub list
You would see output like this:
CONNECTOR Visibility
infinyon-labs/graphite-sink@0.1.2 public
infinyon/duckdb-sink@0.1.0 public
infinyon/http-sink@0.2.6 public
infinyon/http-source@0.3.1 public
infinyon/ic-webhook-source@0.1.2 public
infinyon/kafka-sink@0.2.7 public
infinyon/kafka-source@0.2.5 public
infinyon/mqtt-source@0.2.5 public
infinyon/sql-sink@0.3.3 public
Download the http source and sink connectors:
http source
cdk hub download infinyon/http-source@0.3.1
http sink
cdk hub download infinyon/http-sink@0.2.6
Download smart modules
Smart Modules are web assembly based data transformers. Similar to connectors, We have full blown development kit to build custom data transformation logic. In this case we will use a couple of prebuilt smart modules.
Search available smart modules:
fluvio hub sm list
You would see output like this:
SMARTMODULE Visibility
infinyon-labs/array-map-json@0.1.0 public
infinyon-labs/dedup-filter@0.0.2 public
infinyon-labs/json-formatter@0.1.0 public
infinyon-labs/key-gen-json@0.1.0 public
infinyon-labs/regex-map-json@0.1.1 public
infinyon-labs/regex-map@0.1.0 public
infinyon-labs/rss-json@0.1.0 public
infinyon-labs/stars-forks-changes@0.1.2 public
infinyon/jolt@0.3.0 public
infinyon/json-sql@0.2.1 public
infinyon/regex-filter@0.1.0 public
Download the stars-forks-changes and jolt smart modules:
stars-forks-changes
fluvio hub sm download infinyon-labs/stars-forks-changes@0.1.2
jolt
fluvio hub sm download infinyon/jolt@0.3.0
That's all you will need for setup. Topics will be created automatically by the connectors.
If you want to remove the fluvio cluster at any point shutdown the connectors(command in the sections below):
fluvio cluster delete
Apps and API Keys:
We will store the API Keys and secrets in a file for self hosted deployment.
Create a file named secrets.txt to add the relevant API Keys once we create them.
Create the file using your favourite text editor and add the following variables.
DISCORD_TOKEN=
SLACK_TOKEN=
GITHUB_TOKEN=
Inbound GitHub Connector
To connect to the GitHub API, you can create an API key based on the documentation here: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens
Create a file and call it github.yaml with the following configuration pattern:
You need to put the API endpoint of your repository, and modify the interval. If you use an API Key you can make it 1s, without an API Key you can go as fast as 60s intervals or more. I am keeping it at 120s.
apiVersion: 0.1.0
meta:
version: 0.3.1
name: github-stars-inbound
type: http-source
topic: github-stars
http:
endpoint: 'https://api.github.com/repos/[your_org/your_repo]'
method: GET
interval: 60s
transforms:
- uses: infinyon/jolt@0.3.0
with:
spec:
- operation: shift
spec:
"stargazers_count": "stars"
"forks_count": "forks"
Run the http-source connector using the configuration with the following command:
You can skip the secrets parameter if you have not set a GitHub API Key.
cdk deploy start --ipkg infinyon-http-source-0.3.1.ipkg --config github.yaml --secrets secrets.txt
All configuration in the context of Fluvio data flows are YAML based. We will have a configuration for each source and sink system to deploy the connectors.
Inspect or Shutdown the GitHub Connector:
If you want to see the status, you can run
cdk deploy list
If you want to shutdown the connector, you can run
cdk deploy shutdown --name github-stars-inbound
Outbound Discord Connector
To get the alerts for stars and forks in a Discord channel you need:
- a Discord server with admin access
- a new or existing Discord Channel for the alerts
- a Discord Application
Create a Discord Application: Discord Apps Docs: To create the discord bot simply go into server settings -> Integrations -> New Webhook -> Name the webhook, pick the channel, copy the Webhook URL.
The Discord token is the unique identifier of your workspace.
Create a configuration file to connect to Discord - call it discord.yaml:
apiVersion: 0.1.0
meta:
version: 0.2.6
name: discord-stars-outbound
type: http-sink
topic: github-stars
secrets:
- name: DISCORD_TOKEN
http:
endpoint: "https://discord.com/api/webhooks/${{ secrets.DISCORD_TOKEN }}"
headers:
- "Content-Type: application/json"
transforms:
- uses: infinyon-labs/stars-forks-changes@0.1.2
lookback:
last: 1
- uses: infinyon/jolt@0.3.0
with:
spec:
- operation: shift
spec:
"result": "text"
Start the Discord Connector:
cdk deploy start --ipkg infinyon-http-sink-0.2.6.ipkg --config discord.yaml --secrets secrets.txt
You will now receive notifications on Stars and Forks activity in the Discord channel that you chose.
Inspect or Shutdown the Discord Connector:
If you want to see the status, you can run
cdk deploy list
If you want to shutdown the connector, you can run
cdk deploy shutdown --name discord-stars-outbound
Outbound Slack Connector
To get the alerts for stars and forks in a Slack channel you need:
- a slack workspace with admin access
- a new or existing Slack Channel for the alerts
- a Slack Application
Create a Slack Application: Follow the steps in the quickstart to create a slack application, and activate incoming webhook in features, copy the URL, and install the slack app in your workspace.
The Slack token is the unique identifier of your workspace.
Create a configuration file to connect to Slack - call it slack.yaml:
apiVersion: 0.1.0
meta:
version: 0.2.6
name: slack-stars-outbound
type: http-sink
topic: github-stars
secrets:
- name: SLACK_TOKEN
http:
endpoint: "https://hooks.slack.com/services/${{ secrets.SLACK_TOKEN }}"
headers:
- "Content-Type: application/json"
transforms:
- uses: infinyon-labs/stars-forks-changes@0.1.2
lookback:
last: 1
- uses: infinyon/jolt@0.3.0
with:
spec:
- operation: shift
spec:
"result": "text"
Start the Slack Connector:
cdk deploy start --ipkg infinyon-http-sink-0.2.6.ipkg --config slack.yaml --secrets secrets.txt
You will start receiving notifications on Stars and Forks activity in the Slack channel that you chose.
Inspect or Shutdown the Slack Connector:
If you want to see the status, you can run
cdk deploy list
If you want to shutdown the connector, you can run
cdk deploy shutdown --name slack-stars-outbound
Fluvio Docs Reference
Here are some relevant docs that you can look at for further context:
InfinyOn Docs Master
GitHub to Slack
GitHub to Discord
Fluvio CLI Docs
Top comments (1)
Super useful tutorial, thanks for sharing!