This mini project shows the Air Quality Index (Industrial Air Pollution) from various locations in India in a Tableau Public dashboard.
The data is sourced from data.gov.in API using Python for cleaning & loading the data in Google Sheets.
Data is updated daily (or manually) using GitHub Actions or AWS Lambda.
My Workflow
GitHub Action "run-python.yml"
- Google credentials are stored in GitHub Actions Environment Secrets.
- data.gov.in API Key is also stored in GitHub Actions Environment Secrets.
- A GitHub Action is created that can run manually or on a schedule [12PM UTC (8PM SGT)].
- Python 3.9 is setup using actions/setup-python@v2.3.0 and the pip packages are cached.
- Dependencies are installed using py-actions/py-dependency-install@v2 based on requirements.txt (google auth, pygsheets & pandas).
- The Environment Secrets are exported to environment variables.
- Finally, the Python script is run with the environment variables passed as parameters.
Submission Category:
Wacky Wildcards
- I tried to use this workflow as a replacement / complement to the AWS Lambda function that processes the Python script at 8AM SGT.
Yaml File or Link to Code
run-python.yml
# This is a basic workflow to help you get started with Actions
name: Run Python
# Controls when the action will run.
on:
schedule:
# run at 12PM UTC (8PM SGT)
- cron: '0 12 * * *'
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:
# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "build"
build:
# use environment (named as "env") defined in the GitHub repository settings
environment: env
# The type of runner that the job will run on
runs-on: ubuntu-latest
# Steps represent a sequence of tasks that will be executed as part of the job
steps:
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
-
name: Checkout
uses: actions/checkout@v2
# Set up Python 3.9 environment and cache pip packages
-
name: Setup Python 3.9
uses: actions/setup-python@v2.3.0
with:
python-version: '3.9'
cache: 'pip'
# This action sets up a Python environment for use in actions by:
# optionally installing and adding to PATH a version of Python that is already installed in the tools cache.
# optionally caching dependencies for pip and pipenv.
# Install dependencies mentioned in the requirements.txt
-
name: Install dependencies
uses: py-actions/py-dependency-install@v2
# This GitHub Action installs Python package dependencies from a user-defined requirements.txt file path
# with pip, setuptools, and wheel installs/updates during execution.
# A Python package environment report is displayed at the end of Action execution.
# Uses path requirements.txt and updates pip, setuptools, and wheel before the install.
# Run a bash shell and store env secrets in parameters to pass to Python script
-
name: Get Parameters & Run Air Quality Index India Python script
shell: bash
env:
GOVINAPIKEY: ${{ secrets.DATA_GOV_IN_API_KEY }}
GDRIVEAPIKEY: ${{ secrets.GDRIVE_API_CREDENTIALS }}
run: |
python "Air Quality Index India.py" "$GOVINAPIKEY" "$GDRIVEAPIKEY"
Full repository is here:
AdhirKirtikar / Air-Quality-Index-India-GitHub-Actions
Repo for 2021 GitHub Actions Hackathon on DEV
Air-Quality-Index-India-GitHub-Actions
Repo for 2021 GitHub Actions Hackathon on DEV
GitHub Action "run-python.yml"
- Google credentials are stored in GitHub Actions Environment Secrets.
- data.gov.in API Key is also stored in GitHub Actions Environment Secrets.
- A GitHub Action is created that can run manually or on a schedule [12PM UTC (8PM SGT)]
- Python 3.9 is setup using actions/setup-python@v2.3.0 and the pip packages are cached
- Dependencies are installed using py-actions/py-dependency-install@v2 based on requirements.txt (google auth, pygsheets & pandas).
- The Environment Secrets are exported to environment variables.
- Finally, the Python script is run with the environment variables passed as parameters.
Python script "Air Quality Index India.py"
- The script connects to data.gov.in API using API Key passed as a parameter.
- Then it pulls the latest AQI data for India and stores in pandas dataframe.
- The data is cleaned, formatted and the columns are renamed. Nulls are replaced by 0.
- Google sheet isโฆ
Additional Resources / Info
Tableau Public Dashboard that uses the data from the generated Google Sheets: "Air Quality - Pollutant Index - India"
Top comments (0)