DEV Community

Cover image for Air Quality - Pollutant Index - India
Adhir Kirtikar
Adhir Kirtikar

Posted on • Edited on

Air Quality - Pollutant Index - India

This mini project shows the Air Quality Index (Industrial Air Pollution) from various locations in India in a Tableau Public dashboard.
The data is sourced from data.gov.in API using Python for cleaning & loading the data in Google Sheets.
Data is updated daily (or manually) using GitHub Actions or AWS Lambda.

My Workflow

GitHub Action "run-python.yml"

  • Google credentials are stored in GitHub Actions Environment Secrets.
  • data.gov.in API Key is also stored in GitHub Actions Environment Secrets.
  • A GitHub Action is created that can run manually or on a schedule [12PM UTC (8PM SGT)].
  • Python 3.9 is setup using actions/setup-python@v2.3.0 and the pip packages are cached.
  • Dependencies are installed using py-actions/py-dependency-install@v2 based on requirements.txt (google auth, pygsheets & pandas).
  • The Environment Secrets are exported to environment variables.
  • Finally, the Python script is run with the environment variables passed as parameters.

Submission Category:

Wacky Wildcards

  • I tried to use this workflow as a replacement / complement to the AWS Lambda function that processes the Python script at 8AM SGT.

Yaml File or Link to Code

run-python.yml

# This is a basic workflow to help you get started with Actions

name: Run Python

# Controls when the action will run. 
on:
  schedule:
    # run at 12PM UTC (8PM SGT)
    - cron: '0 12 * * *'

  # Allows you to run this workflow manually from the Actions tab
  workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
  # This workflow contains a single job called "build"
  build:
    # use environment (named as "env") defined in the GitHub repository settings
    environment: env

    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    # Steps represent a sequence of tasks that will be executed as part of the job
    steps:
      # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
      -
        name: Checkout
        uses: actions/checkout@v2

      # Set up Python 3.9 environment and cache pip packages
      - 
        name: Setup Python 3.9
        uses: actions/setup-python@v2.3.0
        with:
          python-version: '3.9'
          cache: 'pip'
        # This action sets up a Python environment for use in actions by:
        #   optionally installing and adding to PATH a version of Python that is already installed in the tools cache.
        #   optionally caching dependencies for pip and pipenv.

      # Install dependencies mentioned in the requirements.txt
      - 
        name: Install dependencies
        uses: py-actions/py-dependency-install@v2
        # This GitHub Action installs Python package dependencies from a user-defined requirements.txt file path 
        # with pip, setuptools, and wheel installs/updates during execution. 
        # A Python package environment report is displayed at the end of Action execution.
        # Uses path requirements.txt and updates pip, setuptools, and wheel before the install.

      # Run a bash shell and store env secrets in parameters to pass to Python script
      -
        name: Get Parameters & Run Air Quality Index India Python script
        shell: bash
        env:
          GOVINAPIKEY: ${{ secrets.DATA_GOV_IN_API_KEY }}
          GDRIVEAPIKEY: ${{ secrets.GDRIVE_API_CREDENTIALS }}
        run: |
          python "Air Quality Index India.py" "$GOVINAPIKEY" "$GDRIVEAPIKEY"
Enter fullscreen mode Exit fullscreen mode

Full repository is here:

GitHub logo AdhirKirtikar / Air-Quality-Index-India-GitHub-Actions

Repo for 2021 GitHub Actions Hackathon on DEV

Air-Quality-Index-India-GitHub-Actions

Repo for 2021 GitHub Actions Hackathon on DEV

GitHub Action "run-python.yml"

  • Google credentials are stored in GitHub Actions Environment Secrets.
  • data.gov.in API Key is also stored in GitHub Actions Environment Secrets.
  • A GitHub Action is created that can run manually or on a schedule [12PM UTC (8PM SGT)]
  • Python 3.9 is setup using actions/setup-python@v2.3.0 and the pip packages are cached
  • Dependencies are installed using py-actions/py-dependency-install@v2 based on requirements.txt (google auth, pygsheets & pandas).
  • The Environment Secrets are exported to environment variables.
  • Finally, the Python script is run with the environment variables passed as parameters.

Python script "Air Quality Index India.py"

  • The script connects to data.gov.in API using API Key passed as a parameter.
  • Then it pulls the latest AQI data for India and stores in pandas dataframe.
  • The data is cleaned, formatted and the columns are renamed. Nulls are replaced by 0.
  • Google sheet isโ€ฆ

Additional Resources / Info

Tableau Public Dashboard that uses the data from the generated Google Sheets: "Air Quality - Pollutant Index - India"

Top comments (0)