DEV Community

Alex Spinov
Alex Spinov

Posted on

I Automated My Entire Data Pipeline for $0 (Python + GitHub Actions + Free APIs)

My data pipeline used to cost $47/month:

  • $5 DigitalOcean droplet
  • $12 Airtable Pro
  • $30 Zapier automation

Now it costs $0. Here's how.

The Architecture

GitHub Actions (free cron) → Python scraper → JSON files in repo → GitHub Pages
Enter fullscreen mode Exit fullscreen mode

No database. No server. No paid tools. Everything runs on GitHub's free tier.

What It Does

Every day at 8am UTC:

  1. Fetches cryptocurrency prices (CoinGecko API — free, no key)
  2. Fetches stock market data (Alpha Vantage — free key)
  3. Checks economic indicators (FRED — free key)
  4. Saves everything to JSON files
  5. Auto-commits to the repo
  6. GitHub Pages serves the data as a static API

The Scraper

import requests
import json
import os
from datetime import datetime

os.makedirs('data', exist_ok=True)

def save(filename, data):
    with open(f'data/{filename}', 'w') as f:
        json.dump(data, f, indent=2)
    print(f'Saved {filename}')

# 1. Crypto prices
crypto = requests.get(
    'https://api.coingecko.com/api/v3/simple/price',
    params={'ids': 'bitcoin,ethereum,solana', 'vs_currencies': 'usd', 'include_24hr_change': 'true'}
).json()
save('crypto.json', {'timestamp': datetime.utcnow().isoformat(), 'prices': crypto})

# 2. Stock prices (need free API key from alphavantage.co)
API_KEY = os.environ.get('ALPHA_VANTAGE_KEY', 'demo')
for symbol in ['AAPL', 'MSFT', 'GOOGL']:
    stock = requests.get(
        'https://www.alphavantage.co/query',
        params={'function': 'GLOBAL_QUOTE', 'symbol': symbol, 'apikey': API_KEY}
    ).json()
    save(f'stock_{symbol.lower()}.json', stock)

# 3. Economic data
FRED_KEY = os.environ.get('FRED_KEY', '')
if FRED_KEY:
    for series in ['CPIAUCSL', 'UNRATE', 'GDP']:
        data = requests.get(
            'https://api.stlouisfed.org/fred/series/observations',
            params={'series_id': series, 'api_key': FRED_KEY, 'file_type': 'json', 'limit': 5, 'sort_order': 'desc'}
        ).json()
        save(f'econ_{series.lower()}.json', data)

print(f'Pipeline complete: {datetime.utcnow()}')
Enter fullscreen mode Exit fullscreen mode

The GitHub Action

name: Daily Data Pipeline
on:
  schedule:
    - cron: '0 8 * * *'  # Daily at 8am UTC
  workflow_dispatch:

jobs:
  pipeline:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      - run: pip install requests
      - run: python pipeline.py
        env:
          ALPHA_VANTAGE_KEY: ${{ secrets.ALPHA_VANTAGE_KEY }}
          FRED_KEY: ${{ secrets.FRED_KEY }}
      - name: Commit & Push
        run: |
          git config user.name 'Pipeline Bot'
          git config user.email 'bot@github.com'
          git add data/
          git diff --staged --quiet || git commit -m 'Data update'
          git push
Enter fullscreen mode Exit fullscreen mode

Accessing the Data

Enable GitHub Pages on your repo → your data is available at:

https://yourusername.github.io/repo-name/data/crypto.json
Enter fullscreen mode Exit fullscreen mode

Free static API. No server.

Cost Comparison

Component Before After
Server $5/mo (DO) $0 (GitHub Actions)
Database $12/mo (Airtable) $0 (JSON + Git)
Automation $30/mo (Zapier) $0 (GitHub Actions)
Total $47/mo $0
Yearly $564 $0

Limitations

  • GitHub Actions free tier: 2,000 min/month (plenty for daily runs)
  • Storage: use git — each data point is a commit (free until repo hits 5GB)
  • No real-time: minimum cron interval is 5 minutes
  • Cold start: Actions can delay 5-15 min from scheduled time

For most data collection needs, these limitations don't matter.

Template

Fork my template: github-action-scraper-template

Free API clients:

What data would you track with a free daily pipeline?

Top comments (0)