My data pipeline used to cost $47/month:
- $5 DigitalOcean droplet
- $12 Airtable Pro
- $30 Zapier automation
Now it costs $0. Here's how.
The Architecture
GitHub Actions (free cron) → Python scraper → JSON files in repo → GitHub Pages
No database. No server. No paid tools. Everything runs on GitHub's free tier.
What It Does
Every day at 8am UTC:
- Fetches cryptocurrency prices (CoinGecko API — free, no key)
- Fetches stock market data (Alpha Vantage — free key)
- Checks economic indicators (FRED — free key)
- Saves everything to JSON files
- Auto-commits to the repo
- GitHub Pages serves the data as a static API
The Scraper
import requests
import json
import os
from datetime import datetime
os.makedirs('data', exist_ok=True)
def save(filename, data):
with open(f'data/{filename}', 'w') as f:
json.dump(data, f, indent=2)
print(f'Saved {filename}')
# 1. Crypto prices
crypto = requests.get(
'https://api.coingecko.com/api/v3/simple/price',
params={'ids': 'bitcoin,ethereum,solana', 'vs_currencies': 'usd', 'include_24hr_change': 'true'}
).json()
save('crypto.json', {'timestamp': datetime.utcnow().isoformat(), 'prices': crypto})
# 2. Stock prices (need free API key from alphavantage.co)
API_KEY = os.environ.get('ALPHA_VANTAGE_KEY', 'demo')
for symbol in ['AAPL', 'MSFT', 'GOOGL']:
stock = requests.get(
'https://www.alphavantage.co/query',
params={'function': 'GLOBAL_QUOTE', 'symbol': symbol, 'apikey': API_KEY}
).json()
save(f'stock_{symbol.lower()}.json', stock)
# 3. Economic data
FRED_KEY = os.environ.get('FRED_KEY', '')
if FRED_KEY:
for series in ['CPIAUCSL', 'UNRATE', 'GDP']:
data = requests.get(
'https://api.stlouisfed.org/fred/series/observations',
params={'series_id': series, 'api_key': FRED_KEY, 'file_type': 'json', 'limit': 5, 'sort_order': 'desc'}
).json()
save(f'econ_{series.lower()}.json', data)
print(f'Pipeline complete: {datetime.utcnow()}')
The GitHub Action
name: Daily Data Pipeline
on:
schedule:
- cron: '0 8 * * *' # Daily at 8am UTC
workflow_dispatch:
jobs:
pipeline:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- run: pip install requests
- run: python pipeline.py
env:
ALPHA_VANTAGE_KEY: ${{ secrets.ALPHA_VANTAGE_KEY }}
FRED_KEY: ${{ secrets.FRED_KEY }}
- name: Commit & Push
run: |
git config user.name 'Pipeline Bot'
git config user.email 'bot@github.com'
git add data/
git diff --staged --quiet || git commit -m 'Data update'
git push
Accessing the Data
Enable GitHub Pages on your repo → your data is available at:
https://yourusername.github.io/repo-name/data/crypto.json
Free static API. No server.
Cost Comparison
| Component | Before | After |
|---|---|---|
| Server | $5/mo (DO) | $0 (GitHub Actions) |
| Database | $12/mo (Airtable) | $0 (JSON + Git) |
| Automation | $30/mo (Zapier) | $0 (GitHub Actions) |
| Total | $47/mo | $0 |
| Yearly | $564 | $0 |
Limitations
- GitHub Actions free tier: 2,000 min/month (plenty for daily runs)
- Storage: use git — each data point is a commit (free until repo hits 5GB)
- No real-time: minimum cron interval is 5 minutes
- Cold start: Actions can delay 5-15 min from scheduled time
For most data collection needs, these limitations don't matter.
Template
Fork my template: github-action-scraper-template
Free API clients:
What data would you track with a free daily pipeline?
Top comments (0)