DEV Community

Rotich Kelly
Rotich Kelly

Posted on

How I Automated Crypto Price Tracking with Apache Airflow & CoinGecko

"Because waking up at 2am to check Bitcoin prices isn’t scalable."

The Idea Behind the Project
Crypto markets never sleep, and neither should your data pipeline. I built a fully automated ETL pipeline using Apache Airflow that extracts hourly snapshots of crypto data from CoinGecko, stores them in a PostgreSQL database (on Aiven), and sets the stage for real-time analysis, dashboards, and trading models.

Whether you're tracking Bitcoin moonshots or studying volume dips in altcoins, this pipeline's got your back.

What It Actually Does
Every hour, my pipeline:

  • Pulls real-time data from the CoinGecko API for 15 top
    cryptocurrencies (BTC, ETH, SOL, etc.)

  • Captures price, market cap, and trading volume

  • Stores the data in a time-series friendly PostgreSQL table: crypto.crypto_prices

  • Retries automatically on failure and logs each step in Airflow’s UI

No manual refresh. No scripts to rerun. Just clean, structured data—hour after hour.

Stack Breakdown
Tools
Apache Airflow DAG orchestration + scheduling
Python 3.12 Core logic and scripting
CoinGecko API Crypto market data
PostgreSQL (Aiven) Cloud-hosted database storage
psycopg2 PostgreSQL database connector
python-dotenv Secure secret management via .env

How I Built It
This project lives in my GitHub repo, and it’s fully reproducible:

Clone the repo
git clone https://github.com/KellyKiprop/Crypto-price-pipeline.git
cd crypto-etl-pipeline
Create a virtual environment & install dependencies


python -m venv venv
source venv/bin/activate # Or venv\Scripts\activate on Windows
pip install -r requirements.txt

Set up environment variables
Create a .env file in the root folder with your PostgreSQL config:


DB_NAME=defaultdb
DB_HOST=your-db-host.aivencloud.com
DB_USER=avnadmin
DB_PASSWORD=yourpassword
DB_PORT=17440

Initialize and start Airflow

airflow db init
airflow webserver --port 8080
airflow scheduler

Trigger the DAG from the UI
Visit localhost:8080, toggle coin_price_etl_dag, and trigger a run.

Testing the Results
Check the crypto.crypto_prices table in your database — you should see hourly updates like:

name symbol price market_cap total_volume timestamp
Bitcoin BTC 71,231 ... ... 2025-06-07T12:00:00Z

You can now plug this into:

Grafana dashboards

ML models for price prediction

Long-term trend analysis

Why This Matters
It’s a hands-on intro to Airflow and ETL pipelines

Crypto data is noisy and dynamic — perfect for learning real-time data workflows

You’ll walk away with a reusable pattern for any public API + DB project

About the Author
Kelly Kiprop
kipropkelly4@gmail.com

This project uses Airflow to extract hourly crypto data from CoinGecko and store it in PostgreSQL. It’s cloud-ready, beginner-friendly, and built for real-time insight.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.