Kenyan households face soaring food prices and inflation. By building a robust ETL pipeline, dimensional model and visual dashboards, I analyzed how inflation impacts staple food prices across Kenyan markets from 2006 to 2024.
Why This Project Matters
- Food consistently makes up 30–40% of Kenya’s consumer basket, driving inflation volatility.  
- By analyzing monthly market data, we uncover inflation’s impacts on food security and policy effectiveness.
Project Overview
Full project code on Github
Tech stack:
- Python
- Pandas
- PostgreSQL (star schema)
- Apache Airflow
- Grafana
Data sources:
- Food prices: HDX CSV from WFP/Humanitarian Data Exchange
- Inflation: World Bank CPI API for Kenya (2006–2024)
Star Schema:
The cleaned data is structured using a star schema to enable efficient querying and dashboarding. At its center is the fact_prices table, which records transactional price data and connects to dimension tables for time, market and commodity information.
ETL Pipeline Overview
a. Extraction: Collect CSV (food prices, with daily / monthly
sampling) and JSON from World Bank (CPI series).
b. Transformation: Clean dates, normalize numeric types, drop nulls
and dedupe.
c. Schema model: Build dimension and fact tables using Python + psycopg2.
d. Automation: Used Airflow (daily runs) with DAGs etl_food_prices.py and etl_cpi.py.
e. Loading: Tables are loaded into capstone schema in PostgreSQL.
Visualizing in Grafana
By connecting Grafana to PostgreSQL, I crafted multiple dashboards using SQL:
a. Yearly CPI vs Food Prices (2006–2024)
SELECT
TO_TIMESTAMP(d.year::TEXT,'YYYY') AS time,
ROUND(AVG(f.price::NUMERIC),2) AS avg_food_price,
c.cpi::NUMERIC AS cpi
FROM capstone.fact_food_prices f
JOIN capstone.date_dim d ON f.date_key = d.date_key
JOIN capstone.fact_cpi c ON c.year = d.year
WHERE d.year::INT BETWEEN 2006 AND 2024
GROUP BY d.year, c.cpi
ORDER BY d.year;
b. Monthly Price Seasonality (All Years)
SELECT
d.month,
ROUND(AVG(f.price::NUMERIC),2) AS avg_price
FROM capstone.fact_food_prices f
JOIN capstone.date_dim d ON f.date_key = d.date_key
GROUP BY d.month
ORDER BY d.month::INT;
c. Top 10 Most Expensive Commodities (2006–2024)
SELECT
c.commodity,
ROUND(AVG(f.price::NUMERIC),2) AS avg_price
FROM capstone.fact_food_prices f
JOIN capstone.commodity_dim c ON f.commodity_id = c.commodity_id
JOIN capstone.date_dim d ON f.date_key = d.date_key
WHERE d.year::INT BETWEEN 2006 AND 2024
GROUP BY c.commodity
ORDER BY avg_price DESC
LIMIT 10;
d. Market Heatmap (requires latitude/longitude)
SELECT
m.market, m.latitude, m.longitude,
ROUND(AVG(f.price::NUMERIC),2) AS avg_price
FROM capstone.fact_food_prices f
JOIN capstone.market_dim m ON f.market_id = m.market_id
JOIN capstone.date_dim d ON f.date_key = d.date_key
WHERE d.year::INT BETWEEN 2006 AND 2024
GROUP BY m.market, m.latitude, m.longitude
HAVING COUNT(*) > 10;
Dynamic Filtering in Grafana
You can enhance interactivity using:
- Variable placeholders ($commodity, $market), passed into WHERE clauses.
-
Grafana $__timeFrom() / $__timeTo()
to use dashboard time-range picker (e.g. 2008–2022).
Example:
WHERE c.commodity = '$commodity'
AND TO_TIMESTAMP(d.year || '-' || d.month || '-01','YYYY-MM-DD')
BETWEEN $__timeFrom() AND $__timeTo()
Grafana Dashboard
Insights & Observations
- Food prices ,notably maize, legumes and dairy, steadily rose, especially around crises in 2008 and 2020.
- CPI mirrored food price trends but was smoothed, showing lagged effects.
- Seasonal cycles were evident: vegetable prices spiked July–September across years.
- Nairobi vs Mombasa vs Kisumu market differences stood out in the geographic heatmap.
Impact
- Policy relevance: Local inflation trends inform policymakers and NGOs targeting vulnerable regions.
- Data-driven planning: Seasonal insights assist supply chain managers and food security planners.
- Extendable: This framework can integrate price forecasts or poverty impact analysis.
⸻
Post Insights & Next Steps
- I used Airflow to operationalize daily data loads and schema tables.
Future enhancements:
- Build forecasting models.
- Add more granular seasonal dashboard filtering
Conclusion
This project showcases how thoughtful data modeling can transform raw information into actionable insights. By combining clean architecture with real-world relevance, we not only improve technical workflows but also create tools that can support smarter decisions in critical sectors.
Top comments (0)