What if you could understand an entire nation's economic trajectory in a single interactive dashboard? That's exactly what I built with KenyaVista — a Python tool that pulls 20+ years of economic data from the World Bank, analyzes trends, forecasts the future, and generates a stunning interactive HTML report.
In this article, I'll walk through the architecture, the statistical methods, and the key design decisions that make this project both analytically rigorous and recruiter-friendly.
Why This Project?
As a data professional based in Kenya, I wanted to build something that combines:
- Real-world data from authoritative sources (World Bank)
- Statistical rigor — CAGR, trend analysis, forecasting with confidence intervals
- Beautiful visualization — interactive Plotly charts, not static matplotlib
- Software engineering — modular architecture, CLI, tests, CI/CD
The result: a tool that fetches, analyzes, forecasts, and visualizes Kenya's economy in one command.
Architecture
┌─────────────────────────────────────────┐
│ CLI (click + rich) │
└──────────────┬──────────────────────────┘
│
┌──────────────▼──────────────────────────┐
│ Data Fetcher (httpx) │
│ World Bank API v2 │
└──────────────┬──────────────────────────┘
│
┌──────────┼──────────┬──────────┐
│ │ │ │
┌───▼────┐ ┌──▼─────┐ ┌──▼─────┐ ┌─▼──────┐
│Analyzer│ │Forecast│ │Compare │ │Insights│
└───┬────┘ └──┬─────┘ └──┬─────┘ └─┬──────┘
│ │ │ │
└─────────┴────┬─────┴──────────┘
│
┌─────────────▼───────────────┐
│ Dashboard Generator │
│ Plotly + Tailwind CSS │
└─────────────────────────────┘
The system is built as 6 focused modules, each doing one thing well:
- Fetcher — pulls data from World Bank API v2
- Analyzer — computes CAGR, trends, YoY changes, statistical summaries
- Forecaster — ensemble of Linear + Holt's Exponential Smoothing
- Comparator — ranks Kenya against 6 African peers
- Insights — algorithmically identifies key findings
- Dashboard — generates interactive Plotly HTML
Data Layer: World Bank API
The World Bank API is a goldmine of free, well-structured data. Here's how to fetch any indicator:
import httpx
def fetch_indicator(country_codes, indicator, date_range="2000:2024"):
countries = ";".join(country_codes)
url = f"https://api.worldbank.org/v2/country/{countries}/indicator/{indicator}"
params = {"format": "json", "date": date_range, "per_page": 500}
with httpx.Client() as client:
resp = client.get(url, params=params, timeout=30)
data = resp.json()
records = []
for entry in data[1]:
if entry["value"] is not None:
records.append({
"country_code": entry["countryiso3code"],
"year": int(entry["date"]),
"value": float(entry["value"]),
})
return records
KenyaVista tracks 18 indicators across 6 dimensions:
| Dimension | Indicators |
|---|---|
| 💰 GDP & Growth | GDP, GDP Growth %, GDP per Capita |
| 📊 Trade & Finance | Exports, Imports, Total Reserves |
| 👥 Demographics | Population, Growth Rate, Urbanization, Life Expectancy |
| 📚 Education | Literacy Rate, Education Spending |
| 🏥 Health | Health Spending, Child Mortality, Maternal Mortality |
| 🌐 Technology | Internet Users, Mobile Subs, Electricity Access |
Analysis Engine
CAGR (Compound Annual Growth Rate)
The most important single-number summary of a time series:
def compute_cagr(start_value, end_value, years):
if start_value <= 0 or end_value <= 0 or years <= 0:
return None
return (end_value / start_value) ** (1 / years) - 1
For Kenya's GDP: from ~$12.7B (2000) to ~$104B (2023), that's a CAGR of about 9.4% — impressive by any standard.
Trend Detection
I use linear regression to determine if an indicator is increasing, decreasing, or flat:
import numpy as np
def compute_trend_direction(values):
years = np.array([v[0] for v in values], dtype=float)
vals = np.array([v[1] for v in values], dtype=float)
x_mean, y_mean = np.mean(years), np.mean(vals)
ss_xy = np.sum((years - x_mean) * (vals - y_mean))
ss_xx = np.sum((years - x_mean) ** 2)
slope = ss_xy / ss_xx
# R² tells us how well the linear model fits
y_pred = slope * years + (y_mean - slope * x_mean)
ss_res = np.sum((vals - y_pred) ** 2)
ss_tot = np.sum((vals - y_mean) ** 2)
r_squared = 1 - (ss_res / ss_tot) if ss_tot > 0 else 0
return {"slope": slope, "r_squared": r_squared,
"direction": "increasing" if slope > 0 else "decreasing"}
The R² value tells us how reliable the trend is. Kenya's internet adoption has an R² > 0.95 — a very clean upward trend.
Forecasting: Ensemble Approach
I combine two complementary methods:
1. Linear Regression Forecast
Extends the historical trend with prediction intervals:
def linear_forecast(values, horizon=5):
# Fit linear model
slope, intercept = fit_linear(values)
se = residual_standard_error(values, slope, intercept)
forecasts = []
for i in range(1, horizon + 1):
year = last_year + i
predicted = slope * year + intercept
margin = 1.96 * se * sqrt(1 + 1/n + (year - x_mean)**2 / ss_xx)
forecasts.append({
"year": year, "value": predicted,
"lower": predicted - margin,
"upper": predicted + margin
})
return forecasts
2. Holt's Double Exponential Smoothing
Captures level and trend momentum:
def exponential_smoothing_forecast(values, alpha=0.3, beta=0.1):
level = values[0]
trend = values[1] - values[0]
for val in values:
prev_level = level
level = alpha * val + (1 - alpha) * (level + trend)
trend = beta * (level - prev_level) + (1 - beta) * trend
# Forecast: level + trend * steps_ahead
Ensemble
The final forecast averages both methods for the point estimate and uses the widest interval:
avg_value = (linear_pred + holt_pred) / 2
lower = min(linear_lower, holt_lower)
upper = max(linear_upper, holt_upper)
This is more robust than either method alone — linear catches the long-term trend, Holt's adapts to recent momentum.
Peer Comparison
Kenya doesn't exist in a vacuum. Comparing with neighbors provides context:
- 🇹🇿 Tanzania, 🇺🇬 Uganda, 🇷🇼 Rwanda, 🇪🇹 Ethiopia (EAC peers)
- 🇳🇬 Nigeria, 🇿🇦 South Africa (continental benchmarks)
The comparator module ranks Kenya for each indicator and generates a radar chart showing strengths and weaknesses:
def compare_countries(records, indicator_code, year):
results = []
for country_code, values in by_country.items():
value = get_value_for_year(values, year)
results.append({"country_code": cc, "value": value})
results.sort(key=lambda x: x["value"], reverse=True)
for i, r in enumerate(results):
r["rank"] = i + 1
return results
Automated Insights
The insights engine scans all analyses and flags notable findings:
- Milestones: "Kenya's population surpassed 50 million"
- Growth leaders: "Internet users grew 15.2% annually"
- Health progress: "Child mortality dropped by 56%"
- Ranking highlights: "Kenya leads peers in mobile subscriptions"
def generate_insights(analyses, forecasts, rank_summary):
insights = []
_add_growth_insights(analyses, insights)
_add_decline_insights(analyses, insights)
_add_milestone_insights(analyses, insights)
_add_ranking_insights(rank_summary, insights)
_add_forecast_insights(forecasts, analyses, insights)
return sorted(insights, key=severity_order)[:15]
The Dashboard
The HTML dashboard is the showpiece — a single self-contained file with:
- KPI cards at the top (GDP, Population, Life Expectancy, etc.)
- Insight cards with color-coded severity
- Ranking table + radar chart
- Per-indicator sections with time series + peer comparison charts
Everything uses Plotly for interactivity (zoom, hover tooltips, toggle traces) and Tailwind CSS for responsive layout.
Running It
# Install
pip install -r requirements.txt && pip install -e .
# Full pipeline
kenyavista pipeline
# Or step by step
kenyavista fetch
kenyavista dashboard data/kenya_data.json
kenyavista summary data/kenya_data.json
kenyavista rankings data/kenya_data.json
Key Findings
From the 2,880 data points analyzed:
- GDP grew from $12.7B to $104B (2000–2023), a CAGR of ~9.4%
- Internet users surged from <1% to 40%+ — the fastest-growing indicator
- Mobile subscriptions exceed 100 per 100 people (more phones than people!)
- Child mortality dropped 56% — from 108 to ~41 per 1,000 live births
- Life expectancy increased from 51 to 62 years
- Electricity access jumped from 16% to 76%
Testing
45 tests covering all modules, running in under 0.3 seconds:
pytest tests/ -v
# 45 passed in 0.28s
What I Learned
- World Bank API is an excellent free data source — no auth, reliable, well-documented
- Ensemble forecasting is more robust than any single method
- Self-contained HTML dashboards (Plotly + Tailwind via CDN) are powerful for portfolio projects
- Automated insights add narrative to numbers — much more engaging than raw charts
- Modular architecture makes each piece independently testable
Links
- 🔗 GitHub: github.com/hajirufai/kenyavista
- 🔗 LinkedIn: linkedin.com/in/hajirufai
Have questions about the statistical methods or want to adapt this for another country? Drop a comment below!
Top comments (0)