Haji Rufai

Posted on May 25 • Originally published at github.com

Building a Kenya Economic Intelligence Dashboard with Python, Plotly & World Bank Data

#plotly #datascience #analytics #python

What if you could understand an entire nation's economic trajectory in a single interactive dashboard? That's exactly what I built with KenyaVista — a Python tool that pulls 20+ years of economic data from the World Bank, analyzes trends, forecasts the future, and generates a stunning interactive HTML report.

In this article, I'll walk through the architecture, the statistical methods, and the key design decisions that make this project both analytically rigorous and recruiter-friendly.

Why This Project?

As a data professional based in Kenya, I wanted to build something that combines:

Real-world data from authoritative sources (World Bank)
Statistical rigor — CAGR, trend analysis, forecasting with confidence intervals
Beautiful visualization — interactive Plotly charts, not static matplotlib
Software engineering — modular architecture, CLI, tests, CI/CD

The result: a tool that fetches, analyzes, forecasts, and visualizes Kenya's economy in one command.

Architecture

┌─────────────────────────────────────────┐
│            CLI (click + rich)            │
└──────────────┬──────────────────────────┘
               │
┌──────────────▼──────────────────────────┐
│       Data Fetcher (httpx)               │
│       World Bank API v2                  │
└──────────────┬──────────────────────────┘
               │
    ┌──────────┼──────────┬──────────┐
    │          │          │          │
┌───▼────┐ ┌──▼─────┐ ┌──▼─────┐ ┌─▼──────┐
│Analyzer│ │Forecast│ │Compare │ │Insights│
└───┬────┘ └──┬─────┘ └──┬─────┘ └─┬──────┘
    │         │          │          │
    └─────────┴────┬─────┴──────────┘
                   │
     ┌─────────────▼───────────────┐
     │    Dashboard Generator       │
     │    Plotly + Tailwind CSS     │
     └─────────────────────────────┘

The system is built as 6 focused modules, each doing one thing well:

Fetcher — pulls data from World Bank API v2
Analyzer — computes CAGR, trends, YoY changes, statistical summaries
Forecaster — ensemble of Linear + Holt's Exponential Smoothing
Comparator — ranks Kenya against 6 African peers
Insights — algorithmically identifies key findings
Dashboard — generates interactive Plotly HTML

Data Layer: World Bank API

The World Bank API is a goldmine of free, well-structured data. Here's how to fetch any indicator:

import httpx

def fetch_indicator(country_codes, indicator, date_range="2000:2024"):
    countries = ";".join(country_codes)
    url = f"https://api.worldbank.org/v2/country/{countries}/indicator/{indicator}"
    params = {"format": "json", "date": date_range, "per_page": 500}

    with httpx.Client() as client:
        resp = client.get(url, params=params, timeout=30)
        data = resp.json()

    records = []
    for entry in data[1]:
        if entry["value"] is not None:
            records.append({
                "country_code": entry["countryiso3code"],
                "year": int(entry["date"]),
                "value": float(entry["value"]),
            })
    return records

KenyaVista tracks 18 indicators across 6 dimensions:

Dimension	Indicators
💰 GDP & Growth	GDP, GDP Growth %, GDP per Capita
📊 Trade & Finance	Exports, Imports, Total Reserves
👥 Demographics	Population, Growth Rate, Urbanization, Life Expectancy
📚 Education	Literacy Rate, Education Spending
🏥 Health	Health Spending, Child Mortality, Maternal Mortality
🌐 Technology	Internet Users, Mobile Subs, Electricity Access

Analysis Engine

CAGR (Compound Annual Growth Rate)

The most important single-number summary of a time series:

def compute_cagr(start_value, end_value, years):
    if start_value <= 0 or end_value <= 0 or years <= 0:
        return None
    return (end_value / start_value) ** (1 / years) - 1

For Kenya's GDP: from ~$12.7B (2000) to ~$104B (2023), that's a CAGR of about 9.4% — impressive by any standard.

Trend Detection

I use linear regression to determine if an indicator is increasing, decreasing, or flat:

import numpy as np

def compute_trend_direction(values):
    years = np.array([v[0] for v in values], dtype=float)
    vals = np.array([v[1] for v in values], dtype=float)

    x_mean, y_mean = np.mean(years), np.mean(vals)
    ss_xy = np.sum((years - x_mean) * (vals - y_mean))
    ss_xx = np.sum((years - x_mean) ** 2)

    slope = ss_xy / ss_xx
    # R² tells us how well the linear model fits
    y_pred = slope * years + (y_mean - slope * x_mean)
    ss_res = np.sum((vals - y_pred) ** 2)
    ss_tot = np.sum((vals - y_mean) ** 2)
    r_squared = 1 - (ss_res / ss_tot) if ss_tot > 0 else 0

    return {"slope": slope, "r_squared": r_squared,
            "direction": "increasing" if slope > 0 else "decreasing"}

The R² value tells us how reliable the trend is. Kenya's internet adoption has an R² > 0.95 — a very clean upward trend.

Forecasting: Ensemble Approach

I combine two complementary methods:

1. Linear Regression Forecast

Extends the historical trend with prediction intervals:

def linear_forecast(values, horizon=5):
    # Fit linear model
    slope, intercept = fit_linear(values)
    se = residual_standard_error(values, slope, intercept)

    forecasts = []
    for i in range(1, horizon + 1):
        year = last_year + i
        predicted = slope * year + intercept
        margin = 1.96 * se * sqrt(1 + 1/n + (year - x_mean)**2 / ss_xx)
        forecasts.append({
            "year": year, "value": predicted,
            "lower": predicted - margin,
            "upper": predicted + margin
        })
    return forecasts

2. Holt's Double Exponential Smoothing

Captures level and trend momentum:

def exponential_smoothing_forecast(values, alpha=0.3, beta=0.1):
    level = values[0]
    trend = values[1] - values[0]

    for val in values:
        prev_level = level
        level = alpha * val + (1 - alpha) * (level + trend)
        trend = beta * (level - prev_level) + (1 - beta) * trend

    # Forecast: level + trend * steps_ahead

Ensemble

The final forecast averages both methods for the point estimate and uses the widest interval:

avg_value = (linear_pred + holt_pred) / 2
lower = min(linear_lower, holt_lower)
upper = max(linear_upper, holt_upper)

This is more robust than either method alone — linear catches the long-term trend, Holt's adapts to recent momentum.

Peer Comparison

Kenya doesn't exist in a vacuum. Comparing with neighbors provides context:

🇹🇿 Tanzania, 🇺🇬 Uganda, 🇷🇼 Rwanda, 🇪🇹 Ethiopia (EAC peers)
🇳🇬 Nigeria, 🇿🇦 South Africa (continental benchmarks)

The comparator module ranks Kenya for each indicator and generates a radar chart showing strengths and weaknesses:

def compare_countries(records, indicator_code, year):
    results = []
    for country_code, values in by_country.items():
        value = get_value_for_year(values, year)
        results.append({"country_code": cc, "value": value})

    results.sort(key=lambda x: x["value"], reverse=True)
    for i, r in enumerate(results):
        r["rank"] = i + 1
    return results

Automated Insights

The insights engine scans all analyses and flags notable findings:

Milestones: "Kenya's population surpassed 50 million"
Growth leaders: "Internet users grew 15.2% annually"
Health progress: "Child mortality dropped by 56%"
Ranking highlights: "Kenya leads peers in mobile subscriptions"

def generate_insights(analyses, forecasts, rank_summary):
    insights = []
    _add_growth_insights(analyses, insights)
    _add_decline_insights(analyses, insights)
    _add_milestone_insights(analyses, insights)
    _add_ranking_insights(rank_summary, insights)
    _add_forecast_insights(forecasts, analyses, insights)
    return sorted(insights, key=severity_order)[:15]

The Dashboard

The HTML dashboard is the showpiece — a single self-contained file with:

KPI cards at the top (GDP, Population, Life Expectancy, etc.)
Insight cards with color-coded severity
Ranking table + radar chart
Per-indicator sections with time series + peer comparison charts

Everything uses Plotly for interactivity (zoom, hover tooltips, toggle traces) and Tailwind CSS for responsive layout.

Running It

# Install
pip install -r requirements.txt && pip install -e .

# Full pipeline
kenyavista pipeline

# Or step by step
kenyavista fetch
kenyavista dashboard data/kenya_data.json
kenyavista summary data/kenya_data.json
kenyavista rankings data/kenya_data.json

Key Findings

From the 2,880 data points analyzed:

GDP grew from $12.7B to $104B (2000–2023), a CAGR of ~9.4%
Internet users surged from <1% to 40%+ — the fastest-growing indicator
Mobile subscriptions exceed 100 per 100 people (more phones than people!)
Child mortality dropped 56% — from 108 to ~41 per 1,000 live births
Life expectancy increased from 51 to 62 years
Electricity access jumped from 16% to 76%

Testing

45 tests covering all modules, running in under 0.3 seconds:

pytest tests/ -v
# 45 passed in 0.28s

What I Learned

World Bank API is an excellent free data source — no auth, reliable, well-documented
Ensemble forecasting is more robust than any single method
Self-contained HTML dashboards (Plotly + Tailwind via CDN) are powerful for portfolio projects
Automated insights add narrative to numbers — much more engaging than raw charts
Modular architecture makes each piece independently testable

DEV Community