DEV Community

Gabriel Mahia
Gabriel Mahia

Posted on

5 arXiv-Backed AI Implementations for East Africa — and Why We Built Them First

The question wasn't what can we build. The question was what does research say is most needed, most impactful, and hasn't been built yet?

We scanned arXiv, IMF Working Papers, WHO guidelines, and PLOS One — then shipped 5 tools across GitHub in one session. Here's the selection logic and what each tool does.


The Selection Framework

Three filters applied to every candidate idea:

1. Research-backed: A peer-reviewed or institutional paper (arXiv, IMF, WHO, NBER) had already proven the core methodology worked.

2. First in East Africa: No deployable version of this specific adaptation existed for Kenya or East Africa.

3. Buildable immediately: Could be shipped as a Python/Streamlit/FastMCP project within one session using free-tier APIs.

The universe of possibilities I considered: satellite poverty mapping, community health AI, diaspora fintech, federated learning for health, natural language querying of government data, multi-agent debate for civic education, crop disease detection, economic nowcasting, AI literacy gaps, constitutional rights access, mobile money fraud detection, and more.

Here's what made the cut and why.


1. Shamba Scan AI — Crop Disease Detection in Swahili

GitHub: gabrielmahia/shamba-scan-ai

Research basis: Mohanty et al. (2016), arXiv:1604.03169 — PlantVillage dataset (54,306 images, 14 crops, 26 diseases, 99.35% accuracy). Springer Nature (2026) comprehensive AI plant disease review.

The gap: Kenya loses ~14.1% of annual crop yield to disease — matching the global average but hitting harder because extension officers are scarce (1 per 3,000+ farmers in some counties). Smartphone penetration crossed 50% in Kenya in 2024. Existing tools (PlantVillage app, Nuru) are English-only and require app installation.

What we built: Upload a photo of a diseased leaf → Gemini Vision diagnosis → disease name in Swahili → treatment steps → prevention → KALRO referral if serious.

Why first: No Swahili-native crop disease detection app exists in Kenya. Zero.


2. Kenya Nowcast — County Economic Tracker via Satellite

GitHub: gabrielmahia/kenya-nowcast

Research basis: IMF Working Paper 2026/020 "Nowcasting Economic Growth with Machine Learning and Satellite Data" (Fotopoulou et al., January 2026). PLOS One 2025 — VIIRS NTL data across 34 Sub-Saharan African countries, 2004-2019. Henderson, Storeygard & Weil (NBER 2009) — the original paper proving the NTL-GDP correlation at subnational levels.

The gap: In Sub-Saharan Africa, the average interval between nationally representative economic surveys is 6.5 years. In Kenya, county-level data updates annually at best. The IMF paper shows nighttime light satellite data can nowcast GDP with accuracy that rivals traditional models — and updates continuously.

What we built: Economic health dashboard for all 47 Kenya counties with scores modeled on VIIRS satellite proxy methodology. First time this research has been expressed as a deployable tool for Kenya.

Why chosen over poverty mapping: More actionable for county governments and development orgs making allocation decisions.


3. Haki Debate AI — Multi-Agent Constitutional Rights Debate

GitHub: gabrielmahia/haki-debate-ai

Research basis: Liang et al. (2023), arXiv:2305.19118 — "Encouraging Divergent Thinking in Large Language Models through Debate." Shows multi-agent debate produces more accurate, less biased reasoning than single-model responses. "Position: The Right to AI" (2025), arXiv:2501.17899 — argues AI access is a civil right, especially for marginalized communities. arXiv:2511.02752 — documents 20% lower AI adoption in Swahili-language countries.

The gap: Access to legal information in Kenya is deeply asymmetric. Citizens often don't know their constitutional rights. When they do, they don't know how to apply them. Existing civic education is English-first.

What we built: Two AI agents argue opposing sides of a constitutional question — government position vs. citizen rights — then a third synthesizes. Covers Article 31 (privacy), 33 (expression), 40 (land), 41 (labor), 43 (health). In Swahili, English, or both.

Why the debate format: A single-model answer can be subtly biased toward authority. The adversarial format forces both sides to be articulated, which produces a more balanced understanding for the user.


4. remit-mcp — Diaspora Remittance Intelligence MCP Server

GitHub: gabrielmahia/remit-mcp
Install: pip install remit-mcp

Research basis: World Bank Remittance Prices Worldwide database — global public dataset. World Bank Migration & Development Brief 2025 — Kenya received USD 4.2B in remittances in 2024. SDG 10.c target: reduce remittance costs to 3% by 2030. Current global average: 6.3%. Some Kenya corridors: 8-9%.

The gap: No AI-native tool exists for African diaspora remittance optimization. 35% of fees go to intermediaries on some corridors. The World Bank publishes corridor data publicly, but it's not accessible via any MCP server or AI agent.

What we built: FastMCP server with compare_remittance_corridors, estimate_savings, and list_corridors tools. AI agents (Claude, GPT-4, Gemini) can now query Kenya corridor costs and recommend the cheapest provider.

First: First MCP server for African diaspora remittance anywhere.


5. Afya CHW AI — Community Health Worker Co-Pilot

GitHub: gabrielmahia/afya-chw-ai

Research basis: arXiv:2408.17216 "Democratizing AI in Africa: Federated Learning for Low-Resource Edge Devices" — proves AI works on Raspberry Pi-class hardware for African health applications. "Edge Intelligence Unleashed" (2025) — Journal of Edge Computing survey of LLM deployment in constrained environments. arXiv:2601.09716 (2026) — Swahili identified as a language where AI tools have high impact due to speaker population and low existing tool coverage. WHO ANC guidelines (2016).

The gap: Kenya has 105,000 Community Health Workers (CHEWs). Patient-to-CHW ratio in rural areas: 3,000:1+. CHWs operate far from facilities with no real-time clinical support. Their training materials are in English. Their patients speak Swahili.

What we built: Embedded Kenya MOH protocols (fever, cough, diarrhoea, ANC, malnutrition) in Swahili. Fast symptom triage → danger sign detection → referral triggers → immediate action → follow-up. Works on low-bandwidth mobile connections.

First: No Swahili-language AI tool exists for Kenya CHWs. The research base is solid; nobody had built the adaptation.


Why These 5 Over Everything Else

The main things I passed on:

  • Federated learning for health — excellent research (arXiv:2408.17216) but requires infrastructure (multiple hospital servers) not available for a Streamlit deployment
  • Mobile money fraud detection — high impact but requires M-PESA transaction history data we don't have
  • Text-to-SQL for government data — good idea but lower urgency vs. the above
  • Natural language radio summarization — interesting but less immediate impact

The 5 I chose share three characteristics: the research is published and replicable, the data gap is real and documented, and the tool is genuinely useful to a non-technical user within 60 seconds.


Full portfolio: gabrielmahia.github.io

All five repos: MIT licensed, mobile-first, DEMO data clearly labeled, AST-validated before push.

Top comments (0)