{
"title": "Building FundFlow: How We Match African Startups with Investors at Scale (And Why Your Database Choice Matters)",
"content": "# Building FundFlow: How We Match African Startups with Investors at Scale (And Why Your Database Choice Matters)\n\nWhen SharkFlow started building FundFlow—our investor matching platform—we faced a problem that most Silicon Valley startups never encounter: 60% of our users are on 2G networks, and a significant portion have data plans measured in MB, not GB.\n\nThis article walks through how we designed FundFlow's backend to solve real African infrastructure constraints while matching thousands of startups with investors in real time. If you're building for emerging markets, this is for you.\n\n## The Problem We're Solving\n\nAfrica has 400M+ unbanked people, but we also have **vibrant startup ecosystems**—Nairobi's Silicon Savannah pumps out unicorns, Lagos is a fintech powerhouse, and Kigali's tech scene is exploding. Yet finding capital remains inefficient.\n\nTraditional investor matching platforms (Crunchbase, AngelList) assume:\n- Fast internet\n- Desktop-first usage\n- Large file uploads\n- Real-time push notifications\n\nNone of these assumptions hold in Nairobi's traffic or rural Uganda. FundFlow needed to work differently.\n\n## Architecture Decision #1: API-First, Mobile-Native\n\nWe started with a REST API, but quickly pivoted to **GraphQL with aggressive caching**. Here's why:\n\nA startup founder in Kampala might check FundFlow on their 3G connection during lunch. With REST, fetching a founder profile (basic info + funding needs + similar investors + recent updates) would require 4-5 separate calls. Each call = latency + battery drain.\n\nWith GraphQL, one query:\n\n```
graphql\nquery GetFounderMatchProfile($id: ID!) {\n founder(id: $id) {\n name\n companyStage\n fundingNeeded\n matchedInvestors(limit: 10) {\n id\n name\n checkSize\n focusAreas\n responseRate\n }\n similarFounders(limit: 5) {\n id\n company\n }\n }\n}\n
```\n\nOne request, one response. We then layer **HTTP/2 server push** to proactively send investor updates without the founder asking.\n\n### Response Optimization\n\nBut here's the crucial part: we **gzip everything** and implement a custom compression layer for JSON payloads:\n\n```
typescript\n// Custom compression for slow networks\nimport { transform } from 'jsoncrush';\n\nconst compressForMobile = (data: any) => {\n // Remove null values, use shorter keys\n const crushed = transform(data, {\n removeNulls: true,\n aliasMap: {\n 'fundingNeeded': 'fn',\n 'companyStage': 'cs',\n 'responseRate': 'rr'\n }\n });\n return crushed;\n};\n
```\n\nOn 2G, this reduces payload size by 60-70%. It matters when someone's data plan costs KSH 100/GB.\n\n## Architecture Decision #2: Database Strategy for Emerging Markets\n\nThis is where most startups fail. Using PostgreSQL is great for consistency, but we needed:\n\n1. **Fast reads for investor matching** (needs to be sub-500ms)\n2. **Offline-first capability** (users lose connection constantly)\n3. **Cheap scaling** (we're not burning VC cash on cloud bills)\n\nOur stack:\n\n- **PostgreSQL** (primary data store, Nairobi-hosted on Linode): Startups, investors, funding history\n- **Redis** (matching cache + session store): Real-time investor/founder matching\n- **DuckDB** (edge analytics): Local processing on phones for pattern matching\n\n### The Matching Algorithm\n\nMatching 15,000 startups with 3,000 investors requires smart indexing. We use **vector similarity** on founder + investor profiles:\n\n```
python\n# Simplified matching logic\nimport numpy as np\nfrom sklearn.metrics.pairwise import cosine_similarity\n\nclass InvestorMatcher:\n def __init__(self, redis_client):\n self.redis = redis_client\n \n def get_matches(self, founder_id: str, limit: int = 10):\n # Fetch cached vectors (precomputed daily)\n founder_vector = self.redis.get(f'vector:founder:{founder_id}')\n \n # Get all investor vectors\n investor_vectors = self.redis.hgetall('vectors:investors')\n \n # Compute similarity\n similarities = cosine_similarity(\n [founder_vector],\n list(investor_vectors.values())\n )[0]\n \n # Return top matches\n top_indices = np.argsort(similarities)[-limit:][::-1]\n return [list(investor_vectors.keys())[i] for i in top_indices]\n
```\n\nVectors are precomputed nightly (not in real-time), reducing compute load by 80%. This is crucial when your servers are in a data center without unlimited power.\n\n## Architecture Decision #3: M-Pesa Integration for Founder Trust\n\nHere's a feature that only works in Africa: We verify founders through M-Pesa transaction history.\n\nWhy? Because in Kenya, M-Pesa transactions are a proxy for business legitimacy. If you've moved KSH 10M+ through M-Pesa in the last year, you probably exist and have customers.\n\n```
typescript\n// Simplified M-Pesa verification\nimport axios from 'axios';\n\nclass MPesaVerifier {\n async verifyFounder(\n phoneNumber: string,\n mp
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (0)