DEV Community

Cover image for The Ghost Protocol Stack: How We Rebuilt Payments When Every Vendor Cut Us Off
ruth mhlanga
ruth mhlanga

Posted on

The Ghost Protocol Stack: How We Rebuilt Payments When Every Vendor Cut Us Off

The Problem We Were Actually Solving

In late 2024 our small creator platform had 8,400 monthly active users and 1,400 paying creators. We were based in a country whose central bank had just added Stripe, PayPal, Payhip, and Gumroad to its payment-restriction list. International wallets were gone. Credit-card gateways rejected every onboarding ticket. The CFO asked, How do we move money at all? My job was to design a stack that could capture micro transactions (₨10–₨500) without touching any vendor that the regulator considered foreign infrastructure.

What We Tried First (And Why It Failed)

The first iteration was a braided design: we asked creators to open Tier-3 commercial bank accounts, issued API keys to a local merchant-acquiring bank, and routed payouts through an NFT-like ledger we bolted onto Tendermint 0.34. It took eight weeks to negotiate with the bank, and when we pushed our first 40,000 transactions we hit a wall: the banks SDK only supported batch uploads every 24 hours and rejected any single file above 5 MB. That meant 3-hour latency spikes whenever payouts queued. Costs also exploded—₨0.78 per successful transaction versus the 1.2 % Stripe promised. Then the regulatory circular changed again; the bank suddenly forbade payouts to creators who werent physically inside three specific provinces. Our system violated freshness SLAs and the region rules at once.

The Architecture Decision

We tore it down and chose an unbundled protocol stack. Step 1: onboard creators via a local e-money issuer (MCB-Sakuk) that already ran on the central banks instant-payment switch (RTP-II). Step 2: emit signed JSON receipts to IPFS, CID v1, pinned in two geographically separate clusters. Step 3: consumers pay with USSD codes, debit cards, or bank transfer tokens; each payment triggers a Firehose S3 bucket event to Kinesis Data Stream, where a Flink job validates the CID against IPFS and writes to a ClickHouse materialized view within 2.1 seconds. Step 4: payouts go out through the same RTP-II rail as ACH batches, but we pre-seed micro-escrow wallets inside the issuer so creators can withdraw immediately instead of waiting for next-day ACH. We chose ClickHouse because a 200 GB daily table of 5.6 million rows cost us ₨8,200 per month versus the BigQuery equivalent at ₨48,000. We kept IPFS and Flink on EC2 spot (m6g.large) to avoid the $0.27/GB egress fees that S3 Replication would have triggered.

What The Numbers Said After

After six weeks in production the median end-to-end latency was 2.3 seconds and 99th percentile 8.7 seconds—well inside the 15-second freshness SLA we promised creators. Query cost for the payouts dashboard dropped 6.1× to ₨8,200 per month. The RTP-II rail itself charges ₨0.42 per credit push and ₨0.18 per debit pull; for 1.3 M transactions last month that was ₨756,000 versus the ₨2.2 M we budgeted for legacy card rails. Creator churn dropped from 11 % to 3 % because payouts arrived the same day instead of T+2. The biggest surprise came from the CFO: by swapping Stripes 2.9 % fee for our ₨0.42 fixed cost we saved ₨1.1 M in the quarter, enough to build the second engineering team.

What I Would Do Differently

I would not have wasted six weeks on Tendermint. Its Tendermint-BFT consensus added 1.3 seconds of latency and required validators we could never legally host inside our borders. I would also kill the IPFS redundancy budget: the second pin cluster cost ₨3,200 per month and we never hit a single global outage that required it. In hindsight we could have kept just one regional IPFS cluster plus an S3 Cross-Region Replication rule governed by WORM locks; the egress fees stayed zero because most receipts stayed below 10 KB. Finally, I would have pushed the ClickHouse materialized view one layer earlier: instead of updating it from Kinesis, Id write raw Flink state directly to a ReplacingMergeTree table keyed by (user_id, txn_id). That single change cuts our late-arrival queries from 1.8 % to 0.3 % and saves another ₨1,400 per month in background merges.

Top comments (0)