DEV Community

Cover image for Payment System Design at Scale
Jack Pritom Soren
Jack Pritom Soren

Posted on

Payment System Design at Scale

What really happens when Maria taps “Confirm Ride”?

Maria has an important meeting in 15 minutes.

She doesn’t have cash.

She opens Uber. Requests a ride. Gets dropped off.

The payment? Invisible. Instant. Effortless.

But behind that single tap is one of the most complex distributed systems in modern software.

Today, we’re breaking it down.

Not just “how to charge a card.”

But how to build a secure, reliable, scalable payment system that can process millions of rides per day.


The Illusion of Simplicity

From the user’s perspective:

Trip ends → $20 charged → Done.

From the backend’s perspective:

  • Securely collect payment details
  • Avoid storing sensitive card data
  • Prevent fraud
  • Handle bank outages
  • Split money across multiple parties
  • Maintain financial correctness
  • Reconcile mismatches
  • Survive retries and timeouts
  • Support global scale

This is not a feature.

This is infrastructure.


1️⃣ The First Problem: You Can’t Store Card Data

When a user enters:

  • Card number
  • CVV
  • Expiry

Storing it directly means:

  • Heavy PCI DSS compliance
  • Massive breach risk
  • Legal exposure

So what do modern systems do?

Tokenization.

The mobile app integrates a payment provider SDK (Stripe, Adyen, etc.).

Instead of sending card details to your servers:

  1. The SDK sends card data directly to the provider.
  2. The provider returns a token.
  3. You store only that token.

The token acts as a reusable, scoped permission to charge the card.

If someone steals it?

It’s useless outside your merchant account.

Security solved. (Mostly.)


2️⃣ Authorization vs Capture (Where Things Get Subtle)

When the ride ends, you don’t just “charge.”

You typically:

Step 1: Authorize

Check if the card has funds and lock the amount.

Step 2: Capture

Actually move the money.

Why split it?

Because:

  • The ride price may change.
  • You may need to adjust final fare.
  • You don’t want unpaid rides.

Large systems often authorize early (estimated fare) and capture later (final fare).

Small detail.

Massive architectural impact.


3️⃣ The Money Doesn’t Go Rider → Driver

This is critical.

The rider does NOT pay the driver directly.

Instead:

Rider → Uber Merchant Account → Split →
    → Driver
    → Uber Commission
    → Taxes
    → Fees
Enter fullscreen mode Exit fullscreen mode

Why?

Because:

  • You need commission control.
  • You must handle taxes.
  • You need dispute handling.
  • You need fraud protection.

Direct peer-to-peer payments would break accounting.


4️⃣ The Hidden Hero: Internal Ledger System

Here’s what most engineers underestimate:

You cannot rely on your payment provider as your source of truth.

You must build your own ledger service.

A simplified double-entry example:

Account Debit Credit
Rider $20
Driver $15
Platform $5

Every movement is recorded.

Why double-entry?

Because money cannot disappear.

If debits ≠ credits → something is broken.

At scale, this is the difference between:

  • “Works fine”
  • “Lost $3M silently”

5️⃣ Reliability: External Systems Will Fail

Your payment system depends on:

  • Banks
  • Card networks
  • Payment providers
  • Network calls

All of them fail.

Common nightmare scenario:

  • Authorization succeeds.
  • Capture request times out.
  • You retry.
  • Customer gets double-charged.

Solution?

Idempotency keys.

Each payment attempt includes a unique key (e.g., ride_id).

If retried, provider recognizes the key and avoids duplicate processing.

Without idempotency?

You will double-charge users.

And you will lose trust.


6️⃣ Smart Retries (Not Blind Retries)

Not all failures are equal.

Error Retry?
Network timeout Yes
Rate limit Yes
Insufficient funds No
Fraud blocked No

Blind retries create chaos.

Intelligent retries create resilience.


7️⃣ Fraud Layer (Before Money Moves)

Before charging:

  • Velocity checks
  • Device fingerprinting
  • Location mismatch
  • Behavioral anomaly detection

Some cases trigger:

  • 3D Secure
  • OTP verification
  • Manual review

Payment systems are also fraud systems.

Ignore this, and chargebacks will destroy margins.


8️⃣ Refunds Aren’t Simple

Refunding isn’t “reverse transaction.”

It requires:

  1. Updating internal ledger
  2. Issuing refund request
  3. Adjusting driver balance
  4. Handling payout already completed

Sometimes, the platform absorbs temporary loss.

Complexity compounds over time.


9️⃣ Driver Payouts: A Different System

Charging cards is one system.

Paying drivers is another.

Most platforms:

  • Aggregate earnings daily
  • Settle weekly
  • Offer instant payout (for a fee)

This uses bank rails like ACH/SEPA.

Completely different from card networks.

Two financial systems under one product.


🔟 Reconciliation (Where Adults Work)

Every night:

  • Pull reports from payment provider
  • Compare with internal ledger
  • Identify mismatches

If mismatch:

  • Flag for review
  • Trigger investigation

Without reconciliation?

Small inconsistencies compound into millions.


1️⃣1️⃣ Scaling to Millions of Rides

At high scale:

  • 1M+ rides/day
  • 1000+ transactions per second at peak

You need:

  • Stateless payment services
  • Event-driven architecture
  • Message queues (Kafka/PubSub)
  • Horizontal scaling

Instead of:

Ride → Immediate Charge
Enter fullscreen mode Exit fullscreen mode

You use:

RideCompleted Event → Payment Queue → Worker → Provider
Enter fullscreen mode Exit fullscreen mode

Decoupling prevents cascading failures.


1️⃣2️⃣ Multi-Provider Strategy

Never depend on one payment provider.

You implement:

  • Primary provider
  • Secondary fallback

With an abstraction layer:

charge(amount, token)
Enter fullscreen mode Exit fullscreen mode

Underneath, routing logic decides where to send it.

Because outages are not “if.”

They are “when.”


What Looks Simple Is Actually Distributed Finance

A ride payment system is not:

  • Just API calls
  • Just token storage
  • Just Stripe integration

It is:

  • Distributed systems
  • Financial accounting
  • Legal compliance
  • Fault tolerance
  • Fraud modeling
  • Bank integrations
  • Event-driven infrastructure

That’s why payment infrastructure is one of the hardest backend domains in the world.


Final Thought

When Maria stepped out of that taxi in Prague, she didn’t think about:

  • Idempotency keys
  • Double-entry accounting
  • Multi-provider failover
  • Fraud scoring
  • Reconciliation pipelines

She just walked into her meeting.

That’s the goal.

Great engineering makes complexity invisible.


If you’re building systems:

Don’t just design features.

Design for:

  • Failure
  • Scale
  • Auditability
  • Correctness

Because money systems don’t forgive mistakes.


Top comments (0)