In most restaurants, billing looks simple from the outside: take an order, calculate totals, print a bill. In reality, billing sits at the center of a noisy, highly concurrent system.
At peak hours, multiple things happen at once:
- Captains create or modify orders
- The kitchen updates item status
- Discounts or taxes change mid-order
- Inventory updates happen asynchronously
- Network latency spikes
- Printers misbehave at the worst possible moment
The requirement sounds small: billing must never block operations. But the implication is big. If billing slows down, the queue grows. If the queue grows, staff panic. If staff panic, they bypass the system.
This article shares real-world lessons from designing a real-time billing system under these conditions.
No theory. Just constraints, mistakes, and hard-earned trade-offs.
Constraints (The Ones That Actually Matter)
Before touching architecture, we had to accept some non-negotiable constraints.
Peak load is unpredictable: Lunch rush, dinner rush, festival days, weekends — traffic is bursty. A system that works fine at 20 bills/hour can collapse at 200.
Latency tolerance is near zero: A billing screen that freezes for 2 seconds feels broken to staff. Humans perceive slowness faster than engineers expect.
Hardware is inconsistent: Low-end Android devices, old printers, mixed network quality. You cannot assume ideal conditions.
Data correctness > elegance: A wrong total is worse than a slow UI. Financial data must be correct, auditable, and replayable.
These constraints shaped every decision that followed.
What Went Wrong (Early Mistakes)
Mistake 1: Treating billing as a synchronous operation
Our first approach tightly coupled:
- Order creation
- Tax calculation
- Inventory update
- Bill generation
- Print trigger All in one request. Under load, a slow printer or inventory lock would block billing entirely. The UI froze because the backend was “doing the right thing.” Lesson: Billing is not one action. It’s a pipeline.
Mistake 2: Recalculating everything on every change
Every time an item was added or removed, we recomputed:
- Subtotals
- Taxes
- Discounts
- Round-offs This worked in isolation but failed under concurrency. Multiple rapid edits caused race conditions and inconsistent totals. Lesson: Idempotent, incremental calculations beat full recomputation.
Mistake 3: Assuming the network is reliable
We initially assumed the backend was the source of truth. When the network dropped, billing stalled.
Staff didn’t wait. They wrote bills manually.
Lesson: If your system pauses, humans route around it. Permanently.
Solution Approach (High-Level, No Secrets)
The final design wasn’t fancy. It was defensive.
1. Event-driven billing model
Instead of “generate bill,” we moved to billing events:
- ITEM_ADDED
- ITEM_REMOVED
- DISCOUNT_APPLIED
- TAX_UPDATED
- BILL_FINALIZED Each event is immutable and timestamped. The bill is a projection of these events.
Why this helped:
- Easy to replay
- Easy to audit
- Partial failures don’t corrupt state
2. Separate “fast path” and “slow path”
We split operations into two categories:
Fast path (must be instant):
- UI updates
- Line-item totals
- Running subtotal
Slow path (can lag slightly):
- Inventory sync
- Printer communication
- Analytics
- Remote sync Billing completion only depends on the fast path.
Key idea: Never block the fast path on external systems.
3. Local-first with eventual sync
The device maintains a local ledger:
- Bills are finalized locally
- Each finalized bill gets a local unique ID
- Sync happens asynchronously Conflict resolution is simple:
- Bills are append-only
- No bill is ever edited after finalization This eliminated entire classes of network-related failures.
4. Deterministic calculation engine
We moved all calculations into a deterministic module:
- Same inputs always produce the same output
- No floating-point surprises
- Explicit rounding rules
- Versioned tax logic This allowed:
- Safe replays
- Backward compatibility
- Debugging past bills reliably
5. Idempotent operations everywhere
Every billing action includes an idempotency key.
If the same event is sent twice:
- It is safely ignored
- Or merged without side effects This mattered during retries, crashes, and reconnects.
Performance Decisions That Actually Helped
Avoid shared locks
We stopped locking “the bill” as a whole. Instead:
- Line items are updated independently
- Totals are derived, not locked
Precompute where humans wait
Humans wait on screen transitions, not background syncs. We optimized perceived performance, not raw throughput.
Backpressure instead of failure
If the system is under stress:
- Slow non-critical features
- Never drop billing actions Dropping logs is acceptable. Dropping bills is not.
Lessons Learned
Billing is a trust system: Once staff distrust billing totals, no UI improvement will fix it.
Real-time does not mean synchronous: Real-time means predictable latency, not doing everything at once.
Audibility beats cleverness: If you can’t explain a bill 6 months later, the design is wrong.
Humans optimize faster than code: If the system slows them down, they will invent workarounds immediately.
Final Takeaway
A billing system survives peak hours because it is forgiving.
Forgiving of:
- Network failures
- Hardware limitations
- Human behavior
- Operational chaos
Design for failure first. Performance follows naturally.
Top comments (0)