Intro
In the previous parts, we built the foundation. But in a real production environment, we face tougher challenges: System crashes, Data loss, and Database bottlenecks. Here is how we solve them.
1. Guaranteeing Atomicity with Lua Scripts
When checking stock and issuing a coupon, we must ensure these two steps happen as one single unit. If another request sneaks in between, we might over-issue coupons.
The Solution:
We use a Redis Lua Script. Redis executes the script in a single thread, meaning no other operation can interrupt it. It’s the perfect way to prevent "Race Conditions."
2. Database Protection: Throttling & Batching
Even if Redis is fast, the Database (MySQL) can be a bottleneck. If we send 10,000 "Insert" queries at once, the DB connection pool will be exhausted.
The Solution:
- Message Queue (Kafka/RabbitMQ): We send the results to a queue first.
- Worker Throttling: A background worker pulls data from the queue at a controlled speed (e.g., 200 records/sec).
-
Batch Insert: Instead of 1,000 separate queries, we group them into one single
INSERTquery to reduce IO overhead.
3. Handling Failures: Eventual Consistency
What happens if the Redis operation succeeds, but the Database save fails? This is a Data Integrity issue.
The Strategy:
- Retry with Back-off: The system automatically retries the save operation.
- Dead Letter Queue (DLQ): If it fails after 3-5 tries, the data is moved to a DLQ. Engineers can then check the logs and manually fix the data.
- The Goal: We accept that data might be delayed for a few seconds, but we ensure it is eventually consistent.
Conclusion
Building a high-performance system is a balance between Speed and Safety. By using Redis for speed and Message Queues for safety, we can build a backend that never crashes.
Top comments (0)