How We Handle Concurrency Control in Financial Systems
A Story About Building Bulletproof Data Integrity
The Problem: When Data Integrity Breaks Down
It's the end of a busy financial period. Two team members are working on the same critical financial record—one is finalizing it, the other just discovered an error and is making corrections.
Both click "Save" at nearly the same time. The system accepts both changes.
Later, during a review, someone notices the data doesn't look right. It's neither what the first person entered nor what the second person corrected—it's a corrupted mix of both. Worse, the audit trail is incomplete. No one can tell what happened or when.
This is the nightmare scenario that keeps financial system architects awake at night.
Why Financial Systems Are Different
In a social media app, if two users accidentally overwrite each other's comments, it's annoying. In a financial system, data integrity isn't just important—it's legally mandated.
When you're dealing with money, regulatory compliance, and financial reporting that could affect shareholder decisions or SEC filings, you can't have:
- Lost Updates: One person's approved transaction being silently overwritten by another's edit
- Inconsistent State: A transaction being approved for financial reporting while someone else is still modifying it
- Audit Trail Gaps: Missing records of who changed what and when—a regulatory compliance nightmare
- Compliance Violations: Inaccurate financial reports that could trigger investigations, fines, or worse
- Cascading Errors: Wrong figures feeding into quarterly reports, tax calculations, and investor statements
In financial systems, every cent must be accounted for, every change must be tracked, and data integrity is non-negotiable.
Our Mission: Protecting Financial Data Integrity
After witnessing the chaos that uncontrolled concurrent access can cause, we set out to build a system with one core principle: First come, first served—and everyone else gets told exactly what's happening.
Our philosophy is simple:
- Priority to Speed: The first user to start an operation gets to complete it
- No Silent Overwrites: If a second user tries to update based on outdated data, we reject the operation with a clear error message—forcing them to refresh, review the latest changes, and then make their update based on current data
- Clarity for Others: Anyone who tries to modify the same data gets a clear, actionable error message
- Zero Tolerance for Data Loss: We'd rather block an operation than risk corrupting financial records
Three War Stories: When Concurrency Goes Wrong
Story #1: The Race Condition
Two team members receive an alert about an error in a financial record. They both open it simultaneously and start making corrections.
User A saves their changes. A few seconds later, User B saves.
What should happen?
User A's save goes through. User B gets a clear message: "This record was modified by another user while you were editing. Please refresh and try again."
User B refreshes, sees the fix is already done, and continues their work.
What could go wrong without protection?
Without concurrency control, both saves might succeed. The final data could be a mix of both changes, or worse—one person's entire update could be silently overwritten, causing data loss in financial records.
Story #2: The Moving Target
A supervisor is reviewing a financial record for approval. The data looks good, so they click "Approve."
But there's a problem: while the supervisor had the approval screen open, another user discovered an error and was actively updating that same record.
What should happen?
The system blocks the approval attempt with a message: "This record is currently being modified by another user. Please wait and try again."
Why this matters in financial systems:
The supervisor was about to approve data that was actively being changed. In financial systems, approving a record locks it for regulatory reporting. If they approved incomplete or incorrect data, it could cascade into financial statements, tax calculations, and compliance reports—creating serious regulatory risks.
Story #3: The Time Traveler's Mistake
An approver opens a financial record to review it. They get interrupted by a meeting, leaving their browser tab open for 30 minutes.
While they're away, another user discovers an error and updates the record with corrected values.
The approver returns and, without refreshing, clicks "Approve"—still looking at the old data on their screen.
What should happen?
The system detects they're trying to approve an outdated version. They get a message: "This record has been modified since you opened it. Please refresh to see the latest version before approving."
The financial compliance angle:
The approver made a decision based on stale data. In financial systems, approvers must see current, accurate data before making decisions. Approving outdated data isn't just a technical bug—it's a control failure that auditors flag during compliance reviews.
The Solution: Two Locks for Two Problems
Looking at our three stories, we noticed something interesting: they represent two fundamentally different concurrency problems.
Stories #1 and #2 are about concurrent operations—multiple people trying to modify or approve the same record at the same time. We need to prevent them from stepping on each other's toes.
Story #3 is about version conflicts—someone making decisions based on outdated data. We need to detect when data has changed since they last looked at it.
Different problems require different solutions:
| Problem Type | Solution | Which Stories |
|---|---|---|
| Concurrent Operations | Pessimistic Locking (Redis) | #1, #2 |
| Version Conflicts | Optimistic Locking | #3 |
Solution #1: Pessimistic Locking (For Concurrent Operations)
The Challenge
When Emma and James both try to edit Transaction #A2547, or when Lisa tries to approve while Michael is editing, we need to physically prevent them from accessing the same record at the same time. One person gets the lock, everyone else waits.
Think of it like a bathroom door lock—only one person at a time, and everyone else can see it's occupied.
Two Ways to Lock: Database vs Redis
We considered two approaches:
Option 1: Redis Distributed Locks
Before any user touches a record, we check Redis: "Is anyone else working on this record?" If yes, they wait. If no, we create a lock entry in Redis indicating someone is editing it.
Advantages:
- Works across multiple servers
- Supports batch approval jobs that run for 15+ minutes
- Locks automatically expire if something crashes
- Doesn't tie up database connections
Downsides:
- We need to run Redis (one more thing to maintain)
- We have to handle lock logic carefully in code
Option 2: Database Row Locks (SELECT FOR UPDATE)
Use the database's built-in locking with SELECT FOR UPDATE. When a user queries a record for editing, the database locks that row until they're done.
Advantages:
- No extra infrastructure needed
- Automatic cleanup when transaction commits
- Database handles deadlocks automatically
Downsides:
- Keeps database connections busy during long operations
- Doesn't work for async batch jobs (can't hold a lock across job queues)
- Under heavy load, we could run out of database connections
Why We Chose Redis
We went with Redis for one critical reason: batch operations.
Financial systems often need to process hundreds or thousands of records at once (like batch approvals). These operations run as background jobs that might take 15-30 minutes. Database locks can't survive across job queue boundaries—the HTTP request ends, the database transaction commits, and the lock is gone before the background job even starts.
With Redis, we can:
- Acquire the lock when the user initiates a batch operation
- Store the lock token in the database
- Pass it to the background job via message queue
- Have the job release the lock when done
Plus, for financial systems, we'd rather sacrifice a bit of infrastructure complexity than risk exhausting our database connection pool during critical processing periods.
Solution #2: Optimistic Locking (For Version Conflicts)
The Problem with Stale Data
Remember Story #3? An approver opened a record, got interrupted, and came back 30 minutes later to approve it—not knowing another user had updated it in the meantime.
We can't lock the record for 30 minutes while someone is away. That would block everyone else from working on it. Instead, we use "optimistic locking"—we assume conflicts are rare, but we verify the data hasn't changed before committing.
How We Detect Version Changes
We track versions two different ways, depending on how the database table works:
Strategy 1: ID-Based Versioning (For Audit Tables)
Some financial tables never delete or overwrite data—for audit compliance. Every edit creates a new record with a new ID, and we mark the old one as deleted.
When someone tries to approve:
- Their browser sends: "I want to approve record ID abc123"
- Backend checks: "What's the current active record?"
- If the current record has a different ID (someone created a new version), we reject the approval
- They get told: "This has been modified. Please review the latest version."
Strategy 2: Timestamp-Based Versioning (For Regular Tables)
For tables that update in place, we use the updated_at timestamp as a version number.
When someone tries to approve:
- Their browser sends: "I want to approve, and I'm looking at the version from [timestamp]"
- Backend checks the current
updated_attimestamp - If timestamps don't match → reject the approval
- They refresh and see the latest data
How Both Locks Work Together
The two mechanisms form a complete defense system. Every operation goes through both checks:
┌────────���────────────────────────────────────────────────────┐
│ User Request │
│ (Edit/Approve Record) │
└────────────────────────┬────────────────────────────────────┘
│
▼
┌──────────────────────┐
│ Pessimistic Lock │
│ (Redis Lock Check) │
└──────────┬───────────┘
│
Lock Acquired?
│ │
Yes No
│ │
│ └──► Return Error:
│ "Record is being modified"
│
▼
┌──────────────────────┐
│ Optimistic Lock │
│ (Version Check) │
└──────────┬───────────┘
│
Version Match?
│ │
Yes No
│ │
│ └──► Release Lock
│ Return Error:
│ "Version conflict detected"
│
▼
┌──────────────────────┐
│ Perform Operation │
│ (Update/Approve) │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ Release Lock │
└──────────────────────┘
Step 1: Pessimistic Lock catches concurrent operations happening right now.
Step 2: Optimistic Lock catches changes that happened earlier while the user was away.
Together, they ensure financial data integrity from every angle.
Implementation Details: How We Built It
This section walks through the actual implementation of our Redis-based locking system.
Choosing the Right Redis Library
We had two options for Redis locking in Go:
-
bsm/redislock- Simple, works great with a single Redis master -
go-redsync/redsync- Implements Redlock algorithm for multi-master Redis clusters
We chose bsm/redislock because our Redis deployment is single-master. For multi-master setups, you'd want go-redsync to handle the distributed consensus problem.
How the Lock System Works
Every lock in Redis follows a simple pattern:
Lock Key Format: lock_event_{resource}_{entity_id}
For example: lock_event_transaction_A2547 when Emma is editing Transaction #A2547.
Lock Lifetime (TTL):
- Quick edits: 10 seconds
- Data imports: 30 seconds
- Batch approvals: 15 minutes
Retry Strategy: If the lock is busy, we retry 3 times with exponential backoff (50ms, 100ms, 200ms). After that, we tell the user someone else is working on it.
Lock Metadata: We store what operation is holding the lock (create/update/delete/approve). This lets us give users helpful error messages like "This record is being approved" instead of generic "Resource locked" errors.
How Locks Work in Practice
When a user tries to edit a financial record, here's what happens:
- System generates a lock key based on the record identifier
- Check Redis: Is this locked? If yes, what operation is holding it?
- If available, create the lock with a unique token and store what operation is happening
- Set TTL so it auto-expires (prevents orphaned locks if something crashes)
The lock stored in Redis contains:
- A unique key identifying the specific record
- A random token proving ownership
- Metadata about the operation type (edit/approve/delete)
The token ensures only the lock owner can release it. The operation metadata helps show helpful error messages ("Record is being edited" instead of generic "Resource locked").
Two Ways to Release Locks
Pattern 1: Auto-Release (For Quick Operations)
For normal edits that finish in a few seconds:
- Acquire the lock
- Do the update
- Automatically release when done (even if something crashes)
- TTL: 10-30 seconds
Examples: Editing a field, updating an amount, creating a new record
Pattern 2: Manual Release (For Background Jobs)
For batch operations that take 15+ minutes:
The Problem: When a user initiates a large batch operation, the web request returns immediately, but the actual processing happens in a background job. If we auto-release the lock when the web request finishes, the lock is gone before the job even starts.
The Solution:
- Web request acquires the lock
- Store the lock token in the database
- Pass the token to the background job via message queue
- Background job releases the lock when it finishes
This way, the lock survives across the process boundary. If the job crashes, the lock expires after 15 minutes (TTL).
Safe Lock Release with Lua Script:
The manual release uses a Lua script to safely release locks. According to Redis distributed locks documentation, this is the correct way to avoid accidentally releasing another client's lock:
if redis.call("get",KEYS[1]) == ARGV[1] then
return redis.call("del",KEYS[1])
else
return 0
end
This script ensures we only delete the lock if the token matches—preventing us from accidentally releasing a lock that belongs to another process.
When locks get released:
- Job completes successfully → Released immediately
- Job fails after max retries → Released (can retry later with fresh lock)
- System crashes → Redis auto-expires after TTL
Lessons Learned: Building Concurrency Control for Financial Systems
When to Use Which Lock
Use Pessimistic Locking (Redis) when:
- Multiple users are actively editing the same records right now
- You need to block concurrent operations completely
- Operations might take a while or run in background jobs
- You need locks to survive across different servers/processes
Use Optimistic Locking (Version Check) when:
- You want to detect if data changed while user was away
- Conflicts are rare and you don't want to block everyone
- Operations are quick and you just need to verify data freshness at commit time
- You want defense-in-depth alongside pessimistic locks
What We Got Right
- TTL on everything - No orphaned locks if something crashes
- Exponential backoff retries - Give legitimate operations a chance to finish
- Operation metadata in locks - Users get helpful error messages
- Two-phase approach - Pessimistic + Optimistic catches all scenarios
- Lock monitoring - Track acquisition times, contention rates, timeouts
- Graceful Redis failures - Circuit breakers prevent cascading failures
The Trade-offs We Made
Performance vs Safety: Yes, locking adds latency. But in financial systems, correctness matters more than speed. We'd rather users wait a fraction of a second than risk data corruption.
Complexity vs Reliability: Redis adds infrastructure to maintain. But it's worth it to avoid database connection exhaustion and support async workflows.
Fine-grained locks: We lock individual records, not entire tables. This reduces contention but requires careful key design.
Final Thoughts
In financial systems, data integrity isn't optional. Every record must be accurate, every change must be tracked, and every concurrent access must be controlled.
The two-lock approach—pessimistic for real-time conflicts, optimistic for stale data—gives us defense in depth. And by choosing Redis over database locks, we can support the long-running batch operations that financial workflows require.
Is it more complex than no locking? Absolutely. Is it worth it? When dealing with financial data, regulatory compliance, and audit trails, the answer is always yes.
Top comments (0)