DEV Community

Cover image for Finding Race Conditions Before Production: How `race_guard` Helps Ruby and Rails Apps Surface Unsafe Concurrency
Vinicius Porto
Vinicius Porto

Posted on

Finding Race Conditions Before Production: How `race_guard` Helps Ruby and Rails Apps Surface Unsafe Concurrency

Most Rails race conditions do not look dramatic at first. They look like normal application code.

A request reads a balance, adds a value, and saves it. A model validates uniqueness before insert. A job is enqueued inside a transaction. A Sidekiq worker runs the same scheduled task on two hosts. Each line is reasonable in isolation, but under real concurrency, those assumptions can break.

That is the problem race_guard is designed to expose.

race_guard is not trying to make your entire Ruby application thread-safe by magic. It is better described as a concurrency smoke alarm: it helps you find the places where your code assumes serialization, atomicity, or uniqueness already exists, then points you toward the correct hard guarantee.

Use database constraints for correctness, locks for serialization, idempotency keys for retries, and race_guard to find the code paths that forgot they needed one of those tools.

Why Race Conditions Still Matter in Rails

Rails makes many things feel sequential, but production is rarely sequential.

A typical app may have:

  • Puma handling multiple requests in parallel.
  • Sidekiq running jobs across many threads and processes.
  • Multiple app containers serving the same database.
  • Scheduled jobs that can overlap during deploys or retries.
  • External side effects like emails, webhooks, and HTTP calls.
  • ActiveRecord transactions that commit later than the code around them suggests.

The classic failure is a read-modify-write flow:

wallet = Wallet.find(user_id)
wallet.balance += 10
wallet.save!
Enter fullscreen mode Exit fullscreen mode

If two workers do this at the same time, both can read the same original value. Each computes a new value from stale state, and the later save overwrites the earlier one. The code is valid Ruby. The bug only exists because two correct executions interleaved.

race_guard exists to make those interleavings visible earlier, especially in development, test, and CI.

What race_guard Does

At its core, race_guard provides a small reporting and detection framework for concurrency risks in Ruby and Rails apps.

It includes:

  • A central RaceGuard.report event pipeline.
  • Pluggable reporters for logs, files, JSON, and webhooks.
  • Thread-local context to understand what code is currently protected or inside a transaction.
  • Rails and ActiveRecord integrations.
  • Detectors for risky patterns such as read-modify-write, commit safety, missing unique indexes, and shared mutable state.
  • A distributed execution guard for coordinating work across threads, processes, and hosts.

The default posture is intentionally conservative. The gem is designed to be useful in development and test first, where teams can surface unsafe patterns without changing production behavior. Production usage is opt-in.

Detecting Read-Modify-Write Bugs

One of the most common concurrency bugs in Rails is reading a value, changing it in Ruby, and saving the result.

For example:

wallet = Wallet.find(id)
wallet.balance = wallet.balance + amount
wallet.save!
Enter fullscreen mode Exit fullscreen mode

That code assumes no other worker changed the same row between the read and the write.

race_guard can watch configured ActiveRecord models and report when an attribute is read and then later persisted through a pattern that looks like read-modify-write. The goal is not to prove every possible race. The goal is to catch the dangerous shape before it becomes an incident.

A safer version might use SQL-level atomicity:

Wallet.where(id: id).update_all(["balance = balance + ?", amount])
Enter fullscreen mode Exit fullscreen mode

Or pessimistic locking:

wallet.with_lock do
  wallet.balance += amount
  wallet.save!
end
Enter fullscreen mode Exit fullscreen mode

The important point is that race_guard does not replace the fix. It helps you find where the fix is needed.

Validations Are Not Constraints

Another common Rails trap is relying on application-level uniqueness validation:

validates :email, uniqueness: true
Enter fullscreen mode Exit fullscreen mode

This is useful for user experience, but it is not a concurrency guarantee. Two requests can both pass validation before either insert commits.

The durable fix is a unique database index:

add_index :users, :email, unique: true
Enter fullscreen mode Exit fullscreen mode

race_guard includes static analysis that compares model uniqueness validations against schema indexes. If a model declares uniqueness but the database does not enforce it, the gem can report that mismatch.

This is a good example of the gem’s philosophy: the database should enforce correctness, while race_guard helps identify places where the application is relying on a weaker guarantee.

Commit Safety: Side Effects Before the Transaction Is Real

A subtler class of bugs happens when code triggers side effects before the database transaction commits.

For example:

ApplicationRecord.transaction do
  order.update!(status: "paid")
  ReceiptMailer.receipt(order).deliver_later
end
Enter fullscreen mode Exit fullscreen mode

If the transaction rolls back after the mail job is enqueued, the outside world may observe something that never actually committed.

Similar issues can happen with:

  • perform_later
  • mail delivery
  • HTTP calls
  • webhooks
  • external API requests

race_guard can intercept these operations and report when they happen inside an open transaction. The safer pattern is to defer side effects until after commit, using Rails mechanisms such as after_commit or an explicit outbox pattern.

Again, the gem does not pretend to own your architecture. It makes the risky timing visible.

Distributed Execution Guard

Some race conditions are not about two threads touching one row. They are about two processes doing the same work at the same time.

Examples:

  • A scheduled job starts on two app hosts.
  • A deploy overlaps with an old worker still running.
  • A retry begins while the original job is still active.
  • Two Sidekiq processes pick up equivalent work.

For this, race_guard now includes a distributed execution guard:

RaceGuard.distributed_once("nightly-report", resource: Date.today) do
  ReportGenerator.run!
end
Enter fullscreen mode Exit fullscreen mode

The API coordinates execution by building a stable lock key from the logical name and optional resource. With Redis, it uses SET NX EX to claim a lock with a TTL, and Lua compare-and-delete to release only if the owner token matches.

That gives you at-most-one active execution while the lock is valid.

This wording matters. It is not exactly-once execution. TTLs expire. Workers can crash. Networks can split. Redis can be unavailable. If the work must be permanently idempotent, you still need idempotency keys, durable state, or database constraints.

But for many operational tasks, “only one active runner right now” is exactly the coordination layer teams need.

Reporting Instead of Guessing

Every detector feeds into RaceGuard.report, which produces structured events. That means teams can decide how strict each signal should be.

During early adoption, you might log warnings locally. In CI, you might write JSON reports and fail builds for high-confidence issues. In production, you might enable only specific low-noise checks and ship events to your observability stack.

That flexibility is important because concurrency tools can become noisy if they pretend every risk is equally severe. race_guard is meant to be adopted gradually.

A practical rollout looks like this:

  1. Enable it in development and test.
  2. Start with reporting, not raising.
  3. Fix obvious issues such as missing unique indexes.
  4. Add focused tests for known risky flows.
  5. Raise severity in CI for patterns your team considers unacceptable.
  6. Enable production signals only for carefully chosen detectors.

What It Does Not Promise

The most important part of a concurrency tool is being honest about its boundaries.

race_guard does not make Rails code automatically safe. It does not replace database constraints. It does not guarantee exactly-once execution. It does not prove the absence of every race condition.

Instead, it helps teams discover unsafe concurrency assumptions in code that otherwise looks normal.

That is valuable because many race conditions are not caused by ignorance. They are caused by missing feedback. The code passes tests because the tests are sequential. It works locally because there is only one process. It fails in production because production finally supplies the interleaving.

The Bigger Idea

The best concurrency strategy is layered.

Use unique indexes when uniqueness matters. Use atomic SQL when updating counters. Use row locks when a transaction must serialize access. Use after-commit hooks or outbox patterns for side effects. Use idempotency keys for retries. Use distributed locks when only one active process should run a task.

race_guard sits beside those tools. It helps identify where they are missing.

That makes it less of a lock library and more of an engineering feedback system. It gives Ruby and Rails teams a way to see the concurrency assumptions hidden in everyday code, before users are the ones who discover them.

Top comments (0)