<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pavel Myslik</title>
    <description>The latest articles on DEV Community by Pavel Myslik (@pavelmyslik).</description>
    <link>https://dev.to/pavelmyslik</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3768672%2F8ccf3a00-0919-405d-8c47-543df44a35e5.png</url>
      <title>DEV Community: Pavel Myslik</title>
      <link>https://dev.to/pavelmyslik</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pavelmyslik"/>
    <language>en</language>
    <item>
      <title>Fixing Production Data in Rails: Lessons from a 6,000-Row Backfill</title>
      <dc:creator>Pavel Myslik</dc:creator>
      <pubDate>Wed, 25 Mar 2026 13:45:03 +0000</pubDate>
      <link>https://dev.to/pavelmyslik/fixing-production-data-in-rails-lessons-from-a-6000-row-backfill-54ca</link>
      <guid>https://dev.to/pavelmyslik/fixing-production-data-in-rails-lessons-from-a-6000-row-backfill-54ca</guid>
      <description>&lt;p&gt;Fixing a bug in production is usually straightforward. But what happens when the data in the database itself is broken? That’s when things get tricky.&lt;/p&gt;

&lt;p&gt;I learned this when a routine deploy quietly broke &lt;code&gt;confirmed_at&lt;/code&gt; on our &lt;code&gt;Order&lt;/code&gt; records. By the time anyone noticed, around &lt;strong&gt;6,000 rows&lt;/strong&gt; were affected — and dashboards, emails, and downstream services all depended on that field. &lt;/p&gt;

&lt;p&gt;The two-line code fix shipped in minutes, but the data backfill required a lot more thought.&lt;/p&gt;




&lt;h2&gt;
  
  
  The First Naive Attempt
&lt;/h2&gt;

&lt;p&gt;Once we understood the scope of the problem, the first instinct was obvious: fix it fast. Most Rails developers will reach for one of two things here — a quick one-liner in the console, or a small migration that updates the data directly.&lt;/p&gt;

&lt;p&gt;Something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;confirmed_at: &lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="s2"&gt;"confirmed"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
     &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;confirmed_at: &lt;/span&gt;&lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Note: In reality, determining the correct value for &lt;code&gt;confirmed_at&lt;/code&gt; was more complicated — we had to derive it from related records and business logic. For the purposes of this article, &lt;code&gt;Time.current&lt;/code&gt; keeps the examples simple and focused on the backfill pattern itself.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It looks clean. One line, no fuss. After a two-minute code fix, it feels like the natural next step.&lt;/p&gt;

&lt;p&gt;And this is exactly where production backfills could go wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Is Dangerous in Production
&lt;/h2&gt;

&lt;p&gt;At first glance, the one-liner looks harmless — simple, fast, and it works perfectly in development. The problem starts when you run it on real production data.&lt;/p&gt;

&lt;p&gt;There are a few risks that are easy to overlook:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No visibility into progress.&lt;/strong&gt; Once the query starts, you have no idea how many records have been updated or how many are left.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No safe way to restart.&lt;/strong&gt; If the process stops halfway — timeout, a deploy, a dropped connection — you don't know what state the data is in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No way to preview the change.&lt;/strong&gt; There's no dry run. Running it once already changes the data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hard to test properly.&lt;/strong&gt; You can't write a meaningful test for a console one-liner or an inline migration. If the logic is wrong, you find out in production.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large transactions put pressure on the database.&lt;/strong&gt; Updating thousands of rows in a single query can lock tables and slow down the app for real users.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The problem isn't the syntax. It's that once it starts, you've handed over control — and when other systems depend on this data being correct, that's a risk you don't want to take.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Safer Approach
&lt;/h2&gt;

&lt;p&gt;Instead of trying to fix everything in one query, we decided to treat the backfill like real code.&lt;/p&gt;

&lt;p&gt;Not a console one-liner.&lt;br&gt;
Not an inline migration.&lt;br&gt;
A small Ruby class that lives in the repository and can be tested like anything else.&lt;/p&gt;

&lt;p&gt;Something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;lib&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;backfills&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;backfill_confirmed_at&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The idea is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the script only triggers the backfill&lt;/li&gt;
&lt;li&gt;all the logic lives in a dedicated class&lt;/li&gt;
&lt;li&gt;the class can be tested safely before anything runs in production&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This keeps the backfill predictable, restartable, and — most importantly — verifiable before it touches a single row.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Backfill Class
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# lib/backfills/backfill_confirmed_at.rb&lt;/span&gt;

&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;Backfills&lt;/span&gt;
  &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BackfillConfirmedAt&lt;/span&gt;
    &lt;span class="nb"&gt;attr_reader&lt;/span&gt; &lt;span class="ss"&gt;:dry_run&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:logger&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;dry_run: &lt;/span&gt;&lt;span class="kp"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;logger: &lt;/span&gt;&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="vi"&gt;@dry_run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dry_run&lt;/span&gt;
      &lt;span class="vi"&gt;@logger&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;
      &lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="s2"&gt;"Starting | mode: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;dry_run&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s1"&gt;'DRY RUN'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'LIVE'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

      &lt;span class="n"&gt;processed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

      &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;dry_run&lt;/span&gt;
          &lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="s2"&gt;"[DRY RUN] Would update Order #&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;
          &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_columns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;confirmed_at: &lt;/span&gt;&lt;span class="n"&gt;confirmed_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="s2"&gt;"Updated Order #&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;

        &lt;span class="n"&gt;processed&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

      &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;StandardError&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="s2"&gt;"Error processing Order &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;

      &lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="s2"&gt;"Done. Processed: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;processed&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="kp"&gt;private&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt; &lt;span class="s2"&gt;"[Backfills::BackfillConfirmedAt] &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;confirmed_at: &lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="s2"&gt;"confirmed"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# In reality, the more complex logic of calculating the date&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;confirmed_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Testing the Backfill Class
&lt;/h2&gt;

&lt;p&gt;One of the biggest advantages of extracting the logic into a class is that you can test it like any other Ruby code — before running anything in production.&lt;/p&gt;

&lt;p&gt;Here are a few tests that cover the most important cases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# spec/lib/backfills/backfill_confirmed_at_spec.rb&lt;/span&gt;

&lt;span class="no"&gt;RSpec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt; &lt;span class="no"&gt;Backfills&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;BackfillConfirmedAt&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;let&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:null_logger&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="no"&gt;Logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;describe&lt;/span&gt; &lt;span class="s2"&gt;"#run"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;let!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:order&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:order&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="s2"&gt;"confirmed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;confirmed_at: &lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="s2"&gt;"when dry_run is enabled"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s2"&gt;"does not update any records"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="n"&gt;described_class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;dry_run: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;logger: &lt;/span&gt;&lt;span class="n"&gt;null_logger&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;

        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;confirmed_at&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be_nil&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="s2"&gt;"when order is confirmed with missing confirmed_at"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s2"&gt;"updates confirmed_at"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="n"&gt;described_class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;logger: &lt;/span&gt;&lt;span class="n"&gt;null_logger&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;

        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;confirmed_at&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;not_to&lt;/span&gt; &lt;span class="n"&gt;be_nil&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="s2"&gt;"when order already has confirmed_at"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;let&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:timestamp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;days&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ago&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="n"&gt;before&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;confirmed_at: &lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s2"&gt;"does not overwrite it"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="n"&gt;described_class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;logger: &lt;/span&gt;&lt;span class="n"&gt;null_logger&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;

        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;confirmed_at&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be_within&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;second&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="s2"&gt;"when order has a different status"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;before&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="s2"&gt;"pending"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s2"&gt;"does not touch it"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="n"&gt;described_class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;logger: &lt;/span&gt;&lt;span class="n"&gt;null_logger&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;

        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;confirmed_at&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be_nil&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that we pass a &lt;code&gt;null_logger&lt;/code&gt; to keep the test output clean — no need to see backfill logs while running the test suite.&lt;/p&gt;

&lt;p&gt;These tests won't catch every edge case, but they give you enough confidence to run the backfill knowing the core logic has been verified.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running the Backfill
&lt;/h2&gt;

&lt;p&gt;The script becomes very simple — it only decides whether we run a dry run or the real update:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# script/backfill_confirmed_at.rb&lt;/span&gt;

&lt;span class="n"&gt;dry_run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;ARGV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;include?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"--dry-run"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="no"&gt;Backfills&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;BackfillConfirmedAt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;dry_run: &lt;/span&gt;&lt;span class="n"&gt;dry_run&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Always Start With a Dry Run
&lt;/h3&gt;

&lt;p&gt;Before touching any production data, always preview what would change first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rails runner script/backfill_confirmed_at.rb &lt;span class="nt"&gt;--dry-run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output will look something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Backfills::BackfillConfirmedAt] Starting | mode: DRY RUN
[Backfills::BackfillConfirmedAt] [DRY RUN] Would update Order #10042
[Backfills::BackfillConfirmedAt] [DRY RUN] Would update Order #10051
[Backfills::BackfillConfirmedAt] [DRY RUN] Would update Order #10063
...
[Backfills::BackfillConfirmedAt] Done. Processed: 6000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a chance to verify a few things before anything is written:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Is the scope correct?&lt;/strong&gt; Spot-check a few IDs directly in the database.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Does the number of affected records match your expectations?&lt;/strong&gt; If you expected 6,000 rows but the dry run shows 10,000, something is wrong.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Are there any surprising records?&lt;/strong&gt; Maybe some orders have a status you didn't account for.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Dry run costs you a few minutes. &lt;br&gt;
A botched live backfill can cost you hours.&lt;/p&gt;
&lt;h3&gt;
  
  
  Then Run It For Real
&lt;/h3&gt;

&lt;p&gt;When everything looks correct:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rails runner script/backfill_confirmed_at.rb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;If you want even more control, consider extending the backfill class with a &lt;code&gt;limit:&lt;/code&gt; option to test on a small subset first, or a configurable &lt;code&gt;batch_size:&lt;/code&gt; for larger datasets.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  How to Safely Stop the Running Backfill
&lt;/h2&gt;

&lt;p&gt;One of the hidden benefits of this approach is that you can stop the script at any time — without worrying about leaving the data in a broken state.&lt;/p&gt;

&lt;p&gt;Because &lt;code&gt;find_each&lt;/code&gt; processes and commits records one by one, each update is independent. If you stop the script after 2,000 records, those 2,000 rows are correctly updated and stay that way. The remaining records are simply untouched.&lt;/p&gt;

&lt;p&gt;To stop the script, a simple &lt;code&gt;Ctrl+C&lt;/code&gt; is enough. When you're ready to continue, just run it again. The scope automatically skips already-updated records and picks up where it left off.&lt;/p&gt;

&lt;p&gt;This is the key difference from &lt;code&gt;update_all&lt;/code&gt; or an inline migration. A large single query either completes fully or rolls back entirely. If something goes wrong halfway through, you're back to square one.&lt;/p&gt;

&lt;p&gt;With batch processing, interrupting is not a failure. It's just a pause.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Workflow We Follow Now
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Not every data fix needs this treatment. If you're updating a handful of records, a quick one-liner in the console is perfectly fine.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;But once you're dealing with thousands of rows — especially when other systems depend on that data — it's worth slowing down and following a simple process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Understand the scope first&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before writing a single line of code, run a &lt;code&gt;COUNT&lt;/code&gt; query on production. Know exactly how many records are affected and confirm what the correct value should be.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Write the backfill class&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Extract all logic into a dedicated class in &lt;code&gt;lib/backfills/&lt;/code&gt;. Add a &lt;code&gt;dry_run&lt;/code&gt; option and logging from the start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Test it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Write at least a few basic tests before running anything. This is the step that's easiest to skip under pressure, and the one you'll be most grateful for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Dry run on production&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Run with &lt;code&gt;--dry-run&lt;/code&gt; first and review the output carefully. Verify the record count matches your expectations and spot-check a few IDs directly in the database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Run it for real&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Only when the dry run looks correct. Monitor the logs as it runs — that's what they're there for.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;The irony of this whole situation was that the code fix took two minutes. The data fix took almost a day — not because it was technically hard, but because we wanted to do it right.&lt;/p&gt;

&lt;p&gt;That ratio is worth remembering. A careless backfill can easily cause more damage than the original bug. Treating it like real code — with tests, dry runs, and logging — is not over-engineering. It's just respect for the data your users depend on.&lt;/p&gt;

&lt;p&gt;Next time you're staring at thousands of broken records, resist the one-liner. Take the extra hour. Your future self will thank you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What does your backfill process look like? I'm curious whether others reach for a similar pattern — or something completely different. Let me know in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rails</category>
      <category>ruby</category>
      <category>database</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I Thought My Rails Query Was Fine — Until NULL Ate My Data</title>
      <dc:creator>Pavel Myslik</dc:creator>
      <pubDate>Wed, 11 Mar 2026 13:02:16 +0000</pubDate>
      <link>https://dev.to/pavelmyslik/i-thought-my-rails-query-was-fine-until-null-ate-my-data-13ca</link>
      <guid>https://dev.to/pavelmyslik/i-thought-my-rails-query-was-fine-until-null-ate-my-data-13ca</guid>
      <description>&lt;p&gt;I ran into this while working on a task where I needed to process all &lt;code&gt;contracts&lt;/code&gt; that were not coming from SAP.&lt;/p&gt;

&lt;p&gt;At first, the query looked perfectly fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Contract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;source: &lt;/span&gt;&lt;span class="s1"&gt;'sap'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then I double-checked the result by counting returned objects — and something didn’t add up.&lt;/p&gt;

&lt;p&gt;You write a Rails query.&lt;br&gt;
It looks correct.&lt;br&gt;
It runs without errors.&lt;/p&gt;

&lt;p&gt;But it's quietly hiding records from you.&lt;/p&gt;

&lt;p&gt;Welcome to one of the most common — and maybe most dangerous — SQL gotchas.&lt;/p&gt;


&lt;h3&gt;
  
  
  The Setup
&lt;/h3&gt;

&lt;p&gt;Imagine you have a &lt;code&gt;contracts&lt;/code&gt; table with a &lt;code&gt;source&lt;/code&gt; column — which is &lt;strong&gt;nullable&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Some contracts have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;source = 'sap'&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;source = 'web'&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;and some have &lt;code&gt;NULL&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now you want to count all contracts that are &lt;strong&gt;not&lt;/strong&gt; from &lt;code&gt;'sap'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Sounds simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Contract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;source: &lt;/span&gt;&lt;span class="s1"&gt;'sap'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 68&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;68 contracts. Looks reasonable.&lt;/p&gt;

&lt;p&gt;But to be sure, I double-checked using Ruby:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Contract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;source&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;'sap'&lt;/span&gt; &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;size&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 310&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait... &lt;strong&gt;310?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Where did 242 records go?&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem: NULL Is Not a Value
&lt;/h3&gt;

&lt;p&gt;In SQL, &lt;code&gt;NULL&lt;/code&gt; does not mean &lt;em&gt;empty&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It means &lt;strong&gt;unknown&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That distinction changes everything.&lt;/p&gt;

&lt;p&gt;When SQL evaluates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;source&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;'sap'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;there are three possible outcomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="s1"&gt;'web'&lt;/span&gt;     &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;'sap'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;TRUE&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;included&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="s1"&gt;'sap'&lt;/span&gt;     &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;'sap'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;FALSE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;excluded&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
&lt;span class="k"&gt;NULL&lt;/span&gt;      &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s1"&gt;'sap'&lt;/span&gt;  &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;excluded&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That third row is the trap.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;NULL != 'sap'&lt;/code&gt; does not return &lt;code&gt;TRUE&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It returns &lt;code&gt;NULL&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;And in SQL’s three-valued logic, the &lt;code&gt;WHERE&lt;/code&gt; clause only keeps rows where the condition is &lt;strong&gt;TRUE&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;FALSE&lt;/code&gt; is excluded.&lt;br&gt;
&lt;code&gt;NULL&lt;/code&gt; is also excluded.&lt;/p&gt;

&lt;p&gt;Silently.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This isn’t just an issue with &lt;code&gt;!=&lt;/code&gt;. Any comparison or negation— &lt;code&gt;&amp;lt;&lt;/code&gt;, &lt;code&gt;&amp;gt;&lt;/code&gt;, &lt;code&gt;NOT LIKE&lt;/code&gt;, &lt;code&gt;NOT IN&lt;/code&gt;—can silently exclude &lt;code&gt;NULL&lt;/code&gt; rows.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Fix
&lt;/h3&gt;

&lt;p&gt;Here are three common ways to handle this safely, depending on your database and preference:&lt;/p&gt;

&lt;h4&gt;
  
  
  Option 1: Rails-style &lt;code&gt;.or&lt;/code&gt; with &lt;code&gt;where.not&lt;/code&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Contract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;not&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;source: &lt;/span&gt;&lt;span class="s1"&gt;'sap'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;or&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Contract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;source: &lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 310&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Rails-native syntax.&lt;/li&gt;
&lt;li&gt;Includes all &lt;code&gt;NULL&lt;/code&gt; values.&lt;/li&gt;
&lt;li&gt;Works on any database supported by Rails.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
   Option 2: Explicitly include NULLs in SQL  with &lt;code&gt;OR&lt;/code&gt;
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Contract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"source != ? OR source IS NULL"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'sap'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 310&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Simple SQL pattern.&lt;/li&gt;
&lt;li&gt;Works reliably across most databases (PostgreSQL, MySQL, SQLite).&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Option 3: Use &lt;code&gt;IS DISTINCT FROM&lt;/code&gt; (PostgreSQL)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Contract&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"source IS DISTINCT FROM ?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'sap'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 310&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;PostgreSQL-specific.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;NULL&lt;/code&gt; is treated like a regular value.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Takeaway
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;NULL&lt;/code&gt; didn’t break my query. It behaved exactly as SQL intended.&lt;br&gt;&lt;br&gt;
That’s what makes it dangerous — no error, no warning, no sign anything went wrong. Just missing data, quietly waiting for you to notice.&lt;/p&gt;

&lt;p&gt;Once you understand that &lt;code&gt;NULL&lt;/code&gt; means &lt;strong&gt;unknown&lt;/strong&gt;, not &lt;em&gt;empty&lt;/em&gt;, the behavior starts to make sense.&lt;br&gt;&lt;br&gt;
Until then… it can silently cost you &lt;strong&gt;242 records&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Have you ever shipped a bug caused by &lt;code&gt;NULL&lt;/code&gt; hiding in a query?&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rails</category>
      <category>webdev</category>
      <category>sql</category>
      <category>postgres</category>
    </item>
    <item>
      <title>Stop Using .any? the Wrong Way in Rails</title>
      <dc:creator>Pavel Myslik</dc:creator>
      <pubDate>Tue, 24 Feb 2026 15:30:27 +0000</pubDate>
      <link>https://dev.to/pavelmyslik/stop-using-any-the-wrong-way-in-rails-429e</link>
      <guid>https://dev.to/pavelmyslik/stop-using-any-the-wrong-way-in-rails-429e</guid>
      <description>&lt;p&gt;A single block passed to &lt;code&gt;.any?&lt;/code&gt; can silently load thousands of records into memory.&lt;/p&gt;

&lt;p&gt;No warnings. No errors. Just unnecessary objects.&lt;/p&gt;

&lt;p&gt;And most Rails developers don’t notice it.&lt;/p&gt;

&lt;p&gt;You’ve probably used both &lt;code&gt;.any?&lt;/code&gt; and &lt;code&gt;.exists?&lt;/code&gt; in your Rails app without thinking twice. They both answer the same simple question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Is there at least one record?&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But under the hood, they can behave very differently.&lt;/p&gt;

&lt;p&gt;In this article, we’ll look at what actually happens when you call each method, when to use which, and how to avoid a common performance trap.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Basics: Same Query, Same Result
&lt;/h3&gt;

&lt;p&gt;If you just need to check whether a relation contains any records at all, both &lt;code&gt;.any?&lt;/code&gt; and &lt;code&gt;.exists?&lt;/code&gt; generate the same efficient query.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;any?&lt;/span&gt;
&lt;span class="c1"&gt;# SELECT 1 AS one FROM "posts" WHERE "posts"."user_id" = 1 LIMIT 1&lt;/span&gt;

&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists?&lt;/span&gt;
&lt;span class="c1"&gt;# SELECT 1 AS one FROM "posts" WHERE "posts"."user_id" = 1 LIMIT 1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No objects are loaded into memory.&lt;br&gt;
No full table scan.&lt;/p&gt;

&lt;p&gt;Both methods ask the database a simple yes/no question and return immediately after finding the first match.&lt;/p&gt;

&lt;p&gt;The same applies when you chain &lt;code&gt;.where&lt;/code&gt; — as long as you don’t pass a block to &lt;code&gt;.any?&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;published: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;any?&lt;/span&gt;
&lt;span class="c1"&gt;# SELECT 1 AS one FROM "posts"&lt;/span&gt;
&lt;span class="c1"&gt;# WHERE "posts"."user_id" = 1 AND "posts"."published" = true&lt;/span&gt;
&lt;span class="c1"&gt;# LIMIT 1&lt;/span&gt;

&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;published: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;exists?&lt;/span&gt;
&lt;span class="c1"&gt;# SELECT 1 AS one FROM "posts"&lt;/span&gt;
&lt;span class="c1"&gt;# WHERE "posts"."user_id" = 1 AND "posts"."published" = true&lt;/span&gt;
&lt;span class="c1"&gt;# LIMIT 1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So if this is all you need, pick whichever reads better in your code.&lt;/p&gt;

&lt;p&gt;There’s no performance difference here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where Things Get Dangerous: &lt;code&gt;.any?&lt;/code&gt; With a Block
&lt;/h3&gt;

&lt;p&gt;The moment you pass a block to &lt;code&gt;.any?&lt;/code&gt;, Rails completely changes its behavior.&lt;/p&gt;

&lt;p&gt;Instead of asking the database, it loads every matching record into memory and filters in Ruby:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;any?&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;published?&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;# SELECT "posts".* FROM "posts" WHERE "posts"."user_id" = 1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That single line:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Loads &lt;strong&gt;all posts&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Instantiates an ActiveRecord object for each one&lt;/li&gt;
&lt;li&gt;Iterates over them in Ruby&lt;/li&gt;
&lt;li&gt;Just to return boolean&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It might look harmless in development.&lt;/p&gt;

&lt;p&gt;But in production?&lt;/p&gt;

&lt;p&gt;If a user has 50,000 posts, you just loaded 50,000 objects into memory to check if one of them is published.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Happens
&lt;/h3&gt;

&lt;p&gt;Here’s how &lt;code&gt;.any?&lt;/code&gt; is implemented in &lt;a href="https://github.com/rails/rails/blob/main/activerecord/lib/active_record/relation.rb#L401-L406" rel="noopener noreferrer"&gt;Rails&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;any?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kp"&gt;false&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vi"&gt;@none&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;super&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;present?&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;block_given?&lt;/span&gt;
  &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;empty?&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If a block is given, Rails delegates to &lt;code&gt;Enumerable#any?&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;And &lt;code&gt;Enumerable#any?&lt;/code&gt; needs the full collection in memory.&lt;/p&gt;

&lt;p&gt;You’ve effectively moved filtering from SQL to Ruby.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fix: Let the Database Do the Work
&lt;/h3&gt;

&lt;p&gt;Push the condition into SQL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;published: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# SELECT 1 AS one FROM "posts"&lt;/span&gt;
&lt;span class="c1"&gt;# WHERE "posts"."user_id" = 1 AND "posts"."published" = TRUE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;One row&lt;/li&gt;
&lt;li&gt;One column&lt;/li&gt;
&lt;li&gt;Stops at the first match&lt;/li&gt;
&lt;li&gt;No object instantiation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same result. Much lower cost.&lt;/p&gt;

&lt;h3&gt;
  
  
   When &lt;code&gt;.any?&lt;/code&gt; Is Actually the Right Choice
&lt;/h3&gt;

&lt;p&gt;There is one important exception.&lt;/p&gt;

&lt;p&gt;If the relation is &lt;strong&gt;already loaded&lt;/strong&gt;, &lt;code&gt;.any?&lt;/code&gt; will not hit the database again.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;User&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:posts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;any?&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;published?&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c1"&gt;# no extra query&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because &lt;code&gt;posts&lt;/code&gt; were preloaded, &lt;code&gt;.any?&lt;/code&gt; works entirely in memory.&lt;/p&gt;

&lt;p&gt;In this case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;.any?&lt;/code&gt; → no additional query&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.exists?&lt;/code&gt; → forces a new SQL query&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So using &lt;code&gt;.exists?&lt;/code&gt; here could actually introduce unnecessary database calls — potentially even an N+1 pattern.&lt;/p&gt;

&lt;p&gt;Rails internally checks whether the relation is loaded.&lt;br&gt;
If it is, &lt;code&gt;.any?&lt;/code&gt; behaves like a normal Ruby collection.&lt;/p&gt;

&lt;h3&gt;
  
  
  A Better Rule of Thumb
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Relation &lt;strong&gt;not&lt;/strong&gt; loaded → prefer &lt;code&gt;.exists?&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Relation &lt;strong&gt;already&lt;/strong&gt; loaded → &lt;code&gt;.any?&lt;/code&gt; is perfectly fine&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.any?&lt;/code&gt; with a block → avoid on ActiveRecord relations&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;.present?&lt;/code&gt; for existence → avoid on ActiveRecord relations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Final Thought
&lt;/h3&gt;

&lt;p&gt;Before calling &lt;code&gt;.any?&lt;/code&gt;, ask yourself:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Am I checking existence — or am I about to load an entire collection?&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Small differences in ActiveRecord APIs can have real production impact.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Have you ever spotted this in a real codebase?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This is part of a small series exploring subtle ActiveRecord behaviors that can impact performance.&lt;/p&gt;

</description>
      <category>rails</category>
      <category>ruby</category>
      <category>performance</category>
      <category>sql</category>
    </item>
  </channel>
</rss>
