Rails find_each/ find_in_batches pitfalls

#rails #activerecord

Every rails-developer knows about method find_each or find_in_batches (if you check the implementation for the first one you find that it uses the second one under the hood) from ActiveRecord. Even more, it is a good practice to use these methods when you need to iterate throw a big amount of records in DB.
If you don't familiar with it you can check this https://apidock.com/rails/ActiveRecord/Batches/find_each.

Looks like, we always can/must use find_each instead of each when we need to iterate throw some bunch of records. We also thought this way. But one of our applications struggled with DB: load for DB was around 95%, it was obvious that DB is a bottleneck.

I'd started to analyze slow-query logs in MySQL and reports in NewRelic. It was quite simple to find the place in the code which produce this SQL query:

 Survey.select(:id, :offer_uuid).active.where(provider: provider).where.not(survey_id: current_surveys).find_in_batches(batch_size: 10) do |survey_batch|
  DeactivateSurveysService.perform(survey_batch, provider)
end

Because of using where.not MySQL should fetch all records in order to sort them by id (this sorting used in find_each/find_in_batches) and only after this takes 10 records - it can very slow if a scope of the records is big how we had.

CONCLUSION

It is good practice using find_each/find_in_batches in order to not consume a lot of memory but need to be aware of what resulting SQL is.