Every rails-developer knows about method
find_in_batches (if you check the implementation for the first one you find that it uses the second one under the hood) from ActiveRecord. Even more, it is a good practice to use these methods when you need to iterate throw a big amount of records in DB.
If you don't familiar with it you can check this https://apidock.com/rails/ActiveRecord/Batches/find_each.
Looks like, we always can/must use
find_each instead of
each when we need to iterate throw some bunch of records. We also thought this way. But one of our applications struggled with DB: load for DB was around 95%, it was obvious that DB is a bottleneck.
I'd started to analyze slow-query logs in MySQL and reports in NewRelic. It was quite simple to find the place in the code which produce this SQL query:
Survey.select(:id, :offer_uuid).active.where(provider: provider).where.not(survey_id: current_surveys).find_in_batches(batch_size: 10) do |survey_batch| DeactivateSurveysService.perform(survey_batch, provider) end
Because of using
where.not MySQL should fetch all records in order to sort them by id (this sorting used in
find_each/find_in_batches) and only after this takes 10 records - it can very slow if a scope of the records is big how we had.
It is good practice using
find_in_batches in order to not consume a lot of memory but need to be aware of what resulting SQL is.