DEV Community

appaloosa.io
appaloosa.io

Posted on

3 1

Ruby memory, ActiveRecord and Draper

At Appaloosa we often process CSV. When we process thousands of CSV lines of users that need to be added to (or removed from) a group, we are seeing a huge increase in memory. Sometimes the Heroku dynos cannot manage it.

We will explain how we jump from this when processing 4_000 users (look at blue line but also time):

To that:

memory profiler

We used stackprof, memory_profiler. We look at tool like speedscope for stackprof results and always it point at heavy use of ActiveRecord. Also we noticed in “Allocated String Report” of memory_profiler a string allocated multiples times.

With 4000 users:

Allocated String Report
-----------------------------------
   1619100  "decorated?"
   1619100  .rvm/gems/ruby-2.5.3@app/gems/activerecord-5.2.1/lib/active_record/attribute_methods.rb:279

"decorated?" where does it come from?

It’s instantiated here: .rvm/gems/ruby-2.5.3@app/gems/activerecord-5.2.1/lib/active_record/attribute_methods.rb:281

def respond_to?(name, include_private = false)
return false unless super
case name
when :to_partial_path
name = "to_partial_path".freeze
when :to_model
name = "to_model".freeze
else
name = name.to_s # ◀️◀️◀️◀️◀️ here
end
#...

If we add a puts caller when "decorated?" is instantiated we have this:

draper-3.0.1/lib/draper/decoratable/equality.rb:21:in `test_for_decorator'
draper-3.0.1/lib/draper/decoratable/equality.rb:8:in `=='
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:281:in `index'
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:281:in `add_to_target'
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:437:in `block in concat_records'
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:435:in `each'
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:435:in `concat_records'
activerecord-5.2.1/lib/active_record/associations/has_many_association.rb:116:in `concat_records'
activerecord-5.2.1/lib/active_record/associations/has_many_through_association.rb:27:in `concat_records'
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:119:in `block in concat'
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:134:in `block in transaction'
activerecord-5.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:254:in `block in transaction'
activerecord-5.2.1/lib/active_record/connection_adapters/abstract/transaction.rb:239:in `block in within_new_transaction'
ruby-2.5.3/lib/ruby/2.5.0/monitor.rb:226:in `mon_synchronize'
activerecord-5.2.1/lib/active_record/connection_adapters/abstract/transaction.rb:236:in `within_new_transaction'
activerecord-5.2.1/lib/active_record/connection_adapters/abstract/database_statements.rb:254:in `transaction'
activerecord-5.2.1/lib/active_record/transactions.rb:212:in `transaction'
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:133:in `transaction'
activerecord-5.2.1/lib/active_record/associations/collection_association.rb:119:in `concat'
activerecord-5.2.1/lib/active_record/associations/has_many_through_association.rb:21:in `concat'
activerecord-5.2.1/lib/active_record/associations/collection_proxy.rb:1056:in `<<'
app/models/assign_users_to_group_line.rb:53:in `block in create_and_add_to_group'
app/models/assign_users_to_group_line.rb:51:in `tap'
app/models/assign_users_to_group_line.rb:51:in `create_and_add_to_group'
app/models/assign_users_to_group_line.rb:31:in `create_from'
app/workers/assign_users_to_group_worker.rb:94:in `add_line'
app/workers/assign_users_to_group_worker.rb:57:in `block in parse_csv_and_import_users'

First we didn’t understand how we can jump from Draper to active_record/associations/collection_association.rb like:

.rvm/gems/ruby-2.5.3@app/gems/draper-3.0.1/lib/draper/decoratable/equality.rb:8:in `=='
.rvm/gems/ruby-2.5.3@app/gems/activerecord-5.2.1/lib/active_record/associations/collection_association.rb:281:in `index'

The code on Draper part is:

module Draper
module Decoratable
module Equality
# Compares self with a possibly-decorated object.
#
# @return [Boolean]
def ==(other) # ◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️ called from ActiveRecord 🧐
super || Equality.test_for_decorator(self, other)
end
# Compares an object to a possibly-decorated object.
#
# @return [Boolean]
def self.test(object, other)
return object == other if object.is_a?(Decoratable)
object == other || test_for_decorator(object, other)
end
# @private
def self.test_for_decorator(object, other)
other.respond_to?(:decorated?) && other.decorated? && # <<<<<<<<<<< call 1 - decorated?
other.respond_to?(:object) && test(object, other.object)
end
end
end
end

Then the code in active_record/associations/collection_association.rb:281 looks like:

def add_to_target(record, skip_callbacks = false, &block)
if association_scope.distinct_value
index = @target.index(record) # ◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️◀️
end
replace_on_target(record, index, skip_callbacks, &block)
end

We jump from index = @target.index(record) to Draper::Decoratable::Equality.== 🤔.

But how?

Christophe Maximin (thanks a lot, Christophe) came with this answer:

(Wild guess by just reading the AR code) Because if the same user (same primary key) is already in users, it just replaces that old object with the new object you’re passing with <<, which is why it needs to know the index of the old one. And to know if records are the same, it needs to call ==.

If you want a full example of how it is called, you can use [TracePoint](https://ruby-doc.org/core-2.5.3/TracePoint.html). Without and with having == in Integer:

x=[1,2,3];
TracePoint.new(:call, :c_call){ |tp|
puts "#{tp.defined_class} #{tp.method_id} #{tp.path}:#{tp.lineno}"
}.enable { x.index(2) }
# Array index (irb):4
# => 1
x=[1,2,3];
class Integer; def ==(_); super; end; end
# => :==
TracePoint.new(:call, :c_call){ |tp|
puts "#{tp.defined_class} #{tp.method_id} #{tp.path}:#{tp.lineno}"
}.enable { x.index(2) }
# Array index (irb):9
# Integer == (irb):5
# Comparable == (irb):5
# Integer <=> (irb):5
# => 1

Tracepoint “Establishes proc as the handler for tracing” calls :call: for call to Ruby method :c_call: for a call to a C-language routine

What happens on a real app?

To be sure we were not testing a side effect of our app, we created a new rails app, with 2 models, Draper installed. Then we monkey patched add_to_target for ActiveRecord.

module CoreExtensions
module ActiveRecord
module Associations
module CollectionAssociation
def add_to_target(record, skip_callbacks = false, &block)
puts "#" * 60
puts "lib/core_extensions.rb:7\n"
puts caller
puts
TracePoint.new(:call, :c_call){ |tp|
puts tp.defined_class
puts " -> #{tp.method_id}"
puts " -> #{tp.path}:#{tp.lineno}"
}.enable { @target.index(record) }
puts
puts "*" * 60
super
end
end
end
end
end

We put the code to reproduce the concatenation in a rake task.

If we add two records rails concat_records:comments[2] on the second record we have this interesting result:

-> index
-> code/draper_allocations/lib/core_extensions.rb:17
Draper::Decoratable::Equality
-> ==
-> .rvm/gems/ruby-2.5.3/gems/draper-3.0.1/lib/draper/decoratable/equality.rb:7
ActiveRecord::Core
-> ==
-> .rvm/gems/ruby-2.5.3/gems/activerecord-5.2.1/lib/active_record/core.rb:420
BasicObject
-> ==
-> .rvm/gems/ruby-2.5.3/gems/activerecord-5.2.1/lib/active_record/core.rb:421
Kernel
-> class
-> .rvm/gems/ruby-2.5.3/gems/activerecord-5.2.1/lib/active_record/core.rb:422
........
view raw block10.Total hosted with ❤ by GitHub

We can see that we are calling index, then Draper::Decoratable::Equality (that override ==), then ActiveRecord:Core, etc...

The issue is when the array of record we concat contains lot’s of ActiveRecord objects. Every time we add one element, we redo all of those calls. Also Draper in is Equality method add more calls, and one of the call adds :decorated? string. All of this chain have a cost.

Let’s measure it.

Measuring allocation with Draper, without Draper, without concatenation

With Draper (branch)

Draper gem installed. No decorators defined but == is overridden by Draper as we can see.

article_1 = Article.create
article_2 = Article.create
TracePoint.new(:call, :c_call){ |tp|
puts tp.defined_class
puts " -> #{tp.method_id}"
puts " -> #{tp.path}:#{tp.lineno}"
}.enable { article_1 == article_2 }
::Decoratable::Equality
-> ==
-> .rvm/gems/ruby-2.5.3/gems/draper-3.0.1/lib/draper/decoratable/equality.rb:7
ActiveRecord::Core
-> ==
-> .rvm/gems/ruby-2.5.3/gems/activerecord-5.2.1/lib/active_record/core.rb:420
BasicObject
-> ==
-> .rvm/gems/ruby-2.5.3/gems/activerecord-5.2.1/lib/active_record/core.rb:421
view raw block11.Total hosted with ❤ by GitHub

If we run a measure with memory_profiler: rails concat_records:comments[2000]. We get:

Total allocated: 327.93 MB (4862043 objects)

Total retained:  17.67 MB (172370 objects)

And we can see also:

Allocated String Report

-----------------------------------

1999000  "decorated?"

1999000  .rvm/gems/ruby-2.5.3/gems/activerecord-5.2.1/lib/active_record/attribute_methods.rb:282

Without Draper (branch)

First we don’t have any more the very present allocated string "decorated?". First allocated string is ”” with 40185 occurrences. That’s a gain of 1 958 815 strings.

Allocated String Report

-----------------------------------

40185  ""

24000  .rvm/gems/ruby-2.5.3/gems/activesupport-5.2.1/lib/active_support/tagged_logging.rb:23

If we run a measure with memory_profiler rails concat_records:comments[2000]. We get:

Total allocated: 247.97 MB (2863011 objects)

Total retained:  17.67 MB (172370 objects)

It’s little bit better.

Without concat but with Draper (branch)

Same as before without Draper we don’t have any more very present allocated string "decorated?":

Allocated String Report

-----------------------------------

40185  ""

24000  .rvm/gems/ruby-2.5.3/gems/activesupport-5.2.1/lib/active_support/tagged_logging.rb:23

And…

Total allocated: 218.73 MB (2377978 objects)

Total retained:  31.95 kB (284 objects)

This is a huge gain but we don’t validate that we don’t create a duplicated entry into the join table. But this is something easy to handle.

How this can happening?

  • If their is a distinct on the association scope like this:
class Post < ActiveRecord::Base
has_many :comments, -> { distinct }
end
class Comment < ActiveRecord::Base
belongs_to :post
end
view raw block18.rb hosted with ❤ by GitHub
  • Then when ActiveRecord needs to concatenate. It will call @target.index(record). If @target have a lot's of elements this will consume more some CPU and memory.

  • And if Draper is installed this will add extra cost because Draper override == called from index. This override will allocate extra strings we don't need ("decorated?").

Looking in index plus using Draper is a logarithmic cost for iteration per second and memory. More element you have in "target" slower it gets.

So careful when using distinct in a scope. Also careful with Draper.

Here is a diagram of what is happening

Diagram of index search and Draper overrideDiagram of index search and Draper override

In our code base the patch looks like:

Thanks

Thanks to Appaloosa Team, Christophe Maximin, Ryuta Kamizono (kamipo), Nate Berkopec and the Rails Speed Community Slack.

👋

You should subscribe to our blog to be notified whenever a new article is available!

This article was written by Benoit Tigeot of Appaloosa’s dev team.

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

Heroku

This site is built on Heroku

Join the ranks of developers at Salesforce, Airbase, DEV, and more who deploy their mission critical applications on Heroku. Sign up today and launch your first app!

Get Started

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay