ActiveRecord is one of those many things in Rails that can be both good and bad depending on the perspective, the size of the project, how it's used, etc.
It's loved when it performs a really good job on abstracting away the specifics of the DB being used (like any good ORM should do), when it makes really easy to define and bootstrap a fully functional app with the domain classes and the persistence mechanism in place, when it helps keeping the data valid and consistent through validators, when it facilitates the migration and rollback of DB schema changes, etc.
It's hated when it makes really easy to tighly couple the application domain and logic to the persistence layer, when it faciliates the proliferation of God objects since it's so easy to put more and more behavior inside the record models and pass them around everywhere, when your domain classes become a big ball of mud knowing too much about everything and with hooks everyhwere, when it makes your tests slow, etc.
But one thing that it should strive, apart from the pros and cons, is to maintain consistency on its behavior and usage. Recently I noticed a weird behavior with the
belongs_to association which is somewhat inconsistent with the other associations and hurts predictabiliy when using this association. In fact, it's such an easy-to-reproduce scenario that I don't know how I didn't notice that before, after years of developing with Rails.
Attention: the following examples that I'll use are unlikely scenarios that are not expected to happen on most applications. In case it happens, it's most likely because there is a bug in place. We should anyway be prepared for it.
Suppose you have the following in your Rails application:
create_table :products do |t| t.string :title end create_table :reviews do |t| t.references :product t.text :comment end class Product < ApplicationRercord has_many :reviews end class Review < ApplicationRecord belongs_to :product end
Suppose now that someone just removed the
reviews.product_id column but forgot to remove the related
has_many :reviews association from
Product. The next time we hit
product.reviews, what it should happen? If you're expecting an error to be raised, you got it right.
product.reviews # => ActiveRecord::StatementInvalid ([db_specific_error_class]: no such column: reviews.product_id)
Moreover, suppose that, instead of removing just the FK column from
reviews, the whole table is removed along with the model class. In this case, an error is still raised, although different, since the association's target class definition cannot be found.
product.reviews # => NameError (uninitialized constant Product::Review)
These two behaviors help a lot on tracking issues in the app, pointing remaining pieces of code that were forgotten somehow in the process of removing a feature from the app. The next time these pieces of code are executed, a big error would pop up in the error monitoring tool and would hardly slip under the radar.
Let's go back to the first scenario where the
reviews.product_id column was removed. Now suppose that the assocation
belongs_to :product in
Review was not removed (very unlikely, but still). What would then happen the next time the we hit
review.product? If you like consistency and guessed an error, you got that wrong, unfortunately.
review.product # => nil
That's right. It just silently returns
nil. The same
nil that would be returned if the
product_id column were still there but empty.
It gets even more interesting in the scenario where the whole table/model
Product is removed. What happens now with
review.product? If you're smart and noticed where this going and guessed
nil, you got it right.
Product.take # => NameError (uninitialized constant Product) review.product # => nil
Super weird. By returning
nil, we would never be able to spot these issues early. They would pass by silently and days could go by producing unexpected results until someone notices something is wrong. It also just does not make sense at all. At least for me. If the underlying DB column does not exist, why would the association still be returning
nil? We could just easily pollute an entire model with fake associations without any consequences.
class Review < ApplicationRecord belongs_to :i_dont belongs_to :care belongs_to :about belongs_to :valid belongs_to :associations end %i[i_dont care about valid associations].map do |association| review.send association end # => [nil, nil, nil, nil, nil]
In my opinion this goes very much against the robustness of
active_record itself and all its ways to keep data consistent and valid, acting as a mirrored version of the DB schema on steroids, to keep wrong and incorrect data out.
Assuming active_record latest version: 6.0.1
Digging deeply into active_record's source code, there is this method
find_target? in the
def find_target? !loaded? && foreign_key_present? && klass end
Its sole purpose is to check whether the target association needs to be found and loaded given the current conditions. This method is overriden in this class, meaning that the other associations use a more abstract and slightly different version to check if the target association must be loaded.
From the code, the target
belongs_to association is only loaded if it's not yet loaded AND if the foreign key is present AND if the association class is defined.
The problem is that
foreign_key_present? only checks whether the attribute is present. For sake of simplification, it would be something like
@foo_id.present?. It returns false either when the FK column exists but the value is empty or if the FK does not exist at all, since no verification to check whether the column exists is being made. The condition returns eagerly with
false and the target association is never tried to be loaded, returning
nil in the very end. That's why also it returns
nil even when class definitions are missing, since the
klass part of the condition is never evaluated and the error of
uninitialized constant Foo is never raised.
By simply switching the condition to
!loaded? && klass && foreign_key_present?, we partially solve the problem.
I opened a WIP pull request to trigger the discussion and check whether this behavior is intentionally and by design, since after I fixed this issue locally, a bunch of unit tests failed. I realized then that there are a lot of tests relying on dynamic
belongs_to associations lacking the corresponding FKs in the test suite.
I couldn't find no other information across the web depicting this issue. Let me know in the comments if you have thoughts or more information with reasons why this is the way it is.