DEV Community

Come with me on a journey through this website's source code by way of a bug

rhymes on October 09, 2018

This article started because I thought that researching a particular bug could be useful to understand devto's source code a little better. So I li...
Collapse
 
dmfay profile image
Dian Fay

O/RM methods like destroy_all really need to be renamed destroy_slowly_and_painfully_use_sql_if_you_can, I swear. This isn't the first time I've seen something like this. Batching ids to delete is one option; if you can just run a script or prepared statement to delete reactions by the article/post id that's even better since the engine can use an index on the foreign key (although the difference may be negligible at the scales we're talking about currently). If the reaction deletion from users' feeds is in-band it probably doesn't need to be, and managing the cache of reactions to a dead article is obviously superfluous. No idea why it's saving other articles since there isn't any metrics aggregation but that's already out of band like you mentioned.

Collapse
 
rhymes profile image
rhymes

O/RM methods like destroy_all really need to be renamed destroy_slowly_and_painfully_use_sql_if_you_can, I swear.

Ahahah, yeah pretty much.

This isn't the first time I've seen something like this

Nope, definitely not. A lot of ORM like AR encourage their users to surrender to all possible callbacks and many of the gems used by Rails applications hook into those callbacks without the developer knowing so after a while it's not uncommon to be hit by this issue.

Batching ids to delete is one option; if you can just run a script or prepared statement to delete reactions by the article/post id that's even better since the engine can use an index on the foreign key (although the difference may be negligible at the scales we're talking about currently)

Yeah, a post is never going to have millions of reactions, at least not for a while. By the way there's no foreign key between the two tables (articles and reactions) so no index to take advantage of.

Collapse
 
dmfay profile image
Dian Fay

You could add an index without the proper foreign key constraint, or vice versa, but without looking at the data model my guess is that having both is probably best in this scenario.

Thread Thread
 
rhymes profile image
rhymes

You can't easily add a foreign key between reactions and articles because of the "polymorphic" nature of these relation. A reaction can be attached to a comment, to an article and other stuff I guess. So what Rails does it setup a pair of fields called reactable_id and reactable_type like this:

PracticalDeveloper_development=# \d reactions
                                          Table "public.reactions"
     Column     |            Type             | Collation | Nullable |                Default
----------------+-----------------------------+-----------+----------+---------------------------------------
 id             | integer                     |           | not null | nextval('reactions_id_seq'::regclass)
 category       | character varying           |           |          |
 created_at     | timestamp without time zone |           | not null |
 points         | double precision            |           |          | 1.0
 reactable_id   | integer                     |           |          |
 reactable_type | character varying           |           |          |
 updated_at     | timestamp without time zone |           | not null |
 user_id        | integer                     |           |          |
Indexes:
    "reactions_pkey" PRIMARY KEY, btree (id)
    "index_reactions_on_category" btree (category)
    "index_reactions_on_reactable_id" btree (reactable_id)
    "index_reactions_on_reactable_type" btree (reactable_type)
    "index_reactions_on_user_id" btree (user_id)

On top of those it mounts what it calles polymorphic associations. The gist is that every select triggered by something like article.reactions becomes a SELECT * FROM reactions WHERE reactable_type = 'Article' AND reactable_id = 1234, same for the other queries

Thread Thread
 
dmfay profile image
Dian Fay

Gross. The site runs on Postgres though iirc, so you could establish a foreign key relationship between reactions and reactables as a parent table extended by articles, posts, and so on. But I don't know if ActiveRecord would play nicely with that kind of specialized structure.

Thread Thread
 
rhymes profile image
rhymes

No you can't use Postgresql inheritance in Rails, not easily.

There's a concept of single table inheritance but it's all virtual. The various entities share the same table and are distinguished by a type column. You can find the explanation here medium.com/@dcordz/single-table-in... - I've used it once and it was used to map entities with differed only in name (sensors with a uniform API)

Thread Thread
 
dmfay profile image
Dian Fay

Another option is a complex junction table with reaction_id, article_id, post_id, and a CHECK constraint ensuring that only one reactable foreign key can have a value. But that does add enough complexity to at least give me pause.

The indexing in the current model could still be improved since reactable_id is unreliable on its own and reactable_type is low-cardinality. A single index on (reactable_type, reactable_id) would be much more useful (including for bulk delete!).

Thread Thread
 
rhymes profile image
rhymes

Yeah, I think the index on both keys is a good comprise

Thread Thread
 
dmfay profile image
Dian Fay

It might be nice longterm to merge reactables if they're similar enough. Articles are almost exactly posts without parents, after all. Ancillary information can go into a specialization table with an optional 1:1 relationship.

Collapse
 
ben profile image
Ben Halpern

This is fabulous, I’ll have some thoughts to add color in the morning.

There are a couple main themes here for me:

  • what makes sense in one context does not in another, which is part of why ActiveRecord callbacks are not well-loved in general.
  • Certain “secondary” actions, like deleting articles, remain under-cared for amidst the focus on developing actions taken more frequently, and perhaps with more on the line.

This is exactly why we open sourced. Dives like this help make the code awesome.

Collapse
 
rhymes profile image
rhymes

Thanks Ben! Take your time, I know it's a lot :-)

what makes sense in one context does not in another, which is part of why ActiveRecord callbacks are not well-loved in general

Yeah, count me in. I tend to move the callback logic in explicit "service objects" or something like that, like your "labor" folder. It makes testing a little bit easier as well I guess. These kind of refactorings definitely require lots of digging, testing and code breaking because you have to go counter the Rails way of trusting implicit side effects and make them explicit. The gain is that code usually becomes as side effect free as possible and more understandable for new comers because it tends to be narrated in the controller:

def destroy
  CacheBuster.bust(article)
  SearchSeeker.byebye(article)
  RelationshipCounselor.drop_or_ghost_everyone_and_everything(article)
  article.destroy!
end

something like that :D

Collapse
 
ben profile image
Ben Halpern

We currently do use this sort of pattern, and even give it its own folder app/services, but we only use it sometimes.

Right now I'd say we're sort of caught in between "hacked to get things to work" and "decent architecture" state (most code could probably be described that way).

The problem is probably that we've sort of gone half-way with some of these things.

The fact that we have the labor folder and the services folder (and the liquid_tags) and the models folders is indicative of some indecision combined with lack of taking the time to actually take the time for dedicated large-scale refactoring.

There are different schools of thoughts in terms of folder structure here (should it all go in models, with subfolders?)

If you have any opinions here, I'm all ears.

PS, we actually define our own timeout at about 9 seconds. RACK_TIMEOUT accepts some a Rails-magicky ENV variable called RACK_TIMEOUT_SERVICE_TIMEOUT. We could extend this longer, but everything we do really should finish sooner, so it would be better to fix the code IMO.

Thread Thread
 
rhymes profile image
rhymes

The fact that we have the labor folder and the services folder (and the liquid_tags) and the models folders is indicative of some indecision combined with lack of taking the time to actually take the time for dedicated large-scale refactoring.

I think it's normal :) A way to refactor could be choosing one style and slowly convert the rest to it

There are different schools of thoughts in terms of folder structure here (should it all go in models, with subfolders?)

Dunno. I've never liked the "fat model" approach because it leads to giant files with a lot of different business logic. For example, where do you put a method that operates on a user and a comment? In user? In comment? 42?

One thing that I like is having "third party services" clients in the same folder, especially if they are used in multiple part of the business logic.

I've also used presenters in the past to group the display logic and (some) query objects to hide complex queries and be able to test them in isolation.

These are a few articles on the topic of how to structure or refactor Rails code:

Another thing I would consider is to make more explicit what is an asynchronous job. The code base right now uses delayed_job directly with its various .delay, _without_delay and handle_asynchronously. Refactoring all of this into Active Jobs would have the effect of make jobs explicit and produce a nice app/jobs folder as a central repository for all the async stuff. This way you can test it in isolation if needed or if in the future you outgrow delayed job you know where to go to make the changes.

PS, we actually define our own timeout at about 9 seconds. RACK_TIMEOUT accepts some a Rails-magicky ENV variable called RACK_TIMEOUT_SERVICE_TIMEOUT.

Ah! Good to know :) Well, it's entirely possible that the wall is hit at 9 seconds then.

We could extend this longer, but everything we do really should finish sooner, so it would be better to fix the code IMO.

Agreed

Thread Thread
 
rhymes profile image
rhymes

I know it's a lot, but the great thing about refactoring is that it can be (mostly) done one step a time :-)