Late last year, we introduced a means of more easily configuring the feed results; you can read more about the first experiment here.
I say easier in that it still requires staring at Ruby code and cutting a pull request. But the factors that go into placement in the feed are more transparent; I mean if you think about lots set theory and combinatorial mathematics as transparent.
A current constraint is that the implementation was developed to compete with the the incumbent. In that time, we replaced the incumbent.
In this post I want to share the different levers that exist.
Below is a list of the
50 12 levers that presently exist for tweaking the feed:
- Weight to give based on the relative age of the article.
- Weight to give for the number of comments on the article from other users that the given user follows.
- Weight to give to the number of comments on the article.
- Weight to give based on the difference between experience level of the article and given user.
- Weight to give for feature or unfeatured articles.
- Weight to give when the given user follows the article's author.
- Weight to give to the when the given user follows the article's organization.
- Weight to give an article based on it's most recent comment.
- Weight to give for the number of intersecting tags the given user follows and the article has.
- Weight privileged user's reactions.
- Weight to give for the number of reactions on the article.
- Weight to give based on spaminess of the article.
Note: I copied these factors and descriptions from the code comments. As I’ve lived with this, I like the idea of renaming “factor” to lever.
The names of these “levers” are presently defined in the
Articles::Feeds::WeightedQueryStrategy::SCORING_METHOD_CONFIGURATIONS constant. The names are not sacred nor all that important, except as a quick means to understand their description. Likewise, the descriptions are the current code comments for the above Ruby constant.
For each article that we query, each
factor lever will assign that article a value between 0 and 1. We then multiply each of the lever values together to come up with a “relative rank”. Last we sort articles from greatest “relative rank” down to “lowest rank”.
The above levers were our best effort to look at the constellation of data associated with each article. There may be more levers we can develop. And we can certainly configure each of these levers to create a “catalog” of configurations.
This requires more careful consideration; if one lever returns 0.9, then the greatest “relative rank” the article could have is 0.9. If two levers return 0.9, the greatest “relative rank” is 0.81 (e.g. 0.9 × 0.9). Consider if all levers return 0.9 we’d have a 0.28 “relative rank”. If all levers return 0.8 we’d have a 0.069 “relative rank”.
For a given query, we apply the factors consistently. So we could change our assumption of values between 0 and 1. We could also move from multiplying factors to adding.
The greater deviation is to move from multiplication to addition. Moving to addition (but keeping the 0 to 1 range) means if we move “relative rank” down to 0 for one lever, that would contribute 8.33% of the “relative rank” (e.g. there are 12 levers, 1 ÷ 12 is 0.0833.)
Another consideration is that each of these levers are independent of each other. I suppose we could start creating dependencies, but at the moment my brain hurts thinking about how we’d do that with the given implementation and given how we build the SQL.
In anycase, the levers are envisioned as having bounded ranges. That’s something we want to continue.
My hope in sharing this is to shine a bit of light onto the feed. It’s something that I, as as the lead for Content Experience, am interested in stewarding along to improve the overall experience of folks engaging on DEV and across other Forems.