Originally posted on Medium https://medium.com/@eyevinntechnology/personalize-the-experience-37f991db2745
One of the major trends in video streaming services over the last few years has been to evolve the personalized experience of content being shown for each specific consumer. It can either be editorial recommendations for specified groups of users, or automatically generated lists of assets that you're supposed to like based on earlier consumption or behavior.
How hard can it be?
How hard can it be, you might ask yourself?
Well, ask yourself these questions
- Do I like everything in a specific genre?
- Do I like everything with a specific actor?
- Do I like everything that my friends likes?
And to dive even deeper
- Do I watch the same movies in the summer as during the holidays?
- Do I watch the same movies when it's sunny outside as when it rains?
- Do I watch the same kind of assets in the morning as in the evening?
It's hard, isn't it?
The expected ouput of a recommendation might, and probably will, vary depending on the context in which you are right now - and therefore we need to provide such recommendations.
Which solutions exists?
Meta data match
Looking at the recommendation solutions out there, the quality and complexity varies a lot. The most basic one which nearly all streaming services starts to implement is a basic meta data matching of the assets in the consumer's consumption history vs the rest of the catalog. I.e. if the consumer have watched an American Action movie with Brad Pitt, I will probably get a recommendation of other Action movies with Brad Pitt preferably.
But are we that simple as humans? Are the main actor that important, as we will choose the movie to watch depending on it?
Consumption match
Next step, following the meta data match recommendations, is often to match consumption history between users. The classic "Because you've watched X" carousel that a lot of streaming services shows.
From my point of view this is a far better approach, which will be even better over time as the dataset grows both when it comes to videos watched, but also consumers to compare against. If I have watched 10 movies that my friend has also watched, there's a big chance that we both will equally like a few more.
Though do we like everything that we've watched?
Ratings match
If we don't like everything we watch, how should we be able to compare your consumtion pattern? Maybe some kind of rating system which contains both positives, but maybe more important negatives as well?
Should we auto detect what we think is a negative? Is a bounce after just a few minutes a negative? Though it can be that you want to continue later on, and you might as well watch an entire movie which isn't that great but just time consumption?
Should we implement a ranking feature, being able to vote thumbs up or down depending on your experience of the movie? Or will the threshold be to high, as it requires action from the consumer.
My approach
Of course there is no holy grail here, as someone should've already implemented it by now. Though I think that you can implement a quite good solution based on aggregation of the ideas that we've been in touch with here.
What if we
- Implement bounce within X minutes as a negative.
- Removes this negative if the user continues to watch later on.
- Implement a finished video as a positive.
- Removes this positive if the rates it as negative.
- Ability to rank thumbs up and down on the asset.
Finally we can as well add some filtering on the output of the recommendations - Should we really recommend typically christmas movies and series during the rest of the year?
We might as well add some bonus points if the metadata is matching my preferences. I.e. this will only change the order of the recommendations already there - not generate new ones.
Implementation
As the positives and negatives is the same, backend wise, even if triggered by different events - the implementation for this will be quite simple.
We want to be able to input a few events - like, dislike, unlike and undislike - i.e. set a rating and remove it later on if needed.
As our filtering will be quite simple, when it comes to metadata, we will apply this on the output generated from our recommendation engine.
Example
We at Eyevinn have choosed to build a simple example project managing these inputs and generating recommendations out of it by using the Jaccard coefficient to determine the similarity between users and k-nearest neighbors to create recommendations out of the user set generated.
It is a node module which you can install through npm and implement as the engine in your recommendation api as example as shown below implemented in Express routes.
Inputs
const eyeRecommender = require("@eyevinn/eye-recommender");
...
app.post("/vote/:userId/:action/:assetId", async (req, res) => {
const userId = req.params.userId;
const action = req.params.action;
const assetId = req.params.assetId;
switch(action) {
case "like":
await eyeRecommender.input.like(userId, assetId);
break;
case "dislike":
await eyeRecommender.input.dislike(userId, assetId);
break;
case "unlike":
// This one removes a set like
// Typically if you've triggered a like on video end, but the user vote a dislike later on
await eyeRecommender.input.unlike(userId, assetId);
break;
case "undislike":
// This one removes a set dislike
// Typically if you've triggered a dislike on bounce, but the user keeps watching or vote like.
await eyeRecommender.input.undislike(userId, assetId);
break;
};
res.set("Cache-Control", "public, no-cache");
res.sendStatus(200);
});
Get recommendations and statistics
const eyeRecommender = require("@eyevinn/eye-recommender");
...
app.get("/recommendations/:userId", async (req, res) => {
const userId = req.params.userId;
// Get an array of recommendations in order from the most recommended to the least
const recommendations = await eyeRecommender.statistics.recommendationsForUser(userId);
// Here you would apply some filtering if you want to
// Fetch assets by the asset id's, filter the metadata etc.
res.send(recommendations);
});
Contributions
As we've built this purely on the aspect of evaluating the options there are, as well as looking into how to evolve the current solutions out there in a cheap scalable way - we've put this open source for you to use and contribute.
We would love engagement, opinions and contributions with code as well as ideas.
Top comments (0)