Hello everyone,
I have just started working on my first project on ranking influencers and I have a few doubts about it.
DESCRIPTION
The objective of this project is to rank different social influencers according to their influential power based on a set of metrics. The idea is rank the influencers based on the score calculated.
Metrics collected:
- username
- categories (the niche the influencer is in)
- influencer_type
- followers
- follower grow, follower_growth_rate
- highlightReelCount, igtvVideoCount, postsCount
- avg_likes, avg_comments
- likes_comments_ratio (comments per 100 likes, use as in authentic indicator)
- engagement_rate
- authentic_engagement (the number of likes and comments that come from real people)
- post_per_week
- etc
Here's how the data collected looks like:
While here is the expected results:
I have tried a numbers of approaches for the ranking algorithms on my project as in the following:
a) Regression model
b) Classification model
c) Machine learning model like SVM, Decision Tree and Deep Neural Network
d) Learning to rank algorithm like CatBoost
e) Any other suitable algorithm
QUESTION
I would like to ask which of the algorithm from the above will be more appropriate and applicable to this project? Any ideas will be much appreciated if anyone of you here came across with similar projects or have some ideas about it. Thank you in advance!
Top comments (2)
I guess a regression model is the way to go. Try out the scikit-learn's decision tree regressor or catboost's regression models. CatBoost is very good for data with many categorical features.
Also you might want to consider which features to include in your regression model, I think username is not appropriate here. And don't forget to record the metrics (MSE, MAE, RMSE, etc) and then compare which model performs the best.
You might want to try doing some hyperparameter search too (grid search, random search, etc) to fine tune your model. I suggest using MLflow to keep track of your experiments
Okie sure, really appreciate your reply. Will try out these few approaches!