DEV Community

Cover image for How to build Content Recommendation Using Elastic Search
Vishnu Chilamakuru
Vishnu Chilamakuru

Posted on • Originally published at Medium on

9 1

How to build Content Recommendation Using Elastic Search

When we read an article in any news website, medium, dev.to etc.. we generally see additional sections like Recommended Articles, Similar Articles, etc.. where we see a few more articles matching the content of the article you are reading or may be based on your previous read history you get a few more recommendations.

Basically, Articles recommendation can be done in two ways.

  • Collaborative Filtering
  • Content similarity

Collaborative Filtering:

  • Collaborative filtering is a method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating).

  • This can be implemented using Machine Learning techniques by identifying the group of users having similar behaviour.

  • Another way of identifying a similar group of users is by using Graph Databases like Neo4j, ArangoDB etc.. where you can build a graph of users connected via their interests, activities on the website, purchase patterns etc.. and identify similar correlated user groups.

Content Similarity:

  • The content similarity is the degree of similarity between two articles, based on the textual content (terms appearing in them) of the two articles.

  • This can be implemented using information retrieval techniques like Bow (Bag Of Words), TF-IDF, etc..

In this blog post, I will explain more about implementing Content Similarity using Elastic Search, which internally uses TF-IDF for calculating the relevant articles for the given search query. I took sample articles dataset from kaggle (dataset from thenews website) for this activity.

This dataset contains 2692 articles, out of which 1408 are sports related articles and remaining 1284 articles are business related articles. Will explore sports articles in this post.

Let’s look into the interesting part… The implementation

Let's go
Lets go ..!

Below is the sample article stored in Elastic Search which talks about Cricket.

Image from Firstpost
Image from Firstpost
{
"id": 1121,
"content": "PRETORIA: A patient century stand between Hashim Amla and Temba Bavuma put South Africa in a commanding position at 223 for four at tea on day four of the final Test against England on Monday.Amla, a regular thorn in England´s side, was not out on 96 as he chases a second century in the match with Bavuma unbeaten on 63. The hosts led by 356 runs on a pitch that has turn, movement and uneven bounce.The pair have put on 117 for the fifth wicket and kept England wicketless between lunch and tea after the touring side had made good inroads in the opening session when a fired-up James Anderson grabbed two early wickets and Ben Stokes another.South Africa, who are likely to be down to three front-line bowlers in the second innings with seamer Kyle Abbott struggling due to a hamstring problem, must decide what target they want to set England with four sessions remaining.They will be mindful of overworking their depleted attack as they chase a consolation victory in the test with England already having claimed the series.The previous highest chase successful in Test matches at Centurion Park was 251 by England in 2000, though the pitch was only used for two days in that match after rain spoiled the contest and the teams decided to forfeit an innings each to force a result.Anderson had earlier taken his number of test wickets to 433, one short of Indian great Kapil Dev in sixth place on the all-time list.He first induced a rash drive from opener Stephen Cook (25) that provided a catch for wicketkeeper Jonny Bairstow and two balls later grabbed the big scalp of home captain AB de Villiers, lbw without scoring.It was the third duck in a row for De Villiers, comfortably his worst run of form in Test cricket since he made his debut in 2004.Stokes removed JP Duminy (29) caught behind by Bairstow to leave South Africa sweating at 106 for four, before Amla and Bavuma edged the home side ahead again. (Reuters)",
"date": "1/25/2016",
"title": "Amla Bavuma put South Africa in driving ",
"category": "sports"
}

Above article talks more about Hashim Amla, Temba Bavuma, South Africa, England, Test Cricket ... Now, let’s see the top 4 articles matching this content (article id — 1817 as mentioned above)

Below are the recommended articles which have similar content

  • “title”: “Injured Amla stands firm as South Africa build lead”
  • “title”: “Amla makes century De Villiers falls for 88”
  • “title”: “Amla and Stephen Cook lead South Africa to 329/5”
  • “title”: “Root Stokes fire up England ” (Eng Vs SA Test match)

So, If u see most of the recommended articles are in context with the article (id- 1121).

I used Elastic search’s More Like This Query to identify the similar articles matching the content of the current article.

Sample Elastic Search More Like This Query:

GET /news_articles/_search
{
"from":0,
"size":4,
"query":{
"more_like_this":{"fields":["title","content"],"like":[{"_id":1121}],"min_term_freq":1,"min_doc_freq":5,"max_query_terms":20}
}
}

Let’s see one more similar example, this time for the Football news.

Image from Business Insider
Image from Business Insider
{
"id": 1817,
"content": "Argentina had a scare when Lionel Messi had to be substituted after incurring a back injury during a 1-0 win over rugged Honduras in a Copa America Centenario warm-up friendly on Friday.</strongMessi walked off in pain in the 64th minute in San Juan after a clash with Honduras substitute Oliver Morazan 10 days before Argentina's opening Group D match against Chile.The team later dispelled fears of a serious injury, saying on their Twitter account that Messi had “suffered bruising on the left of his lower back and ribs.”Gonzalo Higuain scored the only goal for Argentina.In other warm-up matches for the June 3-26 tournament in the United States, Edinson Cavani struck twice as Uruguay overcame Trinidad and Tobago 3-1 and title holders Chile were upset 2-1 by Jamaica.Higuain struck after half an hour for Argentina when he received a low cross from the left by Marcos Rojo with his back to goal, turned central defender Maynor Figueroa and chipped left-footed over goalkeeper Donis Escober.Argentina meet Chile in Santa Clara, California, on June 6 and also face Bolivia and Panama in Group D.Uruguay, without the injured Luis Suarez, came from a goal down after Jomal Williams had put the Trinidadians ahead in the seventh minute in Montevideo.Cavani equalized with a 26th-minute penalty for a foul by Weslie John on midfielder Nicolas Lodeiro and put Uruguay ahead in the 39th with a shot from the right at a corner.Matias Vecino, who made his debut against Brazil in a World Cup qualifier in March, increased Uruguay’s lead seven minutes after halftime with his first international goal.Uruguay face Mexico in Glendale, Arizona on June 5 before also meeting Jamaica and Venezuela in Group C.Jamaica, who meet Venezuela in Chicago on June 5, stunned a Chile side fielding several reserves in Vina del Mar scoring through Clayton Donaldson and Simon Dawkins before substitute Nicolas Castillo pulled one back late on.",
"date": "5/28/2016",
"title": "Messi worries Argentina back injury",
"category": "sports"
}

Above article talks more about Lionel Messi and Argentina. Now, let’s see the top 4 articles matching this content (article id — 1817 as mentioned above)

Below are the recommended articles which have similar content

{
"id": 2031,
"content": "Foxborough, United States: Lionel Messi scored a record-equalling 54th international goal as Argentina outclassed Venezuela 4-1 to book their place in the semi-finals of the Copa America Centenario here Saturday.The Barcelona superstar scored his fourth goal of the tournament in the 60th minute at the Gillette Stadium to equal Gabriel Batistuta´s long-time Argentina international goalscoring record.Messi also created a goal for Gonzalo Higuain, who scored twice, and second half substitute Erik Lamela as Argentina set up a semi-final against the United States in Houston on Tuesday.Messi is determined to lead Argentina to their first trophy in 23 years after a series of agonizing near-misses which include defeats in the finals of last year´s Copa America and the 2014 World Cup final.Messi´s genius created the opening goal after eight minutes, with the playmaker lofting a sublime curling pass to pick out Higuain´s run into the area.Napoli forward Higuain still had plenty to do but skillfully directed a first time strike past Venezuela goalkeeper Dani Hernandez for a wonderful goal.Argentina remained firmly in control but were jolted in the 27th minute when Nicolas Gaitan picked up a needless yellow card for a kick on Alexander Gonzalez that will see him suspended for the semi-final.Argentina´s dominance clearly had Venezuela rattled, and the Vinotinto´s defensive unease led to the next goal.A slack backpass from Arquimedes Figuera went straight to Higuain in space, and the striker coolly rounded Hernandez to stroke home his second of the night to make it 2-0 after 29 minutes.But just when it looked as if the floodgates would open, Venezuela rallied impressively and had Argentina rocking with a series of chances.<br/> <br/> West Bromwich Albion striker Salomon Rondon forced a save from Sergio Romero and hit the woodwork in quick succession.Romero was left scrambling to save once again just before half-time, when fullback Rolf Feltscher´s long-range shot deflected off Gabriel Mercado and looped up and over the Argentina keeper.Venezuela looked to have a justifiable claim for a penalty when Higuain brought down Josef Martinez in the box but Mexican referee Roberto Garcia was unmoved. <br/> Seconds later though Garcia was pointing to the spot after Romero clumsily bundled over Martinez for a clear penalty.But Venezuela´s pressure went unrewarded when Luis Manuel Seijas tried a risky Panenka down the middle which Romero read and gathered with ease.Venezuela were unable to renew their period of pressure in the second half, and Messi effectively put the contest to bed with his record-equalling goal in the 60th minute, which came after Argentina pressed high to force a turnover.Gaitan won the tackle and passed to Messi, who quickly played a one-two with the Benfica winger before poking his finish between Hernandez´s legs.Rondon pulled a goal back with a header in the 70th minute, but Messi´s response was instant, threading a pass to Lamela whose shot crept in at Hernandez´s near post.",
"date": "6/19/2016",
"title": "Messi record as Argentina thrash Venezu",
"category": "sports"
},
{
"id": 1948,
"content": "CHICAGO: Lionel Messi scored a magical hat-trick in 19 minutes as Argentina cruised into the Copa America Centenario quarter-finals with a 5-0 rout of Panama on Friday.Messi had been forced to delay his debut in the tournament after a slow recovery from a lower back injury, with Argentina coach Gerardo Martino surprisingly naming him amongst the substitutes once more.But Messi wasted no time in stamping his genius on the tournament after coming on in the 61st minute to replace Augusto Fernandez with Argentina leading the Group D game 1-0.An electrifying cameo from the Barcelona superstar began with his first goal of the tournament on 68 minutes, finishing with aplomb after Gonzalo Higuain´s shot bounced off a Panamanian defender into his path.But there was no question of luck with his second, with Messi curling a wonderful free-kick into the top corner to bring the Soldier Field crowd to its feet in the 78th minute.With Panama´s heads down, Messi then duly completed his quick-fire hat-trick -- skipping clear of a marker inside the penalty area to finish emphatically for 4-0.There was still time for Messi to have a hand in Argentina´s fifth goal of the evening, providing a sublime pass in the build-up to Sergio Aguero´s close range headed effort in the 90th minute.Aguero´s Manchester City team-mate Nicolas Otamendi had opened the scoring for Argentina in the seventh minute, heading home an Angel Di Maria free-kick from close range.The victory sees Argentina, convincing 2-1 winners over Chile in their opening game, guaranteed a spot in the last eight with one group game remaining.The only blemish on the evening was an injury to Angel Di Maria, who limped off in the first half to extend his miserable record of injuries in major tournaments.Argentina are now firmly on course to top Group D, with only one game against pointless Bolivia to come.Friday´s win marked a dazzling return to the spotlight for Messi, who is determined to help lead Argentina to their first major international tournament victory for 23 years.Messi, a veteran of Argentina´s agonizing defeats in the 2014 World Cup final and 2015 Copa America final, had arrived in the United States nursing a back injury sustained during a friendly victory against Honduras last month.He was late joining up with the squad after returning to Spain to give evidence in his tax fraud trial.",
"date": "6/11/2016",
"title": "Magical Messi grabs hat trick as Argentina romp into quarter",
"category": "sports"
},
{
"id": 1552,
"content": "CORDOBA, ARGENTINA: Lionel Messi scored his 50th international goal to help Argentina beat Bolivia 2-0 in a 2018 World Cup qualifier on Tuesday.</strongThe win put Argentina third in the South American group with 11 points from six matches, two points behind joint leaders Ecuador and Uruguay.Ecuador suffered their first loss in the group, going down 3-1 in Colombia, while Uruguay scored a 1-0 home win over Peru.Argentina defender Gabriel Mercado, who scored the winner in Thursday’s 2-1 away victory over Chile, opened the scoring in the 21st minute against Bolivia.Argentina were awarded a penalty in the 29th minute after Ronald Eguino fouled midfielder Ever Banega, and Messi made no mistake from the spot to net his 50th goal for Argentina, putting him within six of Gabriel Batistuta’s record.“This (campaign) is very long and what’s important is that we won and remain on course,” Messi told TyC Sports. “I’m happy with goal number 50 but more because we won and this helps to keep growing.”Argentina coach Gerardo Martino said it had been crucial to get the two wins.“It was fundamental to get the six points,” he told TyC Sports of the two wins. “I think we played a serious match and won well.”Argentina could have gone ahead after 10 seconds with a quick attack down the left from the opening kickoff but Angel Di Maria’s shot was blocked by goalkeeper Carlos Lampe and Banega hit the bar with the rebound.Di Maria limped off after half an hour and was replaced by Angel Correa.Argentina squandered several other chances to score against a side they put 12 goals past in two friendlies last year. They are a point ahead of Colombia and Chile, who came from behind to beat Venezuela 4-1.Brazil and Paraguay have nine points after their 2-2 draw in Asuncion.The top four teams in the 10-nation group qualify for the 2018 World Cup finals, while the fifth-placed side goes into an intercontinental playoff.",
"date": "3/30/2016",
"title": "Messi scores 50th Argentina goal in 2 0 win over Bolivi",
"category": "sports"
},
{
"id": 2076,
"content": "NEW JERSEY: Chile face Argentina in Sunday's Copa America final for the right to call themselves South America's dominant team but perhaps an even bigger question for football fans the world over regards whether Lionel Messi can finally win a major international title.</strongThe Barcelona forward has won every trophy possible with the Spanish club but he has lost three finals with Argentina, including in 2014 World Cup Final in 2014 and the Copa America last year.Sunday's game against Chile in New Jersey gives Messi a chance to end both his personal hoodoo and that of Argentina, who have not won a major title since lifting the Copa America in Ecuador in 1993.Getting to three finals in a row is impressive, said Messi, whose first decider was a 3-0 loss to Brazil in the 2007 Copa America. I hope we can win the Cup that we so desire.Argentina lost to Chile on penalties in the final last year and Messi said the squad were better prepared this time around.You learn all the time, said Messi, who turned 29 on Friday.We have been working together for another year, we are stronger as a group and we've really grown in a lot of ways.The five-times world player of the year has been outstanding at the Centenary Copa America, even though he played the first three games as a substitute after injuring his back in a warm-up game.His sublime free kick in the 4-0 win over the United States took him on to 55 goals and above Gabriel Batistuta as Argentina's all-time leading goalscorer.Messi called the performance against the U.S. perfect and he will not have forgotten that Argentina beat Chile 2-1 in their opening match on June 6.RED HOT CHILEHowever, the Chileans have improved since, beating Bolivia and Panama before hammering Mexico 7-0 in what was undoubtedly the performance of the tournament.Coach Juan Antonio Pizzi, who replaced Jorge Sampaoli in January, is now settled in the job and he has Chile playing the same high-paced pressing and super-fast counter attacks that make them such an exciting team to watch.This team has created an identity, the Argentine-born Pizzi said after the semi-final.It's a group of winners, I can see that just talking with them. That's not because they win games because we don't win every time but in their heads they are convinced they are going to win. That mentality allows them to grow stronger and gives them the confidence to keep going.Chile will have the dynamic Artur Vidal back after suspension and Pizzi hopes central midfielder Marcelo Diaz will recover from the muscle injury that kept him out the 2-0 win over Colombia in the semi-final.They are on a high and confident that they will do the double over their neighbours.But they will not have their home fans behind them this time and there is one other detail. The last time Messi played at the MetLife stadium was also in June and also against a South American side.Argentina beat Brazil 4-3 in a friendly in 2012. Messi scored a hat-trick.",
"date": "6/25/2016",
"title": "Messi primed to end Argentina drought Copa fi",
"category": "sports"
}
  • “title”: “Messi record as Argentina thrash Venezuela”
  • “title”: “Magical Messi grabs hat trick as Argentina romp into quarter”
  • “title”: “Messi scores 50th Argentina goal in 2 0 wins over Bolivia”
  • “title”: “Messi primed to end Argentina drought Copa fi”

Almost all the recommended articles are in context with the article id- 1817.

Conclusion:

Overall, ElasticSearch MoreLikeThis query will help you in identifying similar content articles in a fast and efficient manner which can give you decent recommendations based on text content.

Thank you for your time!
If you like the article keep sharing it and you can follow me on Twitter for more posts on technology, startups & leadership

Image of Docusign

Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more

Top comments (0)