<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Matea Pesic</title>
    <description>The latest articles on DEV Community by Matea Pesic (@matea16).</description>
    <link>https://dev.to/matea16</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F982523%2Ff3c738f8-b112-4444-ba4f-395c82ef0344.jpeg</url>
      <title>DEV Community: Matea Pesic</title>
      <link>https://dev.to/matea16</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/matea16"/>
    <language>en</language>
    <item>
      <title>Security Analysis with JupiterOne’s Starbase and Memgraph</title>
      <dc:creator>Matea Pesic</dc:creator>
      <pubDate>Tue, 22 Aug 2023 16:36:38 +0000</pubDate>
      <link>https://dev.to/memgraph/security-analysis-with-jupiterones-starbase-and-memgraph-138f</link>
      <guid>https://dev.to/memgraph/security-analysis-with-jupiterones-starbase-and-memgraph-138f</guid>
      <description>&lt;p&gt;Starbase is an open-source graph-based security analysis tool that unifies all of JupiterOne’s integrations into one. It collects assets and relationships from services and systems, including cloud infrastructure, SaaS applications, security controls, and more, into an intuitive graph visualization. With over 115 open-source graph integrations, Starbase collaborates with your existing toolkit enabling easy and insightful cyber security analysis.&lt;/p&gt;

&lt;p&gt;In this article, we’ll dig into Starbase, guiding you through the setup of two example integrations and enabling Starbase to work with Memgraph for easy ingestion and visualization of your graph data.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--p3ARwUhY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://public-assets.memgraph.com/security-analysis-with-starbase-and-memgraph/starbase-logo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--p3ARwUhY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://public-assets.memgraph.com/security-analysis-with-starbase-and-memgraph/starbase-logo.png" alt="starbase logo" width="726" height="722"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Installed &lt;a href="https://yarnpkg.com/"&gt;Yarn&lt;/a&gt; package manager.&lt;/li&gt;
&lt;li&gt;Installed &lt;a href="https://nodejs.org/en"&gt;Node.js&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;A running Memgraph instance—visit Memgraph’s docs for instructions on how to &lt;a href="https://memgraph.com/docs/memgraph/installation"&gt;install&lt;/a&gt; and &lt;a href="https://memgraph.com/docs/memgraph/connect-to-memgraph"&gt;connect&lt;/a&gt; to Memgraph.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Setting up Starbase
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;To kick-start your Starbase setup, first, you need to clone the &lt;a href="https://github.com/JupiterOne/starbase"&gt;JupiterOne/Starbase&lt;/a&gt; repo into your local directory and ensure you have &lt;strong&gt;Yarn&lt;/strong&gt; and &lt;strong&gt;Node.js&lt;/strong&gt; installed.&lt;/li&gt;
&lt;li&gt;Once you’ve successfully cloned the repository and installed the prerequisites, place yourself in the terminal in the directory where you cloned the repo and run the &lt;code&gt;yarn&lt;/code&gt; command. The command installs all of the necessary project dependencies.&lt;/li&gt;
&lt;li&gt;The next step is setting up configurations for your integration of choice. You can find a list of all integrations on JupiterOne’s GitHub repo. Moving forward, we are going to explore two options for possible integration, &lt;a href="https://github.com/jupiterone/graph-zoom"&gt;Zoom&lt;/a&gt;, and &lt;a href="https://github.com/jupiterone/graph-github"&gt;GitHub&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Setting up integrations
&lt;/h2&gt;

&lt;p&gt;In order to set up an integration, you need to register an account in the system the integration targets for ingestion and obtain the necessary API credentials. Starbase leverages credentials from external services to authenticate and collect data. When Starbase is started, it reads configuration data from a single configuration file named &lt;code&gt;config.yaml&lt;/code&gt; at the root of the project.&lt;/p&gt;

&lt;h3&gt;
  
  
  Zoom integration
&lt;/h3&gt;

&lt;p&gt;In order to configure the Zoom integration, we need to create a Zoom app to retrieve the needed credentials: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to the &lt;a href="https://marketplace.zoom.us/"&gt;Zoom App Marketplace&lt;/a&gt; and sign into your Zoom account.&lt;/li&gt;
&lt;li&gt;In the top right corner, go to the Develop dropdown menu and select Build App.&lt;/li&gt;
&lt;li&gt;Choose to create an &lt;strong&gt;OAuth&lt;/strong&gt; type of app.&lt;/li&gt;
&lt;li&gt;Take note of your &lt;strong&gt;Account ID&lt;/strong&gt;, &lt;strong&gt;Client ID&lt;/strong&gt;, and &lt;strong&gt;Client secret&lt;/strong&gt; which we’ll need for the configuration file later on.&lt;/li&gt;
&lt;li&gt;In the Scopes section, add &lt;code&gt;group:read:admin&lt;/code&gt;, &lt;code&gt;role:read:admin&lt;/code&gt;, &lt;code&gt;user:read:admin&lt;/code&gt;, and &lt;code&gt;account:read:admin&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After you’ve successfully created your Zoom App, open up the starbase repo in your editor of choice and create your &lt;code&gt;config.yaml&lt;/code&gt; file. This is an example of a &lt;code&gt;config.yaml&lt;/code&gt; file for Zoom integration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;integrations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="o"&gt;-&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;zoom&lt;/span&gt;
    &lt;span class="nx"&gt;instanceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;testInstanceId&lt;/span&gt;
    &lt;span class="nx"&gt;directory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;integrations&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;zoom&lt;/span&gt;
    &lt;span class="nx"&gt;gitRemoteUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c1"&gt;//github.com/JupiterOne/graph-zoom.git&amp;gt;&lt;/span&gt;
    &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="nx"&gt;ACCOUNT_ID&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;ACCOUNT_ID&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="nx"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;CLIENT_ID&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="nx"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;CLIENT_SECRET&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="nx"&gt;SCOPES&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;read:admin role:read:admin user:read:admin account:read:admin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  GitHub integration
&lt;/h3&gt;

&lt;p&gt;In order to configure GitHub integration, we need to create a GitHub app to retrieve the needed credentials: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Go to the &lt;a href="https://github.com/settings/apps"&gt;GitHub Apps&lt;/a&gt; and select to create a new GitHub App&lt;/li&gt;
&lt;li&gt;Name your app, and enter a homepage URL (in this case, you can use the JupiterOne’s &lt;a href="https://github.com/JupiterOne/starbase"&gt;Starbase repo URL&lt;/a&gt;), uncheck the webhook and adjust the repository permissions. The following permissions need to be set to &lt;strong&gt;read-only&lt;/strong&gt;: 
-&lt;em&gt;Repository Permissions&lt;/em&gt;: Actions, Environments, Issues, Pull Requests and Secrets
-&lt;em&gt;Organization Permissions&lt;/em&gt;: Administration, Members, Secrets. The rest of the permissions are &lt;strong&gt;No access&lt;/strong&gt; by default. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Read-only access for secrets repo doesn’t give read-only access to actual secret content, it only gives read-only info to the existence of the metadata about the secrets.  &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Select &lt;strong&gt;Any account&lt;/strong&gt; and create your GitHub App.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After you’ve successfully created your GitHub App, open up the cloned Starbase repository in your editor of choice and create your &lt;code&gt;config.yaml&lt;/code&gt; file. Generate your private key and retrieve other needed credentials from the GitHub App you previously created. Below is an example of a &lt;code&gt;config.yaml&lt;/code&gt; file for a GitHub integration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;integrations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="o"&gt;-&lt;/span&gt;
     &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;github&lt;/span&gt;
     &lt;span class="nx"&gt;instanceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;testInstanceId&lt;/span&gt;
     &lt;span class="nx"&gt;directory&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;integrations&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;graph&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;github&lt;/span&gt;
     &lt;span class="nx"&gt;gitRemoteUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c1"&gt;//github.com/JupiterOne/graph-github.git&amp;gt;&lt;/span&gt;
     &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nx"&gt;GITHUB_APP_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;GITHUB_APP_ID&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="nx"&gt;GITHUB_APP_LOCAL_PRIVATE_KEY_PATH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;YOURPATH&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="sr"&gt;/{YOURFILENAME}.private-key.pe&lt;/span&gt;&lt;span class="err"&gt;m
&lt;/span&gt;        &lt;span class="nx"&gt;INSTALLATION_ID&lt;/span&gt;&lt;span class="o"&gt;=&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;INSTALLATION_ID&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nx"&gt;GITHUB_API_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nx"&gt;https&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c1"&gt;//api.github.com     &lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Use Starbase with Memgraph
&lt;/h2&gt;

&lt;p&gt;After you’ve successfully created your &lt;code&gt;config.yaml&lt;/code&gt; file, the last step is to adjust your queries to work with Memgraph. In order to do that, run the following steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;First, you need to place yourself in the terminal in the folder you cloned your Starbase repo and run yarn starbase setup command to clone or update all integrations listed in the config.yaml file, as well as install all dependencies for each integration.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Run your Memgraph instance. Follow the instructions from Memgraph’s docs on how to connect to Memgraph, or if you are using Docker, simply run the following command:&lt;br&gt;
&lt;code&gt;docker run -it -p 3000:3000 -p 7444:7444 -p 7687:7687 memgraph/memgraph-platform&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;By modifying just a single line of code, you are ready to use Starbase with Memgraph. Inside the neo4jGraphStore.js file, locate the addEntities() function. To enable compatibility with Memgraph, simply update the following line:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;runCypherCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`CREATE INDEX index_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;entity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; IF NOT EXISTS FOR (n:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;entity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;) ON (n._key, n._integrationInstanceID);`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;runCypherCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`CREATE INDEX index_&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;entity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; IF NOT EXISTS FOR (n:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;entity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;_type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;) ON (n._key, n._integrationInstanceID);`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;runCypherCommand&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;You are all set to utilize Starbase with Memgraph. The instance is actively listening to port 7687, as defined in the code. &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The final step is to run the &lt;code&gt;yarn starbase run&lt;/code&gt; command. Afterward, launch your browser and go to &lt;code&gt;localhost:3000&lt;/code&gt; to access &lt;strong&gt;Memgraph Lab&lt;/strong&gt; or open your desktop version to explore and visualize your graph data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Explore your dataset
&lt;/h3&gt;

&lt;p&gt;Below, we’ve provided a few query examples that demonstrate how you can dig into your dataset and extract valuable insights. The following examples assume the use of GitHub integration.&lt;/p&gt;

&lt;p&gt;With the following query, you are retrieving the information of all of the extracted GitHub users from a certain organization:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;MATCH (n:github_user) RETURN n LIMIT 3;&lt;/code&gt; &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9sx93pZM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://public-assets.memgraph.com/security-analysis-with-starbase-and-memgraph/github-user.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9sx93pZM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://public-assets.memgraph.com/security-analysis-with-starbase-and-memgraph/github-user.png" alt="github user" width="800" height="287"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you also want to determine which code owners of organization repositories grant access to outside contributors, execute the following query:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;MATCH (account:github_account) - [e:OWNS] -&amp;gt; (repo:github_repo) -&amp;gt; [f:ALLOWS] -&amp;gt; (user:github_user {role: ‘OUTSIDE’})&lt;br&gt;
RETURN account, repo, user, e, f;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--DM8YN9Pf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://public-assets.memgraph.com/security-analysis-with-starbase-and-memgraph/github-user-awesome-code.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--DM8YN9Pf--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://public-assets.memgraph.com/security-analysis-with-starbase-and-memgraph/github-user-awesome-code.png" alt="github user awesome code" width="800" height="287"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;Starbase is a powerful tool that simplifies security analysis by unifying integrations into a user-friendly graph view, enhancing cybersecurity insights. Incorporating Memgraph for data ingestion adds another dimension by enhancing its capabilities and visualizing your data. If you are curious about graphs and would like to learn more, make sure to check out our &lt;a href="https://memgraph.com/blog"&gt;blog&lt;/a&gt; and join our community on &lt;a href="https://discord.gg/memgraph"&gt;Discord&lt;/a&gt;.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Lost in Documentation? Let Our Docs Recommendation System Guide You Along!</title>
      <dc:creator>Matea Pesic</dc:creator>
      <pubDate>Thu, 01 Dec 2022 14:25:52 +0000</pubDate>
      <link>https://dev.to/memgraph/lost-in-documentation-let-our-docs-recommendation-system-guide-you-along-4aai</link>
      <guid>https://dev.to/memgraph/lost-in-documentation-let-our-docs-recommendation-system-guide-you-along-4aai</guid>
      <description>&lt;p&gt;You’re probably familiar with the situation of trying to read through the new documentation, which can often be messy and hard to navigate through. And at a certain point, you find yourself stuck and unsure about the next page to visit in your learning process. With the guidance of the Docs Recommendation System, the decision of choosing the next page is just one click away. With more than ten recommendations to choose from, it’s now easier to keep track of similar documents that suit your needs!&lt;/p&gt;

&lt;p&gt;In this blog post, you can find out how to scrape documentation of your choice, extract content from the webpage and use it to build a recommendation engine with the help of various algorithms like &lt;a href="https://en.wikipedia.org/wiki/Tf%E2%80%93idf" rel="noopener noreferrer"&gt;TF-IDF&lt;/a&gt;, &lt;a href="https://memgraph.com/docs/mage/query-modules/python/node2vec" rel="noopener noreferrer"&gt;node2vec&lt;/a&gt; and &lt;a href="https://memgraph.com/docs/mage/algorithms/machine-learning-graph-analytics/link-prediction-algorithm" rel="noopener noreferrer"&gt;link prediction&lt;/a&gt;. Also, if you want to find out which page is the most influential within the documentation, there is a &lt;a href="https://memgraph.com/docs/mage/query-modules/cpp/pagerank" rel="noopener noreferrer"&gt;PageRank&lt;/a&gt; algorithm to help you out. &lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites and application setup
&lt;/h2&gt;

&lt;p&gt;The application consists of two main parts - the backend built with Python Flask and the frontend built with React.&lt;/p&gt;

&lt;p&gt;To build and start the application, you will need:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The &lt;a href="https://memgraph.com/docs/mage/installation" rel="noopener noreferrer"&gt;MAGE graph library&lt;/a&gt; - for graph algorithms&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://memgraph.com/docs/gqlalchemy/installation" rel="noopener noreferrer"&gt;GQLAlchemy&lt;/a&gt; - a Python driver and Object Graph Mapper (OGM)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://memgraph.com/docs/memgraph/reference-guide/deployment/docker" rel="noopener noreferrer"&gt;Docker&lt;/a&gt; - for building and running the application in a container&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://nodejs.org/en/" rel="noopener noreferrer"&gt;Node.js&lt;/a&gt; - for creating the React app&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://memgraph.com/lab" rel="noopener noreferrer"&gt;Memgraph Lab&lt;/a&gt; for graph visualizations.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To start the application, follow these steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Clone our project’s &lt;a href="https://github.com/memgraph/docs-recommendation-system" rel="noopener noreferrer"&gt;Git repository&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Position yourself in the local repo directory&lt;/li&gt;
&lt;li&gt;Start Docker&lt;/li&gt;
&lt;li&gt;Open a new command prompt and run the following commands:
4.1. docker-compose build
4.2. docker-compose up&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Finally, you can check out the application at &lt;code&gt;localhost:3000&lt;/code&gt;. If everything goes as planned, the application should look like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_11.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_11.png" alt="image alt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see, there are two possible inputs. The first one is the URL of the desired documentation and that’s an obligatory input. Below that, there is an option of adding some text in case you want to find documents that are similar to it. And that's it! Keep in mind that if you don’t provide the second input, then the recommendations will be based on the extracted text from the HTML of the input URL. &lt;/p&gt;

&lt;p&gt;So far we have everything up and running, and before we show you the final output, let’s see what’s really happening under the hood.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scrape away
&lt;/h2&gt;

&lt;p&gt;The first step, after you click on the RECOMMEND button, is to scrape the documentation provided with the input URL. From that URL, the HTML content is extracted by using Python’s &lt;a href="https://www.crummy.com/software/BeautifulSoup/" rel="noopener noreferrer"&gt;BeautifulSoup&lt;/a&gt; library. In that content, we search for the “href” HTML tag in order to find the other links. The links are first compared with the domain of the input URL (we want recommendations to be within the same documentation), then validated and finally ready for further processing. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_22.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_22.jpg" alt="image alt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After we manage to get all links to web pages within the documentation, it’s time to start getting the textual content of those pages. To achieve that, we use the &lt;a href="https://pypi.org/project/jusText/" rel="noopener noreferrer"&gt;jusText&lt;/a&gt; Python module, which helps to remove the unnecessary boilerplate content such as headers, sidebars, footers, navigation links, etc. What's left is the actual text with full sentences that is used to create our recommendation models. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_33.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_33.jpg" alt="image alt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, the final step in text pre-processing is to convert the text to lowercase, split it into words, discard all stopwords and numbers, and lemmatize each word to its base form called lemma, since we don’t want our recommendation engine to treat words like, for example, “playing” and “play” differently.&lt;/p&gt;

&lt;p&gt;Now that the text is “cleaned up” and ready to use, it’s time to actually get the most important part - the keywords from each document. And that's where our first model comes into play.&lt;/p&gt;

&lt;h2&gt;
  
  
  TF-IDF algorithm - the baseline model
&lt;/h2&gt;

&lt;p&gt;Some of you may have heard of or used the NLP algorithm called TF-IDF (term frequency-inverse document frequency). Long story short, the algorithm describes the importance of words in a document amongst a collection of documents (corpus) by computing a numerical value for each word. As the name may suggest, the algorithm takes into account two main factors: tf score and idf score. Those two are defined as follows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Ff66_memgraph.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Ff66_memgraph.jpg" alt="image alt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, we can compute the tf-idf value as &lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Ff5_memgraph.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Ff5_memgraph.png" alt="image alt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since we have the tf scores, we can use them to get the most important words from each document, i.e. the keywords which we will need for the node2vec algorithm.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_44.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_44.jpg" alt="image alt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Having computed the tfidf scores for each word among all documents, it was time to calculate the similarity between the base document (extracted from the HTML of the page provided with the input URL or the text of the second input if provided) and the rest of the documentation. And that job was pretty straightforward. Basically, each document is represented by a vector whose elements are the tfidf scores of each word within the documentation. Now, since we have an abstract representation of a document as a vector, we can compute the &lt;a href="https://en.wikipedia.org/wiki/Cosine_similarity" rel="noopener noreferrer"&gt;cosine similarity&lt;/a&gt; between the vectors and find out which documents are most similar to the base document. And that’s it! We have the baseline model which recommends other similar web pages within the documentation. Now it’s time to build the other two models using &lt;a href="https://memgraph.com/docs/mage/algorithms" rel="noopener noreferrer"&gt;MAGE graph algorithms&lt;/a&gt; and compare the results with this model.&lt;/p&gt;

&lt;p&gt;Keywords are finally extracted using the TF-IDF keyword extraction and each page obtained from the HTML content of the input URL now has its own set of most important words. It’s time to put them to use with the Magic of Graphs!&lt;/p&gt;

&lt;h2&gt;
  
  
  Build a graph in Memgraph’s database
&lt;/h2&gt;

&lt;p&gt;First, let’s do all necessary imports and create an instance of the database:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;from gqlalchemy import Field, Memgraph, Node, Relationship&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;After that, we instantiate Memgraph and using &lt;a href="https://memgraph.com/docs/gqlalchemy/how-to-guides#object-graph-mapper" rel="noopener noreferrer"&gt;Object Graph Mapper&lt;/a&gt; createe classes representing the nodes and relationships between them. We create &lt;code&gt;WebPage&lt;/code&gt; nodes that are in &lt;code&gt;SimilarTo&lt;/code&gt; relationship:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;memgraph = Memgraph()

class WebPage(Node):
    url: str = Field(index=True, exist=True, unique=True, db=memgraph)
    name: str = Field(index=True, exist=True, unique=True, db=memgraph)

class SimilarTo(Relationship, type=”SIMILARTO”):
    pass*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we need to save the nodes of the graph in the database. Each node represents one link scraped from the HTML content of the input URL. Properties of the nodes are the &lt;code&gt;name&lt;/code&gt; of the node, which is shortened version of the link, and the &lt;code&gt;url&lt;/code&gt;.  Nodes are created and saved with:&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;&lt;code&gt;WebPage(name=name, url=url).save(memgraph)&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;p&gt;Now that we populated Memgraph’s database with nodes, how do we create relationships between them?&lt;br&gt;
As mentioned, each page has its own set of keywords. Using those keywords we create a similarity matrix between each page using &lt;a href="https://en.wikipedia.org/wiki/Jaccard_index" rel="noopener noreferrer"&gt;Jaccard similarity&lt;/a&gt;. In other words, the more the keywords of pages intersect, the more similar they are. Relationships are now created based on that similarity matrix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;s_node = WebPage(url=start_url, name=start_name).load(db=memgraph)
e_node = WebPage(url=end_url, name=end_name).load(db=memgraph)

if start_url != end_url:
    similar_rel = SimilarTo(
        _start_node_id = s_node._id, 
        _end_node_id = e_node._id
        ).save(memgraph)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the picture below, you can see the implict scheme of the database:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_51.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_51.png" alt="image alt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Finding the best recommendations
&lt;/h2&gt;

&lt;p&gt;After having a structured graph, the rest of the work is made easier for us with the already implemented &lt;a href="https://memgraph.com/docs/mage/algorithms" rel="noopener noreferrer"&gt;MAGE graph algorithms&lt;/a&gt;, such as &lt;a href="https://memgraph.com/docs/mage/algorithms/machine-learning-graph-analytics/node2vec-algorithm" rel="noopener noreferrer"&gt;node2vec&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are unfamiliar with node2vec or node embeddings in general, we suggest exploring blog posts &lt;a href="https://memgraph.com/blog/introduction-to-node-embedding" rel="noopener noreferrer"&gt;introduction to node embedding&lt;/a&gt; (to check out what node embeddings are) and &lt;a href="https://memgraph.com/blog/how-node2vec-works" rel="noopener noreferrer"&gt;how node2vec works&lt;/a&gt; (to fully understand the algorithm) for a deeper understanding.&lt;/p&gt;

&lt;p&gt;Node2vec has a procedure called &lt;code&gt;set_embeddings()&lt;/code&gt; in the module, which we will use to set a list of embeddings in the graph as a property. All we need to do is execute the following query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CALL node2vec.set_embeddings(False, 2.0, 0.5, 4, 5, 2) YIELD *;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are unsure about what parameters to use, check out &lt;a href="https://memgraph.com/docs/mage/query-modules/python/node2vec" rel="noopener noreferrer"&gt;node2vec parameters&lt;/a&gt;.&lt;br&gt;
Now that the embeddings are set as one of the node properties, we can start with the calculations for the recommendations.&lt;br&gt;
First, we need to get those embeddings from Memgraph’s database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;gqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Match&lt;/span&gt;

&lt;span class="c1"&gt;# exports "embedding" property from all nodes in Memgraph db    
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_embeddings_as_properties&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]]]:&lt;/span&gt;
    &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nc"&gt;Match&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WebPage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;return_&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node.name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;n_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node.url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;n_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node.embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;n_embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;n_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;n_embedding&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embeddings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using retrieved embeddings, we can now calculate the similarity between nodes. For every pair of node embeddings, we calculate cosine similarity to check how similar two-node embeddings are and keep those calculations in an adjacency matrix. And that’s it! All that’s left to do is use obtained matrix to retrieve our best recommendations! &lt;/p&gt;

&lt;h2&gt;
  
  
  Link prediction
&lt;/h2&gt;

&lt;p&gt;Now, let's use the node2vec algorithm once again, this time to find link prediction recommendations. First, we need to retrieve our edges from the database and split them into a test and train set. We can do that using &lt;a href="https://scikit-learn.org/stable/" rel="noopener noreferrer"&gt;scikit-learn library&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_all_edges&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Match&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; \
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WebPage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node_a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; \
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relationship_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SIMILAR_TO&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;edge&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; \
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WebPage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node_b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; \
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;return_&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; \
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node_a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node_b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# split edges in train, test group
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;split_edges_train_test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;]]]):&lt;/span&gt;
    &lt;span class="n"&gt;edges_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edges_test&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;train_test_split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;test_size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;random_state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;edges_train&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;edges_test&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;remove_edges&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Node&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;edge&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;edges&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Match&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; \
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WebPage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;edge&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; \
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;relationship_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SIMILAR_TO&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;variable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; \
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WebPage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;edge&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; \
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; \
                &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reason for splitting edges is that we need to be able to correctly predict new edges that might appear from existing ones. In order to test the algorithm, we remove a part of the existing edges and make predictions based on the remaining set. &lt;/p&gt;

&lt;p&gt;We will randomly remove 20% percent of edges. This will represent our test set. We will leave all the nodes in the graph, it doesn’t matter that some of them could be completely disconnected from the graph. &lt;br&gt;
Next, we will run node2vec again on the remaining edges to get new node embeddings and use these embeddings to predict new edges. &lt;/p&gt;

&lt;p&gt;After edge splitting and embedding calculations, we calculate cosine similarity and adjacency matrix once again to retrieve new predictions and recommendations.&lt;/p&gt;
&lt;h2&gt;
  
  
  PageRank
&lt;/h2&gt;

&lt;p&gt;Sometimes, you may want to measure the importance/popularity of certain web pages. Similarly, we wanted to find out which page within the documentation is the most important one. This is where the &lt;a href="https://memgraph.com/docs/mage/algorithms/traditional-graph-analytics/pagerank-algorithm" rel="noopener noreferrer"&gt;PageRank&lt;/a&gt; algorithm comes in hand! Below is the actual method implemented in the backend of our application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/pagerank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_pagerank&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Call the Pagerank procedure and return top 30 in descending order.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;Call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pagerank.get&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;yield_&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;with_&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;return_&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node.name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;order_by&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;properties&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Order&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;page_rank_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;page_rank_list&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;node_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;page_rank_dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="n"&gt;dict_copy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page_rank_dict&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;page_rank_list&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dict_copy&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;page_rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;page_rank_list&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;make_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HTTPStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OK&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fetching users&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; ranks using pagerank went wrong.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HTTPStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;INTERNAL_SERVER_ERROR&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The image below shows the visualization of the algorithm output, where the larger node indicates that the corresponding page is more popular within the documentation. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_71.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpublic-assets.memgraph.com%2Flost-in-documentation%2Flost-in-documentation_memgraph_71.png" alt="image alt"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;A recommendation system can be a very useful tool for finding relevant suggestions for users and efficiently reducing search time. We hope you learned how to build a recommendation engine by using Memgraph’s graph algorithms and database or at least got an idea for building something similar on your own.&lt;/p&gt;

&lt;p&gt;If this was interesting to you, check out the &lt;a href="https://github.com/memgraph/docs-recommendation-system" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; for more details. Also, if you want to provide feedback or share some thoughts, or talk about anything related to the project or Memgraph in general, let us know on our &lt;a href="https://memgr.ph/discord" rel="noopener noreferrer"&gt;Discord server&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>memgraph</category>
      <category>recommendationsystem</category>
      <category>graphdatabase</category>
      <category>internship</category>
    </item>
  </channel>
</rss>
