<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nikolaos Katomeris</title>
    <description>The latest articles on DEV Community by Nikolaos Katomeris (@nicktheway).</description>
    <link>https://dev.to/nicktheway</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F390106%2F2273268a-d420-48e8-8298-650148131a33.png</url>
      <title>DEV Community: Nikolaos Katomeris</title>
      <link>https://dev.to/nicktheway</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nicktheway"/>
    <language>en</language>
    <item>
      <title>Going multi-core, a pagerank example</title>
      <dc:creator>Nikolaos Katomeris</dc:creator>
      <pubDate>Mon, 25 May 2020 02:19:26 +0000</pubDate>
      <link>https://dev.to/nicktheway/going-multi-core-a-pagerank-example-23p6</link>
      <guid>https://dev.to/nicktheway/going-multi-core-a-pagerank-example-23p6</guid>
      <description>&lt;h2&gt;
  
  
  Parallelizing a serial pagerank implementation in C
&lt;/h2&gt;

&lt;p&gt;Looking back through the years of studying Electrical and Computer Engineering and among many challenging projects, the last of the four total &lt;strong&gt;parallel and distributed systems&lt;/strong&gt; project stands out.&lt;/p&gt;

&lt;p&gt;The task was to parallelize a serial algorithm using one of the APIs we had used on previous projects during the semester, namely one of &lt;strong&gt;pthreads&lt;/strong&gt;/&lt;strong&gt;cilk&lt;/strong&gt;/&lt;strong&gt;openMP&lt;/strong&gt;, &lt;strong&gt;cuda&lt;/strong&gt; or &lt;strong&gt;openMPI&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Being a summer project, I chose to go with a single system CPU-parallelization API -and eventually openMP- so that I could work easily offline on my laptop without the need to access a cluster or a system with a specific GPU.&lt;/p&gt;

&lt;p&gt;The project I went for was &lt;strong&gt;Pagerank calculation using the Gauss-Seidel method&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pagerank
&lt;/h2&gt;

&lt;p&gt;Pagerank was the original algorithm used by Google to rank websites so that it returns better search results. In brief, what it does, is to calculate a value for every page based on its in-going and out-going links. The basic idea being that the more a website is targeted by others the better the chance the content will be useful and what the user is looking for. &lt;/p&gt;

&lt;p&gt;In other words, each link is like a special "vote" for the targeted website and the higher the rank of the website, the more its votes count for. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--oVzhCFm3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/3kraaqpd4be3l4mpbmtr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--oVzhCFm3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/3kraaqpd4be3l4mpbmtr.png" alt="Websites linking to each other!" title="Websites linking to each other!"&gt;&lt;/a&gt;&lt;br&gt;Websites linking to each other - source: Wikipedia
  &lt;/p&gt;

&lt;p&gt;This may look like a cyclic dependency at first glance as each website's votes are weighted by its rating which is in turn affected by the incoming votes of other websites, that are also affected by their rating etc. &lt;/p&gt;

&lt;p&gt;However, the math works out and the resulting rating of each website corresponds to the probability for an "internet traveler" to be at that particular website while moving around randomly through the available hyperlinks. A small chance of teleportation to any website is also applied, so that no website is left out and the traveler's behavior resembles more that of a real user browsing the web.&lt;/p&gt;

&lt;h2&gt;
  
  
  Creating a serial algorithm
&lt;/h2&gt;

&lt;p&gt;In order to create an algorithm to solve our problem we first need a way to visualize and store the program's data. For this particular case, the intuitive way to do that is with a graph. Imagine each website being a different graph node and every link being a directed edge from the source to the target website.&lt;/p&gt;

&lt;p&gt;Storing a graph in memory can be done in multiple ways and the best way will vary depending on the particular algorithm and system requirements. In any case, although the in memory representation of the graph may vary, they are all equivalent from a math perspective.&lt;/p&gt;

&lt;p&gt;In our case, the traveler's trip can be simulated with a &lt;em&gt;Markov Chain&lt;/em&gt; and reach a solution by solving a linear system derived from the Markov Chain's properties.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gauss-Seidel Method
&lt;/h2&gt;

&lt;p&gt;Having the linear system ready we can now solve it with a variety of methods, in this project the Gauss-Seidel iterative method was used.&lt;/p&gt;

&lt;p&gt;In this method, the solution vector &lt;strong&gt;x&lt;/strong&gt; for the system's table &lt;strong&gt;A&lt;/strong&gt; is calculated many times until it converges to a value. Specifically, the &lt;em&gt;k+1&lt;/em&gt; step's solution vector is derived from the equality bellow:&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;xi(k+1)=1Aii(bi−∑j=1i−1Aijxj(k+1)−∑j=i+1NAijxj(k))
x_i(k+1) = \frac{1}{A_{ii}}\bigg(b_i-\sum_{j=1}^{i-1}A_{ij}x_j(k+1)-\sum_{j=i+1}^{N}A_{ij}x_j(k)\bigg)
&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;x&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathdefault"&gt;k&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;A&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;i&lt;/span&gt;&lt;span class="mord mathdefault mtight"&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="delimsizing size3"&gt;(&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;b&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mop op-limits"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;j&lt;/span&gt;&lt;span class="mrel mtight"&gt;=&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="mop op-symbol large-op"&gt;∑&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;i&lt;/span&gt;&lt;span class="mbin mtight"&gt;−&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;A&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;i&lt;/span&gt;&lt;span class="mord mathdefault mtight"&gt;j&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;x&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;j&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathdefault"&gt;k&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mop op-limits"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;j&lt;/span&gt;&lt;span class="mrel mtight"&gt;=&lt;/span&gt;&lt;span class="mord mathdefault mtight"&gt;i&lt;/span&gt;&lt;span class="mbin mtight"&gt;+&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="mop op-symbol large-op"&gt;∑&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;N&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;A&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;i&lt;/span&gt;&lt;span class="mord mathdefault mtight"&gt;j&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathdefault"&gt;x&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathdefault mtight"&gt;j&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathdefault"&gt;k&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="delimsizing size3"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;p&gt;The important thing about the expression above is that each next solution vector (step &lt;strong&gt;k+1&lt;/strong&gt;) depends on its previous (step &lt;strong&gt;k&lt;/strong&gt;) thus making the whole algorithm serial.&lt;/p&gt;

&lt;h2&gt;
  
  
  Identifying parallelization opportunities
&lt;/h2&gt;

&lt;p&gt;After creating the serial algorithm it is important to first develop it and test it before trying to derive a parallel one from it. When this is done, we can start looking for ways to make the algorithm run on multiple cores.&lt;/p&gt;

&lt;p&gt;But that may not be always possible with the selected serial algorithm.&lt;/p&gt;

&lt;p&gt;This is also the case on our example here, the Gauss-Seidel method on a first glance looks completely serial. Before questioning our life choices that led as to this project, discard all our work and select a different project, it is a good idea to rethink about the problem and look for other parallelization opportunities that we may be missing.&lt;/p&gt;

&lt;p&gt;Usually, the way this works is by trying to identify "independent patterns" on the task. Meaning, trying to split the problem in parts that can be solved or at least partially solved independently from the others. In our particular case, a such pattern is that the websites-links graph isn't a &lt;em&gt;connected graph&lt;/em&gt; - meaning there exist groups of websites that only contain links within the group.&lt;/p&gt;

&lt;p&gt;From the above observation we conclude that there shouldn't be a problem to calculate the rating of the websites within different groups independently and indeed this is the case.&lt;/p&gt;

&lt;h2&gt;
  
  
  Developing and testing the parallel algorithm
&lt;/h2&gt;

&lt;p&gt;Great! Now we have a plan on our parallel approach. All that's left to do is develop it, test it and hopefully enjoy the speed up!&lt;/p&gt;

&lt;p&gt;In this project, I used a simple graph coloring algorithm to identify independent groups of nodes that can be assigned to different threads and be picked up by different CPU cores. Finally, the CPU cores synced after every step, in order for the algorithm to behave exactly like the serial Gauss-Seidel one. In a non for-education project, that would not be needed and the resulting algorithm would be using a merge of the Gauss-Seidel and Jacobi methods.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--OWmwGSmy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/rk452wsmtdej05hn6una.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--OWmwGSmy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/rk452wsmtdej05hn6una.png" alt="Time comparison between the serial and the parallel algorithms." title="Iteration step times with the parallel algorithm on an 8 core Intel Xeon E5420 @2.5GHz and 8GB ram."&gt;&lt;/a&gt;&lt;br&gt;Iteration step times with the parallel algorithm on an 8 core Intel Xeon E5420 @2.5GHz and 8GB ram.
  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--SC7BNUo3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/8zmajappdmliyoymmox0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--SC7BNUo3--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/8zmajappdmliyoymmox0.png" alt="Speed up gained by using the parallel algorithm instead of the serial one." title="Iteration step speed up with the parallel algorithm on an 8 core Intel Xeon E5420 @2.5GHz and 8GB ram."&gt;&lt;/a&gt;&lt;br&gt;Iteration step speed up with the parallel algorithm on an 8 core Intel Xeon E5420 @2.5GHz and 8GB ram.
  &lt;/p&gt;

&lt;p&gt;The results above correspond to a web graph of &lt;strong&gt;875,713&lt;/strong&gt; websites and &lt;strong&gt;4,563,235&lt;/strong&gt; links and the programs were compiled using &lt;strong&gt;GCC v.5.4.0&lt;/strong&gt; with the &lt;strong&gt;-O3&lt;/strong&gt; optimization flag enabled. The solution vector's convergence is also shown and is identical for both the parallel and serial versions of the program.&lt;/p&gt;

&lt;h2&gt;
  
  
  Link to Code
&lt;/h2&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vWogaON8--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://practicaldev-herokuapp-com.freetls.fastly.net/assets/github-logo-28d89282e0daa1e2496205e2f218a44c755b0dd6536bbadf5ed5a44a7ca54716.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/nicktheway"&gt;
        nicktheway
      &lt;/a&gt; / &lt;a href="https://github.com/nicktheway/Pagerank"&gt;
        Pagerank
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      A parallel C implementation using the Gauss Seidel method.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;h1&gt;
Pagerank Calculator&lt;/h1&gt;
&lt;h2&gt;
A C implementation of the pagerank algorithm with synchronous and asynchronous Gauss-Seidel iterations.&lt;/h2&gt;
&lt;p&gt;The algorithm needs the CRS representation of a graph in memory in order to compute its pagerank
Methods for loading web-graphs into the memory that way are provided for a specific file structure
Therefore, in order to load any web-graph, you might need to write a method that converts
your web-graph file into a file of that structure
For an example of such method check the already implemented
one that converts files like the ones &lt;a href="https://snap.stanford.edu/data/" rel="nofollow"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For the parallel algorithm, a simple edge coloring algorithm has been written in a way that makes the
algorithm equivalent to the serial one. Gauss-Seidel convergence speed depends on the order of the nodes too
The best way to order the nodes is NP but there are heuristic greedy ways to do it.
Non such way has been…&lt;/p&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/nicktheway/Pagerank"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;
  
  
  Notable headaches
&lt;/h2&gt;

&lt;p&gt;Here is a non-exclusive list of issues I had to overcome while working on this project:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How do I store and work with a table of up to about a million rows and columns? &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Solution&lt;/strong&gt;: Thankfully the table is sparse -&amp;gt; Used a compressed row storage (CRS) format.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How do I support different formats of input data?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Solution&lt;/strong&gt;: Created a special type of file optimized to be used by the program, then created a library for managing this new type of file. Can now work with any type of input by just providing a map function to the provided conversion program.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Introducing the traveler's teleport ability makes it possible for the traveler to jump from any node to any other, doesn't that break the sparsity of the table?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Indeed it does, but as the teleport probability is equal for every node, there is a workaround equation that works with the original -no teleportation included- sparse table.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Programming language headaches.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the node-coloring I wished I could use C++ to make use of the standard library's vector structure, in the end I ended up creating a custom vector structure in C. &lt;/p&gt;

&lt;h2&gt;
  
  
  Tools that helped
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://git-scm.com/"&gt;Git&lt;/a&gt;, even if used in a basic level, having version control usually makes life easier, this project was no exception.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.doxygen.nl/"&gt;Doxygen&lt;/a&gt;, very easy way to generate code documentation. &lt;a href="https://nicktheway.github.io/Pagerank/"&gt;Here&lt;/a&gt; is this project's code doc.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/"&gt;Github&lt;/a&gt;, version control and hosting the docs solution.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.gnu.org/software/make/manual/make.html"&gt;GNU make&lt;/a&gt;, the standard compilation command wrapper.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.latex-project.org/"&gt;LaTeX&lt;/a&gt;, creating the project's report. All the latex files are included inside the git repo, &lt;a href="https://github.com/nicktheway/Pagerank/blob/master/latex/report.pdf"&gt;here&lt;/a&gt; is the produced PDF. Although it is in Greek, the references titles are in English and can be found in the last page.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Thanks for reading and congratulations for your graduation everyone!
&lt;/h4&gt;

</description>
      <category>octograd2020</category>
      <category>c</category>
      <category>pagerank</category>
      <category>multicore</category>
    </item>
  </channel>
</rss>
