<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jovanny Cruz</title>
    <description>The latest articles on DEV Community by Jovanny Cruz (@jovannypcg).</description>
    <link>https://dev.to/jovannypcg</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F50982%2F568ab3d8-5386-424f-9dbf-40dcc3eca365.jpeg</url>
      <title>DEV Community: Jovanny Cruz</title>
      <link>https://dev.to/jovannypcg</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jovannypcg"/>
    <language>en</language>
    <item>
      <title>Back to the Future in Clojure</title>
      <dc:creator>Jovanny Cruz</dc:creator>
      <pubDate>Sun, 09 Feb 2020 01:05:44 +0000</pubDate>
      <link>https://dev.to/jovannypcg/back-to-the-future-in-clojure-4i29</link>
      <guid>https://dev.to/jovannypcg/back-to-the-future-in-clojure-4i29</guid>
      <description>&lt;p&gt;Parallel processing is an exciting topic, especially when talking about heavy processes that can be split into smaller tasks, executed in isolation, aggregated, and presented as a unique outcome.&lt;/p&gt;

&lt;p&gt;Clojure provides different ways to accomplish this kind of job, and &lt;em&gt;futures&lt;/em&gt; are a handy approach to split a big task into parallel threads that will eventually be gathered by the process that started them up, allowing it to aggregate the results to provide the same outcome in a faster way.&lt;/p&gt;

&lt;h1&gt;
  
  
  Talking Futures
&lt;/h1&gt;

&lt;p&gt;You can think about a &lt;em&gt;future&lt;/em&gt; as a piece of code that runs on its own and independent thread. More specifically, a &lt;em&gt;future&lt;/em&gt; is a Clojure macro that takes a set of expressions in order to execute them in another thread. The macro returns a reference in memory to the triggered &lt;em&gt;future&lt;/em&gt;. This allows to communicate with it via some helper functions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://clojuredocs.org/clojure.core/future_q"&gt;future?&lt;/a&gt;: Verifies the provided argument is a &lt;em&gt;future&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://clojuredocs.org/clojure.core/future-done_q"&gt;future-done?&lt;/a&gt;: Takes a &lt;em&gt;future&lt;/em&gt; as argument and verifies whether it has been completed.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://clojuredocs.org/clojure.core/future-cancel"&gt;future-cancel&lt;/a&gt;: Attempts to cancel the given &lt;em&gt;future&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://clojuredocs.org/clojure.core/future-cancelled_q"&gt;future-cancelled?&lt;/a&gt;: Verifies if the &lt;em&gt;future&lt;/em&gt; passed as argument was cancelled.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result of the &lt;em&gt;future&lt;/em&gt; is stored and available somewhere in the cache, we can attempt to get its outcome even when not finished yet by calling &lt;code&gt;(deref my-future)&lt;/code&gt;. In this case, the thread that tries to &lt;code&gt;deref&lt;/code&gt; is going to get blocked until the &lt;em&gt;future&lt;/em&gt; is done.&lt;/p&gt;

&lt;h1&gt;
  
  
  Getting the Stargazers from Github
&lt;/h1&gt;

&lt;p&gt;We are interested in getting the total amount of stars that &lt;a href="https://github.com/clojure"&gt;Clojure&lt;/a&gt; has on Github. That is, the sum of stars from each repository that belongs to the Clojure account.&lt;/p&gt;

&lt;p&gt;This information can be readily fetched by the &lt;a href="https://developer.github.com/v3/"&gt;Github API&lt;/a&gt;. The &lt;code&gt;GET /users/:username/repos&lt;/code&gt; &lt;a href="https://developer.github.com/v3/repos/#list-user-repositories"&gt;endpoint&lt;/a&gt; already responds with the star count of each repository. We can just loop over the returned JSON array, get the star counts and sum them up.&lt;/p&gt;

&lt;p&gt;For the sake of this article, let's assume the endpoint above does not return the star counts, but the name of the repositories, so that we can hit the &lt;code&gt;GET /repos/:owner/:repo&lt;/code&gt; &lt;a href="https://developer.github.com/v3/repos/#get"&gt;endpoint&lt;/a&gt; to get the star count. This means that for each repository name, a request to &lt;code&gt;GET /repos/:owner/:repo&lt;/code&gt; will be launched.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--dER5blRD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/qokm249rxkosa3u6fi94.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--dER5blRD--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/qokm249rxkosa3u6fi94.png" alt="" title="Aggregate Repositories Synchronously"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As the image above suggests, getting the stars from each repository is a sequential and synchronous task, which might take a considerable amount of time for each request to complete. Once all requests have successfully responded, the &lt;code&gt;reduce&lt;/code&gt; function comes into play to sum up all the stars.&lt;/p&gt;

&lt;p&gt;The following function, &lt;code&gt;get-star-count&lt;/code&gt;, executes the HTTP request synchronously to get the repository details and then extracts the &lt;code&gt;stargazers-count&lt;/code&gt; attribute from the response, which represents the star count.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;Now, let’s take a look at the &lt;code&gt;sum-stars&lt;/code&gt; function. It provides two arities, one that takes the username exclusively and another one that requires a sequence of repository names:&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;Note that the first arity in line 3 calls the 2-arity function and sends the result of &lt;code&gt;github/repo-names&lt;/code&gt; as the second argument (&lt;code&gt;repos&lt;/code&gt;). This involves an additional HTTP call that prevents us from manually entering the repo names using a previously defined sequence. We can avoid this additional HTTP call interfering with &lt;code&gt;sum-stars&lt;/code&gt; by either providing a defined sequence containing the repository names or storing the result of &lt;code&gt;github/repo-names&lt;/code&gt; in a symbol and then pass it in to the function.&lt;/p&gt;

&lt;p&gt;Each repository is passed in to &lt;code&gt;get-star-count&lt;/code&gt; in line 6, blocking the main thread until each single request is finished synchronously. Then the results are aggregated by &lt;code&gt;reduce&lt;/code&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Futures in Action
&lt;/h1&gt;

&lt;p&gt;Now we are going to implement the same functionality using &lt;em&gt;futures&lt;/em&gt;. This time, for each repository a &lt;em&gt;future&lt;/em&gt; that executes &lt;code&gt;GET /repos/:owner/:repo&lt;/code&gt; will be launched.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0qK1ZUdK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/q0e1reu017gtwbsuikks.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0qK1ZUdK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/q0e1reu017gtwbsuikks.png" alt="" title="Aggregate Repositories Asynchronously"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every HTTP request is working in parallel, as shown by the image above. The &lt;em&gt;futures&lt;/em&gt; are triggered in point A using the &lt;code&gt;future&lt;/code&gt; macro. Once they all are ready with their response, a &lt;em&gt;list comprehension&lt;/em&gt; can be utilized to make them meet in point B in order to be reduced to a unique result.&lt;/p&gt;

&lt;p&gt;Note that some requests might take longer than others, as they depend on the network throughput, latency of the Github server or even the speed of the processor running the threads. So, the worst case for this scenario is the request that takes more time to complete, as that will be the time for &lt;code&gt;sum-stars&lt;/code&gt; to be done.&lt;/p&gt;


&lt;div class="ltag_gist-liquid-tag"&gt;
  
&lt;/div&gt;


&lt;p&gt;The 1-arity function remains pretty much the same as in the synchronous version. All the repos of the given username are passed in to the 2-arity function.&lt;/p&gt;

&lt;p&gt;The interesting part begins in line 5, with the &lt;code&gt;for&lt;/code&gt; statement, which creates a list comprehension of &lt;em&gt;futures&lt;/em&gt;. That is, a sequence of &lt;em&gt;futures&lt;/em&gt; (line 6) that represents the location in memory of the threads working independently. Having this sequence of &lt;em&gt;futures&lt;/em&gt; allows us to communicate with them using the helper functions explained earlier or simply try to &lt;code&gt;deref&lt;/code&gt; and get the result.&lt;/p&gt;

&lt;p&gt;The pipeline in line 7 shows up the process to aggregate each result. The sequence of &lt;em&gt;futures&lt;/em&gt; is passed in to &lt;code&gt;map&lt;/code&gt; using &lt;code&gt;deref&lt;/code&gt;. This is going to block the main thread until each parallel thread is done. Once derefed, the list comprehension of &lt;em&gt;futures&lt;/em&gt; becomes a sequence of responses with the actual star counts. Then they are passed in to the &lt;code&gt;reduce&lt;/code&gt; function that simply sums them up.&lt;/p&gt;

&lt;h1&gt;
  
  
  Benchmarking
&lt;/h1&gt;

&lt;p&gt;Let’s see how fast the sync and async versions work. In order to do that, we are going to use the handy &lt;code&gt;time&lt;/code&gt; function that is part of &lt;code&gt;clojure.core&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Considerations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Executed on MacBook Pro, macOS Catalina 10.15.1, 2.2GHz 6-Core Intel i7, 16 GB 2400 MHz DDR4.&lt;/li&gt;
&lt;li&gt;The &lt;a href="https://github.com/clojure"&gt;Clojure&lt;/a&gt; account on Github is used for the tests.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;github/repo-names&lt;/code&gt; is limited to return 30 repos, despite &lt;a href="https://github.com/clojure"&gt;Clojure&lt;/a&gt; owns 85+.&lt;/li&gt;
&lt;li&gt;Based on the point above, the tests consider 30 repositories from the &lt;a href="https://github.com/clojure"&gt;Clojure&lt;/a&gt; account.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight clojure"&gt;&lt;code&gt;&lt;span class="n"&gt;user=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;"clojure"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;user=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;repos&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;github/repo-names&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;user=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;user=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;core/sum-stars&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;repos&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="s"&gt;"Elapsed time: 13841.095288 msecs"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;20263&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;user=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="n"&gt;user=&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;core/sum-stars-async&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;username&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;repos&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="s"&gt;"Elapsed time: 587.558128 msecs"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;20263&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;As you can see, the result of both versions is the same &lt;code&gt;20263&lt;/code&gt;, which represents the total stars of the first 30 repositories of Clojure on Github. But the elapsed time for the synchronous version is almost 14 seconds, unlike the 587.55 ms of the asynchronous call.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;Parallel processing provides a powerful way to split a heavy task into smaller and independent problems that can be aggregated once they all are done. &lt;em&gt;Futures&lt;/em&gt; in Clojure are a powerful and easy-to-use approach to accomplish this kind of task.&lt;/p&gt;

&lt;p&gt;Next time you are dealing with a function that is taking so long to complete, consider using &lt;em&gt;futures&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Do not forget to clone and play around with the &lt;a href="https://github.com/jovannypcg/parallel-processing"&gt;code of this article&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;

&lt;p&gt;The cover image is a Photo by &lt;a href="https://unsplash.com/@jamie452?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Jamie Street&lt;/a&gt; on &lt;a href="https://unsplash.com/s/photos/parallel?utm_source=unsplash&amp;amp;utm_medium=referral&amp;amp;utm_content=creditCopyText"&gt;Unsplash&lt;/a&gt;&lt;/p&gt;

</description>
      <category>clojure</category>
      <category>functional</category>
    </item>
  </channel>
</rss>
