<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: jesusramirezs</title>
    <description>The latest articles on DEV Community by jesusramirezs (@jesusramirezs).</description>
    <link>https://dev.to/jesusramirezs</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F460550%2F251833ac-1816-4f1c-bf1e-f41e7f6069af.jpeg</url>
      <title>DEV Community: jesusramirezs</title>
      <link>https://dev.to/jesusramirezs</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jesusramirezs"/>
    <language>en</language>
    <item>
      <title>“Scroll Restoration”, React Router and my custom solution for React Studyboard</title>
      <dc:creator>jesusramirezs</dc:creator>
      <pubDate>Mon, 21 Dec 2020 10:13:03 +0000</pubDate>
      <link>https://dev.to/jesusramirezs/scroll-restoration-react-router-and-my-custom-solution-for-react-studyboard-14bg</link>
      <guid>https://dev.to/jesusramirezs/scroll-restoration-react-router-and-my-custom-solution-for-react-studyboard-14bg</guid>
      <description>&lt;p&gt;I keep working on improvements for &lt;a href="https://dev.to/jesusramirezs/react-studyboard-react-hooks-redux-5h5p"&gt;&lt;strong&gt;React Studyboard&lt;/strong&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Github repository&lt;/strong&gt;: &lt;a href="https://github.com/jesusramirezs/react-studyboard" rel="noopener noreferrer"&gt;https://github.com/jesusramirezs/react-studyboard&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I would like to write in this article about:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;“scrollRestoration”&lt;/strong&gt; and React Router.&lt;/li&gt;
&lt;li&gt;My solution to resume reading a text at the point where it was last left.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  1. “scrollRestoration” and React Router
&lt;/h2&gt;

&lt;p&gt;According to developer.mozilla, “The scrollRestoration property of History interface allows web applications to explicitly set default scroll restoration behavior on history navigation.” (&lt;a href="https://developer.mozilla.org/en-US/docs/Web/API/History/scrollRestoration" rel="noopener noreferrer"&gt;https://developer.mozilla.org/en-US/docs/Web/API/History/scrollRestoration&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;This browser feature has raised some debate in the past when using &lt;strong&gt;React Router&lt;/strong&gt;, especially when it comes to unwanted performance. For example, in a &lt;strong&gt;SPA&lt;/strong&gt; (Single Application Page), when we navigate through React Router from one "page" to another, the browser keeps the scroll of the first page on the next one, instead of positioning itself at the beginning of the new page as it would be more logical and natural.&lt;/p&gt;

&lt;p&gt;See, for example, the following conversation when some time ago, the problem was detected and where a solution is beginning to emerge:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/ReactTraining/react-router/issues/252" rel="noopener noreferrer"&gt;https://github.com/ReactTraining/react-router/issues/252&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are times when it is desirable to maintain this performance and other times when it is not.&lt;/p&gt;

&lt;p&gt;After some time trying to address the issue with partial solutions, officially, React Router has chosen not to offer support to control this property. According to the documentation:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"In earlier versions of React Router we provided out-of-the-box support for scroll restoration and people have been asking for it ever since....Because browsers are starting to handle the “default case” and apps have varying scrolling needs, we don’t ship with default scroll management."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;(&lt;a href="https://reactrouter.com/web/guides/scroll-restoration" rel="noopener noreferrer"&gt;https://reactrouter.com/web/guides/scroll-restoration&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;As a result, when it is desired to dispense with automatic scrolling, especially in SPA's, the developer must adapt his solution, as described in the same guide or examples like this one:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://spectrum.chat/react/general/handling-scroll-position-on-route-changes%7E1e897a67-c05f-40c0-946b-d459d93615bf" rel="noopener noreferrer"&gt;https://spectrum.chat/react/general/handling-scroll-position-on-route-changes~1e897a67-c05f-40c0-946b-d459d93615bf&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. My solution to resume reading a text at the point where it was last left
&lt;/h2&gt;

&lt;p&gt;So, for example, in my case, to prevent this performance in the most reliable way, I have placed in the "header" component the following code to disable the “scrollRestauration” property of “window.history”:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;    &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt;&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;  &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;scrollRestoration&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;history&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scrollRestoration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;manual&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},[]);&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And for those components where I want the page to be displayed from a scrolling position at the top of the page, I use the following code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;    &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt;&lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;  &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scrollTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;},[]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But there is a particular case in which I find it necessary to maintain the browser scroll position &lt;strong&gt;when visiting a page for the second time: the article page, which is the essential page in the app. Thus, when I want to resume reading an article, which could be extended&lt;/strong&gt;, I find it convenient that the browser positions me at the point where I left the reading for the last time, something like a virtual page mark.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F7hcre23btg3qu93a9xmu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F7hcre23btg3qu93a9xmu.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I consider this functionality vital since it contributes to significantly improving the app's user experience by maintaining control over the reading and saving the reader time every time he or she returns to any of the articles.&lt;/p&gt;

&lt;p&gt;Also, I think it is interesting that &lt;strong&gt;the list of articles in a category or section shows the progress made in reading each of them&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In addition, &lt;strong&gt;the solution adopted to this problem can be used so that when you click on annotations, the application not only navigates to the article but position us exactly in the paragraph to which it refers&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The solution seems simple; it could store in Redux (the status manager I use in the project) the last scroll position of each article since the last login to the page, reading, for example, the &lt;strong&gt;window.pageYOffset&lt;/strong&gt; property, and when returning to the page, make a scrollTo to the previously stored position.&lt;/p&gt;

&lt;p&gt;This &lt;strong&gt;window.pageYOffset&lt;/strong&gt; property is monitored to show a thin reading progress bar at the top of the page.&lt;/p&gt;

&lt;p&gt;But this simple solution has some problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;The app allows you to modify the preferred font used in the articles' text and their size. If these properties are modified between two accesses to the same article, there is a possibility that the position of the scroll will not be correct since the height of each line will probably have changed.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If the author modifies the article's content between two reading sessions, adding new text or images, or something foreseen by future new features, the content is dynamically enriched by new content provided by other readers. Also, the reading position based on an offset will not be valid.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Afterward, it appears to make more sense to mark the last reading position based on the paragraphs visible in the browser at a given time rather than the offset.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the article.component, the text is divided into “paragraphs” (which may contain text or other content such as images or video).&lt;/p&gt;

&lt;p&gt;Each of these paragraphs is managed by &lt;strong&gt;TextBlock&lt;/strong&gt; component (pending renaming to a more appropriate name). &lt;/p&gt;

&lt;p&gt;The design decision is because this way, unrelated functionalities are separated, making the code more readable. This &lt;strong&gt;TextBlock&lt;/strong&gt; component deals with things like highlighting text, formatting Markdown, and displaying or editing annotations.&lt;/p&gt;

&lt;p&gt;Each TextBlock instance is embedded in a component called &lt;strong&gt;VisibilitySensor&lt;/strong&gt;, provided by the &lt;strong&gt;“react-visibility-sensor”&lt;/strong&gt; package.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fvk18r9zs2mvai43cme49.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fvk18r9zs2mvai43cme49.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This package provides a very useful feature for our purposes: it detects when a component becomes visible or invisible in the browser or inside another component depending on the scroll position.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;VisibilitySensor&lt;/span&gt; &lt;span class="nx"&gt;scrollCheck&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;scrollThrottle&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;partialVisibility&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nx"&gt;onChange&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;visibilityChange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each time a change occurs in a component's display, we check whether it is due to an upward or downward scroll and thus determine which the first active paragraph on the page is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;visibilityChange&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;isVisible&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;


      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;previous_scroll&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;lastScroll&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;new_scroll&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pageYOffset&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;documentElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scrollTop&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scrollTop&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;new_scroll&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;previous_scroll&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;isVisible&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;new_scroll&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;previous_scroll&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;isVisible&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;new_scroll&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;previous_scroll&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

          &lt;span class="nf"&gt;dispatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;updateProgressAtReadingStatus&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="na"&gt;articleId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;articleId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;progress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;calcProgress&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;textBlockId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}));&lt;/span&gt;

          &lt;span class="nx"&gt;lastScrollTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
          &lt;span class="nx"&gt;lastScroll&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;new_scroll&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nx"&gt;lastScroll&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;pageYOffset&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;documentElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scrollTop&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scrollTop&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="p"&gt;}&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, the identificator of this new active paragraph is sent to Redux:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;dispatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;updateProgressAtReadingStatus&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="na"&gt;articleId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;articleId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;progress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;calcProgress&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;textBlockId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And here is the second part of all this. Once we resume reading the article, the first thing we do is reading the first active paragraph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;useSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;selectArticleLastTextBlockId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;articleId&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;equality_selector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then scroll to the position of that paragraph:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;scrollRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;scrollIntoView&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;behavior&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;smooth&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;block&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;start&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is a interesting discussión about &lt;strong&gt;scrollIntoView&lt;/strong&gt;: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://stackoverflow.com/questions/48634459/scrollintoview-block-vs-inline/48635751#48635751" rel="noopener noreferrer"&gt;https://stackoverflow.com/questions/48634459/scrollintoview-block-vs-inline/48635751#48635751&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;My conclusion is that a seemingly simple feature requires some developmental effort and creativity. Thanks to the numerous components available, it is possible to arrive at acceptable solutions in a short time.&lt;/p&gt;

&lt;p&gt;Thanks for reading this article. Any feedback will be greatly appreciated.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://twitter.com/jesusramirezs" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/jesusramirezserran/?locale=en_US" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>react</category>
      <category>javascript</category>
      <category>webdev</category>
      <category>showdev</category>
    </item>
    <item>
      <title>React StudyBoard (React, Hooks, Redux...)</title>
      <dc:creator>jesusramirezs</dc:creator>
      <pubDate>Tue, 24 Nov 2020 16:53:40 +0000</pubDate>
      <link>https://dev.to/jesusramirezs/react-studyboard-react-hooks-redux-5h5p</link>
      <guid>https://dev.to/jesusramirezs/react-studyboard-react-hooks-redux-5h5p</guid>
      <description>&lt;p&gt;A &lt;strong&gt;React&lt;/strong&gt; webapp to publish and study extended content in Markdown format organized in articles and categories and allowing annotations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Github repository&lt;/strong&gt;: &lt;a href="https://github.com/jesusramirezs/react-studyboard"&gt;https://github.com/jesusramirezs/react-studyboard&lt;/a&gt;&lt;br&gt;
Please submit bug fixes via pull requests &amp;amp; feedback via issues.&lt;/p&gt;

&lt;h2&gt;
  
  
  Purpose
&lt;/h2&gt;

&lt;p&gt;With this app, I intend to develop an example app by using some of the latest trends in real React app (redux, hooks,...) and that, besides fulfilling an educational function, offers an attractive functionality.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--I1iKkSGX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ins4hp38jgfls8oogjgs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--I1iKkSGX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ins4hp38jgfls8oogjgs.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When I thought about developing &lt;strong&gt;React StudyBoard&lt;/strong&gt;, I imagined an app where you could publish extensive articles on a particular study topic and organize them into sections or categories, which would be useful for the study. I want this app to be helpful as an educational and informative app not only for simple reading, and for this, It had to allow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Using &lt;strong&gt;Markdown&lt;/strong&gt; for more friendly text formatting.&lt;/li&gt;
&lt;li&gt;Keeping a record of what has been read so far.&lt;/li&gt;
&lt;li&gt;To continue reading a text at the last point where it was left.&lt;/li&gt;
&lt;li&gt;To maintain an index of the following readings to be addressed by the student.&lt;/li&gt;
&lt;li&gt;Adapting the characteristics of the text to the &lt;strong&gt;reader's preferences&lt;/strong&gt; (font type, size...)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Highlighting&lt;/strong&gt; important text for the reader.&lt;/li&gt;
&lt;li&gt;Adding and organizing &lt;strong&gt;annotations&lt;/strong&gt; (also in Markdown format) to any text within the article.
&lt;/li&gt;
&lt;li&gt;Annotations must also support uploaded images (for now to Imgur).&lt;/li&gt;
&lt;li&gt;Being able to add &lt;strong&gt;tags&lt;/strong&gt; to any annotation. &lt;/li&gt;
&lt;li&gt;Editing annotations.&lt;/li&gt;
&lt;li&gt;Displaying the annotations made just by moving the cursor over the text without interrupting the reading flow.&lt;/li&gt;
&lt;li&gt;Quickly access to a list of all the annotations made in reverse chronological order of editing, from any of the articles, and from them, navigate to the point in the article to which they refer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the first version, and later in this article, I will tell you about the next tasks to be tackled in future versions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Zuil4scK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/9z87frap1whhivgbe6k2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Zuil4scK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/9z87frap1whhivgbe6k2.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--8aE4T2bM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/tjvjoq29j3x22kxyefot.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--8aE4T2bM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/tjvjoq29j3x22kxyefot.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Kbi18iYV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/38do4jyszh1huv2l5bff.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Kbi18iYV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/38do4jyszh1huv2l5bff.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GmNE5foA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ye23fkphqcts1jjlfkha.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GmNE5foA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ye23fkphqcts1jjlfkha.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;To get the frontend running locally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clone this repo &lt;code&gt;git clone https://github.com/jesusramirezs/react-studyboard.git&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;npm install&lt;/code&gt; or &lt;code&gt;yarn&lt;/code&gt; to install all required dependencies&lt;/li&gt;
&lt;li&gt;Optional: Edit the config-data.js file with your Firebase credentials and your Imgur API keys&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;npm start&lt;/code&gt;/ &lt;code&gt;yarn start&lt;/code&gt; to start the local server (this project uses create-react-app)&lt;/li&gt;
&lt;li&gt;App should now be running on &lt;code&gt;http://localhost:3000/&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Featuring
&lt;/h2&gt;

&lt;p&gt;The project makes use of the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;React Hooks&lt;/li&gt;
&lt;li&gt;React Redux&lt;/li&gt;
&lt;li&gt;React Suite components&lt;/li&gt;
&lt;li&gt;Styled components&lt;/li&gt;
&lt;li&gt;Firebase authentication&lt;/li&gt;
&lt;li&gt;Markdown-to-jsx&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Features
&lt;/h2&gt;

&lt;p&gt;The code is reasonably easy to follow and understand. It is divided into pages and components, each of them in a separate folder; I think they are as simple and decoupled as possible so that we do not add excessive levels to the code. The same has been done with different &lt;strong&gt;Redux&lt;/strong&gt; stores.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7PXHUceL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/9xthss3nx00ki1cg1vku.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7PXHUceL--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/9xthss3nx00ki1cg1vku.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bW1RXjC0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/7hcre23btg3qu93a9xmu.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bW1RXjC0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/7hcre23btg3qu93a9xmu.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--cnjjHuRV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/zlcc90lo3u7u7n35hbyp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--cnjjHuRV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/zlcc90lo3u7u7n35hbyp.png" alt="Alt Text"&gt;&lt;/a&gt;   &lt;/p&gt;

&lt;p&gt;All contents: sections and articles are stored in two &lt;strong&gt;JSON files&lt;/strong&gt;, easy to maintain and organize: one for categories and one for articles.&lt;/p&gt;

&lt;p&gt;The Markdown formatting is applied using the component &lt;strong&gt;Markdown-to-jsx&lt;/strong&gt;, in its version 6.11.4; I must mention that the last version of this package has generated some errors still to be solved.&lt;/p&gt;

&lt;p&gt;This component supports different functions for each of the formats, and specific functions have been implemented for rendering (in text-block.component.jsx) :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;paragraph
&lt;/li&gt;
&lt;li&gt;list elements&lt;/li&gt;
&lt;li&gt;titles (h1...h6)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;strong&gt;tag-input&lt;/strong&gt; component is used to enter tags in the annotation form and unique colors have been set aside for three specific tags so that they are visually easy to identify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;re-read&lt;/li&gt;
&lt;li&gt;question&lt;/li&gt;
&lt;li&gt;highlight&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All standard status management between components in the app is managed through &lt;strong&gt;React-Redux&lt;/strong&gt;, and all access to the standard status is done through selectors.&lt;/p&gt;

&lt;p&gt;Redux stores the most varied information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The visible or hidden state of the side panels&lt;/li&gt;
&lt;li&gt;The reading progress point of each article and the last article read.&lt;/li&gt;
&lt;li&gt;All content: articles and categories
Content of the reading list&lt;/li&gt;
&lt;li&gt;All text portions highlighted&lt;/li&gt;
&lt;li&gt;Annotations&lt;/li&gt;
&lt;li&gt;User preferences (preferred font and size)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system uses local storage as user data storage, almost everything stored in Redux except the contents themselves.&lt;/p&gt;

&lt;p&gt;So far, this could be enough, but obviously, it has its limitations, and in the next version, the app will probably use Firebase as cloud storage.&lt;/p&gt;

&lt;p&gt;An authentication mechanism has been implemented through user password and &lt;strong&gt;Google Auth&lt;/strong&gt; but only for educational purposes and to support the cloud storage and sharing of content and annotations between users in a future version. &lt;/p&gt;

&lt;p&gt;I am not a graphic designer, so I have tried to keep the style as simple as possible. To do this, I have used:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tachyons CSS&lt;/strong&gt; as the main style base. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Styled Components&lt;/strong&gt; to apply the styles to some of the components.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;React Suite&lt;/strong&gt; for some particular components: drawer, progress bar.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There are &lt;strong&gt;still many points of improvement and evolution in the project&lt;/strong&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  From the functional point of view.
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Allow highlighting of any word selection, not just whole paragraphs, and allow annotations on them.&lt;/li&gt;
&lt;li&gt;Allow the sharing of notes between different students.&lt;/li&gt;
&lt;li&gt;Allow several tabs to keep reading several articles at once. Perhaps use a splitter in the reading panel to have two or more articles active.&lt;/li&gt;
&lt;li&gt;Improve the management of image uploads to the cloud.&lt;/li&gt;
&lt;li&gt;Add night mode for reading.&lt;/li&gt;
&lt;li&gt;Filter the side panel annotations according to tags. For example: display only "questions" or "re-readings.&lt;/li&gt;
&lt;li&gt;The possibility of publishing your articles (summaries, reflections) and dynamically integrating notes on other articles into the content.&lt;/li&gt;
&lt;li&gt;The possibility to export/import annotations in the JSON file.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;EDIT: Dec 19, 2020&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;v1.1:&lt;/p&gt;

&lt;p&gt;Accomplished: Filter the side panel annotations according to tags. For example: display only "questions" or "re-readings.&lt;br&gt;
Accomplished: Allow highlighting of any word selection, not just whole paragraphs.&lt;br&gt;
Accomplished: Improved behaviour os scroll restoration mechanism.&lt;/p&gt;

&lt;h3&gt;
  
  
  From the technical point of view.
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;PropTypes for type verification.&lt;/li&gt;
&lt;li&gt;Improve the naming of some components.&lt;/li&gt;
&lt;li&gt;Improve the communication mechanism between components, e.g., Article and Annotation Form.&lt;/li&gt;
&lt;li&gt;Use a database system for storage of items (instead of JSON files), statuses, and annotations. Perhaps based on Apollo and GraphQL.&lt;/li&gt;
&lt;li&gt;Integrate a complete testing system into the project.
In-depth performance review.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Credits
&lt;/h2&gt;

&lt;p&gt;All texts have benn generated using &lt;a href="https://www.blindtextgenerator.com/"&gt;https://www.blindtextgenerator.com/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;All images come from the initiative Open Access from The Metropolitan Museum of Art:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.metmuseum.org/about-the-met/policies-and-documents/open-access"&gt;https://www.metmuseum.org/about-the-met/policies-and-documents/open-access&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks for reading this article. Any feedback will be greatly appreciated.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://twitter.com/jesusramirezs"&gt;Twitter&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/jesusramirezserran/?locale=en_US"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>react</category>
      <category>webdev</category>
      <category>showdev</category>
    </item>
    <item>
      <title>First steps in text processing with NLTK: text tokenization and analysis</title>
      <dc:creator>jesusramirezs</dc:creator>
      <pubDate>Tue, 17 Nov 2020 14:49:59 +0000</pubDate>
      <link>https://dev.to/jesusramirezs/first-steps-in-text-processing-with-nltk-text-tokenization-and-analysis-2591</link>
      <guid>https://dev.to/jesusramirezs/first-steps-in-text-processing-with-nltk-text-tokenization-and-analysis-2591</guid>
      <description>&lt;p&gt;I have already had the opportunity to talk about &lt;strong&gt;NLTK&lt;/strong&gt; in two of my previous articles (&lt;a href="https://dev.to/jesusramirezs/why-is-worth-considering-django-for-your-web-project-in-2020-dh7"&gt;link#1&lt;/a&gt;, &lt;a href="https://dev.to/jesusramirezs/when-a-technology-draws-the-perfect-constellation-for-your-educational-project-46ng"&gt;link#2&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;In this article, I would like to review some possibilities of &lt;strong&gt;NLTK&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The kind of examples discussed in articles like this one fall under what is called natural language processing (NLP). We can apply these techniques to different categories of texts to obtain very varied results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatic summaries&lt;/li&gt;
&lt;li&gt;Sentiment analysis&lt;/li&gt;
&lt;li&gt;Keyword extraction for search engines&lt;/li&gt;
&lt;li&gt;Content recommendation&lt;/li&gt;
&lt;li&gt;Opinion research (in marketplaces, aggregators, etc...) &lt;/li&gt;
&lt;li&gt;Offensive language filters&lt;/li&gt;
&lt;li&gt;...&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;NLTK is not the only one package in this field. There are other alternatives to NLTK for these types of tasks, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Apache OpenNLP.&lt;/li&gt;
&lt;li&gt;Stanford NLP suite.&lt;/li&gt;
&lt;li&gt;Gate NLP library.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To start experimenting, in the first place we will install NLTK from the Python CLI, which is very simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the NLTK library is installed, we can install different packages from the Python command-line interface, like the Punkt sentence tokenizer :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;nltk&lt;/span&gt;

&lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'punkt'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One of the most important things to do before tackling any natural language processing task is "text tokenization". This phase can be critical because otherwise, it will be much more challenging to process the text.   &lt;/p&gt;

&lt;p&gt;Tokenization, also known as &lt;strong&gt;text segmentation&lt;/strong&gt; or &lt;strong&gt;linguistic analysis&lt;/strong&gt;, consists of conceptually dividing text or text strings into smaller parts such as sentences, words, or symbols. As a result of the tokenization process, we will get a list of tokens.&lt;/p&gt;

&lt;p&gt;NLTK includes both a &lt;strong&gt;phrase tokenizer&lt;/strong&gt; and a &lt;strong&gt;word tokenizer&lt;/strong&gt;. A text can be converted into sentences; sentences can be tokenized into words, etc.&lt;/p&gt;

&lt;p&gt;We have, for example, this text (from Wikipedia - Stoicism):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;para&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"Stoicism is a school of Hellenistic philosophy founded by Zeno of Citium in Athens in the early 3rd century BC. It is a philosophy of personal ethics informed by its system of logic and its views on the natural world. According to its teachings, as social beings, the path to eudaimonia (happiness, or blessedness) is found in accepting the moment as it presents itself, by not allowing oneself to be controlled by the desire for pleasure or by the fear of pain, by using one's mind to understand the world and to do one's part in nature's plan, and by working together and treating others fairly and justly."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To perform Tokenizing in Python is simple. We import the NLTK library and precisely the &lt;strong&gt;sent_tokenize&lt;/strong&gt; function that will return a vector with a token for each phrase.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk.tokenize&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sent_tokenize&lt;/span&gt;

&lt;span class="n"&gt;tokenized_l1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sent_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;para&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokenized_l1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and we will get the following result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Stoicism is a school of Hellenistic philosophy founded by Zeno of Citium in Athens in the early 3rd century BC.'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'It is a philosophy of personal ethics informed by its system of logic and its views on the natural world.'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"According to its teachings, as social beings, the path to eudaimonia (happiness, or blessedness) is found in accepting the moment as it presents itself, by not allowing oneself to be controlled by the desire for pleasure or by the fear of pain, by using one's mind to understand the world and to do one's part in nature's plan, and by working together and treating others fairly and justly."&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Likewise, we can tokenize a sentence to obtain a list of words:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk.tokenize&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;

&lt;span class="n"&gt;sentence1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tokenized_l1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'Stoicism'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'is'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'a'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'school'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'of'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'Hellenistic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'philosophy'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'founded'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'by'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'Zeno'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'of'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'Citium'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'Athens'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'the'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'early'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'3rd'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'century'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'BC'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'.'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Well, let's do something a little closer to a real case; for example, extract some statistics from an article. We can take the content of the web page and then analyze the text to draw some conclusions from the text.&lt;/p&gt;

&lt;p&gt;For this, we can use &lt;strong&gt;urllib.request&lt;/strong&gt; to get the HTML content of our target page:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;urllib.request&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'https://en.wikipedia.org/wiki/Stoicism'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and use &lt;strong&gt;BeautifulSoup&lt;/strong&gt;. This is a very useful Python library for extracting data from HTML and XML documents and setting different filtering and noise removal levels. We can extract only the text of the page without HTML markup by using &lt;strong&gt;get_text()&lt;/strong&gt; or a custom solution like in the example code bellow.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="n"&gt;beautifulsoup4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;bs4&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;

&lt;span class="n"&gt;soup&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;BeautifulSoup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;html&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s"&gt;"html.parser"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"#firstHeading"&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;

&lt;span class="n"&gt;paragraphs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;soup&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"p"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;intro&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt; &lt;span class="n"&gt;para&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;para&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;paragraphs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;]])&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;intro&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we can go on to convert the text obtained into tokens by dividing the text as described above:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;word_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;intro&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;From here, we can apply different tools to “standardize” our token set. For example, to convert all tokens to lowercase:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;new_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;new_token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;new_tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_tokens&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;...remove punctuation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;re&lt;/span&gt;
&lt;span class="n"&gt;new_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;new_token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;r'[^\w\s]'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;new_token&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;new_tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_tokens&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;...replace numbers with their textual representation using Inflect, which is a library that generate plurals, singular nouns, ordinals, indefinite articles and convert numbers to words:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="n"&gt;inflect&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;inflect&lt;/span&gt;
&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;inflect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;new_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isdigit&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;new_token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;number_to_words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;new_tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;new_tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_tokens&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;... and remove stopwords, which are words that don't add significantly any sense to the text.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'stopwords'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk.corpus&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;stopwords&lt;/span&gt;

&lt;span class="n"&gt;stopwords&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'i'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'me'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'my'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'myself'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'we'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'our'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'ours'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'ourselves'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'you'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"you're"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"you've"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"you'll"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"you'd"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'your'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'yours'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'yourself'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'yourselves'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'he'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'him'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'his'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'himself'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'she'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"she's"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'her'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'hers'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'herself'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'it'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"it's"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'its'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'itself'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'they'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'them'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'their'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'theirs'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'themselves'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'what'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'which'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'who'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'whom'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'this'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'that'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"that'll"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'these'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'those'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'am'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'is'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'are'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'was'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'were'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'be'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'been'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'being'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'have'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'has'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'had'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'having'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'do'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'does'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'did'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'doing'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'a'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'an'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'the'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'and'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'but'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'if'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'or'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'because'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'as'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'until'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'while'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'of'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'at'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'by'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'for'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'with'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'about'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'against'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'between'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'into'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'through'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'during'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'before'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'after'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'above'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'below'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'to'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'from'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'up'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'down'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'out'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'on'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'off'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'over'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'under'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'again'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'further'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'then'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'once'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'here'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'there'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'when'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'where'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'why'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'how'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'all'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'any'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'both'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'each'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'few'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'more'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'most'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'other'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'some'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'such'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'no'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'nor'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'not'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'only'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'own'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'same'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'so'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'than'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'too'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'very'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'s'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'t'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'can'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'will'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'just'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'don'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"don't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'should'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"should've"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'now'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'d'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'ll'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'m'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'o'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'re'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'ve'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'y'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'ain'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'aren'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"aren't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'couldn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"couldn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'didn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"didn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'doesn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"doesn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'hadn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"hadn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'hasn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"hasn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'haven'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"haven't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'isn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"isn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'ma'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'mightn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"mightn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'mustn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"mustn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'needn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"needn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'shan'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"shan't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'shouldn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"shouldn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'wasn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"wasn't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'weren'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"weren't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'won'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"won't"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;'wouldn'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"wouldn't"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="n"&gt;new_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stopwords&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'english'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;new_tokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_tokens&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, lemmatisation will allow us to extract the root of each word and thus ignore any inflection (verbal conjugations, plurals...)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk.stem&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WordNetLemmatizer&lt;/span&gt; 
&lt;span class="n"&gt;lemmatizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;WordNetLemmatizer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;lemmas&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;lemma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lemmatizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lemmatize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'v'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;lemmas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lemma&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;lemmas&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tokenization and lemmatization are techniques widely used by me in my &lt;a href="https://dev.to/jesusramirezs/when-a-technology-draws-the-perfect-constellation-for-your-educational-project-46ng"&gt;last project&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Once all this standardization process is done, we can move on to simple analysis, for example, calculate the frequency distribution of those tokens using a function in NLTK called &lt;strong&gt;FreqDist()&lt;/strong&gt; that does the job correctly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;freq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FreqDist&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;

    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;':'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;material&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;found&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;ethical&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;phrase&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;present&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;others&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;justly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;teach&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;athens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;natural&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;especially&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;corruptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;health&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;live&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;believe&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;sufficient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;emotionally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;misfortune&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;citium&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;moral&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;behave&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;sage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;mind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;everything&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;zeno&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;radical&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;epictetus&lt;/span&gt;&lt;span class="err"&gt;—&lt;/span&gt;&lt;span class="n"&gt;emphasized&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;pain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;pleasure&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;moment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;blessedness&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;allow&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;hold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;philosophy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;also&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;two&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;know&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;rule&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;adiaphora&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;people&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;fear&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;bc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;use&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;vicious&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;together&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;accord&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;maintain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;fairly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;say&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;prohairesis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;act&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;ethics&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;form&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;work&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;nature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;would&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;human&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;social&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;wealth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;stoicism&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;understand&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;three&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;one&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="n"&gt;mean&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;think&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;judgment&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;oneself&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;pleasure&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;accept&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;hellenistic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;school&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;virtue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="n"&gt;personal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;person&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;century&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;equally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;eudaimonia&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;order&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;free&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;similar&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;destructive&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;calm&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;alongside&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;indication&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;accordance&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;view&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;certain&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;good&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;seneca&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;bad&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;many&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;consider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;belief&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;resilient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;truly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;control&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;include&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;logic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;early&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;desire&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;world&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;life&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;inform&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;major&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;happiness&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;tradition&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;though&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;emotions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;treat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;since&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;stoics&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;individual&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;aim&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;aristotelian&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;be&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="n"&gt;upon&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;external&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="n"&gt;stoic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;
&lt;span class="n"&gt;approach&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And finally, we can graphically represent the result in this way.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;pip&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="n"&gt;matplotlib&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;freq&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cumulative&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--jPubRKd6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/j7le9b13y75f910w54sj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--jPubRKd6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/j7le9b13y75f910w54sj.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This first analysis can help us classify a text and determine how to index we could frame the article within a content aggregator.&lt;/p&gt;

&lt;p&gt;We can apply formal techniques to this classification, such as a &lt;strong&gt;Naive Bayes classifier&lt;/strong&gt;; most simply, in its "naive" mode, we use the conditional probabilities of the words in a text to determine which category it belongs to. This algorithm is called "naive" because it calculates each word's conditional probabilities separately as if they were independent of each other. Once we have each word's conditional probabilities in a review, we calculate the joint probability of all of them by using a Pi-product to determine the likelihood that it belongs to the category.&lt;/p&gt;

&lt;p&gt;I would like to cover a simple example of applying a &lt;strong&gt;Naive Bayes classifier&lt;/strong&gt; in another article.&lt;/p&gt;

&lt;p&gt;To finish this article we can tackle an apparently difficult task wich becomes easy to achieve in Python, at least at a first level of approach: sentiment analysis. By using NLTK we can analyze the feeling of each sentence in the article.  Sense analysis is a machine learning technique based on natural language processing, aiming to obtain subjective information from a series of texts or documents.&lt;/p&gt;

&lt;p&gt;To do this, we must download the next package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;nltk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;download&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'vader_lexicon'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This package implements VADER ( Valence Aware Dictionary for Sentiment Reasoning), a model used for text sentiment analysis that is sensitive to both positive/negative and strength of emotion. &lt;/p&gt;

&lt;p&gt;Next, we will cut the text to be analyzed by using a tokenization process that allows us to divide the different sentences of a paragraph, obtaining each one of them separately.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;tokenized_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sent_tokenize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk.sentiment.vader&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SentimentIntensityAnalyzer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;nltk&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sentiment&lt;/span&gt;


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"The habit of reading is one of the greatest resources of mankind; and we enjoy reading books that belong to us much more than if they are borrowed. A borrowed book is like a guest in the house; it must be treated with punctiliousness, with a certain considerate formality. You must see that it sustains no damage; it must not suffer while under your roof. You cannot leave it carelessly, you cannot mark it, you cannot turn down the pages, you cannot use it familiarly. And then, some day, although this is seldom done, you really ought to return it...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(William Lyon Phelps- The pleasure of Books, from &lt;a href="http://www.historyplace.com/speeches/phelps.htm"&gt;http://www.historyplace.com/speeches/phelps.htm&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;...to finally instanciate the sentiment analyzer and apply it to each sentence.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SentimentIntensityAnalyzer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sentence&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tokenized_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;polarity_scores&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sentence&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;': '&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a result we can examine each of the phrases separately.&lt;br&gt;
These are some example results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;The&lt;/span&gt; &lt;span class="n"&gt;habit&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;reading&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;one&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;greatest&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="n"&gt;of&lt;/span&gt; &lt;span class="n"&gt;mankind&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;we&lt;/span&gt; &lt;span class="n"&gt;enjoy&lt;/span&gt; &lt;span class="n"&gt;reading&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;belong&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;us&lt;/span&gt; &lt;span class="n"&gt;much&lt;/span&gt; &lt;span class="n"&gt;more&lt;/span&gt; &lt;span class="n"&gt;than&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;they&lt;/span&gt; &lt;span class="n"&gt;are&lt;/span&gt; &lt;span class="n"&gt;borrowed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;neg&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.0&lt;/span&gt;

&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.222&lt;/span&gt;

&lt;span class="n"&gt;neu&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.778&lt;/span&gt;

&lt;span class="n"&gt;compound&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.8126&lt;/span&gt;

&lt;span class="n"&gt;A&lt;/span&gt; &lt;span class="n"&gt;borrowed&lt;/span&gt; &lt;span class="n"&gt;book&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;like&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;guest&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;house&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="n"&gt;must&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="n"&gt;treated&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;punctiliousness&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;certain&lt;/span&gt; &lt;span class="n"&gt;considerate&lt;/span&gt; &lt;span class="n"&gt;formality&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;neg&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.0&lt;/span&gt;

&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.333&lt;/span&gt;

&lt;span class="n"&gt;neu&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.667&lt;/span&gt;

&lt;span class="n"&gt;compound&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.7579&lt;/span&gt;

&lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;must&lt;/span&gt; &lt;span class="n"&gt;see&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="n"&gt;sustains&lt;/span&gt; &lt;span class="n"&gt;no&lt;/span&gt; &lt;span class="n"&gt;damage&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="n"&gt;must&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;suffer&lt;/span&gt; &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;under&lt;/span&gt; &lt;span class="n"&gt;your&lt;/span&gt; &lt;span class="n"&gt;roof&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;neg&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.254&lt;/span&gt;

&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.134&lt;/span&gt;

&lt;span class="n"&gt;neu&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.612&lt;/span&gt;

&lt;span class="n"&gt;compound&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.3716&lt;/span&gt;

&lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;leave&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="n"&gt;carelessly&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;mark&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="n"&gt;down&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;cannot&lt;/span&gt; &lt;span class="n"&gt;use&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="n"&gt;familiarly&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;neg&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.0&lt;/span&gt;

&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.138&lt;/span&gt;

&lt;span class="n"&gt;neu&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.862&lt;/span&gt;

&lt;span class="n"&gt;compound&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.2235&lt;/span&gt;

&lt;span class="p"&gt;...&lt;/span&gt;

&lt;span class="n"&gt;And&lt;/span&gt; &lt;span class="n"&gt;there&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="n"&gt;no&lt;/span&gt; &lt;span class="n"&gt;doubt&lt;/span&gt; &lt;span class="n"&gt;that&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;these&lt;/span&gt; &lt;span class="n"&gt;books&lt;/span&gt; &lt;span class="n"&gt;you&lt;/span&gt; &lt;span class="n"&gt;see&lt;/span&gt; &lt;span class="n"&gt;these&lt;/span&gt; &lt;span class="n"&gt;men&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt; &lt;span class="n"&gt;their&lt;/span&gt; &lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
&lt;span class="n"&gt;neg&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.215&lt;/span&gt;

&lt;span class="n"&gt;pos&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.192&lt;/span&gt;

&lt;span class="n"&gt;neu&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.594&lt;/span&gt;

&lt;span class="n"&gt;compound&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="mf"&gt;0.128&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For each sentence, several different scores are obtained, which we will see in the output a little further down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;neg (negative): to tell us how negative this sentence would be.&lt;/li&gt;
&lt;li&gt;neu (neutral): this second value indicates the neutrality of a phrase and a score between zero and one.&lt;/li&gt;
&lt;li&gt;pos (positive): Same as the previous ones, but indicating how positive a phrase is.&lt;/li&gt;
&lt;li&gt;compound: this is a value between -1 and 1 that indicates at once whether the phrase is positive or negative. Values close to -1 suggest that it is very negative, close to zero would mean that it is neutral, and close to, that it would be very positive.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you are interested in this field of analysis probably you can find more suitable texts available for sentiment analysis, like political opinions, product reviews... this is just an example.&lt;/p&gt;

&lt;p&gt;Thanks for reading this article. If you have any questions, feel free to comment below.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://twitter.com/jesusramirezs"&gt;Twitter&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/jesusramirezserran/?locale=en_US"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
    </item>
    <item>
      <title>When a technology draws the perfect constellation for your educational project (Django, NLTK...)</title>
      <dc:creator>jesusramirezs</dc:creator>
      <pubDate>Mon, 26 Oct 2020 07:34:48 +0000</pubDate>
      <link>https://dev.to/jesusramirezs/when-a-technology-draws-the-perfect-constellation-for-your-educational-project-46ng</link>
      <guid>https://dev.to/jesusramirezs/when-a-technology-draws-the-perfect-constellation-for-your-educational-project-46ng</guid>
      <description>&lt;p&gt;Each development project is unique, and there are times when the characteristics of a particular technology show up as a perfect constellation. I would love to present my case.&lt;/p&gt;

&lt;p&gt;Two years ago, I started developing an educational project called &lt;strong&gt;&lt;a href="https://eningles.club/" rel="noopener noreferrer"&gt;eningles.club&lt;/a&gt;&lt;/strong&gt; of which I am immensely proud and is already working and helping students from different countries.&lt;/p&gt;

&lt;p&gt;The objective was initially quite simple: a web tool that would help early &lt;strong&gt;English learners&lt;/strong&gt; to improve their &lt;strong&gt;English reading and listening skills&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To do this, I set out to develop a tool that would offer students different activities that would reinforce their vocabulary and grammar before reading well-known stories in English from prestigious publishers that are easy to find in a bookstore.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F6cm2kzv94i0bzx30nnv9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F6cm2kzv94i0bzx30nnv9.png" alt="Example activities"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #1: Example activities&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fmcabg155i2tfplze4c7r.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fmcabg155i2tfplze4c7r.png" alt="Grammar activity"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #2: Grammar activity&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;The tool is currently available for native Spanish language students, but I would like to extend it to students from anywhere in the world in the future.&lt;/p&gt;

&lt;p&gt;Determined to undertake the adventure, I looked for and studied different alternatives in terms of technology. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fykslljadtdm3i8d13tni.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fykslljadtdm3i8d13tni.png" alt="Sky"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I raised four fundamental options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Django / Python + Javascript for the frontend&lt;/li&gt;
&lt;li&gt;React / Node&lt;/li&gt;
&lt;li&gt;React / Headless CMS (Strapi, Directus ...)&lt;/li&gt;
&lt;li&gt;Monolithic CMS (WordPress, Drupal ...)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The second and third options attracted me enormously, but there were good reasons which, like stars of a perfect constellation, led me to choose &lt;strong&gt;&lt;a href="https://www.djangoproject.com/" rel="noopener noreferrer"&gt;Django&lt;/a&gt;&lt;/strong&gt;, a framework that I have already covered in a &lt;a href="https://dev.to/jesusramirezs/why-is-worth-considering-django-for-your-web-project-in-2020-dh7"&gt;previous article&lt;/a&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  Star #1: Django Admin
&lt;/h3&gt;

&lt;p&gt;A large part of the development effort required for an educational application like this would go to the development of a backend to manage the creation and maintenance of very varied content: lessons, readings, grammar exercises, flashcards, multiple-choice exercises, vocabulary pills, and also chain them in "routes" and learning sessions.&lt;/p&gt;

&lt;p&gt;Not developing this background would mean having to edit all these contents directly on the database or keeping it in some plain text format following a particular grammar (using the well-known &lt;a href="http://dinosaur.compilertools.net/" rel="noopener noreferrer"&gt;LEX and YACC&lt;/a&gt; tools for example). Also, the idea is that in the future other collaborators can contribute content.&lt;/p&gt;

&lt;p&gt;Using a Headless CMS seemed viable, but there was a much more powerful option: &lt;strong&gt;&lt;a href="https://docs.djangoproject.com/en/3.1/ref/contrib/admin/#:~:text=One%20of%20the%20most%20powerful,an%20organization's%20internal%20management%20tool." rel="noopener noreferrer"&gt;Django Admin&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fjihlohldi68zdtls728d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fjihlohldi68zdtls728d.png" alt="Django Admin main view"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #3: Django Admin main view&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fsdz8s29ft2ah88sc04c8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fsdz8s29ft2ah88sc04c8.png" alt="Django Admin"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #4: Django Admin&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Django Admin&lt;/strong&gt; offers a highly customizable CRUD interface. This interface is automatically generated from the definition of the database data model already written in Python as part of the Django project. To this is added other metadata information that we can contribute. That allows us to make the Django Admin interface more user-friendly (labels, validations, relationships ...) and create more complex tools intended even for other staff users with non-technical profiles.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F1yik00qnrtfozotbm5pe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F1yik00qnrtfozotbm5pe.png" alt="Administration metadata for database model"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #5: Administration metadata for database model&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;With this possibility that Django offers me, I cut development time by months because I did not have the time to save in the future in backend maintenance and error correction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Star #2: Natural language processing with NLTK
&lt;/h3&gt;

&lt;p&gt;My project required developing natural language analysis algorithms to process lots of contents written in English, mainly text and video transcripts.&lt;/p&gt;

&lt;p&gt;I needed to extract specially difficult vocabulary according to the different English levels(A1, B1, B2 ...), classify language by level, detect verb conjugations, proper names, and classify the complexity of sentences...&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;One of the key characteristics is determining how difficult a word is for a English learner depending on the word itself and the context in which it is found.&lt;/p&gt;

&lt;p&gt;Based on the facilities that NLTK provides, the algorithms used to analize the vocabulary present in a text are based on different studies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Research papers from Paul Nation about his work in English language learning and vocabulary coverage in English texts &lt;a href="https://www.wgtn.ac.nz/lals/resources/paul-nations-resources" rel="noopener noreferrer"&gt;-link-&lt;/a&gt; &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Resources from the University Centre for Computer Corpus Research on Language at the University of Lancaster &lt;a href="http://ucrel.lancs.ac.uk/" rel="noopener noreferrer"&gt;-link-&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Resources from University of Oxford about researching in English corpora &lt;a href="https://libguides.bodleian.ox.ac.uk/c.php?g=422775&amp;amp;p=2886864" rel="noopener noreferrer"&gt;-link-&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;English Profile from Cambridge University &lt;a href="https://www.englishprofile.org/wordlists" rel="noopener noreferrer"&gt;-link-&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fyl0h9xtmzoro14qsw4on.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fyl0h9xtmzoro14qsw4on.png" alt="Vocabulary analysis for a given text"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #6: Vocabulary analysis for a given text&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;Python has excellent natural language processing packages. &lt;strong&gt;&lt;a href="https://www.nltk.org/" rel="noopener noreferrer"&gt;NLTK&lt;/a&gt;&lt;/strong&gt; is undoubtedly the most important, extremely powerful, but other libraries like &lt;strong&gt;&lt;a href="https://textblob.readthedocs.io/en/dev/" rel="noopener noreferrer"&gt;TextBlob&lt;/a&gt;&lt;/strong&gt; are especially interesting for specific tasks without conflicting with the use of &lt;strong&gt;NLTK&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fj635d6h0encx3gwooolw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fj635d6h0encx3gwooolw.png" alt="Readability tests code"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #7: Readability tests code&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;NLTK&lt;/strong&gt; also offered me a perfect gift: a compelling interface with &lt;strong&gt;&lt;a href="https://wordnet.princeton.edu/" rel="noopener noreferrer"&gt;WordNet&lt;/a&gt;&lt;/strong&gt;, a huge lexical database in English, something like a complete thesaurus for scientific purposes, organized in semantic sets called &lt;strong&gt;synsets&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fwtx9fpadjtmsl7vbn64l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fwtx9fpadjtmsl7vbn64l.png" alt="Dictionary manager"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fiqboe36ut8a74nvy0uyq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fiqboe36ut8a74nvy0uyq.png" alt="Vocabulary manager"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #8: Dictionary and vocabulary manager&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;Also, its license for use is relatively permissive to use Wordnet to offer an exciting dictionary within my project.&lt;/p&gt;

&lt;p&gt;Again a possibility was open to me to cut the development time dramatically.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This project uses a &lt;strong&gt;readability test&lt;/strong&gt; based on different readability studies to classify the complexity of any text and determine how suitable it is for a student. Scores from readability formulas generally correlate highly with the actual readability of a text.&lt;/p&gt;

&lt;p&gt;Were readability test comes from?&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;US Military&lt;/strong&gt; took the first step in grading adults in 1917. In 1921 Edward Thorndike tabulated the frequency of difficult words used in general literature, a great step in analyzing texts complexity.&lt;/p&gt;

&lt;p&gt;First modern readability tests were developed in the 1930s. The Great Depression sparked investment in adult education. General adult readers in the United States had limited reading ability and many students complained that many books were too difficult to read. These first readability tests were based on sentence length and difficult words.&lt;/p&gt;

&lt;p&gt;In 1949 &lt;strong&gt;George Kingsley Zipf&lt;/strong&gt; came up with his study "Human Behavior and The Principle of Least Effort", in which he declared a mathematical relationship between the hard and easy words and its frequency in texts, called &lt;strong&gt;Zipf’s Curve&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Later In the 1940s and the 50s, with the birth of modern marketing, several linguists like &lt;strong&gt;Rudolf Flesch&lt;/strong&gt; and &lt;strong&gt;Robert Gunning&lt;/strong&gt; strove to create the easiest and most reliable readability formula. Readability studies became a very competitive field.&lt;/p&gt;

&lt;p&gt;Both public (education, military...) and private sectors (publishing, journalism...) needed readability tests to ensure that texts meet a minimum readability requirement. As an example the Texas Department of Insurance has a requirement that all insurance policy documents have a Flesch Reading Ease score of 40 or higher, the reading level of a first-year undergraduate student.&lt;/p&gt;

&lt;p&gt;In the 1970s, Flesh published another readability formula with his colleague J &lt;strong&gt;Peter Kincaid&lt;/strong&gt; in partnership with the US Navy: &lt;strong&gt;the Flesch Kincaid&lt;/strong&gt; . After rigorous validation, the formula became US &lt;strong&gt;military standard&lt;/strong&gt;. It also was applied to other sectors, like the insurance industry.&lt;/p&gt;

&lt;p&gt;Nowadays there are over 200 readability formulas and readability is a research field in continuous evolution. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Star #3: Choosing PostgreSQL as my database engine
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.postgresql.org/" rel="noopener noreferrer"&gt;PostgreSQL&lt;/a&gt;&lt;/strong&gt; was shown to me as the perfect option over other options (MySQL, MongoDB ...) for some reasons:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fkg5nqku9kofcg72q37i7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fkg5nqku9kofcg72q37i7.png" alt="PostgreSQL logo"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;I wanted to manage my Database instance and not a cloud database, and I had previous experience with PostgreSQL.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;For some of the features that I planned to offer, I needed the richness &lt;strong&gt;SQL&lt;/strong&gt; offers to execute queries. In many cases, the questions have a certain complexity, involving variable relationships between different data types.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I did not anticipate frequent changes in the data model that would justify a NoSQL model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;PostgreSQL offers adequate mechanisms to scale, if necessary, something that does not concern me for now.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Star #4: The Python ecosystem.
&lt;/h3&gt;

&lt;p&gt;The Django and Python ecosystem is merely unique. You can find a solution to any problem and focus on developing &lt;strong&gt;"your idea"&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For example, one of the features that it wanted to offer was that the student had access to the pronunciation of any word they were looking for in the dictionary at the click of a button, something similar to what &lt;strong&gt;&lt;a href="https://youglish.com/" rel="noopener noreferrer"&gt;Youglish&lt;/a&gt;&lt;/strong&gt; does.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F3z7utbn6xb9ameeoxwt5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F3z7utbn6xb9ameeoxwt5.png" alt="Pronunciation assistant"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #9: Pronunciation assistant&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;I needed to be able to access the transcripts of thousands of &lt;strong&gt;YouTube videos&lt;/strong&gt; and to be able to analyze their content to index the appearance of thousands of words from the &lt;strong&gt;Wordnet&lt;/strong&gt; dictionary. I found two beneficial Python tools: &lt;strong&gt;&lt;a href="https://github.com/PauloMigAlmeida/youtube2srt/" rel="noopener noreferrer"&gt;youtube2str&lt;/a&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;a href="https://pypi.org/project/youtube-transcript-api/" rel="noopener noreferrer"&gt;Youtube-transcript-API&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But these are not the only libraries/features of Django and Python that helped me in the task and that I would like to comment, something that I think may help others who are considering the development of a similar educational tool:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://django-crispy-forms.readthedocs.io/en/latest/" rel="noopener noreferrer"&gt;crispy-forms&lt;/a&gt;&lt;/strong&gt;: rapid form development and Ajax validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fgna0b0zlrqfzmvjz7g7s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fgna0b0zlrqfzmvjz7g7s.png" alt="Form definition"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #10: Form definition&lt;/small&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://django-allauth.readthedocs.io/en/latest/index.html" rel="noopener noreferrer"&gt;Django-allauth&lt;/a&gt;&lt;/strong&gt;: sign up, authentication, and user account management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fg5rugjjr1y7z0fp5habi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fg5rugjjr1y7z0fp5habi.png" alt="Signup section specification"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #11: Signup section specification (Allauth)&lt;/small&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fwy7ykkzhhvu247e3anp6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fwy7ykkzhhvu247e3anp6.png" alt="Signup and signin page design"&gt;&lt;/a&gt;&lt;br&gt;
&lt;small&gt;Image #12: Signup and signin page design&lt;/small&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://pypi.org/project/django-mobile/" rel="noopener noreferrer"&gt;Django Mobile&lt;/a&gt;&lt;/strong&gt;: library for the detection of browsers on mobile devices for their differentiated treatment in the templating system&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://pypi.org/project/geoip2/" rel="noopener noreferrer"&gt;GeoIP2&lt;/a&gt;&lt;/strong&gt;: wrapper for the MaxMind geoip2 Python library&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.djangoproject.com/en/3.1/ref/contrib/humanize/" rel="noopener noreferrer"&gt;Django-humanize&lt;/a&gt;&lt;/strong&gt;: a series of filters for Django's templating system to make the interface more human&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.djangoproject.com/en/3.1/ref/templates/language/" rel="noopener noreferrer"&gt;Django templating language&lt;/a&gt; and &lt;a href="https://jinja.palletsprojects.com/en/2.11.x/" rel="noopener noreferrer"&gt;Jinja&lt;/a&gt;&lt;/strong&gt; to define templates in HTML&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://docs.djangoproject.com/en/3.1/howto/custom-management-commands/" rel="noopener noreferrer"&gt;Django Managements Commands&lt;/a&gt;&lt;/strong&gt;: This is not a library but a Django feature. The management commands are small Python applications that can be invoked from the command line perfectly integrated with the Django project. In the project, I use dozens of them for a multitude of small automation tasks, scrapping-etc ...&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thanks for reading this article. If you have any questions, feel free to comment below.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://twitter.com/jesusramirezs" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/jesusramirezserran/?locale=en_US" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Creative Common Images from Flickr:&lt;br&gt;
&lt;a href="https://www.flickr.com/photos/luisfelipepadilla" rel="noopener noreferrer"&gt;https://www.flickr.com/photos/luisfelipepadilla&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>django</category>
      <category>showdev</category>
      <category>webdev</category>
    </item>
    <item>
      <title>A simple experiment with the JSFeat library combining skin and edge detection</title>
      <dc:creator>jesusramirezs</dc:creator>
      <pubDate>Wed, 30 Sep 2020 15:55:06 +0000</pubDate>
      <link>https://dev.to/jesusramirezs/a-simple-experiment-with-the-jsfeat-library-combining-skin-and-edge-detection-4k08</link>
      <guid>https://dev.to/jesusramirezs/a-simple-experiment-with-the-jsfeat-library-combining-skin-and-edge-detection-4k08</guid>
      <description>&lt;p&gt;In a &lt;a href="https://dev.to/jesusramirezs/some-interesting-javascript-libraries-for-image-processing-and-computer-vision-3al0"&gt;previous article&lt;/a&gt;, I briefly reviewed some libraries that allow testing with artificial vision and image processing using &lt;strong&gt;Javascript&lt;/strong&gt;. This is an area that I find a fascinating and funny time.&lt;/p&gt;

&lt;p&gt;Among these listed libraries, one, in particular, caught my attention: &lt;a href="https://inspirit.github.io/jsfeat/" rel="noopener noreferrer"&gt;JSFeat&lt;/a&gt;. Besides, it seems to be an entirely complete library for the filters and algorithms that it uses; it has a good documentation and some quite illustrative examples.&lt;/p&gt;

&lt;p&gt;I found it very easy to start playing with this library. Each filter or algorithm library is documented with a simple example, and all of them work in real-time with the PC’s webcam.&lt;/p&gt;

&lt;p&gt;I find it interesting to try something that I have been thinking about: a simple hand gesture/movement detector. To do this, I will first try to apply a simple previous filtering of the image in real-time to detect the skin tones from the rest of the image's colors. &lt;/p&gt;

&lt;p&gt;I know that the result won’t be rigorous, but I don’t try to get a  100% reliable result: it is just a test intended to simplify the initial problem as much as possible.&lt;/p&gt;

&lt;p&gt;To start with our experiment, we will only need a local HTTP server, for example, Apache, and copy the code from any of the most basic JSfeat’s examples and take it as a template; for example, we can start from “canny edge demo” that already uses one of the most known edge detection algorithms: “&lt;strong&gt;Canny edges&lt;/strong&gt;”:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://inspirit.github.io/jsfeat/sample_canny_edge.html" rel="noopener noreferrer"&gt;https://inspirit.github.io/jsfeat/sample_canny_edge.html&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;JSfeat&lt;/strong&gt; website does not provide the setting up of the examples by cloning, so you will have to set up a “js” folder with the necessary libraries next to your .html or modify the code not to use them:&lt;/p&gt;

&lt;p&gt;jsfeat-min.js: Github: &lt;a href="https://github.com/inspirit/jsfeat" rel="noopener noreferrer"&gt;https://github.com/inspirit/jsfeat&lt;/a&gt;&lt;br&gt;
profiler.js&lt;br&gt;
compatibility.js&lt;br&gt;
bootstrap.js&lt;/p&gt;

&lt;p&gt;and in a folder named “css”:&lt;/p&gt;

&lt;p&gt;js-feat.css  // basic styles&lt;br&gt;
bootstrap.css  // bootstrap CSS&lt;/p&gt;

&lt;p&gt;There is a bunch of code dedicated to webcam's initialization and a the creation of a web canvas on which the webcam video stream is dumped and the algorithms applied. Let's skip all this to focus on just two functions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;    &lt;span class="nf"&gt;demo_app&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;tick&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;demo_app()&lt;/strong&gt; is an initialization function while &lt;strong&gt;tick()&lt;/strong&gt; is executed at each frame of video captured from our webcam&lt;/p&gt;

&lt;p&gt;At &lt;strong&gt;demo_app()&lt;/strong&gt; we find two important lines of code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;    &lt;span class="nx"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;2d&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;getContext()&lt;/strong&gt; function returns the drawing context from the HTML canvas - which is an object that has all the drawing properties and functions you use to draw on the canvas.&lt;/p&gt;

&lt;p&gt;At each frame we will draw the image captured from our webcam into this drawing context&lt;/p&gt;

&lt;p&gt;The second line is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;    &lt;span class="nx"&gt;img_u8&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;jsfeat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;matrix_t&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;640&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;480&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;jsfeat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;U8_t&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nx"&gt;jsfeat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;C1_t&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;JSfeat uses a data structure called “&lt;strong&gt;matrix_t&lt;/strong&gt;” which is an array with the parameters of our HTML canvas and the resolution chosen for our capturing video from our webcam, in our case &lt;strong&gt;640 x 480 pixels&lt;/strong&gt;. In this matrix, the edge detection algorithm will be applied once we have filtered the skin tones. &lt;/p&gt;

&lt;p&gt;You need to initialize our matrix with the number of channels to be used, and the type of data that represent each pixel, in our case, “single-channel unsigned char” because once we filter the skin of the rest of the image, we’ll apply edge detection to a monochrome image result of the “&lt;strong&gt;grayscale&lt;/strong&gt;” function.&lt;/p&gt;

&lt;p&gt;It is important to note that the skin pre-filtering will not be performed using any JSfeat’s specific algorithm but a function programmed from scratch and which this data structure “img_u8” is not involved.&lt;/p&gt;

&lt;p&gt;This function traverses an array of data “&lt;strong&gt;RGBA&lt;/strong&gt;”, where each pixel is represented by four bytes: &lt;strong&gt;Red, Green, Blue&lt;/strong&gt; color components and &lt;strong&gt;Alpha&lt;/strong&gt; channel.&lt;/p&gt;

&lt;p&gt;To determine whether or not a pixel corresponds to skin in the image, we previously convert the color in &lt;strong&gt;RGB&lt;/strong&gt; format to &lt;strong&gt;HSV&lt;/strong&gt; format using the following function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;    &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;rgb2hsv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;g&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;rabs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gabs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;babs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;rr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bb&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;h&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;diff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;diffc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;percentRoundFn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nx"&gt;rabs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nx"&gt;gabs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;g&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nx"&gt;babs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rabs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gabs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;babs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="nx"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rabs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;gabs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;babs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="nx"&gt;diffc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nx"&gt;percentRoundFn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;num&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;diff&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="nx"&gt;rr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;diffc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rabs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="nx"&gt;gg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;diffc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gabs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="nx"&gt;bb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;diffc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;babs&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

                    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rabs&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;bb&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;gg&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;gabs&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;rr&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;bb&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;babs&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;gg&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;rr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="na"&gt;h&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;360&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="na"&gt;s&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;percentRoundFn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                    &lt;span class="na"&gt;v&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;percentRoundFn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;};&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next we use the algorithm proposed by the following paper, were are put the results of analyzing the data set "&lt;strong&gt;Pratheepan dataset for human skin detection&lt;/strong&gt;":&lt;/p&gt;

&lt;p&gt;&lt;a href="https://arxiv.org/ftp/arxiv/papers/1708/1708.02694.pdf" rel="noopener noreferrer"&gt;https://arxiv.org/ftp/arxiv/papers/1708/1708.02694.pdf&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This simple algorithm is passed over the data set obtained from the initialized canvas line in our HTML document:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;    &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;filterSkin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

        &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

            &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;hsv&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;rgb2hsv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

            &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;(((&lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nx"&gt;hsv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;hsv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mf"&gt;50.0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="mi"&gt;23&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nx"&gt;hsv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;hsv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;68&lt;/span&gt;  &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
                &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;95&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
                &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

                &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;


        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So final data flow in the tick function is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;        &lt;span class="c1"&gt;// the frame is drawn from the video stream into the 2D context of the canvas &lt;/span&gt;
        &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drawImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;video&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;640&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;480&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// we get the image data (matrix+metadata) from the 2D context&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;imageData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getImageData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;640&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;480&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// the image data matrix is passed to the Skin Filtering function&lt;/span&gt;
        &lt;span class="nf"&gt;filterSkin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imageData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;   

        &lt;span class="c1"&gt;// the new image content is passed to grayscale function. The result is a one byte per pixel image&lt;/span&gt;
        &lt;span class="nx"&gt;jsfeat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;imgproc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;grayscale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;imageData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;640&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;480&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;img_u8&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// lets apply some gaussian blur to reduce noise&lt;/span&gt;
        &lt;span class="nx"&gt;jsfeat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;imgproc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gaussian_blur&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;img_u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;img_u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// the monochrome image is passed to canny edges algorithm&lt;/span&gt;
        &lt;span class="nx"&gt;jsfeat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;imgproc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;canny&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;img_u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;img_u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F0cf1745vff5f31wntblc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2F0cf1745vff5f31wntblc.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fuc4ut18ag5m04pezitls.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fuc4ut18ag5m04pezitls.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fwlbfhe8jpst6ppnzupei.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fi%2Fwlbfhe8jpst6ppnzupei.png" alt="Alt Text"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I would like to continue with these experiments and see how far I can go.&lt;/p&gt;

&lt;p&gt;Thanks for reading this article. Any feedback will be greatly appreciated.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://twitter.com/jesusramirezs" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/jesusramirezserran/?locale=en_US" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>frontend</category>
      <category>development</category>
      <category>todayilearned</category>
    </item>
    <item>
      <title>Why is worth considering  Django for your web project in 2020?</title>
      <dc:creator>jesusramirezs</dc:creator>
      <pubDate>Wed, 16 Sep 2020 17:16:13 +0000</pubDate>
      <link>https://dev.to/jesusramirezs/why-is-worth-considering-django-for-your-web-project-in-2020-dh7</link>
      <guid>https://dev.to/jesusramirezs/why-is-worth-considering-django-for-your-web-project-in-2020-dh7</guid>
      <description>&lt;p&gt;Web frameworks abstract the common aspects of building web sites and APIs and allow you to build solid applications with less effort.&lt;/p&gt;

&lt;p&gt;I’m confident that &lt;strong&gt;Django&lt;/strong&gt; will be one of the major players in the future of web development and it deserves to be considered.&lt;/p&gt;

&lt;p&gt;In these days when Javascript frameworks seem to be the way to go going forward and all the focus is put on frontend libraries and Single Page Applications,   I would like to give some reasons to still consider Python web development with Django, an impressive backend framework. &lt;/p&gt;

&lt;p&gt;I think that developers should always be open minded and consider the best solution for concrete problems, not only following the last trend but taking into account different approaches. That is why I thought this article could be a little reminder.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Open source&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Django is open source  and it is maintained by the Django software foundation,  an independent Non-Profit 501(c)(3), focused on promoting, supporting and advancing this framework.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Productivity&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Django was designed to help build software (not only dynamic web apps) as quickly as possible. It has a well-deserved reputation for being highly productive.&lt;/p&gt;

&lt;p&gt;Django will let you create a complete website or API from scratch in short time span so it has become a strong technological foundation for lots of companies.&lt;/p&gt;

&lt;p&gt;Python is an interpreted, object-oriented, dynamic language and it is considered one of the most readable programming languages which has greatly contributed to its popularity.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Popularity&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;First, Django is built on Python which is one of the most popular languages.   According to TIOBE, Python is ranked in third position in the list of most popular programming languages. Therefore you will find that the ecosystem and the community is huge.&lt;/p&gt;

&lt;p&gt;This also means that there are more developers available that could fit your needs when you decide to expand your team and start hiring.&lt;/p&gt;

&lt;p&gt;Talking about popularity, Django is considered the most popular Python framework, followed by Pyramid, and Flask and according to Statista, in 2020 Django is ranking among the most used web frameworks &lt;a href="https://www.statista.com/statistics/1124699/worldwide-developer-survey-most-used-frameworks-web/"&gt;https://www.statista.com/statistics/1124699/worldwide-developer-survey-most-used-frameworks-web/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Companies like Instagram, Pinterest, Disqus, The Washington Post, Bitbucket, Udemy and Mozilla use Django in their production websites.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Documentation&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The documentation is one of the strong points of Django. In my honest opinion it is really excellent and instructive. In fact most Django developers learn Django from official tutorials. It is thorough, cross-referenced and well versioned and it covers all the Django features in a very detailed way.&lt;/p&gt;

&lt;p&gt;Tutorials, reference guides, how-to articles and recipes provide a comprehensive and deeply view of the framework and describe how it works and how to use it&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Release process&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The release process is very well documented and reflects the extreme care that &lt;b&gt;Django Project&lt;/b&gt; puts on satisfying actual industry development cycles, by smoothing changes between versions as much as possible.&lt;/p&gt;

&lt;p&gt;According to Django’s website: “The plan is to have a new feature release every 8 months and a new long-term support release (LTS) every 2 years. LTS releases are supported with security updates for 3 years.” &lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Batteries included&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Django’s batteries included philosophy means that the framework has everything necessary to develop a complete web application. It lets you focus on your business logic.&lt;/p&gt;

&lt;p&gt;Django has lots of built-in modules and open-source packages to create anything from scratch very easily and without reinventing the wheel. There are literally thousands of high quality 3rd party libraries that can help accelerate your development.&lt;/p&gt;

&lt;p&gt;Some examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requests - HTTP client&lt;/li&gt;
&lt;li&gt;Django-Rest-Framework - ReST API endpoint&lt;/li&gt;
&lt;li&gt;pytz - Timezone support&lt;/li&gt;
&lt;li&gt;Pillow - Image Manipulation&lt;/li&gt;
&lt;li&gt;django-templated-email - css in mail templates&lt;/li&gt;
&lt;li&gt;django-social-auth - oauth autentication&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...and many more...&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Security&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Django is one of the most secure web frameworks. &lt;/p&gt;

&lt;p&gt;It is implemented in Python, which has excellent security track record.&lt;/p&gt;

&lt;p&gt;Furthermore, Django was also designed with security in mind. The security mechanisms already developed in Django can prevent your website or app from most of the attacks like – SQL Injection, XSS, CSRF, Clickjacking, and many more. &lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The ORM&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Are you thinking of writing a database middleware by yourself?&lt;/p&gt;

&lt;p&gt;Django comes with ORM&lt;/p&gt;

&lt;p&gt;The ORM (object-relational mapping ) is an abstraction layer between the database and your web application.&lt;/p&gt;

&lt;p&gt;It is compatible with most major SQL databases and makes possible, thanks to its powerful API, to accessing and maintaining your data from python code, not SQL. &lt;/p&gt;

&lt;p&gt;The key here is that you won’t code using  SQL sentences but an abstraction that supports any SQL database.&lt;/p&gt;

&lt;p&gt;This abstraction provides a declarative model to describe your schema and common operators to perform any kind of query you need.&lt;/p&gt;

&lt;p&gt;Anyway, you can still use SQL if you want.&lt;/p&gt;

&lt;p&gt;ORM has core support for mysql, mariadb, postgresql, oracle and sqllite and works with third party packages to support other database systems like Microsoft SQL Server or IBM DB2 among others. &lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Migrations&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;It's difficult to imagine a project where data model doesn´t change throughout development or production stages.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Migrations&lt;/b&gt; is a very unique feature of Django. It detects changes in your data model declarations and generates automatically SQL sentences to adapt the database schema to them. &lt;/p&gt;

&lt;p&gt;It is also able to generate "data migrations" to apply transformations to data.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Admin interface&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;An administration backend is a must have for any project and Django has done a great job with the admin interface. &lt;/p&gt;

&lt;p&gt;Django provides a Built-in admin panel: a custom and extendable admin panel by default. This admin panel is highly configurable and it can be a great interface for your staff in order to maintain data. You can even use themes to customize the appearance to your liking.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;SEO-friendly development&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;There are lots of tools Django provides to assist you with SEO such as the Django SEO framework, django-robots, the sitemap framework, the redirects app...&lt;/p&gt;

&lt;p&gt;Developers can also reduce page size and loading time by compressing CSS and JavaScript and using cached templates.&lt;/p&gt;

&lt;p&gt;There is much more to write about Django. I will be very pleased to write more concrete articles about features/libraries I have experimented with in some of my projects. For example &lt;strong&gt;I would like to write about scalability&lt;/strong&gt; in Django in another article.&lt;/p&gt;

&lt;p&gt;If you have any questions, feel free to comment below. &lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://twitter.com/jesusramirezs"&gt;Twitter&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/jesusramirezserran/?locale=en_US"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>django</category>
      <category>python</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Some interesting Javascript libraries for  image processing and computer vision.
</title>
      <dc:creator>jesusramirezs</dc:creator>
      <pubDate>Sun, 06 Sep 2020 15:58:36 +0000</pubDate>
      <link>https://dev.to/jesusramirezs/some-interesting-javascript-libraries-for-image-processing-and-computer-vision-3al0</link>
      <guid>https://dev.to/jesusramirezs/some-interesting-javascript-libraries-for-image-processing-and-computer-vision-3al0</guid>
      <description>&lt;p&gt;For the past two months, I've been doing some research in the field of computer vision on the web.&lt;/p&gt;

&lt;p&gt;Today’s Javascript implementations are really fast, and this fact has allowed that some computationally intensive tasks, that were reserved to other languages and platforms just a few years ago, are now feasible for web browsers or Node.js.&lt;/p&gt;

&lt;p&gt;So if you are a Javascript developer interested in computer vision you are lucky.&lt;/p&gt;

&lt;p&gt;First of all we must differentiate computer vision from image processing.  Some of the Javascript libraries in this article are in fact just image processing libraries.  Performing computer vision requires more complex and sophisticated algorithms and techniques.&lt;/p&gt;

&lt;p&gt;Image processing makes an extensive usage of maths and algorithms to extract important image features. Computer vision uses the power of image processing along with other techniques (decision trees, Bayes classifiers, deep neural networks...)  in order to recognize objects or categorize images. &lt;/p&gt;

&lt;p&gt;Computer Vision tries to do what a human brain does when it recognizes shapes, objects or situations in an image while Image processing is mainly focused on processing raw images, making them optimal for other tasks (noise reduction for example)  and extracting key features.&lt;/p&gt;

&lt;p&gt;Some Javascript  computer vision libraries like tracking.js or handtrack.js are very specialized in their scope , trying to solve how to detect concrete kind of “objects” like faces, eyes, hands, etcetera. These libraries allow you to use ready-to-go systems to perform actual computer vision tasks. Others, like Opencv4nodejs / OpenCV aims to provide more general systems / frameworks that can help to solve a wider range of computer vision problems.&lt;/p&gt;

&lt;p&gt;Here are some of the libraries that I found specially interesting in the field of image processing and computer vision, all of them open source. &lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;GammaCV&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;WebGL accelerated computer vision library. It uses a data flow paradigm to create and run graphs on the GPU. This is a very compact library: weights just 32.5K minimized. &lt;/p&gt;

&lt;p&gt;In addition to the most common algorithms (grayscaling, color segmentation…) it implements some other more sophisticated algorithms like Canny Edges, Sobel operator and lines detections, but it also lacks of important feature extraction algorithms like FAST or ORB.&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://gammacv.com" rel="noopener noreferrer"&gt;https://gammacv.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;github repository: &lt;a href="https://github.com/PeculiarVentures/GammaCV" rel="noopener noreferrer"&gt;https://github.com/PeculiarVentures/GammaCV&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Opencv4nodejs&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Opencv4nodejs is not exactly a pure Javascript library but a npm package that provides Node.js bindings to OpenCV through an asynchronous API.  It supports Open CV 3 and Open CV 4, so it brings us all the performance benefits of the native OpenCV library to your Node.js application and allows to easily implement multithreaded CV tasks via Promises.  It sound really great.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fi.imgur.com%2FqpXz3o2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fi.imgur.com%2FqpXz3o2.png" alt="OpenCV"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OpenCV (Open Source Computer Vision Library) is a library of programming functions mainly aimed at real-time computer vision.&lt;/p&gt;

&lt;p&gt;If execution in the browser is not an important requirement, Opencv4nodejs is probably the most interesting option given the performance and maturity of OpenCV.&lt;/p&gt;

&lt;p&gt;Github repository: &lt;a href="https://github.com/justadudewhohacks/opencv4nodejs/" rel="noopener noreferrer"&gt;https://github.com/justadudewhohacks/opencv4nodejs/&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;OpenCV.js&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;If you are looking for a 100% browser solution OpenCV.js offers a different approach. OpenCV.js offers JavaScript bindings for a subset of the OpenCV library, implemented in WebAssembly. &lt;/p&gt;

&lt;p&gt;You can't expect OpenCV.js to perform all the things you can do with OpenCV using C or Python or even Opencv4nodejs. Documentation is not that good either. &lt;/p&gt;

&lt;p&gt;An additional problem to take into account is the size of the library itself, 2MB, which doesn’t make it very appropriate to all networks / devices. &lt;/p&gt;

&lt;p&gt;To be clear, OpenCV.js is a really interesting Webassembly implementation but, in my opinion, probably you can find better alternatives depending on the kind of task you are trying to achieve.&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="https://docs.opencv.org/3.4/df/df7/tutorial_js_table_of_contents_setup.htmlhttps://docs.opencv.org/master/d5/d10/tutorial_js_root.html" rel="noopener noreferrer"&gt;https://docs.opencv.org/3.4/df/df7/tutorial_js_table_of_contents_setup.htmlhttps://docs.opencv.org/master/d5/d10/tutorial_js_root.html&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;MarvinJ&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;MarvinJ is a pure javascript image processing library. It derives from Marvin Framework, a Java cross-platform image processing framework.&lt;/p&gt;

&lt;p&gt;MarvinJ provides a set of algorithms and filters (Gaussian, emboss, grayScale, thresholding…) that could be wide enough for your purposes, but as it happens in the case of  GammaCV, it lacks of feature extraction algorithms. I could only find the Prewitt edge filter, which is not one exactly one  of the most used.&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="http://www.marvinj.org/en/index.html" rel="noopener noreferrer"&gt;http://www.marvinj.org/en/index.html&lt;/a&gt;&lt;br&gt;
Github repository: &lt;a href="https://github.com/gabrielarchanjo/marvinj" rel="noopener noreferrer"&gt;https://github.com/gabrielarchanjo/marvinj&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;tracking.js&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;This library brings some well-known image processing algorithms (gaussian blur, gray scale, convolution...) along with various computer vision algorithms to JavaScript. It can perform color tracking, face detection and feature detection. It is well-documented and the examples in the website are very illustrative.&lt;/p&gt;

&lt;p&gt;It is very easy to implement color tracking, face detection (not recognition) or eye tracking from video or webcam. Tracking.js also provides a simple framework to implement your own object tracking algorithm. Of course it comes with some filters and feature extraction tools like FAST, BRIEF.&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="http://trackingjs.com/" rel="noopener noreferrer"&gt;http://trackingjs.com/&lt;/a&gt;&lt;br&gt;
 Github repository: &lt;a href="https://github.com/eduardolundgren/tracking.js/" rel="noopener noreferrer"&gt;https://github.com/eduardolundgren/tracking.js/&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;jsfeat&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;jsfeat has a rich and varied feature set to implement image processing in any browser. It can perform tasks such as: edge detection, image processing (grayscale, blur, etc.), corner detection, object detection, optical flow detection, etc...&lt;/p&gt;

&lt;p&gt;This library is very lightweight (23 kB)  and really fast, with very good performance on desktop computers or even mobile devices. In its website you can find lots of real time demos and examples using your webcam (webRTC required) so you can check the resulting framerate in all of them.&lt;/p&gt;

&lt;p&gt;JSFeat documentation is very good. Of course this library include basic filters and algorithms (grayscale, derivatives, box-blur, resample, gaussian blur, equalize histogram) but also more advanced operations like:&lt;/p&gt;

&lt;p&gt;Canny edges&lt;br&gt;
Fast Corners feature detector&lt;br&gt;
Lucas-Kanade optical flow &lt;br&gt;
HAAR object detector&lt;br&gt;
BBF object detector &lt;/p&gt;

&lt;p&gt;which can be considered as advanced feature extractors.&lt;/p&gt;

&lt;p&gt;Website: &lt;a href="http://inspirit.github.io/jsfeat/" rel="noopener noreferrer"&gt;http://inspirit.github.io/jsfeat/&lt;/a&gt;&lt;br&gt;
Github repository: &lt;a href="https://github.com/inspirit/jsfeat" rel="noopener noreferrer"&gt;https://github.com/inspirit/jsfeat&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In a next article I will show some little experiment using this library.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;PoseNet&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A machine learning model, built on Tensorflow.js,  which allows for real-time human pose estimation in the browser.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fi.imgur.com%2FgcB4Iy4.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fi.imgur.com%2FgcB4Iy4.jpg" alt="PoseNet"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;PoseNet can be used to estimate either a single pose or multiple poses, meaning there is a version of the algorithm that can detect only one person in an image/video and one version that can detect multiple persons in an image/video.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5" rel="noopener noreferrer"&gt;https://medium.com/tensorflow/real-time-human-pose-estimation-in-the-browser-with-tensorflow-js-7dd0bc881cd5&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;As you can see there are some interesting options if you don’t want to start coding your image processing system from scratch. if you are planning to learn about computer vision I encourage you to start experimenting with them. &lt;/p&gt;

&lt;p&gt;And that is all!. Thanks for reading this is my first article. Hope you found it useful. I look forward to hear any feedback or suggestion.&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="https://twitter.com/jesusramirezs" rel="noopener noreferrer"&gt;Twitter&lt;/a&gt; or &lt;a href="https://www.linkedin.com/in/jesusramirezserran/?locale=en_US" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>opensource</category>
      <category>webdev</category>
      <category>development</category>
    </item>
  </channel>
</rss>
