<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Bruno Luvizotto</title>
    <description>The latest articles on DEV Community by Bruno Luvizotto (@brudhu).</description>
    <link>https://dev.to/brudhu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F377795%2F001870f5-21cc-4f26-8bca-af2658a7934f.jpeg</url>
      <title>DEV Community: Bruno Luvizotto</title>
      <link>https://dev.to/brudhu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/brudhu"/>
    <language>en</language>
    <item>
      <title>Brazilian News Sentiment Analysis</title>
      <dc:creator>Bruno Luvizotto</dc:creator>
      <pubDate>Sat, 02 May 2020 18:55:13 +0000</pubDate>
      <link>https://dev.to/brudhu/brazilian-news-sentiment-analysis-2e7e</link>
      <guid>https://dev.to/brudhu/brazilian-news-sentiment-analysis-2e7e</guid>
      <description>&lt;p&gt;Disclaimer: this is an article of a project that uses the &lt;a href="https://cloud.google.com/natural-language#natural-language-api-demo"&gt;Google Language Sentiment Analysis API&lt;/a&gt;, it doesn't train any machine learning model.&lt;/p&gt;

&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;As a side project, I decided to develop a project to do sentiment analysis of headlines of some of the most important Brazilian news agencies. On the one hand I would like to test Google's API and on the other hand I would like to check if I could see significant differences on sentiments of the headlines of each news agency.&lt;/p&gt;

&lt;h1&gt;
  
  
  Architecture
&lt;/h1&gt;

&lt;p&gt;The decisions on the architecture of this project were taken based on two decision criteria:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lowest Prices&lt;/li&gt;
&lt;li&gt;Less work&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Database
&lt;/h2&gt;

&lt;p&gt;For a database I decided to use Google's Firestore (non relational database) - no special reason for that other than "I'm already using GCP (Google Cloud Platform) for the sentiment analysis".&lt;/p&gt;

&lt;p&gt;The database has three collections: &lt;code&gt;websites&lt;/code&gt;, &lt;code&gt;keywords&lt;/code&gt; and &lt;code&gt;sentiments&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The documents in the collections have the following fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;websites&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: the website's name&lt;/li&gt;
&lt;li&gt;regex: regex used for scraping the website's headlines&lt;/li&gt;
&lt;li&gt;url: the websites's url&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;keywords (that we want to scrape):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;value: the string that we are looking for on the news agencies websites&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;sentiments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;headline: the original headline analyzed&lt;/li&gt;
&lt;li&gt;headlineEnglish: headline translated to English (we'll talk about that later)&lt;/li&gt;
&lt;li&gt;isOnline: boolean that indicates if the headline is still being displayed on the website&lt;/li&gt;
&lt;li&gt;keywords: array with the keywords found in the headine&lt;/li&gt;
&lt;li&gt;onlineStartDate: timestamp of the first time the headline has been seen on the website&lt;/li&gt;
&lt;li&gt;onlineEndDate: timestamp of the last time the headline has been seen on the website&lt;/li&gt;
&lt;li&gt;onlineTotalTimeMS: the difference between the end and start dates (in milliseconds)&lt;/li&gt;
&lt;li&gt;sentimentScore: score of the sentiment analyzed (-1 to -0.25 means a negative sentiment, -0.25 to 0.25 a neutral sentiment and 0.25 to 1 a positive sentiment)&lt;/li&gt;
&lt;li&gt;sentimentMagnitude: the magnitude of the sentiment analyzed&lt;/li&gt;
&lt;li&gt;website: the website's name (from where the headline has been scraped)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Node.js Job
&lt;/h2&gt;

&lt;p&gt;The responsible for actually doing all the work is a Node.js script (&lt;a href="https://github.com/Brudhu/politicians_analysis"&gt;https://github.com/Brudhu/politicians_analysis&lt;/a&gt;). The script does the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Get all the info it needs (like websites info, keywords etc.) from Firestore&lt;/li&gt;
&lt;li&gt;Scrape the websites to get the headlines (using puppeteer and the regex stored on Firestore)&lt;/li&gt;
&lt;li&gt;Pick headlines that have at least one of the keywords&lt;/li&gt;
&lt;li&gt;Check which of the scraped headlines have not been analyzed yet&lt;/li&gt;
&lt;li&gt;Translate headlines to English (using an API from Azure) - there we go: the reason for that is that in a quick test of the sentiment analysis API I realized it works a lot better with English sentences than Portuguese sentiments&lt;/li&gt;
&lt;li&gt;Analyze the sentiment of the headline translated to English (GCP Language API)&lt;/li&gt;
&lt;li&gt;Insert new sentiments in the "sentiments" collection&lt;/li&gt;
&lt;li&gt;Update sentiments that are not online anymore&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I decided to run this job periodically every 30 minutes (not faster because I don't want to spend to much on Cloud resources).&lt;/p&gt;

&lt;p&gt;I had two options to host the job: GCP (again) and Heroku - I know there are thousands of options but these are the ones I've had more experience&lt;br&gt;
with. I decided to go with Heroku and Heroku Scheduler Addon (the scheduler is the responsible for running the script periodically). It's free for now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;While the job on Heroku is free, the project on GCP is costing me 0.01 BRL per day.&lt;/p&gt;

&lt;h1&gt;
  
  
  First Results
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;To get the data from Firestore and analyze it, I wrote a Python script (will release it later).&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For the first tests I set up two news agencies:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.uol.com.br"&gt;UOL&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://g1.globo.com/"&gt;G1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The keywords are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bolsonaro (Brazilian president)&lt;/li&gt;
&lt;li&gt;Moro (Former Brazilian minister of justice - removed from the ministry a in April)&lt;/li&gt;
&lt;li&gt;Lula (Former Brazilian president)&lt;/li&gt;
&lt;li&gt;Dória (Governor of São Paulo state in Brazil)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In less than 14 days I got 571 headlines analyzed: 366 from UOL (the first one I started collecting data from) and 205 from G1.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The only keyword that has enough data for some analysis is "Bolsonaro", which makes sense since he is the current president.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Top Positive and Negative Sentiment Headlines
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Most positive sentiment headline on UOL (Portuguese and the translated version in English):
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Opinião: Com a PF, Bolsonaro cumpre a profecia de Jucá&lt;br&gt;
Opinion: With PF, Bolsonaro fulfills the prophecy of Jucá&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Most positive sentiment headline on G1:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Bolsonaro amplia lista de atividades consideradas essenciais na pandemia&lt;br&gt;
Bolsonaro expands list of activities considered essential in the pandemic&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Most negative sentiment headline on UOL:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Bolsonaro culpa governadores: 'Essa conta não é minha'&lt;br&gt;
Bolsonaro blames governors: 'This account is not mine'&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;In this case we can see an error on the translation. I would say the best translation would be "Bolsonaro blames governors: 'This bill is not mine'"&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Most negative sentiment headline on G1:
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;Procuradora diz que Bolsonaro violou a Constituição ao determinar revogação de portarias sobre armas&lt;br&gt;
Prosecutor says Bolsonaro violated the Constitution by determining repeal of ordinances on weapons&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Word Clouds
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The word clouds are displaying only words with 3 or more occurrences. The only keyword analyzed so far is "Bolsonaro".&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The word cloud of every single headline analyzed is the following (it's in Portuguese, don't kill me):&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--GzYoDOZm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/hi9fxemvt8tn9kmdh31o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--GzYoDOZm--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/hi9fxemvt8tn9kmdh31o.png" alt="Word cloud with every headline analyzed"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of positive sentiments:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--L7A3fRng--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/kj4m1gf2pph4ltar57sx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--L7A3fRng--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/kj4m1gf2pph4ltar57sx.png" alt="Word cloud with positive headlines"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of negative sentiments:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--eR5BBzDK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/vjcqso3zx2q4sp8kmgum.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--eR5BBzDK--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/vjcqso3zx2q4sp8kmgum.png" alt="Word cloud with negative headlines"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of neutral sentiments:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--XcDCvHlc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/hvbd9v2zm09m7kjyr60o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--XcDCvHlc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/hvbd9v2zm09m7kjyr60o.png" alt="Word cloud with neutral headlines"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of positive sentiments on UOL:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--IntMwLNc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/q8j3k73czfcwholn832o.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--IntMwLNc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/q8j3k73czfcwholn832o.png" alt="Word cloud with positive headlines on UOL"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of negative sentiments on UOL:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--xFvBhgI5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/871v4e2tv7273i107ckh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--xFvBhgI5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/871v4e2tv7273i107ckh.png" alt="Word cloud with negative headlines on UOL"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of neutral sentiments on UOL:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--L77rYhHj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/0mkfy5p853ya3lzwntwo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--L77rYhHj--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/0mkfy5p853ya3lzwntwo.png" alt="Word cloud with neutral headlines on UOL"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of positive sentiments on G1:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3-P8Co76--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/xoop24c9zc73wb1bnv3n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3-P8Co76--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/xoop24c9zc73wb1bnv3n.png" alt="Word cloud with positive headlines on G1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of negative sentiments on G1:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--wnIwyzpu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/lboomcnqmt0jkasesqoh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--wnIwyzpu--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/lboomcnqmt0jkasesqoh.png" alt="Word cloud with negative headlines on G1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Word cloud of neutral sentiments on G1:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--KJF0jDWa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/0j0i0p04iin7d1gorp2e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--KJF0jDWa--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/0j0i0p04iin7d1gorp2e.png" alt="Word cloud with neutral headlines on G1"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Plots
&lt;/h2&gt;

&lt;p&gt;Now that we have an idea of what the word clouds look like for many conditions, let's take a look on some plots. The first one is a box plot of the sentiments grouped by website: &lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--vytJ7WTU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/92afbcjgp8f5scnwcx0x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--vytJ7WTU--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/92afbcjgp8f5scnwcx0x.png" alt="Sentiments boxplot by website"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;They look very similar: both are largely concentrated around the neutral area and both medians are pretty close - around 0 a little shifted to negative sentiments, but they are not exactly the same. UOL's box plot's minimum and maximum tails are longer then the ones from G1. Let's take a closer look.&lt;/p&gt;

&lt;h3&gt;
  
  
  Percentages
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Total:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Negative: 26.8%&lt;/li&gt;
&lt;li&gt;Neutral: 57.4%&lt;/li&gt;
&lt;li&gt;Positive: 15.8%&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;UOL:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Negative: 25.3%&lt;/li&gt;
&lt;li&gt;Neutral: 58.6%&lt;/li&gt;
&lt;li&gt;Positive: 16.1%&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;li&gt;

&lt;p&gt;G1:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Negative: 29.9%&lt;/li&gt;
&lt;li&gt;Neutral: 55.2%&lt;/li&gt;
&lt;li&gt;Positive: 14.9%&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--b0WKOJ_Q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/zfq0v2wg1s471tk226i7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--b0WKOJ_Q--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/zfq0v2wg1s471tk226i7.png" alt="Percentages of sentiments by website"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While they are still similar, we can see that G1 has more negative sentiment headlines than UOL, while UOL has more neutral and positive sentiment headlines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Histograms
&lt;/h3&gt;

&lt;p&gt;The histogram with all the sentiments for the "Bolsonaro" keyword is the following:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--rTeYhHxP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/qz135c9ekf8c22a6x5p8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--rTeYhHxP--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/qz135c9ekf8c22a6x5p8.png" alt='Histogram with all the headlines containing "Bolsonaro"'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the histogram we can confirm what we saw before: we have more negative than positive sentiments, but neutral sentiments are way more common.&lt;/p&gt;

&lt;p&gt;Now let's break the sentiments by website:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5TjUByx4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/19mwwtu1nihslh468xdb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5TjUByx4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/19mwwtu1nihslh468xdb.png" alt='Histogram with all the headlines containing "Bolsonaro" collected from UOL'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--iDUP0hTV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/207f19lu17xmcitndfg4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--iDUP0hTV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/207f19lu17xmcitndfg4.png" alt='Histogram with all the headlines containing "Bolsonaro" collected from G1'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And the two previous histograms combined in the same plot:&lt;br&gt;
&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--gY2i78AN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ecd84fo9vos8qczl4n19.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--gY2i78AN--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/ecd84fo9vos8qczl4n19.png" alt='Histogram with all the headlines containing "Bolsonaro" by website'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It looks like while G1 has proportionally more negative sentiments than UOL (like we saw on the percentages before), UOL tends to be a little more "extremist", with more very negative and very positive sentiment headlines.&lt;/p&gt;

&lt;p&gt;Now let's break the histograms even more: by positive and negative sentiments for each website.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--oOiGYlOz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/z8xz953vyaiydul08e4h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--oOiGYlOz--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/z8xz953vyaiydul08e4h.png" alt='Histogram with the positive headlines containing "Bolsonaro" on UOL'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--1zrOv3NZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/p0dxrkqz4dcfmnnjzjbe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--1zrOv3NZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/p0dxrkqz4dcfmnnjzjbe.png" alt='Histogram with the positive headlines containing "Bolsonaro" on G1'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;UOL has more headlines with sentiments &amp;gt;= 0.7 (very positive sentiments).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--uK-dolPX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/k3dtnk73fkv6x0v3k9dj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--uK-dolPX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/k3dtnk73fkv6x0v3k9dj.png" alt='Histogram with the negative headlines containing "Bolsonaro" on UOL'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--cU2WyZc5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/gdygksc6lvpa2x11co1c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--cU2WyZc5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/gdygksc6lvpa2x11co1c.png" alt='Histogram with the negative headlines containing "Bolsonaro" on G1'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Even though we now that G1 has more headlines with negative sentiments, these histograms shows that UOL has more headlines with sentiments &amp;lt;= -0.6 (very negative sentiments).&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusion
&lt;/h1&gt;

&lt;p&gt;While it was a lot fun to work on this project and having learned new stuff, I have to point out some of the flaws here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The translation from Portuguese to English (Azure) is very good, but not perfect for some cases&lt;/li&gt;
&lt;li&gt;Headlines related to Brazilian politics sometimes have a specific context that would be useful for the translation and Azure doesn't get it&lt;/li&gt;
&lt;li&gt;Some of the headlines were written by columnists and may be too informal to make sense after being translated (e.g. "Batata assou no fogo do parquinho dos Bolsonaro" which was translated to "Potato baked in the fire of bolsonaro playground" this sentence contains a Brazilian expression and means, in a very simplistic translation, something like "The Bolsonaros are in a bad situation")&lt;/li&gt;
&lt;li&gt;Getting way more negative than positive sentiments may not reflect a partial position of the news agencies. Many headlines are about problems related to Covid-19 and may be inherently negative (some are not).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both agencies have similar results - not exactly the same, but very similar.&lt;/p&gt;

&lt;h1&gt;
  
  
  Next steps
&lt;/h1&gt;

&lt;p&gt;Recently I added a new news agency (&lt;a href="https://www.r7.com/"&gt;R7&lt;/a&gt;) and will try to update the data and analysis once I have more relevant data - maybe with new news agencies and new keywords.&lt;/p&gt;

</description>
      <category>python</category>
      <category>node</category>
      <category>gcp</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Creating my first Node.js app</title>
      <dc:creator>Bruno Luvizotto</dc:creator>
      <pubDate>Sat, 02 May 2020 00:09:47 +0000</pubDate>
      <link>https://dev.to/brudhu/creating-my-first-node-js-app-30kk</link>
      <guid>https://dev.to/brudhu/creating-my-first-node-js-app-30kk</guid>
      <description>&lt;p&gt;This tutorial article was written using Linux – that's why the commands won't work on a Windows computer. While it's not a requirement, if you are planning to become a developer, I strongly recommend using a Unix based operating system.&lt;/p&gt;

&lt;p&gt;The only official requirement to run a Node project is having Node installed on your computer, but this is not what happens in the real world. To make it easier to deploy an application, some tools are used – npm in this case (Node Package Manager).&lt;/p&gt;

&lt;p&gt;The first step is to install NPM (and the way to do it depends on your Linux distribution or Operating System).&lt;/p&gt;

&lt;h1&gt;
  
  
  Installing NPM (Node Package Manager)
&lt;/h1&gt;

&lt;p&gt;On Arch linux, npm is supplied by the npm community package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorials]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;pacman &lt;span class="nt"&gt;-Sy&lt;/span&gt; npm
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;On Ubuntu (and other distributions), the instructions can be found here: &lt;a href="https://github.com/nodesource/distributions/blob/master/README.md"&gt;https://github.com/nodesource/distributions/blob/master/README.md&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorials]&lt;span class="nv"&gt;$ &lt;/span&gt;curl &lt;span class="nt"&gt;-sL&lt;/span&gt; https://deb.nodesource.com/setup_14.x | &lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; bash -
&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorials]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; nodejs
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h1&gt;
  
  
  Creating the app using NPM
&lt;/h1&gt;

&lt;p&gt;Create a directory for you project and enter the directory:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorials]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;tutorial-project-1
&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorial]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;tutorial-project-1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Once you are in the directory, create the app using NPM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorial-project-1]&lt;span class="nv"&gt;$ &lt;/span&gt;npm init
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;After running the init command, it will ask some questions about your project (you can just press enter to all of then for this project):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;package name: the name of you project&lt;/li&gt;
&lt;li&gt;version: the version of your project&lt;/li&gt;
&lt;li&gt;description: the description of your project&lt;/li&gt;
&lt;li&gt;entry point: the file that will be called to run your project&lt;/li&gt;
&lt;li&gt;test command: a command to run tests on your project&lt;/li&gt;
&lt;li&gt;git repository: the git repository of your project, in case it already has one&lt;/li&gt;
&lt;li&gt;keywords: keywords of you project&lt;/li&gt;
&lt;li&gt;author: the author's name&lt;/li&gt;
&lt;li&gt;license: the license type of the project&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is what I answered for this tutorial - once you answer all the questions, it will create a package.json file, as shown bellow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorial-project-1]&lt;span class="nv"&gt;$ &lt;/span&gt;npm init
This utility will walk you through creating a package.json file.
It only covers the most common items, and tries to guess sensible defaults.

See &lt;span class="sb"&gt;`&lt;/span&gt;npm &lt;span class="nb"&gt;help &lt;/span&gt;json&lt;span class="sb"&gt;`&lt;/span&gt; &lt;span class="k"&gt;for &lt;/span&gt;definitive documentation on these fields
and exactly what they &lt;span class="k"&gt;do&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;

Use &lt;span class="sb"&gt;`&lt;/span&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &amp;lt;pkg&amp;gt;&lt;span class="sb"&gt;`&lt;/span&gt; afterwards to &lt;span class="nb"&gt;install &lt;/span&gt;a package and
save it as a dependency &lt;span class="k"&gt;in &lt;/span&gt;the package.json file.

Press ^C at any &lt;span class="nb"&gt;time &lt;/span&gt;to quit.
package name: &lt;span class="o"&gt;(&lt;/span&gt;tutorial-project-1&lt;span class="o"&gt;)&lt;/span&gt;
version: &lt;span class="o"&gt;(&lt;/span&gt;1.0.0&lt;span class="o"&gt;)&lt;/span&gt;
description: My first Node.js app project
entry point: &lt;span class="o"&gt;(&lt;/span&gt;index.js&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;test command&lt;/span&gt;:
git repository:
keywords: node tutorial
author: Bruno Luvizotto
license: &lt;span class="o"&gt;(&lt;/span&gt;ISC&lt;span class="o"&gt;)&lt;/span&gt;
About to write to /home/brudhu/tutorials/tutorial-project-1/package.json:

&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"name"&lt;/span&gt;:&lt;span class="s2"&gt;"tutorial-project-1"&lt;/span&gt;,
  &lt;span class="s2"&gt;"version"&lt;/span&gt;:&lt;span class="s2"&gt;"1.0.0"&lt;/span&gt;,
  &lt;span class="s2"&gt;"description"&lt;/span&gt;:&lt;span class="s2"&gt;"My first Node.js app project"&lt;/span&gt;,
  &lt;span class="s2"&gt;"main"&lt;/span&gt;:&lt;span class="s2"&gt;"index.js"&lt;/span&gt;,
  &lt;span class="s2"&gt;"scripts"&lt;/span&gt;:&lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="s2"&gt;"test"&lt;/span&gt;:&lt;span class="s2"&gt;"echo &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;Error: no test specified&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &amp;amp;&amp;amp; exit 1"&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;,
  &lt;span class="s2"&gt;"keywords"&lt;/span&gt;:[
    &lt;span class="s2"&gt;"node"&lt;/span&gt;,
    &lt;span class="s2"&gt;"tutorial"&lt;/span&gt;
  &lt;span class="o"&gt;]&lt;/span&gt;,
  &lt;span class="s2"&gt;"author"&lt;/span&gt;:&lt;span class="s2"&gt;"Bruno Luvizotto"&lt;/span&gt;,
  &lt;span class="s2"&gt;"license"&lt;/span&gt;:&lt;span class="s2"&gt;"ISC"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;

Is this OK? &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;yes&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The package.json file is the descriptor of you project - it stores all the information you answered in the npm init command and will store information on the packages used by the project (dependencies).&lt;/p&gt;

&lt;p&gt;If you list the files in the project's directory, there will be the new package.json file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorial-project-1]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;ls
&lt;/span&gt;package.json
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Now that we have the project descriptor (aka package.json), let's create the first file (the entry point of the project):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorial-project-1]&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'console.log("I did it! My first project!")'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; index.js
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;At this point, we have the package.json and the index.js files. The next thing to do is to create a start script in your package.json file. Add the line &lt;code&gt;"start": "node index.js"&lt;/code&gt; under “scripts”. Don't forget to add the comma after the previous line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"tutorial-project-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1.0.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"My first Node.js app project"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"main"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"index.js"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scripts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"test"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"echo &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;Error: no test specified&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt; &amp;amp;&amp;amp; exit 1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"start"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"node index.js"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"keywords"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"node"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"tutorial"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"author"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bruno Luvizotto"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"license"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ISC"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The scripts described under “scripts” in the package.json file can be run using the npm run command (e.g. npm run test or npm run start in this case).&lt;/p&gt;

&lt;p&gt;Now that we have the start script described and also the index.js file, we can finally run the project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;[&lt;/span&gt;brudhu@brudhu-manjaro tutorial-project-1]&lt;span class="nv"&gt;$ &lt;/span&gt;npm run start

&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; tutorial-project-1@1.0.0 start /home/brudhu/tutorials/tutorial-project-1
&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; node index.js

I did it! My first project!
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Congratulations! This is the very beginning of a Node.js project! &lt;/p&gt;

</description>
      <category>node</category>
      <category>javascript</category>
      <category>npm</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
