<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Giuseppe Schillaci</title>
    <description>The latest articles on DEV Community by Giuseppe Schillaci (@giuschil).</description>
    <link>https://dev.to/giuschil</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1277600%2F1b64e3f3-726d-4584-8678-aa978b66fb40.jpeg</url>
      <title>DEV Community: Giuseppe Schillaci</title>
      <link>https://dev.to/giuschil</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/giuschil"/>
    <language>en</language>
    <item>
      <title>Building a Review Scraper with Python using BeautifulSoup and Sentiment Analysis with NLTK</title>
      <dc:creator>Giuseppe Schillaci</dc:creator>
      <pubDate>Sat, 10 Feb 2024 22:13:23 +0000</pubDate>
      <link>https://dev.to/giuschil/building-a-review-scraper-with-python-using-beautifulsoup-and-sentiment-analysis-with-nltk-38f8</link>
      <guid>https://dev.to/giuschil/building-a-review-scraper-with-python-using-beautifulsoup-and-sentiment-analysis-with-nltk-38f8</guid>
      <description>&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;In recent years, analyzing online reviews has become a crucial aspect for many businesses. Understanding customer sentiment can help identify areas for improvement and evaluate overall customer satisfaction. In this article, we'll explore how to use Python to create a review scraper and analyze sentiment using the BeautifulSoup and NLTK libraries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Review Scraper with BeautifulSoup
&lt;/h3&gt;

&lt;p&gt;To begin, we utilized Python along with the BeautifulSoup library to extract reviews from a leading Italian company's online review site. BeautifulSoup allows us to parse the HTML markup of a web page and efficiently extract the data of interest. Using BeautifulSoup's features, we extracted the reviews and saved them for further analysis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests
from bs4 import BeautifulSoup
import pandas as pd

# Number of pages to scrape
page_start = 1
page_end = 49

# DataFrame to store the data
df = pd.DataFrame(columns=["title", "text"])

# Loop through the pages
for page_num in range(page_start, page_end + 1):
    # Construct the URL for the current page
    url = f'https://it.trustpilot.com/review/www.companyname.it?page={page_num}'

    # Make an HTTP request to fetch the page content
    response = requests.get(url)
    if response.status_code == 200:
        # Use BeautifulSoup to parse the HTML of the page
        soup = BeautifulSoup(response.content, 'html.parser')

        # Find all review elements
        reviews = soup.find_all(attrs={"data-review-content": True})

        # Extract title and text of each review and add them to the DataFrame
        for review in reviews:
            title_element = review.find(attrs={"data-service-review-title-typography": True})
            content_element = review.find(attrs={"data-service-review-text-typography": True})

            if title_element and content_element:
                title = title_element.text
                content = content_element.text
                # Add data to the DataFrame
                df = df.append({"title": title, "text": content}, ignore_index=True)
            else:
                print("Title or text element not found.")

# Print the DataFrame with all review data
df


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Review Analysis with NLTK
&lt;/h3&gt;

&lt;p&gt;Once the reviews were extracted, we employed the Natural Language Toolkit (NLTK), a widely-used Python library for Natural Language Processing (NLP). NLTK provides a range of tools for text analysis, including sentiment analysis.&lt;/p&gt;

&lt;p&gt;We used NLTK's SentimentIntensityAnalyzer to assess the sentiment of the reviews. This analyzer assigns a numerical score to each review, indicating whether the sentiment is positive, negative, or neutral. This analysis provided us with a clear insight into customer sentiment towards the company.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer

# Download the VADER lexicon for sentiment analysis
nltk.download('vader_lexicon')

# Create a SentimentIntensityAnalyzer object
sid = SentimentIntensityAnalyzer()

# Define a function to get the sentiment of a text
def get_sentiment(text):
    # Calculate the sentiment score of the text
    scores = sid.polarity_scores(text)
    # Determine the sentiment based on the compound score
    if scores['compound'] &amp;gt;= 0.05:
        return 'positive'
    elif scores['compound'] &amp;lt;= -0.05:
        return 'negative'
    else:
        return 'neutral'


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Visualizing the Results
&lt;/h3&gt;

&lt;p&gt;Finally, we used the analyzed data to create bar and pie charts displaying the percentages of negative, positive, and neutral reviews. These charts offer a visual representation of the overall sentiment of the reviews and allow for easy identification of trends.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import matplotlib.pyplot as plt

# Count unique values in the 'sentiment' column
value_counts = df['sentiment'].value_counts()

# Define colors for each category
colors = {'positive': 'green', 'negative': 'red', 'neutral': 'blue'}

# Create a pie chart using the defined colors
plt.pie(value_counts, labels=value_counts.index, colors=[colors[value] for value in value_counts.index], autopct='%1.1f%%')

# Add title
plt.title('Sentiment Analysis of Reviews for Company XYZ')

# Show the chart
plt.show()


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;In this article, we've seen how to use Python along with the BeautifulSoup and NLTK libraries to create a review scraper and analyze online sentiment. The combination of these powerful libraries allowed us to gain valuable insights into customer sentiment and visualize the results clearly and comprehensively.&lt;/p&gt;

&lt;p&gt;By employing similar techniques, businesses can actively monitor customer feedback and make informed decisions to enhance overall customer experience. The combination of web scraping and sentiment analysis is a powerful tool for online reputation monitoring and customer relationship management.&lt;/p&gt;

</description>
      <category>python</category>
      <category>data</category>
      <category>webscraper</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Building a Review Scraper with Python using BeautifulSoup and Sentiment Analysis with NLTK</title>
      <dc:creator>Giuseppe Schillaci</dc:creator>
      <pubDate>Sat, 10 Feb 2024 21:27:15 +0000</pubDate>
      <link>https://dev.to/giuschil/test-4nel</link>
      <guid>https://dev.to/giuschil/test-4nel</guid>
      <description>&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;In recent years, analyzing online reviews has become a crucial aspect for many businesses. Understanding customer sentiment can help identify areas for improvement and evaluate overall customer satisfaction. In this article, we'll explore how to use Python to create a review scraper and analyze sentiment using the BeautifulSoup and NLTK libraries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating the Review Scraper with BeautifulSoup
&lt;/h3&gt;

&lt;p&gt;To begin, we utilized Python along with the BeautifulSoup library to extract reviews from a leading Italian company's online review site. BeautifulSoup allows us to parse the HTML markup of a web page and efficiently extract the data of interest. Using BeautifulSoup's features, we extracted the reviews and saved them for further analysis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests
from bs4 import BeautifulSoup
import pandas as pd

# Number of pages to scrape
page_start = 1
page_end = 49

# DataFrame to store the data
df = pd.DataFrame(columns=["title", "text"])

# Loop through the pages
for page_num in range(page_start, page_end + 1):
    # Construct the URL for the current page
    url = f'https://it.trustpilot.com/review/www.companyname.it?page={page_num}'

    # Make an HTTP request to fetch the page content
    response = requests.get(url)
    if response.status_code == 200:
        # Use BeautifulSoup to parse the HTML of the page
        soup = BeautifulSoup(response.content, 'html.parser')

        # Find all review elements
        reviews = soup.find_all(attrs={"data-review-content": True})

        # Extract title and text of each review and add them to the DataFrame
        for review in reviews:
            title_element = review.find(attrs={"data-service-review-title-typography": True})
            content_element = review.find(attrs={"data-service-review-text-typography": True})

            if title_element and content_element:
                title = title_element.text
                content = content_element.text
                # Add data to the DataFrame
                df = df.append({"title": title, "text": content}, ignore_index=True)
            else:
                print("Title or text element not found.")

# Print the DataFrame with all review data
df


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Review Analysis with NLTK
&lt;/h3&gt;

&lt;p&gt;Once the reviews were extracted, we employed the Natural Language Toolkit (NLTK), a widely-used Python library for Natural Language Processing (NLP). NLTK provides a range of tools for text analysis, including sentiment analysis.&lt;/p&gt;

&lt;p&gt;We used NLTK's SentimentIntensityAnalyzer to assess the sentiment of the reviews. This analyzer assigns a numerical score to each review, indicating whether the sentiment is positive, negative, or neutral. This analysis provided us with a clear insight into customer sentiment towards the company.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer

# Download the VADER lexicon for sentiment analysis
nltk.download('vader_lexicon')

# Create a SentimentIntensityAnalyzer object
sid = SentimentIntensityAnalyzer()

# Define a function to get the sentiment of a text
def get_sentiment(text):
    # Calculate the sentiment score of the text
    scores = sid.polarity_scores(text)
    # Determine the sentiment based on the compound score
    if scores['compound'] &amp;gt;= 0.05:
        return 'positive'
    elif scores['compound'] &amp;lt;= -0.05:
        return 'negative'
    else:
        return 'neutral'


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Visualizing the Results
&lt;/h3&gt;

&lt;p&gt;Finally, we used the analyzed data to create bar and pie charts displaying the percentages of negative, positive, and neutral reviews. These charts offer a visual representation of the overall sentiment of the reviews and allow for easy identification of trends.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import matplotlib.pyplot as plt

# Count unique values in the 'sentiment' column
value_counts = df['sentiment'].value_counts()

# Define colors for each category
colors = {'positive': 'green', 'negative': 'red', 'neutral': 'blue'}

# Create a pie chart using the defined colors
plt.pie(value_counts, labels=value_counts.index, colors=[colors[value] for value in value_counts.index], autopct='%1.1f%%')

# Add title
plt.title('Sentiment Analysis of Reviews for Company XYZ')

# Show the chart
plt.show()


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;In this article, we've seen how to use Python along with the BeautifulSoup and NLTK libraries to create a review scraper and analyze online sentiment. The combination of these powerful libraries allowed us to gain valuable insights into customer sentiment and visualize the results clearly and comprehensively.&lt;/p&gt;

&lt;p&gt;By employing similar techniques, businesses can actively monitor customer feedback and make informed decisions to enhance overall customer experience. The combination of web scraping and sentiment analysis is a powerful tool for online reputation monitoring and customer relationship management.&lt;/p&gt;

</description>
      <category>python</category>
      <category>scraping</category>
      <category>webdev</category>
      <category>data</category>
    </item>
  </channel>
</rss>
