<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Prashant Kumar</title>
    <description>The latest articles on DEV Community by Prashant Kumar (@prashant2018).</description>
    <link>https://dev.to/prashant2018</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F449504%2Fd04e2b8c-4f8f-4d74-ad0a-aecb3c3cdc57.jpg</url>
      <title>DEV Community: Prashant Kumar</title>
      <link>https://dev.to/prashant2018</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/prashant2018"/>
    <language>en</language>
    <item>
      <title>Consuming REST API in GO</title>
      <dc:creator>Prashant Kumar</dc:creator>
      <pubDate>Mon, 02 May 2022 15:17:35 +0000</pubDate>
      <link>https://dev.to/prashant2018/consuming-rest-api-in-go-382f</link>
      <guid>https://dev.to/prashant2018/consuming-rest-api-in-go-382f</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--VWz-0iiV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qdaab3fglbywbmzlhlqc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--VWz-0iiV--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/qdaab3fglbywbmzlhlqc.png" alt="Consuming REST API GO" width="880" height="495"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Consuming REST API is really simple in GO. The &lt;em&gt;net/http&lt;/em&gt; comes with GO standard modules, which we are going to use in this article.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"encoding/json"&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"io/ioutil"&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;
    &lt;span class="s"&gt;"net/http"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;getMovieDetails&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;movieName&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

    &lt;span class="c"&gt;// response is going to be stored in this variable&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;movieDetailResponse&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="c"&gt;// url to be fetched, replace the apiikey with your own&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"http://www.omdbapi.com/?t="&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;movieName&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"&amp;amp;apikey=your_key"&lt;/span&gt;

    &lt;span class="c"&gt;// create a new http request&lt;/span&gt;
    &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// check for errors&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// close the response body when function returns&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// read the response body&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;ioutil&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// unmarshal the json response&lt;/span&gt;
    &lt;span class="c"&gt;// the body is in []byte format, so we need to convert it to json&lt;/span&gt;
    &lt;span class="c"&gt;// we can use json.Unmarshal(body, &amp;amp;movieDetailResponse) to unmarshal byte array to json&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unmarshal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;movieDetailResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// return the movie details&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;movieDetailResponse&lt;/span&gt;

&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;movieName&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="s"&gt;"k.g.f chapter 2"&lt;/span&gt;
    &lt;span class="n"&gt;movieDetail&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;getMovieDetails&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;movieName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;// print the movie details&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movieDetail&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"Title"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Genre"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movieDetail&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"Genre"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Plot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;movieDetail&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"Plot"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hope this was insightful. &lt;/p&gt;

&lt;p&gt;For more instant updates follow me on &lt;a href="https://twitter.com/prash2018"&gt;twitter&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>rest</category>
      <category>api</category>
    </item>
    <item>
      <title>Web Scraping using Python! Create your own Dataset</title>
      <dc:creator>Prashant Kumar</dc:creator>
      <pubDate>Tue, 20 Jul 2021 17:17:10 +0000</pubDate>
      <link>https://dev.to/prashant2018/web-scraping-using-python-create-your-own-dataset-50n5</link>
      <guid>https://dev.to/prashant2018/web-scraping-using-python-create-your-own-dataset-50n5</guid>
      <description>&lt;p&gt;Machine Learning requires a lot of data and not always it is easy to get the data you want. Have you ever wondered how Kaggle and other such websites provide us with huge datasets? The answer is web scraping. So, let us see how we can extract data from the web.&lt;br&gt;
Let’s assume we are building a model which requires movie information such as title, summary, and rating of a number of movies. When it comes to movies, we know IMDB has the largest database. Let us dig into it.&lt;/p&gt;
&lt;h3&gt;
  
  
  What exactly we do to scrape a webpage?
&lt;/h3&gt;

&lt;p&gt;There’s a pattern in everything. We need to observe and find a pattern in the HTML code of the web page to extract relevant data. Let’s go step by step. We will be doing everything using python and scrape the data from the following URL :&lt;br&gt;
 &lt;a href="https://dev.toLink"&gt;https://www.imdb.com/search/title?release_date=2019&amp;amp;sort=user_rating,desc&amp;amp;ref_=adv_nxt&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Install dependencies&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# To download the webpage
pip install requests
# To scrape data from the downloaded webpage
pip install beautifulsoup4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Download the webpage&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;“Requests” is a great HTTP library to make request calls. We will use it to download the webpage of the given URL.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import requests
url = "https://www.imdb.com/search/title?release_date=2019&amp;amp;sort=user_rating,desc&amp;amp;ref_=adv_nxt"
# get() method downloads the entire HTML of the provided url
response = requests.get(url)
# Get the text from the response object
response_text = response.text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Inspecting elements and finding the pattern&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now the data we have downloaded is exactly the same you see when you right-click and do inspect element in the browser. Let’s right-click on the rating and see how we can extract it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1626586216655%2FiNe-dzemU.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.hashnode.com%2Fres%2Fhashnode%2Fimage%2Fupload%2Fv1626586216655%2FiNe-dzemU.png" alt="medium1.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When we look closely we will see the class “&lt;strong&gt;ratings-bar&lt;/strong&gt;” contains the rating of the movie. If we inspect other movies, we will find all the movies have the same class name for the ratings on that page. Here, we found a pattern to extract all the ratings from the page. Similarly, we can extract summary, title, genre, etc.&lt;/p&gt;

&lt;p&gt;Not only using &lt;strong&gt;class&lt;/strong&gt; but you can select a specific part of the HTML code using &lt;strong&gt;id&lt;/strong&gt;, &lt;strong&gt;tags&lt;/strong&gt;, etc as well.&lt;/p&gt;

&lt;p&gt;Let’s jump into the code!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BeautifulSoup&lt;/strong&gt; allows us to extract data(more precisely parse data) from HTML using the class name, id, tags, etc. Isn’t it Beautiful? :-D&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
from bs4 import BeautifulSoup
# Create a BeautifulSoup object
# response_text -&amp;gt; The downloaded webpage
# lxml -&amp;gt; Used for processing HTML and XML pages
soup = BeautifulSoup(response_text,'lxml')

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To select the content from the page we use &lt;em&gt;CSS Selectors&lt;/em&gt;. CSS Selectors allows us to select different &lt;em&gt;classes&lt;/em&gt;, &lt;em&gt;ids&lt;/em&gt;, &lt;em&gt;tags&lt;/em&gt;, and other html elements easily. CSS Selector for &lt;em&gt;Class&lt;/em&gt; is &lt;strong&gt;"."&lt;/strong&gt; and for &lt;em&gt;ID&lt;/em&gt; is &lt;strong&gt;"#"&lt;/strong&gt;. To select a class we need to prefix a &lt;strong&gt;"."&lt;/strong&gt; to the class name we want to extract and similarly, for ID we need to prefix &lt;strong&gt;"#"&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# As we saw the rating's class name was "ratings-bar" 
# we prefix "." since its a class
rating_class_selector = ".ratings-bar"
# Extract the all the ratings class
rating_list = soup.select(rating_class_selector)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This “&lt;strong&gt;rating_list&lt;/strong&gt;” is the list of object containing all the &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; elements containing “&lt;strong&gt;ratings-bar&lt;/strong&gt;” as class name. We need to get the text from within the div element.&lt;/p&gt;

&lt;p&gt;Here’s how a single rating object looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;div class="ratings-bar"&amp;gt;
&amp;lt;div class="inline-block ratings-imdb-rating" data-value="10" name="ir"&amp;gt;
&amp;lt;span class="global-sprite rating-star imdb-rating"&amp;gt;&amp;lt;/span&amp;gt;
&amp;lt;strong&amp;gt;10.0&amp;lt;/strong&amp;gt;
&amp;lt;/div&amp;gt;
...
&amp;lt;/div&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We need to get the rating value from the &lt;code&gt;&amp;lt;strong&amp;gt;&lt;/code&gt; tag. We can extract the tags using &lt;strong&gt;find(‘tagName’)&lt;/strong&gt; method and get the text using &lt;strong&gt;getText()&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# This List will store all the ratings
ratings = []
# Iterate through all the ratings object
for rating_object in rating_list:
    # Find the &amp;lt;strong&amp;gt; tag and get the Text
    rating_text = rating_object.find('strong').getText() 
    # Append the rating to the list
    ratings.append(rating_text)
print(ratings)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And we are &lt;strong&gt;done&lt;/strong&gt;. Similarly, you can extract &lt;strong&gt;Titles&lt;/strong&gt;, &lt;strong&gt;Summary&lt;/strong&gt;, &lt;strong&gt;Genre&lt;/strong&gt; using the above method with the appropriate class name and tag names.&lt;/p&gt;

&lt;p&gt;You can store the data to CSV or excel file and use it for your Machine Learning model.&lt;/p&gt;

&lt;p&gt;Full Code present on my &lt;strong&gt;Github&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.toLink"&gt;https://github.com/prashant2018/Medium-Article-Code-Snippets/tree/master/Web-Scraping-Using-Python&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;Follow me on &lt;strong&gt;Twitter&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://dev.toLink"&gt;https://twitter.com/prash2018&lt;/a&gt; &lt;/p&gt;

</description>
      <category>python</category>
      <category>machinelearning</category>
      <category>datascience</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
