Consuming REST API in GO

Prashant Kumar — Mon, 02 May 2022 15:17:35 +0000

Consuming REST API is really simple in GO. The net/http comes with GO standard modules, which we are going to use in this article.

package main

import (
    "encoding/json"
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
)

func getMovieDetails(movieName string) map[string]interface{} {

    // response is going to be stored in this variable
    var movieDetailResponse map[string]interface{}

    // url to be fetched, replace the apiikey with your own
    url := "http://www.omdbapi.com/?t=" + movieName + "&apikey=your_key"

    // create a new http request
    res, err := http.Get(url)

    // check for errors
    if err != nil {
        log.Fatal(err)
    }

    // close the response body when function returns
    defer res.Body.Close()

    // read the response body
    body, err := ioutil.ReadAll(res.Body)

    if err != nil {
        log.Fatal(err)
    }

    // unmarshal the json response
    // the body is in []byte format, so we need to convert it to json
    // we can use json.Unmarshal(body, &movieDetailResponse) to unmarshal byte array to json
    if err := json.Unmarshal(body, &movieDetailResponse); err != nil {
        log.Fatal(err)
    }

    // return the movie details
    return movieDetailResponse

}

func main() {
    movieName := "k.g.f chapter 2"
    movieDetail := getMovieDetails(movieName)
    // print the movie details
    fmt.Println("Title", movieDetail["Title"])
    fmt.Println("Genre", movieDetail["Genre"])
    fmt.Println("Plot", movieDetail["Plot"])
}

Hope this was insightful.

For more instant updates follow me on twitter

Web Scraping using Python! Create your own Dataset

Prashant Kumar — Tue, 20 Jul 2021 17:17:10 +0000

Machine Learning requires a lot of data and not always it is easy to get the data you want. Have you ever wondered how Kaggle and other such websites provide us with huge datasets? The answer is web scraping. So, let us see how we can extract data from the web.
Let’s assume we are building a model which requires movie information such as title, summary, and rating of a number of movies. When it comes to movies, we know IMDB has the largest database. Let us dig into it.

What exactly we do to scrape a webpage?

There’s a pattern in everything. We need to observe and find a pattern in the HTML code of the web page to extract relevant data. Let’s go step by step. We will be doing everything using python and scrape the data from the following URL :
https://www.imdb.com/search/title?release_date=2019&sort=user_rating,desc&ref_=adv_nxt

1. Install dependencies

# To download the webpage
pip install requests
# To scrape data from the downloaded webpage
pip install beautifulsoup4

2. Download the webpage

“Requests” is a great HTTP library to make request calls. We will use it to download the webpage of the given URL.

import requests
url = "https://www.imdb.com/search/title?release_date=2019&sort=user_rating,desc&ref_=adv_nxt"
# get() method downloads the entire HTML of the provided url
response = requests.get(url)
# Get the text from the response object
response_text = response.text

3. Inspecting elements and finding the pattern

Now the data we have downloaded is exactly the same you see when you right-click and do inspect element in the browser. Let’s right-click on the rating and see how we can extract it.

When we look closely we will see the class “ratings-bar” contains the rating of the movie. If we inspect other movies, we will find all the movies have the same class name for the ratings on that page. Here, we found a pattern to extract all the ratings from the page. Similarly, we can extract summary, title, genre, etc.

Not only using class but you can select a specific part of the HTML code using id, tags, etc as well.

Let’s jump into the code!

BeautifulSoup allows us to extract data(more precisely parse data) from HTML using the class name, id, tags, etc. Isn’t it Beautiful? :-D


from bs4 import BeautifulSoup
# Create a BeautifulSoup object
# response_text -> The downloaded webpage
# lxml -> Used for processing HTML and XML pages
soup = BeautifulSoup(response_text,'lxml')

To select the content from the page we use CSS Selectors. CSS Selectors allows us to select different classes, ids, tags, and other html elements easily. CSS Selector for Class is "." and for ID is "#". To select a class we need to prefix a "." to the class name we want to extract and similarly, for ID we need to prefix "#".

# As we saw the rating's class name was "ratings-bar" 
# we prefix "." since its a class
rating_class_selector = ".ratings-bar"
# Extract the all the ratings class
rating_list = soup.select(rating_class_selector)

This “rating_list” is the list of object containing all the <div> elements containing “ratings-bar” as class name. We need to get the text from within the div element.

Here’s how a single rating object looks like:

<div class="ratings-bar">
<div class="inline-block ratings-imdb-rating" data-value="10" name="ir">
<span class="global-sprite rating-star imdb-rating"></span>
<strong>10.0</strong>
</div>
...
</div>

We need to get the rating value from the <strong> tag. We can extract the tags using find(‘tagName’) method and get the text using getText().

# This List will store all the ratings
ratings = []
# Iterate through all the ratings object
for rating_object in rating_list:
    # Find the <strong> tag and get the Text
    rating_text = rating_object.find('strong').getText() 
    # Append the rating to the list
    ratings.append(rating_text)
print(ratings)

And we are done. Similarly, you can extract Titles, Summary, Genre using the above method with the appropriate class name and tag names.

You can store the data to CSV or excel file and use it for your Machine Learning model.

Full Code present on my Github:

https://github.com/prashant2018/Medium-Article-Code-Snippets/tree/master/Web-Scraping-Using-Python

Follow me on Twitter:

https://twitter.com/prash2018

DEV Community: Prashant Kumar

Consuming REST API in GO

Web Scraping using Python! Create your own Dataset

What exactly we do to scrape a webpage?