DEV Community

Serpdog
Serpdog

Posted on • Originally published at ecommerceapi.io

Scraping E-commerce Platforms with Python

Scraping E-commerce Platforms with Python

The online retail or e-commerce industry has seen a fast pace of growth since its launch. This boom is also witnessing the rise of 10-minute delivery models, revolutionizing how customers interact with e-commerce platforms.

Scraping E-commerce Platforms with Python

This article will explore how to scrape data from e-commerce platforms using Python and the E-commerce Data API.

What is E-commerce Scraping?

E-commerce web scraping involves extracting publicly available data from e-commerce platforms such as Amazon, Walmart, and Flipkart. This data can be used to compare prices, track competitors, understand customer preferences, make data-driven decisions, and stand out in the fierce competition.

It offers various use cases for businesses to grow their digital presence, including price monitoring, market trends forecasting, price prediction, product data enrichment, and more.

So far, we have covered the basics of e-commerce scraping. Let us now explore how we can implement it with Python.

Is it legal to scrape E-Commerce platforms?

In short, it is legal to scrape e-commerce platforms as long as the data being extracted is publicly available. The data generally used by businesses includes product information, customer reviews, and pricing data, which is available to everyone and is completely legal to scrape since you are not accessing any private information of the platform or its users.

However, it is important to respect the website’s terms of service before extracting the data and to avoid overloading their servers with excessive requests, which can severely affect not only the website but also your data collection process.

Prerequisites

For those who have not installed Python, you can download it from here. After downloading Python, we will install the libraries we will use in this project.

pip install requests
Enter fullscreen mode Exit fullscreen mode

So, now we are done with the setup. Let’s create a new file in our project folder and start the project.

Building the Scraper

To build our scraper, we need to first import the library we installed earlier.

import requests
Enter fullscreen mode Exit fullscreen mode

As we will use the EcommerceAPI to retrieve the data, you will need an API Key from its dashboard to collect the data. If you haven’t registered already, you can sign up to get the API Key and 1000 free credits for testing purposes.

After successfully registering, you can add the API Key to your code.

api_key = "xxxx8977ac"
Enter fullscreen mode Exit fullscreen mode

For the sake of this tutorial, we will be scraping Walmart.

Making an API request on EcommerceAPI is straightforward. You just need to pass the API key and the platform URL to scrape the results.

base_url = "https://api.ecommerceapi.io/walmart_search"

params = {
    "api_key": api_key,
    "url": "https://www.walmart.com/search?q=football"
}
Enter fullscreen mode Exit fullscreen mode

Now that we have the base URL and parameters ready, we will establish an HTTP GET connection using Python’s Requests library.

response = requests.get(base_url, params=params)

print(response.json())
Enter fullscreen mode Exit fullscreen mode

This will return you the meta information and the search results from the Walmart search page. However, we only need the list of products from the search results to access the pricing information.

If you examine the returned response, you will find that the products are within the search results array. Let’s access it.

data = response.json()

search_results = data.get('search_results', [])  

print(search_results)
Enter fullscreen mode Exit fullscreen mode

This will give you the following output:

Output

Alternatively, you can loop through each item to retrieve the pricing and other details of the product.

    # Extract and print the current_price for each product
for product in search_results:

    for item in product['item']:
        print(f"Product Title: {item['title']}")
        print(f"Current Price: {item['current_price']}")
        print('-' * 40)
Enter fullscreen mode Exit fullscreen mode

Easier isn’t it? You don’t even need to parse complex HTML structures; the ready-made JSON data is available to you within seconds.

Here is the complete code:

import requests

api_key = "xxxx8977ac"

base_url = "https://api.ecommerceapi.io/walmart_search"

params = {
"api_key": api_key,
"url": "https://www.walmart.com/search?q=football"
}

response = requests.get(base_url, params=params)
data = response.json()

search_results = data.get('search_results', [])

print(search_results)

for product in search_results:

for item in product['item']:
    print(f"Product Title: {item['title']}")
    print(f"Current Price: {item['current_price']}")
    print('-' * 40)
Enter fullscreen mode Exit fullscreen mode
Enter fullscreen mode Exit fullscreen mode




Conclusion

The web scraping community has developed various techniques to extract data from e-commerce platforms, making it easier than ever. Techniques include bypassing CAPTCHAs or other blocking mechanisms using configurations that help our IP address avoid getting blocked by the website.

However, if you need to perform this method at scale, relying on a single IP with basic infrastructure may not suffice. In such cases, using an e-commerce scraper API would be ideal. It helps you collect data at scale without facing obstructions and at an economical price.

In this article, we learned how to use Python for scraping e-commerce platforms. With this basic technique, you can develop your scraper to perform data extraction at scale.

Top comments (0)