DEV Community

Artur Chukhrai for SerpApi

Posted on • Edited on • Originally published at serpapi.com

Integrate The Home Depot Search Page Results Data with SerpApi and Python

Intro

SerpApi’s Home Depot API allows to scrape product information in real time for your automation without the knowledge of web scraping. In this blog post, we'll go over how to extract data from all pages using The Home Depot Search API and Python programming language. You can look at the complete code in the online IDE (Replit).

If you prefer video format, we have a dedicated video that shows how to do that: The Home Depot Search API overview - SerpApi.

What will be scraped

wwbs-the-home-depot-search

📌Note: Each page displays up to 24 products.

Why using API?

There're a couple of reasons that may use API, ours in particular:

  • No need to create a parser from scratch and maintain it.
  • Bypass blocks from Google: solve CAPTCHA or solve IP blocks.
  • Pay for proxies, and CAPTCHA solvers.
  • Don't need to use browser automation.

SerpApi handles everything on the backend with fast response times under ~2.5 seconds (~1.2 seconds with Ludicrous speed) per request and without browser automation, which becomes much faster. Response times and status rates are shown under SerpApi Status page.

serpapi-status-all

Full Code

This code retrieves all the data with pagination:

from serpapi import GoogleSearch
import json

params = {
    'api_key': '...',           # https://serpapi.com/manage-api-key
    'engine': 'home_depot',     # SerpApi search engine 
    'q': 'coffeee',             # query
    'ps': 10,                   # number of items per page
    'lowerbound': 20,           # minimum price
    'upperbound': 50,           # maximum price
    'hd_sort': 'top_rated',     # sorted by different options
    'page': 1                   # pagination
}

search = GoogleSearch(params)   # where data extraction happens on the SerpApi backend
results = search.get_dict()     # JSON -> Python dict

home_depot_results = {
    'search_information': results['search_information'],
    'filters': results['filters'],
    'products': []
}

while 'error' not in results:
    home_depot_results['products'].extend(results['products'])

    params['page'] += 1
    results = search.get_dict()

print(json.dumps(home_depot_results, indent=2, ensure_ascii=False))
Enter fullscreen mode Exit fullscreen mode

Preparation

Install library:

pip install google-search-results
Enter fullscreen mode Exit fullscreen mode

google-search-results is a SerpApi API package.

Code Explanation

Import libraries:

from serpapi import GoogleSearch
import json
Enter fullscreen mode Exit fullscreen mode
Library Purpose
GoogleSearch to scrape and parse Google results using SerpApi web scraping library.
json to convert extracted data to a JSON object.

The parameters are defined for generating the URL. If you want to pass other parameters to the URL, you can do so using the params dictionary:

params = {
    'api_key': '...',           # https://serpapi.com/manage-api-key
    'engine': 'home_depot',     # SerpApi search engine 
    'q': 'coffeee',             # query
    'ps': 10,                   # number of items per page
    'lowerbound': 20,           # minimum price
    'upperbound': 50,           # maximum price
    'hd_sort': 'top_rated',     # sorted by different options
    'page': 1                   # pagination
}
Enter fullscreen mode Exit fullscreen mode
Parameters Explanation
api_key Parameter defines the SerpApi private key to use. You can find it under your account -> API key
engine Set parameter to home_depot to use The Home Depot API engine.
q Parameter defines the search query. You can use anything that you would use in a regular The Home Depot search.
ps Determines the number of items per page. There are scenarios where Home depot overrides the ps value. By default Home depot returns 24 results. 48 is the max value.
lowerbound Defines lower bound for price in USD.
upperbound Defines upper bound for price in USD.
hd_sort Parameter defines results sorted by diferent options.
page Value is used to get the items on a specific page. (e.g., 1 (default) is the first page of results, 2 is the 2nd page of results, 3 is the 3rd page of results, etc.).

📌Note: You can also add other API Parameters.

Then, we create a search object where the data is retrieved from the SerpApi backend. In the results dictionary we get data from JSON:

search = GoogleSearch(params)   # data extraction on the SerpApi backend
results = search.get_dict()     # JSON -> Python dict
Enter fullscreen mode Exit fullscreen mode

Consider the parameters mentioned above to understand how they affect the request.

  • You may have noticed that I made a mistake when passing the value to the q parameter. This was done on purpose to demonstrate that SerpApi's The Home Depot Spell Check API allows you to extract the corrected search term and search it:
print(results['search_information']['spelling_fix'])    # coffee
Enter fullscreen mode Exit fullscreen mode
  • SerpApi’s Home Depot Sorting API allows you to change the ordering of scraped data according to various product details such as price, overall rating of customer reviews, etc. via hd_sort. It can be set to:

    • top_sellers
    • price_low_to_high
    • price_high_to_low
    • top_rated
    • best_match
  • SerpApi’s Home Depot Price Bound API allows you to refine searches using lowerbound, and upperbound parameters to set the minimum and maximum price range.

At the moment, the results dictionary only stores data from 1 page. Before extracting data, the home_depot_results dictionary is created where this data will be added later. Since the search_information and filters are repeated on each subsequent page, you can extract them immediately:

home_depot_results = {
    'search_information': results['search_information'],
    'filters': results['filters'],
    'products': []
}
Enter fullscreen mode Exit fullscreen mode

SerpApi’s Home Depot Filtering API allows you to refine a search according to product details and delivers structured data. It also eliminates the need for complicated web scraping processes or tools required by scrapers.

The filters contains various filters depending on the product data which also appear on the left side of the HTML. These filters could be used to extract data with only specific product information such as warranty conditions, durability, certifications, brands, etc. to empower your e-commerce projects with the power of data scraping.

Extracting all products

To get all products, you need to apply pagination. This is achieved by the following check: while there is no error in the results object of the current page, we extract the data, increase the page parameter by 1 to get the results from next page and update the results object with the new page data:

while 'error' not in results:
    # data extraction from current page will be here

    params['page'] += 1
    results = search.get_dict()
Enter fullscreen mode Exit fullscreen mode

📌Error check is done via google-search-results error managements that checks for backend (failed search or no more results) or client errors.

Extending the home_depot_results['products'] list with new data from this page:

home_depot_results['products'].extend(results['products'])
# filter_categories = results['filters'][1]
# product_title = results['products'][0]['title']
# product_price = results['products'][0]['price']
# product_rating = results['products'][0]['rating']
# product_reviews = results['products'][0]['reviews']
Enter fullscreen mode Exit fullscreen mode

📌Note: In the comments above, I showed how to extract specific fields. You may have noticed the results['products'][0]. This is the index of a product, which means that we are extracting data from the first product. The results['products'][1] is from the second product and so on.

After the all data is retrieved, it is output in JSON format:

print(json.dumps(home_depot_results, indent=2, ensure_ascii=False))
Enter fullscreen mode Exit fullscreen mode

Output

{
  "search_information": {
    "results_state": "Results for spelling fix",
    "total_results": 1111,
    "spelling_fix": "coffee"
  },
  "products": [
    {
      "position": 1,
      "product_id": "308728897",
      "title": "Mardi Gras King Cake Medium Roast Single Serve Cups (54-Pack)",
      "thumbnails": [
        [
          "https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_65.jpg",
          "https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_100.jpg",
          "https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_145.jpg",
          "https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_300.jpg",
          "https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_400.jpg",
          "https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_600.jpg",
          "https://images.thdstatic.com/productImages/c24e5edc-38bd-4be8-b95e-c37035c158e3/svn/community-coffee-coffee-pods-k-cups-16324-64_1000.jpg"
        ]
      ],
      "link": "https://www.homedepot.com/p/Community-Coffee-Mardi-Gras-King-Cake-Medium-Roast-Single-Serve-Cups-54-Pack-16324/308728897",
      "serpapi_link": "https://serpapi.com/search.json?delivery_zip=04401&engine=home_depot_product&product_id=308728897&store_id=2414",
      "model_number": "16324",
      "brand": "Community Coffee",
      "collection": "https://www.homedepot.com",
      "favorite": 10,
      "rating": 4.9237,
      "reviews": 131,
      "price": 34.65,
      "unit": "case",
      "delivery": {
        "free": true,
        "free_delivery_threshold": false
      },
      "pickup": {
        "free_ship_to_store": true
      }
    },
    ... ohter products
  ],
  "filters": [
    {
      "key": "Review Rating",
      "value": [
        {
          "name": "5",
          "count": "519",
          "value": "bwo5q",
          "link": "https://www.homedepot.com/b/Best-Rated/N-5yc1vZbwo5q/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Category",
      "value": [
        {
          "name": "Food & Gifts",
          "count": "134",
          "value": "cigl",
          "link": "https://www.homedepot.com/b/Food-Gifts/N-5yc1vZcigl/Ntk-elastic/Ntt-coffee?NCNI-5"
        },
        ... ohter results
      ]
    },
    {
      "key": "Get It Fast",
      "value": [
        {
          "name": "Pick Up Today",
          "count": "16",
          "value": "1z175a5",
          "link": "https://www.homedepot.com/b/Pick-Up-Today/N-5yc1vZ1z175a5/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Brand",
      "value": [
        {
          "name": "Victor Allen's",
          "count": "55",
          "value": "nig",
          "link": "https://www.homedepot.com/b/Victor-Allens/N-5yc1vZnig/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Price",
      "value": [
        {
          "name": "$10 - $20",
          "count": "4",
          "value": "12ky",
          "link": "https://www.homedepot.com/b/N-5yc1vZ12ky/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Coffee/Tea Type",
      "value": [
        {
          "name": "Pods/K cups",
          "count": "92",
          "value": "1z0jm9l",
          "link": "https://www.homedepot.com/b/Pods-K-cups/N-5yc1vZ1z0jm9l/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Brand Compatibility",
      "value": [
        {
          "name": "Keurig",
          "count": "71",
          "value": "1z0knhy",
          "link": "https://www.homedepot.com/b/Keurig/N-5yc1vZ1z0knhy/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Flavor",
      "value": [
        {
          "name": "Variety Pack",
          "count": "12",
          "value": "1z0wjbu",
          "link": "https://www.homedepot.com/b/Variety-Pack/N-5yc1vZ1z0wjbu/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Package Quantity",
      "value": [
        {
          "name": "80",
          "count": "26",
          "value": "1z0w7jy",
          "link": "https://www.homedepot.com/b/80/N-5yc1vZ1z0w7jy/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Coffee Appliance Type",
      "value": [
        {
          "name": "Other",
          "count": "39",
          "value": "1z1ab3g",
          "link": "https://www.homedepot.com/b/Other/N-5yc1vZ1z1ab3g/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Collection Name",
      "value": [
        {
          "name": "Coffee Pods & K-Cups",
          "count": "43",
          "value": "1z1usiw",
          "link": "https://www.homedepot.com/b/Coffee-Pods-K-Cups/N-5yc1vZ1z1usiw/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Number of Cups",
      "value": [
        {
          "name": "1 cup",
          "count": "30",
          "value": "1z1b0v5",
          "link": "https://www.homedepot.com/b/1-cup/N-5yc1vZ1z1b0v5/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Product volume (fl. oz.)",
      "value": [
        {
          "name": "8",
          "count": "50",
          "value": "1z1bpch",
          "link": "https://www.homedepot.com/b/8/N-5yc1vZ1z1bpch/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Capacity (fl. oz.)",
      "value": [
        {
          "name": "12 fl. oz.",
          "count": "19",
          "value": "1z1b105",
          "link": "https://www.homedepot.com/b/12-fl-oz/N-5yc1vZ1z1b105/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Diet & Allergens",
      "value": [
        {
          "name": "Not Applicable",
          "count": "124",
          "value": "1z1bjkr",
          "link": "https://www.homedepot.com/b/Not-Applicable/N-5yc1vZ1z1bjkr/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Small Appliances Color Family",
      "value": [
        {
          "name": "Black",
          "count": "72",
          "value": "1z1ab15",
          "link": "https://www.homedepot.com/b/Black/N-5yc1vZ1z1ab15/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "New Arrival",
      "value": [
        {
          "name": "Recently Added",
          "count": "70",
          "value": "1z179pc",
          "link": "https://www.homedepot.com/b/Recently-Added/N-5yc1vZ1z179pc/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    },
    {
      "key": "Subscription Eligible",
      "value": [
        {
          "name": "Subscription Eligible",
          "count": "5",
          "value": "1z18amw",
          "link": "https://www.homedepot.com/b/Subscription-Eligible/N-5yc1vZ1z18amw/Ntk-elastic/Ntt-coffee?NCNI-5&lowerBound=20&upperBound=50"
        },
        ... ohter results
      ]
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

📌Note: Head to the playground for a live and interactive demo.

Join us on Twitter | YouTube

Add a Feature Request💫 or a Bug🐞

Top comments (0)