Make Direct Async Requests to SerpApi with Python

#web #webscraping #api #tutorial

Intro

In the previous SerpApi Async Requests with Pagination using Python blog post we've covered how to make Async requests with SerpApi's pagination, how to use Search Archive API and Queue.

In this blog post we'll cover on how to make direct requests to serpapi.com/search.json without using SerpApi's google-search-results Python client.

This way, when making a direct request SerpApi, we can get a slightly faster response time in comparison to Python's client batch async search feature which uses Queue.

For example, 50 requests in less than 10 seconds. Depends on the internet speed.

In the following blog post, we'll cover how to add pagination to the shown code below.

Subject of test: YouTube Search Engine Results API.

Test includes: 50 async search queries.

Code

You can check the code example in the online IDE:

import aiohttp
import asyncio
import json
import time

async def fetch_results(session, query):
    params = {
        'api_key': '...',      # your serpapi api key: https://serpapi.com/manage-api-key
        'engine': 'youtube',   # search engine to parse data from
        'device': 'desktop',   # from which device to parse data
        'search_query': query, # search query
        'no_cache': 'true'     # https://serpapi.com/search-api#api-parameters-serpapi-parameters-no-cache
    }

    async with session.get('https://serpapi.com/search.json', params=params) as response:
        results = await response.json()

    data = []

    if 'error' in results:
        print(results['error'])
    else:
        for result in results.get('video_results', []):
            data.append({
                'title': result.get('title'),
                'link': result.get('link'),
                'channel': result.get('channel').get('name'),
            })

    return data

async def main():
    # 50 queries
    queries = [
        'burly',
        'creator',
        'doubtful',
        'chance',
        'capable',
        'window',
        'dynamic',
        'train',
        'worry',
        'useless',
        'steady',
        'thoughtful',
        'matter',
        'rotten',
        'overflow',
        'object',
        'far-flung',
        'gabby',
        'tiresome',
        'scatter',
        'exclusive',
        'wealth',
        'yummy',
        'play',
        'saw',
        'spiteful',
        'perform',
        'busy',
        'hypnotic',
        'sniff',
        'early',
        'mindless',
        'airplane',
        'distribution',
        'ahead',
        'good',
        'squeeze',
        'ship',
        'excuse',
        'chubby',
        'smiling',
        'wide',
        'structure',
        'wrap',
        'point',
        'file',
        'sack',
        'slope',
        'therapeutic',
        'disturbed'
    ]

    data = []

    async with aiohttp.ClientSession() as session:
        tasks = []
        for query in queries:
            task = asyncio.ensure_future(fetch_results(session, query))
            tasks.append(task)

        start_time = time.time()
        results = await asyncio.gather(*tasks)
        end_time = time.time()

        data = [item for sublist in results for item in sublist]

    print(json.dumps(data, indent=2, ensure_ascii=False))
    print(f'Script execution time: {end_time - start_time} seconds') # ~7.192448616027832 seconds

asyncio.run(main())

Code Explanation

Import libraries:

import aiohttp # to make a request
import asyncio
import json    # for printing data
import time    # to measure execution time

In the fetch_results() function we:

creating search params that will be passed to SerpApi while making request.
making an async session request, passing params and awaiting for each response and storing it to results variable.
checking for 'error' in the results and iterating over 'video_results', and storing extracted data to the data list.
returning list with videos data.

async def fetch_results(session, query):
    params = {
        'api_key': '...',      # your serpapi api key: https://serpapi.com/manage-api-key
        'engine': 'youtube',   # search engine to parse data from
        'device': 'desktop',   # from which device to parse data
        'search_query': query, # search query
        'no_cache': 'true'     # https://serpapi.com/search-api#api-parameters-serpapi-parameters-no-cache
    }

    async with session.get('https://serpapi.com/search.json', params=params) as response:
        results = await response.json()

    data = []

    if 'error' in results:
        print(results['error'])
    else:
        for result in results.get('video_results', []):
            data.append({
                'title': result.get('title'),
                'link': result.get('link'),
                'channel': result.get('channel').get('name'),
            })

    return data

In the second main() function we:

create a list of queries. Could be also txt/csv/json.
open a aiohttp.ClientSession().
iterate over queries and create asyncio tasks.
proceed all of the tasks with asyncio.gather(*tasks).
flatten list with data and store it to the data variable.
print the data.

async def main():
    queries = [
        'burly',
        'creator',
        'doubtful',
        # ...
    ]

    data = []

    async with aiohttp.ClientSession() as session:
        tasks = []
        for query in queries:
            task = asyncio.ensure_future(fetch_results(session, query))
            tasks.append(task)

        start_time = time.time()
        results = await asyncio.gather(*tasks)
        end_time = time.time()

        data = [item for sublist in results for item in sublist]

    print(json.dumps(data, indent=2, ensure_ascii=False))
    print(f'Script execution time: {end_time - start_time} seconds') # ~7.192448616027832 seconds

asyncio.run(main())

Conclusion

As you saw (and possibly tried) this results in quite a fast response times. Additionally, we can add pagination to it, which will be covered in the next blog post.

Join us on Twitter | YouTube

DEV Community

Make Direct Async Requests to SerpApi with Python

Intro

Code

Code Explanation

Conclusion

Top comments (0)