🚀 🤖💻🔍 How to scrape g2 using Python, Selenium and Bose Framework 🅶2️⃣🐍🖥️

#webscraping #python #tutorial #webscrapingtools

Introduction

In this article, you will learn how to scrape g2.com using Bose Framework.

Also, Scraping g2.com is an excellent way to do competitor analysis.

Bose Framework, is a Selenium based Bot Development Framework that provides a comprehensive set of tools and functionalities specifically aimed at making the Bot Development Process easy for Developers.

To make it easy to scrape g2.com, I have prepared a script that you can use to scrape g2 effectively. This article will walk you through the steps of utilizing the script.

Installation

Clone Starter Template

git clone https://github.com/omkarcloud/g2-scraper
cd g2-scraper

Install dependencies

python -m pip install -r requirements.txt

Usage

In extract_product_links.py specify your Task.product_url
Run Project

python main.py

The script will start running and output progress updates to the console. When the scraper is complete, it will generate a JSON file named pending.json in the output directory. The JSON file will contain the product links.

Once the bot is detected by Cloudflare, the script will recognize it and prompt you to press the "Enter" key in the console once you have successfully solved the Cloudflare captcha.

Additionaly, you don't have to configure the Selenium driver as it will automatically download the appropriate driver based on your Chrome browser version.