DEV Community

loading...
Cover image for Web Scraping with PHP: Building a Competitor Price Monitoring Tool

Web Scraping with PHP: Building a Competitor Price Monitoring Tool

Andreas A.
・2 min read

Depending on your business's niche and market, adjusting your services and prices has to come along with taking your competitors into account.
In a lot of companies that I have seen, this is a manual task, that is completed once every quarter or at least every year.
In this PHP web scraping tutorial, we are going to build a tiny tool, that automizes this process. Of course, the tool will need further advancements, but is always about understanding the concepts, right? :)

Let's get started!

Prerequisites

We will need the following set of tools:

  • Web server with PHP
  • Composer
  • Guzzle - scraping client
  • PHP HTML Parser - as HTML Parser
  • A currency parser

Download composer here to download the composer and follow the install instructions.

After composer has successfully been installed, install guzzle via composer:

composer require guzzlehttp/guzzle
Enter fullscreen mode Exit fullscreen mode

Next, let's install our HTML parser:

composer require paquettg/php-html-parser
Enter fullscreen mode Exit fullscreen mode

Finally, we add the currency parser to our project:

composer require mcuadros/currency-detector dev-master
Enter fullscreen mode Exit fullscreen mode

Building the scraper

As we want to build a competitor price monitoring tool, let's say that this product URL is our own:

https://www.allendalewine.com/products/11262719/diplomatico-reserva-exclusiva

As a competitor page, we select the following:

https://www.winetoship.com/diplomatico-rum-reserva-exclusiva.html

Next, we have to define the CSS-Selectors that contain the price information.

Selecting Price Information

For our "own" website, the selector is .sale-price.currency. Going through the same process for the competitor, the selector is .less-price .o_price span.

Putting the pieces together, we end up with the following script:

<?php
require 'vendor/autoload.php';

use \GuzzleHttp\Client;
use \PHPHtmlParser\Dom;
use \CurrencyDetector\Detector;


$productPairs = [
    'rum' => [
        'own' => [
            'url' => 'https://www.allendalewine.com/products/11262719/diplomatico-reserva-exclusiva',
            'selectorPath' => '.sale-price.currency'
        ],
        'competitor1' => [
            'url' => 'https://www.winetoship.com/diplomatico-rum-reserva-exclusiva.html',
            'selectorPath' => '.less-price .o_price span'
        ]
    ]
    # you can add as many product pairs as you wish
];

$detector = new Detector();

$comparison = [];

foreach ($productPairs as $productName => $pair) {


    foreach($pair as $provider => $product) {

        $client = new Client();
        $parser = new Dom;

        $request = $client->request('GET', $product['url']);
        $response = (string) $request->getBody();
        $parser->loadStr($response);
        $price = $parser->find($product['selectorPath'])[0];
        $priceString = $price->text;

        $fmt = new NumberFormatter( 'en_US', NumberFormatter::CURRENCY );

        $comparison[$productName][$provider] = [
            'currency' => $detector->getCurrency($priceString),
            'amount' => $detector->getAmount($priceString),
        ];     

    }
}

echo json_encode($comparison);
Enter fullscreen mode Exit fullscreen mode

You can add as many product and competitor entities as you like. The scraper then loops through all products and competitors and fetches the HTML-Markup. Our DOM-Parser then extracts the related elements from the HTML. Finally, the currency detector parses the price string into a comparable and normalized format.

I used the following PHP web scraping tutorial to create this scraper.

Discussion (0)