Robert N. Gutierrez

Posted on Feb 27

Automating Market Research on Dermstore: Scrape Categories and Monitor Inventory Trends

#marketresearch #webscraping #node

The beauty and skincare market moves at a breakneck pace. For Revenue Operations (RevOps) teams and e-commerce managers, keeping tabs on competitor pricing, stock levels, and brand positioning on major retailers like Dermstore is a full-time job. Manual browsing is tedious and results in stale data. By the time you’ve mapped a category, prices have changed or a top-seller has gone out of stock.

Implementing a data-on-demand strategy solves this. Instead of manual checks, automated scrapers pull a complete snapshot of any Dermstore category in minutes. This allows you to spot inventory gaps, analyze "Share of Shelf," and adjust bidding strategies in real time.

This guide uses the open-source Dermstore.com-Scrapers repository to build an automated market research pipeline using Python and Playwright.

Why Scrape Product Categories?

Before looking at the code, it helps to understand why this data is a goldmine for competitive intelligence.

1. Inventory Monitoring and Conquesting

"Conquesting" is a strategy where you bid on a competitor's keywords the moment they run out of stock. By scraping the availability field across a category like "Sun Care," you can identify top-tier products that are currently unavailable. This is your signal to increase ad spend on your own equivalent product to capture frustrated customers.

2. Share of Shelf Analysis

In physical retail, brands fight for eye-level shelf space. In e-commerce, the "shelf" is the first page of a category. By scraping category listings, you can calculate how many SKUs your brand has on page one versus your competitors. If Brand X has 40 SKUs and you have five, you have a visibility problem.

3. Pricing Architecture

Automated scraping allows you to map the price points of an entire category. You can quickly calculate the average price for "Vitamin C Serums" and see if your product is positioned as a luxury, mid-tier, or budget option compared to the live market.

Prerequisites and Setup

To follow this tutorial, you need:

Python 3.8+ installed.
A ScrapeOps API Key (the free tier works for this).
Basic familiarity with the terminal.

First, clone the repository and install the dependencies for the Playwright implementation:

# Clone the repository
git clone https://github.com/scraper-bank/Dermstore.com-Scrapers.git
cd Dermstore.com-Scrapers

# Install Python dependencies
pip install playwright playwright-stealth
playwright install chromium

Step 1: Configuring the Product Category Scraper

We will focus on the Playwright implementation in python/playwright/product_category/. This scraper navigates category pages, handles pagination, and extracts structured data.

Open python/playwright/product_category/scraper/dermstore_scraper_product_category_v1.py. You need to add your API key and define your target category URL.

# python/playwright/product_category/scraper/dermstore_scraper_product_category_v1.py

API_KEY = "YOUR_SCRAPEOPS_API_KEY"

# Define the category you want to research
# Example: Sun Care category
TARGET_URL = "https://www.dermstore.com/skin-care/sun-care.list"

# The PROXY_CONFIG uses ScrapeOps to bypass anti-bot measures
PROXY_CONFIG = {
    "server": "http://residential-proxy.scrapeops.io:8181",
    "username": "scrapeops",
    "password": API_KEY
}

The scraper uses ScrapeOps Residential Proxies. This is necessary because Dermstore employs sophisticated anti-bot protections like Cloudflare. Without proxy rotation and optimized headers, the script will likely be blocked after a few requests.

Step 2: Running the Scraper

Run the script from your terminal. The scraper visits the category page, extracts all products across subsequent pages, and saves them to a JSONL file.

python python/playwright/product_category/scraper/dermstore_scraper_product_category_v1.py

We use JSONL (JSON Lines) because it is efficient for data pipelines. Each line is a standalone JSON object, so you can process files containing thousands of products without loading the entire dataset into memory at once.

Step 3: Interpreting the Data Signals

The output file contains objects representing each product. Here is how to map technical fields to business metrics:

JSON Field	Business Metric	Insight
`availability`	Inventory Status	Identifies "out_of_stock" items for ad conquesting.
`price`	Current Market Price	Monitors for sudden price drops or promotional shifts.
`rating`	Customer Sentiment	High-rated items with low stock are prime targets for replacement.
`isSponsored`	Competitor Spend	Detects which brands are paying for top-row placement.

A typical data object looks like this:

{
  "name": "Supergoop! Unseen Sunscreen SPF 40",
  "price": 38.00,
  "availability": "in_stock",
  "isSponsored": false,
  "rating": 4.8
}

Step 4: Visualizing the Data

Now for the RevOps analysis. We can use pandas to analyze the JSONL file and find which brands are potentially losing sales due to stockouts.

import pandas as pd
import json

# Load the scraped data
data = []
with open('dermstore_com_product_category_page_data_TIMESTAMP.jsonl', 'r') as f:
    for line in f:
        data.append(json.loads(line))

df = pd.DataFrame(data)

# 1. Calculate Share of Shelf (Top 10 Brands by SKU count)
share_of_shelf = df['brand'].value_counts().head(10)
print("Share of Shelf (Top 10 Brands):")
print(share_of_shelf)

# 2. Identify Out of Stock Competitors
oos_competitors = df[df['availability'] != 'in_stock']
print("\nOut of Stock Opportunities:")
print(oos_competitors[['brand', 'name', 'price']])

This script provides an immediate list of products missing from the shelf, allowing a marketing team to pivot their strategy quickly.

Reliable Scraping with ScrapeOps

Scraping major e-commerce sites isn't as simple as sending a GET request. These sites use advanced fingerprinting to detect automation. The Dermstore.com-Scrapers repository handles this complexity by integrating with ScrapeOps.

When you use the PROXY_CONFIG provided in the repo, the request follows this path:

Your Script: Sends a request to the ScrapeOps Proxy Gateway.
ScrapeOps: Rotates through residential IPs and selects browser headers that mimic a real user.
Dermstore: Receives a legitimate request from a residential user and returns the data.
Data Extraction: The Playwright logic parses the HTML and returns clean JSON.

Residential proxies are essential for e-commerce. Datacenter IPs are easily flagged, but residential IPs provide the reputation scores needed to navigate Cloudflare or DataDome challenges.

To Wrap Up

Automating market research on Dermstore transforms data from a static report into a competitive tool. By using the Dermstore.com-Scrapers repository, you can move away from manual browsing and toward a reactive, data-driven strategy.

Key Takeaways:

Automate Categories: Use category scrapers to monitor "Share of Shelf" and pricing trends across entire product lines.
Spot Gaps: Use the availability field to find "Out of Stock" signals for ad conquesting.
Scale Reliably: Use ScrapeOps proxy rotation to handle anti-bot measures and ensure your pipeline remains stable.

For your next steps, consider setting up a Cron job to run these scrapers daily or piping the JSONL output into a visualization tool like Tableau to create a live competitive dashboard.

DEV Community