DEV Community

Cover image for Reverse-Engineering Category Strategy: Analyzing Assortment Mix with Node.js
Erika S. Adkins
Erika S. Adkins

Posted on

Reverse-Engineering Category Strategy: Analyzing Assortment Mix with Node.js

Most web scraping tutorials focus on the micro view: how to extract the price of a single laptop or the specs of a specific blender. While price monitoring is useful, it doesn't tell you why a competitor is winning. To understand a competitor's true market position, you need to zoom out and look at the macro view, the Assortment Mix.

The assortment mix represents the total variety and depth of products a retailer stocks. By programmatically scraping entire product categories, you can reverse-engineer a competitor's strategy. This data reveals which brands they prioritize, where they have inventory gaps, and which price tiers they are betting on.

This guide covers how to build a high-speed Node.js toolchain to scrape a complete e-commerce category and transform that raw data into actionable business intelligence.

Phase 1: Strategy and Setup

Before writing code, you must define what "strategy" looks like in terms of data. If you only scrape prices, you only see a snapshot. To analyze Shelf Share (brand dominance) and Price Tiering, you need specific data points:

  • Product Title: The full name of the item.
  • Brand: Often found in data attributes or specific CSS classes.
  • Price: The current selling price.
  • Stock Status: Whether the item is "In Stock" or "Out of Stock."
  • Review Count: A proxy for sales volume and popularity.

For this project, we use Node.js with Axios for fetching pages and Cheerio for parsing. This stack is preferable to heavy tools like Puppeteer because category pages are usually server-side rendered. You can process hundreds of products per second without the overhead of a full browser.

Initialize your project and install the dependencies:

npm init -y
npm install axios cheerio json2csv
Enter fullscreen mode Exit fullscreen mode

Phase 2: Building the Category Scraper

The goal is to capture every product in a category, which requires handling pagination. The following script finds the "Next" button, follows it, and collects data until the category is exhausted.

Using a recursive function ensures you don't miss products hidden on page 10 or 20 of a search result.

const axios = require('axios');
const cheerio = require('cheerio');
const fs = require('fs');

const BASE_URL = 'https://example-ecommerce.com/category/laptops';
const allProducts = [];

async function scrapeCategory(url) {
    console.log(`Scraping: ${url}`);
    try {
        const { data } = await axios.get(url, {
            headers: {
                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36'
            }
        });
        const $ = cheerio.load(data);

        $('.product-card').each((i, el) => {
            allProducts.push({
                title: $(el).find('.product-title').text().trim(),
                brand: $(el).attr('data-brand') || 'Unknown', 
                price: parseFloat($(el).find('.price').text().replace(/[^0-9.]/g, '')),
                availability: $(el).find('.stock-status').text().trim().toLowerCase(),
                reviews: parseInt($(el).find('.review-count').text().replace(/[^0-9]/g, '')) || 0
            });
        });

        const nextButton = $('.pagination-next').attr('href');
        if (nextButton) {
            const nextUrl = new URL(nextButton, BASE_URL).href;
            await scrapeCategory(nextUrl); 
        } else {
            console.log(`Finished. Total products found: ${allProducts.length}`);
            fs.writeFileSync('products.json', JSON.stringify(allProducts, null, 2));
        }
    } catch (error) {
        console.error(`Error scraping ${url}: ${error.message}`);
    }
}

scrapeCategory(BASE_URL);
Enter fullscreen mode Exit fullscreen mode

How it works

The script uses axios to fetch the HTML and cheerio to load it into a jQuery-like interface. The each() loop iterates through every product card on the page. Using parseFloat and replace with regex on the price is necessary because e-commerce data is often messy, and cleaning it during the scrape saves time during analysis.

Phase 3: The Analysis Engine

With a products.json file ready, you can turn those rows into insights. JavaScript's functional programming methods-reduce, filter, and map- are ideal for this.

First, calculate Shelf Share. This identifies which brands occupy the most digital shelf space.

const products = require('./products.json');

function analyzeStrategy(data) {
    const totalItems = data.length;

    // 1. Calculate Brand Shelf Share
    const brandCounts = data.reduce((acc, product) => {
        acc[product.brand] = (acc[product.brand] || 0) + 1;
        return acc;
    }, {});

    const shelfShare = Object.keys(brandCounts).map(brand => ({
        brand,
        count: brandCounts[brand],
        percentage: ((brandCounts[brand] / totalItems) * 100).toFixed(2) + '%'
    })).sort((a, b) => b.count - a.count);

    // 2. Price Tiering Analysis
    const tiers = { budget: 0, mid: 0, premium: 0 };
    data.forEach(p => {
        if (p.price < 500) tiers.budget++;
        else if (p.price < 1200) tiers.mid++;
        else tiers.premium++;
    });

    console.log("--- Brand Dominance (Shelf Share) ---");
    console.table(shelfShare.slice(0, 10)); 

    console.log("--- Price Tier Distribution ---");
    console.table(tiers);
}

analyzeStrategy(products);
Enter fullscreen mode Exit fullscreen mode

Using reduce transforms an array of products into an object that counts brand occurrences. This reveals if a competitor is focusing heavily on a specific manufacturer, which might indicate an exclusive partnership or a supply chain preference.

Phase 4: Finding Inventory Gaps

Stock status is a goldmine for competitive intelligence. If a competitor has a high Out of Stock (OOS) rate for a specific brand, it signals either high popularity or a failing supply chain.

You can modify the analysis to find these gaps:

function analyzeAvailability(data) {
    const brandOOS = data.reduce((acc, p) => {
        if (!acc[p.brand]) acc[p.brand] = { total: 0, oos: 0 };
        acc[p.brand].total++;
        if (p.availability.includes('out of stock')) {
            acc[p.brand].oos++;
        }
        return acc;
    }, {});

    const oosRates = Object.keys(brandOOS).map(brand => ({
        brand,
        oos_rate: ((brandOOS[brand].oos / brandOOS[brand].total) * 100).toFixed(2) + '%'
    })).sort((a, b) => parseFloat(b.oos_rate) - parseFloat(a.oos_rate));

    console.log("--- Highest Out-of-Stock Rates by Brand ---");
    console.table(oosRates.filter(b => b.oos_rate !== "0.00%"));
}

analyzeAvailability(products);
Enter fullscreen mode Exit fullscreen mode

An OOS rate of 50% for a premium brand suggests high demand that the retailer cannot meet. For a competitor, this is a signal to run a promotion on that specific brand to capture frustrated customers.

Phase 5: Visualization and Reporting

While console.table works for development, stakeholders usually require spreadsheets. Use the json2csv library to export metrics into a format compatible with Excel or Google Sheets.

const { Parser } = require('json2csv');

function exportReport(data) {
    try {
        const parser = new Parser();
        const csv = parser.parse(data);
        fs.writeFileSync('category_report.csv', csv);
        console.log('Report saved to category_report.csv');
    } catch (err) {
        console.error(err);
    }
}
Enter fullscreen mode Exit fullscreen mode

Once in a spreadsheet, you can create pie charts for Shelf Share or histograms for Price Tiering. These visualizations make the competitor's business decisions clear.

To Wrap Up

By moving from simple product scraping to category-wide analysis, you turn raw HTML into a strategic roadmap. Node.js allows you to gather and interpret data through the lens of brand dominance, pricing strategy, and inventory health.

Key Takeaways:

  • Zoom Out: Category-level scraping reveals the assortment mix, which offers more strategic value than individual price points.
  • Efficiency: Use Axios and Cheerio for fast scraping of server-rendered pages.
  • Analyze Programmatically: Use reduce and filter to calculate metrics like Shelf Share and OOS rates.
  • Identify Gaps: High out-of-stock rates in specific price tiers or brands represent immediate market opportunities.

Try automating this script to run weekly. Tracking how a competitor’s strategy evolves over time—such as a sudden pivot to budget brands or phasing out a manufacturer—gives you a significant advantage.

Top comments (0)