agenthustler

Posted on Apr 9 • Edited on Apr 19

eBay Scraping: Extract Product Listings, Prices, and Seller Data

#webdev #javascript #programming #webscraping

eBay is one of the world's largest online marketplaces, with over 1.7 billion live listings at any given time. For developers, researchers, and business analysts, eBay represents a goldmine of product data — pricing trends, seller behavior, market demand signals, and competitive intelligence.

In this comprehensive guide, you'll learn how to scrape eBay product listings, extract pricing and seller data, navigate pagination efficiently, and scale your scraping operation using cloud-based tools like Apify.

Why Scrape eBay?

eBay data is incredibly valuable for several use cases:

Price tracking: Monitor product prices over time to spot deals or understand market trends
Competitive intelligence: Understand what sellers are charging and how they position products
Market research: Identify trending products, underserved niches, and demand patterns
Arbitrage opportunities: Find price differences between eBay and other marketplaces
Academic research: Study auction behavior, pricing dynamics, and marketplace economics
Inventory monitoring: Track stock levels and availability of specific products

As always, respect eBay's Terms of Service and use data responsibly. Don't overload their servers, and don't use data for prohibited purposes.

Understanding eBay's Site Structure

eBay has a complex but well-organized site structure. Understanding it is the first step to building an effective scraper.

Search Results Pages

https://www.ebay.com/sch/i.html?_nkw=iphone+15&_pgn=1

Search result pages display listings in either grid or list view. Each listing card includes:

Product title
Price (Buy It Now or current bid)
Shipping cost
Item condition (New, Used, Refurbished)
Seller name and rating
Number of bids (for auctions)
Time remaining (for auctions)
Product image thumbnail
Free returns badge

Product Detail Pages

https://www.ebay.com/itm/123456789012

Product pages are the richest data source and contain:

Full product title and subtitle
All product images (full gallery)
Price details (Buy It Now price, auction information, best offer option)
Item specifics (brand, model, color, size, UPC, etc.)
Item condition and detailed condition description
Seller information (username, feedback score, positive feedback percentage)
Shipping options and costs
Return policy details
Item location
Watchers count
Number of units sold

Seller Profile Pages

https://www.ebay.com/usr/seller_username

Seller profiles provide:

Total feedback score and positive percentage
Member since date
Location
Recent feedback comments and ratings
Active listings count

Navigating eBay Pagination

eBay search results pagination is one of the trickier aspects to handle correctly. Here's what you need to know.

URL Parameters

eBay uses query parameters to control search behavior. Understanding these lets you build precise search URLs programmatically:

const buildSearchUrl = (query, page = 1, options = {}) => {
  const params = new URLSearchParams({
    _nkw: query,              // Search keywords
    _pgn: page,               // Page number
    _ipg: 240,                // Items per page (60, 120, or 240)
    _sop: options.sort || 12, // Sort order
    LH_BIN: options.buyItNow ? 1 : 0,      // Buy It Now only
    LH_Free: options.freeShipping ? 1 : 0,  // Free shipping
    _udlo: options.minPrice || '',           // Min price
    _udhi: options.maxPrice || '',           // Max price
  });
  return `https://www.ebay.com/sch/i.html?${params}`;
};

Available Sort Options

Value	Sort Order
12	Best Match (default)
1	Time: ending soonest
10	Time: newly listed
15	Price + Shipping: lowest first
16	Price + Shipping: highest first

Pagination Limits and Workarounds

eBay typically limits search results to about 10,000 items — roughly 42 pages at 240 items per page. To get more results, split your queries:

Use category filters to narrow the result set
Apply price range filters to create smaller, non-overlapping result sets
Sort by newly listed and scrape incrementally over time

// Strategy: Split by price ranges to bypass the 10K limit
async function scrapeFullCategory(query) {
  const priceRanges = [
    { min: 0, max: 25 },
    { min: 25, max: 50 },
    { min: 50, max: 100 },
    { min: 100, max: 250 },
    { min: 250, max: 500 },
    { min: 500, max: null }  // No upper limit
  ];

  const allResults = [];
  for (const range of priceRanges) {
    const url = buildSearchUrl(query, 1, {
      minPrice: range.min,
      maxPrice: range.max
    });
    const results = await scrapeAllPages(url);
    allResults.push(...results);
    console.log(
      `Price $${range.min}-$${range.max || '∞'}: ${results.length} items`
    );
  }

  // Deduplicate by item ID
  const unique = [...new Map(
    allResults.map(item => [item.itemId, item])
  ).values()];
  return unique;
}

Scraping Search Results

Here's a complete working example for extracting listing data from eBay search results across multiple pages:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Extracting Detailed Product Data

For richer data, you need to scrape individual product detail pages. This is slower but gives you the full picture:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Extracting Seller Data

Understanding seller behavior and reputation is crucial for competitive analysis. Here's how to scrape seller profile data:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Handling eBay's Anti-Scraping Measures

eBay has robust anti-bot protections. Here are the key techniques for handling them.

1. Session Management

eBay tracks sessions closely. Maintain consistent browser sessions rather than creating new ones for each request:

const context = await browser.newContext({
  userAgent: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ' +
    'AppleWebKit/537.36',
  locale: 'en-US',
  timezoneId: 'America/New_York',
  // Persist cookies across requests
  storageState: './ebay-session.json'
});

2. Proxy Rotation

Rotating proxies is essential for large-scale eBay scraping to avoid IP-based blocking:

const proxyList = [
  'http://proxy1:port',
  'http://proxy2:port',
  'http://proxy3:port'
];

async function getRandomProxy() {
  return proxyList[Math.floor(Math.random() * proxyList.length)];
}

const browser = await chromium.launch({
  proxy: { server: await getRandomProxy() }
});

3. Human-Like Browsing Behavior

Add realistic interactions — scrolling, mouse movements, and variable timing — to avoid triggering bot detection:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

4. Retry Logic with Exponential Backoff

Always implement retry logic to handle transient failures gracefully:

async function scrapeWithRetry(fn, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      console.log(`Attempt ${attempt} failed: ${error.message}`);
      if (attempt === maxRetries) throw error;
      const delay = Math.pow(2, attempt) * 1000
        + Math.random() * 1000;
      await new Promise(r => setTimeout(r, delay));
    }
  }
}

Scaling with Apify

When you need to scrape thousands or millions of eBay listings, managing your own browser infrastructure becomes impractical. Apify provides a cloud-based platform specifically designed for web scraping at scale.

Using Apify for eBay Scraping

The Apify Store offers pre-built eBay scraper actors that handle all the hard parts — proxy rotation, browser management, anti-bot evasion, retry logic, and structured data storage.

const { ApifyClient } = require('apify-client');

const client = new ApifyClient({
  token: 'YOUR_APIFY_TOKEN',
});

async function scrapeEbayAtScale() {
  const run = await client.actor('ACTOR_ID').call({
    searchQueries: [
      'iphone 15 pro max',
      'macbook pro m3',
      'sony ps5'
    ],
    maxItems: 1000,
    scrapeProductPages: true,
    proxyConfiguration: {
      useApifyProxy: true,
      apifyProxyGroups: ['RESIDENTIAL']
    }
  });

  const { items } = await client.dataset(
    run.defaultDatasetId
  ).listItems();
  console.log(`Scraped ${items.length} eBay listings`);

  // Download as CSV
  const csvUrl = `https://api.apify.com/v2/datasets/${
    run.defaultDatasetId
  }/items?format=csv`;
  console.log(`Download CSV: ${csvUrl}`);

  return items;
}

Scheduling Regular Scrapes

For ongoing price monitoring, set up automated scheduled runs:

const schedule = await client.schedules().create({
  name: 'ebay-daily-price-check',
  cronExpression: '0 9 * * *',  // Every day at 9 AM
  actions: [{
    type: 'RUN_ACTOR',
    actorId: 'ACTOR_ID',
    runInput: {
      searchQueries: ['gaming laptop'],
      maxItems: 500
    }
  }]
});

Webhook Integration for Data Pipelines

Get notified when your scrape completes so you can trigger downstream processing:

const run = await client.actor('ACTOR_ID').call(input, {
  webhooks: [{
    eventTypes: ['ACTOR.RUN.SUCCEEDED'],
    requestUrl: 'https://your-app.com/api/ebay-data-ready',
    payloadTemplate: JSON.stringify({
      datasetId: '{{resource.defaultDatasetId}}',
      itemCount: '{{resource.stats.itemsScraped}}'
    })
  }]
});

Search Pages vs. Product Pages: When to Use Each

Choosing the right scraping strategy depends on your data needs:

Data Need	Search Pages	Product Pages
Basic price comparison	Sufficient	Overkill
Seller identification	Yes	Yes
Full item specifics	No	Yes
Image gallery	Thumbnail only	Full gallery
Shipping details	Basic	Detailed
Item condition details	Basic label	Full description
Auction bid history	Bid count only	Full history
Speed	Fast (240 items/page)	Slow (1 item/page)
Proxy cost	Low	Higher

Best practice: Start with search pages to identify items of interest, then selectively scrape product pages only for items that match your specific criteria. This two-pass approach is far more efficient than scraping every product page.

Extracting Completed/Sold Listings

One of eBay's most valuable datasets for market research is completed listings — items that actually sold at a real price:

function buildSoldListingsUrl(query) {
  const params = new URLSearchParams({
    _nkw: query,
    LH_Complete: 1,  // Completed listings
    LH_Sold: 1,      // Sold items only
    _sop: 13,        // Sort: recent first
    _ipg: 240
  });
  return `https://www.ebay.com/sch/i.html?${params}`;
}

Completed listings show you what people actually paid, not what sellers hope to get. This is incredibly valuable for pricing research and market analysis.

Data Analysis Patterns

Once you've collected the data, here are practical analysis patterns.

Price Distribution

function analyzePrices(items) {
  const prices = items
    .map(item => parseFloat(
      item.price?.replace(/[^0-9.]/g, '')
    ))
    .filter(p => !isNaN(p));

  prices.sort((a, b) => a - b);

  return {
    count: prices.length,
    min: prices[0],
    max: prices[prices.length - 1],
    median: prices[Math.floor(prices.length / 2)],
    average: (
      prices.reduce((a, b) => a + b, 0) / prices.length
    ).toFixed(2),
    p25: prices[Math.floor(prices.length * 0.25)],
    p75: prices[Math.floor(prices.length * 0.75)]
  };
}

Seller Competition Analysis

function analyzeCompetition(items) {
  const sellerMap = new Map();

  items.forEach(item => {
    if (!item.seller) return;
    if (!sellerMap.has(item.seller)) {
      sellerMap.set(item.seller, {
        count: 0, totalValue: 0
      });
    }
    const seller = sellerMap.get(item.seller);
    seller.count++;
    seller.totalValue += parseFloat(
      item.price?.replace(/[^0-9.]/g, '')
    ) || 0;
  });

  return Array.from(sellerMap.entries())
    .map(([name, data]) => ({
      seller: name,
      listingCount: data.count,
      totalValue: data.totalValue.toFixed(2),
      avgPrice: (data.totalValue / data.count).toFixed(2)
    }))
    .sort((a, b) => b.listingCount - a.listingCount)
    .slice(0, 20);
}

Best Practices and Tips

Start with search pages: Extract basic data from search results before hitting individual product pages to minimize requests.
Use item IDs for deduplication: eBay item numbers are globally unique — use them to avoid storing duplicate listings.
Handle auction vs. Buy It Now: These listing types have different data structures and require different extraction logic. Always check which type you're dealing with.
Watch for regional differences: eBay has country-specific domains (ebay.co.uk, ebay.de, ebay.com.au) with different HTML layouts and price formats.
Monitor completed listings: Add LH_Complete=1&LH_Sold=1 to get actual sold prices rather than aspirational listing prices.
Respect rate limits: Keep requests under 20 per minute per IP to avoid blocks. Use random delays.
Cache aggressively: Product pages change slowly. Cache data and refresh on a schedule rather than re-scraping everything.
Validate extracted prices: eBay shows prices in many formats — "$29.99", "$25.00 to $35.00", "GBP 19.99". Build robust parsing logic.
Handle "Best Offer" listings: Many listings accept offers below the listed price. The listed price may not reflect actual transaction values.
Use structured data when available: Like Etsy, eBay embeds JSON-LD structured data in some pages. Parse it for more reliable extraction.

Conclusion

eBay scraping opens up a world of market intelligence — from real-time price tracking to competitive analysis and trend detection. The key challenges are handling eBay's anti-bot protections, managing pagination efficiently, and correctly processing the different listing formats (auctions, Buy It Now, variations, and Best Offer).

For small-scale projects and learning, a custom Playwright scraper gives you full control and deep understanding. For production workloads where reliability and scale matter, platforms like Apify with their pre-built eBay actors, managed proxy infrastructure, and scheduling capabilities can save you significant development and ongoing maintenance effort.

Browse the Apify Store for eBay-focused actors that handle the complexity of proxy rotation, browser management, and anti-detection measures out of the box — letting you focus on analyzing the data rather than collecting it.

Start with a specific use case, build a focused scraper, validate your data quality, and expand from there. Happy scraping!

Need production-ready eBay scraping? Check out the Apify Store for pre-built actors that handle all the infrastructure complexity so you can focus on your data.

DEV Community