DEV Community

agenthustler
agenthustler

Posted on

How to Scrape eBay Sold Listings for Market Research

How to Scrape eBay Sold Listings for Market Research

eBay sold listings are a goldmine for market research. They reveal actual selling prices, demand trends, and competitive insights. In this guide, I'll show you how to scrape eBay's completed listings using Python.

Why Scrape eBay Sold Listings?

Retail arbitrage sellers, product researchers, and data analysts all need pricing intelligence. eBay's "Sold" filter shows what people actually paid — not just asking prices. This data helps you:

  • Price products competitively based on real transaction data
  • Identify trending items before they saturate the market
  • Track seasonal pricing patterns for inventory planning
  • Validate product ideas before investing in stock

Setting Up Your Environment

pip install requests beautifulsoup4 pandas
Enter fullscreen mode Exit fullscreen mode

The Scraper

import requests
from bs4 import BeautifulSoup
import pandas as pd
import time

def scrape_ebay_sold(query, pages=3):
    results = []
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
    }

    for page in range(1, pages + 1):
        url = f"https://www.ebay.com/sch/i.html?_nkw={query}&LH_Complete=1&LH_Sold=1&_pgn={page}"
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.text, "html.parser")

        items = soup.select(".s-item")
        for item in items:
            title_el = item.select_one(".s-item__title")
            price_el = item.select_one(".s-item__price")
            date_el = item.select_one(".s-item__ended-date, .s-item__endedDate")
            link_el = item.select_one(".s-item__link")

            if title_el and price_el:
                results.append({
                    "title": title_el.get_text(strip=True),
                    "price": price_el.get_text(strip=True),
                    "sold_date": date_el.get_text(strip=True) if date_el else "N/A",
                    "url": link_el["href"] if link_el else "N/A"
                })

        time.sleep(2)

    return pd.DataFrame(results)

df = scrape_ebay_sold("vintage nintendo", pages=5)
print(f"Found {len(df)} sold listings")
print(df.head(10))
Enter fullscreen mode Exit fullscreen mode

Analyzing the Data

Once you have the data, extract pricing insights:

import re

def clean_price(price_str):
    match = re.search(r"[\d,]+\.\d{2}", price_str.replace(",", ""))
    return float(match.group()) if match else None

df["price_clean"] = df["price"].apply(clean_price)
df = df.dropna(subset=["price_clean"])

print(f"Average sold price: ${df['price_clean'].mean():.2f}")
print(f"Median sold price: ${df['price_clean'].median():.2f}")
print(f"Price range: ${df['price_clean'].min():.2f} - ${df['price_clean'].max():.2f}")
Enter fullscreen mode Exit fullscreen mode

Scaling with a Proxy Service

eBay will block repeated requests from the same IP. For production scraping, use a proxy API like ScraperAPI to handle rotation and CAPTCHAs:

def scrape_with_proxy(query):
    api_url = "https://api.scraperapi.com"
    params = {
        "api_key": "YOUR_SCRAPERAPI_KEY",
        "url": f"https://www.ebay.com/sch/i.html?_nkw={query}&LH_Complete=1&LH_Sold=1"
    }
    response = requests.get(api_url, params=params)
    return response.text
Enter fullscreen mode Exit fullscreen mode

Alternatively, ThorData provides residential proxies that work well for marketplace scraping, and ScrapeOps offers a proxy aggregator that picks the best provider for each request.

Saving Results

df.to_csv("ebay_sold_listings.csv", index=False)
df.to_json("ebay_sold_listings.json", orient="records", indent=2)
print("Data exported successfully!")
Enter fullscreen mode Exit fullscreen mode

Tips for Better Results

  1. Use specific queries — "iPhone 15 Pro Max 256GB" beats "iPhone"
  2. Scrape multiple categories to compare cross-category demand
  3. Run weekly to build a price history database
  4. Filter outliers — remove listings below $1 (lots) and above 99th percentile
  5. Track sell-through rate — compare sold vs active listings for demand signals

Legal Considerations

Always check eBay's Terms of Service. Use delays between requests, don't overload their servers, and consider using their official API for commercial applications. This tutorial is for educational purposes.


Happy scraping! If you found this useful, follow me for more Python data collection tutorials.

Top comments (0)