DEV Community

Cover image for High-Frequency eBay Scraping: Sync Prices and Stock Without Getting Banned
Erika S. Adkins
Erika S. Adkins

Posted on

High-Frequency eBay Scraping: Sync Prices and Stock Without Getting Banned

In e-commerce and dropshipping, stale data kills profit margins. If a customer buys an item from your store but the eBay price has jumped 20%, or the item is out of stock, you face a lose-lose choice: cancel the order and damage your seller rating, or fulfill it at a loss.

To prevent this, you must sync your inventory frequently. However, re-scraping 10,000 individual product pages every hour will get your IP blacklisted and blow your proxy budget. Real-time monitoring requires a smarter approach than brute-force scraping.

This guide demonstrates how to build a high-frequency eBay inventory monitor using Python. Instead of "one-by-one" scraping, we will use a diffing strategy with lightweight list views to track thousands of items with minimal overhead.

The Strategy: List View vs. Detail View

Most developers start by scraping the Detail View (the specific product page). While this provides the most data, it’s inefficient for monitoring. Tracking 5,000 items this way requires 5,000 separate requests.

Instead, use the List View (search results or "Seller’s Other Items" pages). On eBay, a single list page can display up to 200 items. By scraping these pages, you can update the price and stock status of 200 items with one request.

  • Detail View: 1 Request = 1 Item. High fidelity, low efficiency.
  • List View: 1 Request = 50+ Items. High efficiency, sufficient fidelity for price and stock.

Switching to List View reduces request volume by up to 98%, making syncs every 15 minutes technically and financially feasible.

Step 1: Designing the Local Database (The "State")

To detect changes, the script needs to remember the price and stock from the previous check. This is the State. While a JSON file works for a handful of items, it becomes slow and prone to corruption as your list grows. SQLite is a better choice because it is serverless, fast, and built into Python.

The database tracks the item_id, price, stock_status, and a last_checked timestamp.

import sqlite3

def setup_database():
    conn = sqlite3.connect('ebay_sync.db')
    cursor = conn.cursor()

    # Create table to store the 'state' of our inventory
    cursor.execute('''
        CREATE TABLE IF NOT EXISTS products (
            item_id TEXT PRIMARY KEY,
            price REAL,
            is_in_stock INTEGER,
            last_checked TIMESTAMP DEFAULT CURRENT_TIMESTAMP
        )
    ''')
    conn.commit()
    return conn

# Initialize the DB
db_conn = setup_database()
Enter fullscreen mode Exit fullscreen mode

This schema allows for rapid lookups to see if scraped data differs from your records.

Step 2: The Lightweight Scraper

We need a function to fetch a list of items. This example targets a seller’s store page or a search result page using requests for the network call and BeautifulSoup to parse the HTML.

When scraping eBay, look for containers with the class s-item. These hold the ID, price, and availability markers.

import requests
from bs4 import BeautifulSoup

def fetch_ebay_batch(url):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36",
        "Accept-Language": "en-US,en;q=0.9"
    }

    response = requests.get(url, headers=headers)
    if response.status_code != 200:
        print(f"Failed to fetch: {response.status_code}")
        return []

    soup = BeautifulSoup(response.text, 'html.parser')
    items = []

    for tags in soup.select('.s-item__wrapper'):
        # Extract Item ID from the link
        link_tag = tags.select_one('.s-item__link')
        if not link_tag: continue

        link = link_tag['href']
        item_id = link.split('?')[0].split('/')[-1]

        # Extract and clean price: "$25.99" -> 25.99
        price_tag = tags.select_one('.s-item__price')
        if not price_tag: continue

        try:
            price_text = price_tag.text.replace('$', '').replace(',', '').split(' to ')[0]
            price = float(price_text)
        except ValueError:
            continue 

        # Stock check: Look for "Out of stock" labels
        status_tag = tags.select_one('.s-item__availability')
        is_in_stock = 0 if status_tag and "Out of stock" in status_tag.text else 1

        items.append({
            'item_id': item_id,
            'price': price,
            'is_in_stock': is_in_stock
        })

    return items
Enter fullscreen mode Exit fullscreen mode

Step 3: Implementing the Diffing Logic

This is the core logic. Instead of blindly updating the database, we compare new_data against old_data. This triggers specific actions, such as a Discord alert or a Shopify update, only when a change actually occurs.

def detect_and_sync_changes(scraped_items, conn):
    cursor = conn.cursor()
    changes_detected = 0

    for item in scraped_items:
        cursor.execute("SELECT price, is_in_stock FROM products WHERE item_id = ?", (item['item_id'],))
        row = cursor.fetchone()

        if row is None:
            # New item discovered
            cursor.execute("INSERT INTO products (item_id, price, is_in_stock) VALUES (?, ?, ?)",
                           (item['item_id'], item['price'], item['is_in_stock']))
            print(f"New Item Tracked: {item['item_id']}")
        else:
            old_price, old_stock = row

            if old_price != item['price'] or old_stock != item['is_in_stock']:
                print(f"Change Found on {item['item_id']}: Price {old_price} -> {item['price']}")

                cursor.execute('''
                    UPDATE products 
                    SET price = ?, is_in_stock = ?, last_checked = CURRENT_TIMESTAMP 
                    WHERE item_id = ?
                ''', (item['price'], item['is_in_stock'], item['item_id']))
                changes_detected += 1

    conn.commit()
    return changes_detected
Enter fullscreen mode Exit fullscreen mode

Step 4: Building the Sync Loop

To keep data fresh, wrap the logic in a loop with basic error handling to prevent network timeouts from crashing the monitor.

import time

TARGET_URLS = [
    "https://www.ebay.com/sch/i.html?_ssn=some_seller_id&_ipg=200", 
]

def main():
    db_conn = setup_database()
    print("Starting eBay Monitor...")

    while True:
        try:
            for url in TARGET_URLS:
                print(f"Scanning {url}...")
                scraped_data = fetch_ebay_batch(url)
                changes = detect_and_sync_changes(scraped_data, db_conn)
                print(f"Scan complete. Changes detected: {changes}")

            print("Sleeping for 15 minutes...")
            time.sleep(900) 

        except Exception as e:
            print(f"Error occurred: {e}")
            time.sleep(60) 

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

Optimization & Scaling

As your inventory grows, use these recommended approaches to stay efficient:

1. Proxy Management

eBay limits request rates aggressively. For high-frequency monitoring, use Residential Proxies or Smart Proxy Rotators. These are much harder for eBay to distinguish from real shoppers compared to data center IPs.

2. The Hybrid Scrape

Sometimes the List View is ambiguous, showing "See Price" instead of a value. In these cases, program your engine to trigger a targeted Detail View scrape only for that specific item. This maintains accuracy without sacrificing overall efficiency.

3. Concurrency

If you monitor multiple large sellers, a sequential loop will be too slow. Use concurrent.futures.ThreadPoolExecutor to fetch multiple pages simultaneously.

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(fetch_ebay_batch, TARGET_URLS))
Enter fullscreen mode Exit fullscreen mode

To Wrap Up

High-frequency eBay monitoring relies on efficient state management. By focusing on List Views and a diffing strategy, you can build a system that reduces proxy costs by 98% and minimizes the risk of IP bans.

From here, you can connect the detect_and_sync_changes function to an external API, like Shopify or a Discord Webhook, to automate your response to price and stock updates. For more advanced techniques, see our guides on rotating proxies with Python and parsing HTML with BeautifulSoup.

Top comments (3)

Collapse
 
wfgsss profile image
wfgsss

The List View vs Detail View distinction is spot on. We use the exact same pattern for monitoring Chinese wholesale platforms like Yiwugo.com — their search result pages pack 40-60 products per page with price and MOQ data, so one request replaces dozens of detail page fetches.

One thing we added on top of your diffing strategy: a price velocity metric. Instead of just flagging "price changed," we track the rate of change over a rolling 7-day window. A product that drops 15% in one day is a very different signal than one that drifts down 15% over a month. The first usually means a clearance event (buy signal for resellers), the second is normal market adjustment.

Also worth noting for anyone adapting this to non-English platforms: the is_in_stock detection gets trickier when stock labels are in Chinese or use icon-based indicators instead of text. We ended up building a small lookup table per platform rather than relying on string matching.

The SQLite state approach scales surprisingly well — we are tracking ~50k products and queries still return in <10ms. Great architecture choice over JSON files.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.