wfgsss

Posted on Feb 15

How to Build a China Wholesale Price Tracker with DHgate and Yiwugo Data

#webscraping #ecommerce #python #productivity

If you're sourcing products from China, you already know the pain: prices fluctuate constantly, suppliers run flash deals you miss, and checking two platforms manually every day is a recipe for burnout.

What if you could track prices across both DHgate and Yiwugo automatically, get notified when something drops, and see cross-platform comparisons in one place?

That's what we're building today. A Python price tracker that pulls data from both platforms, stores historical prices, detects changes, and sends you alerts. No more spreadsheet gymnastics.

Why Track Prices Across Both Platforms?

DHgate and Yiwugo serve different segments of the China wholesale market:

Yiwugo connects you to Yiwu market stall owners — factory-direct pricing, lower MOQs, but primarily Chinese-language
DHgate is cross-border focused — English interface, buyer protection, but often higher prices due to platform fees

The same product category can have a 20-40% price gap between the two. Tracking both gives you:

Arbitrage opportunities — buy where it's cheapest right now
Negotiation leverage — show a DHgate supplier that Yiwugo has it cheaper
Trend detection — if prices rise on both platforms simultaneously, demand is spiking
Timing optimization — buy during seasonal dips, not peaks

Architecture

┌─────────────┐     ┌─────────────┐
│  Yiwugo     │     │  DHgate     │
│  Scraper    │     │  Scraper    │
└──────┬──────┘     └──────┬──────┘
       │                   │
       └───────┬───────────┘
               ▼
       ┌───────────────┐
       │  Price Store   │
       │  (SQLite)      │
       └───────┬───────┘
               │
       ┌───────┴───────┐
       │               │
       ▼               ▼
┌─────────────┐ ┌─────────────┐
│  Change     │ │  Report     │
│  Detector   │ │  Generator  │
└─────────────┘ └─────────────┘

Four components:

Data collectors — pull product data from both platforms via Apify
Price store — SQLite database with historical price snapshots
Change detector — compares latest prices against previous snapshot
Report generator — outputs comparison tables and alerts

Step 1: Set Up the Price Database

We need a schema that handles both platforms and tracks prices over time:

import sqlite3
from datetime import datetime

def init_db(db_path="price_tracker.db"):
    """Create the price tracking database."""
    conn = sqlite3.connect(db_path)
    c = conn.cursor()

    c.execute("""
        CREATE TABLE IF NOT EXISTS price_snapshots (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            platform TEXT NOT NULL,
            product_name TEXT NOT NULL,
            search_keyword TEXT NOT NULL,
            price_min REAL,
            price_max REAL,
            currency TEXT DEFAULT 'CNY',
            min_order TEXT,
            supplier_name TEXT,
            product_url TEXT,
            snapshot_date TEXT NOT NULL,
            created_at TEXT DEFAULT CURRENT_TIMESTAMP
        )
    """)

    c.execute("""
        CREATE INDEX IF NOT EXISTS idx_keyword_platform_date
        ON price_snapshots(search_keyword, platform, snapshot_date)
    """)

    conn.commit()
    return conn

Key design decisions:

price_min and price_max because wholesale prices are usually ranges (volume tiers)
snapshot_date is date-only (not timestamp) — one snapshot per product per day is enough
Index on keyword + platform + date for fast lookups

Step 2: Collect Data from Both Platforms

Here's where the Yiwugo Scraper and DHgate Scraper do the heavy lifting:

Made-in-China Scraper — Extract B2B product data, supplier info, and MOQ from Made-in-China.com

from apify_client import ApifyClient
import re

client = ApifyClient("YOUR_APIFY_TOKEN")

def parse_price(price_str):
    """Extract min and max price from a price string."""
    numbers = re.findall(r'[\d.]+', price_str)
    if len(numbers) >= 2:
        return float(numbers[0]), float(numbers[1])
    elif len(numbers) == 1:
        return float(numbers[0]), float(numbers[0])
    return None, None

def collect_yiwugo(keyword, max_items=30):
    """Scrape Yiwugo for product prices."""
    run = client.actor("jungle_intertwining/yiwugo-scraper").call(
        run_input={"keyword": keyword, "maxItems": max_items}
    )
    items = list(client.dataset(run["defaultDatasetId"]).iterate_items())

    results = []
    for item in items:
        price_min, price_max = parse_price(item.get("price", ""))
        results.append({
            "platform": "yiwugo",
            "product_name": item.get("title", ""),
            "search_keyword": keyword,
            "price_min": price_min,
            "price_max": price_max,
            "currency": "CNY",
            "min_order": item.get("minOrder", ""),
            "supplier_name": item.get("shopName", ""),
            "product_url": item.get("url", ""),
        })
    return results

def collect_dhgate(keyword, max_pages=2):
    """Scrape DHgate for product prices."""
    run = client.actor("jungle_intertwining/dhgate-scraper").call(
- **[Made-in-China Scraper](https://apify.com/jungle_intertwining/made-in-china-scraper)** — Extract B2B product data, supplier info, and MOQ from Made-in-China.com
        run_input={"searchKeywords": [keyword], "maxPages": max_pages}
    )
    items = list(client.dataset(run["defaultDatasetId"]).iterate_items())

    results = []
    for item in items:
        price_min, price_max = parse_price(item.get("price", ""))
        results.append({
            "platform": "dhgate",
            "product_name": item.get("productName", ""),
            "search_keyword": keyword,
            "price_min": price_min,
            "price_max": price_max,
            "currency": "USD",
            "min_order": item.get("minOrder", ""),
            "supplier_name": item.get("sellerName", ""),
            "product_url": item.get("productUrl", ""),
        })
    return results

Note the currency difference: Yiwugo prices are in CNY (¥), DHgate in USD ($). We'll handle conversion in the comparison step.

Step 3: Store Snapshots

Save each collection run as a daily snapshot:

def store_snapshot(conn, products):
    """Save a batch of product prices to the database."""
    today = datetime.now().strftime("%Y-%m-%d")
    c = conn.cursor()

    for p in products:
        c.execute("""
            INSERT INTO price_snapshots
            (platform, product_name, search_keyword, price_min, price_max,
             currency, min_order, supplier_name, product_url, snapshot_date)
            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
        """, (
            p["platform"], p["product_name"], p["search_keyword"],
            p["price_min"], p["price_max"], p["currency"],
            p["min_order"], p["supplier_name"], p["product_url"], today
        ))

    conn.commit()
    print(f"Stored {len(products)} prices for {today}")

Step 4: Detect Price Changes

This is where it gets interesting. Compare today's prices against the previous snapshot:

def detect_changes(conn, keyword, threshold_pct=5.0):
    """Find products with significant price changes."""
    c = conn.cursor()

    # Get the two most recent snapshot dates for this keyword
    c.execute("""
        SELECT DISTINCT snapshot_date FROM price_snapshots
        WHERE search_keyword = ?
        ORDER BY snapshot_date DESC LIMIT 2
    """, (keyword,))

    dates = [row[0] for row in c.fetchall()]
    if len(dates) < 2:
        return []  # Need at least 2 snapshots to compare

    current_date, previous_date = dates[0], dates[1]

    # Get average min prices by platform for each date
    changes = []
    for platform in ["yiwugo", "dhgate"]:
        c.execute("""
            SELECT AVG(price_min) FROM price_snapshots
            WHERE search_keyword = ? AND platform = ? AND snapshot_date = ?
            AND price_min IS NOT NULL
        """, (keyword, platform, current_date))
        current_avg = c.fetchone()[0]

        c.execute("""
            SELECT AVG(price_min) FROM price_snapshots
            WHERE search_keyword = ? AND platform = ? AND snapshot_date = ?
            AND price_min IS NOT NULL
        """, (keyword, platform, previous_date))
        previous_avg = c.fetchone()[0]

        if current_avg and previous_avg and previous_avg > 0:
            change_pct = ((current_avg - previous_avg) / previous_avg) * 100
            if abs(change_pct) >= threshold_pct:
                changes.append({
                    "platform": platform,
                    "keyword": keyword,
                    "previous_avg": round(previous_avg, 2),
                    "current_avg": round(current_avg, 2),
                    "change_pct": round(change_pct, 1),
                    "previous_date": previous_date,
                    "current_date": current_date,
                })

    return changes

Step 5: Cross-Platform Comparison

The real power — see the same product category priced across both platforms:

CNY_TO_USD = 0.14  # Update this periodically

def cross_platform_report(conn, keyword):
    """Compare latest prices between Yiwugo and DHgate."""
    c = conn.cursor()

    # Get latest snapshot date
    c.execute("""
        SELECT MAX(snapshot_date) FROM price_snapshots
        WHERE search_keyword = ?
    """, (keyword,))
    latest_date = c.fetchone()[0]
    if not latest_date:
        return None

    report = {"keyword": keyword, "date": latest_date, "platforms": {}}

    for platform in ["yiwugo", "dhgate"]:
        c.execute("""
            SELECT COUNT(*), AVG(price_min), MIN(price_min), MAX(price_min)
            FROM price_snapshots
            WHERE search_keyword = ? AND platform = ? AND snapshot_date = ?
            AND price_min IS NOT NULL
        """, (keyword, platform, latest_date))

        count, avg_price, min_price, max_price = c.fetchone()

        # Normalize to USD for comparison
        if platform == "yiwugo" and avg_price:
            avg_usd = round(avg_price * CNY_TO_USD, 2)
            min_usd = round(min_price * CNY_TO_USD, 2)
        else:
            avg_usd = round(avg_price, 2) if avg_price else 0
            min_usd = round(min_price, 2) if min_price else 0

        report["platforms"][platform] = {
            "count": count,
            "avg_price_usd": avg_usd,
            "min_price_usd": min_usd,
            "original_currency": "CNY" if platform == "yiwugo" else "USD",
        }

    # Calculate savings
    yiwugo = report["platforms"].get("yiwugo", {})
    dhgate = report["platforms"].get("dhgate", {})
    if yiwugo.get("avg_price_usd") and dhgate.get("avg_price_usd"):
        savings_pct = ((dhgate["avg_price_usd"] - yiwugo["avg_price_usd"])
                       / dhgate["avg_price_usd"] * 100)
        report["yiwugo_savings_pct"] = round(savings_pct, 1)

    return report

Step 6: Put It All Together

Here's the main tracking script you'd run daily (via cron or Apify scheduler):

def run_tracker(keywords):
    """Main tracking loop — run this daily."""
    conn = init_db()

    all_changes = []

    for keyword in keywords:
        print(f"\n📦 Tracking: {keyword}")

        # Collect from both platforms
        yiwugo_data = collect_yiwugo(keyword)
        dhgate_data = collect_dhgate(keyword)

        # Store snapshots
        store_snapshot(conn, yiwugo_data)
        store_snapshot(conn, dhgate_data)

        # Check for price changes
        changes = detect_changes(conn, keyword)
        all_changes.extend(changes)

        # Cross-platform comparison
        report = cross_platform_report(conn, keyword)
        if report:
            print(f"\n  Cross-platform comparison ({report['date']}):")
            for platform, data in report["platforms"].items():
                print(f"    {platform}: avg ${data['avg_price_usd']} "
                      f"(min ${data['min_price_usd']}, {data['count']} products)")
            if "yiwugo_savings_pct" in report:
                savings = report["yiwugo_savings_pct"]
                if savings > 0:
                    print(f"    💰 Yiwugo is {savings}% cheaper on average")
                else:
                    print(f"    📊 DHgate is {abs(savings)}% cheaper on average")

    # Print alerts
    if all_changes:
        print("\n🚨 Price Alerts:")
        for c in all_changes:
            direction = "📈" if c["change_pct"] > 0 else "📉"
            print(f"  {direction} {c['platform']}/{c['keyword']}: "
                  f"{c['change_pct']:+.1f}% "
                  f"(${c['previous_avg']} → ${c['current_avg']})")

    conn.close()

# Track these product categories
run_tracker([
    "wireless earbuds",
    "phone case",
    "led strip lights",
    "silicone kitchen utensils",
])

Sample Output

After a few days of tracking, you'll see something like:

📦 Tracking: wireless earbuds

  Cross-platform comparison (2026-02-15):
    yiwugo: avg $1.82 (min $0.56, 30 products)
    dhgate: avg $3.41 (min $1.23, 58 products)
    💰 Yiwugo is 46.6% cheaper on average

📦 Tracking: phone case

  Cross-platform comparison (2026-02-15):
    yiwugo: avg $0.34 (min $0.08, 30 products)
    dhgate: avg $0.89 (min $0.31, 60 products)
    💰 Yiwugo is 61.8% cheaper on average

🚨 Price Alerts:
  📉 dhgate/wireless earbuds: -8.2% ($3.71 → $3.41)
  📈 yiwugo/led strip lights: +12.4% ($1.15 → $1.29)

That LED strip price spike on Yiwugo? Probably seasonal demand (spring festival decorations). The earbuds price drop on DHgate? Could be a new batch of sellers undercutting each other. Either way, you know about it before your competitors do.

Extending the Tracker

Once the basic pipeline works, you can add:

Email/Slack alerts when prices drop below your target threshold
Historical charts using matplotlib to visualize price trends over weeks
Supplier scoring — track which suppliers consistently offer the best prices
Currency rate updates — pull live CNY/USD rates instead of hardcoding
Category expansion — add more keywords as you discover profitable niches

Key Takeaways

Track both platforms — Yiwugo gives you factory-direct prices, DHgate gives you cross-border convenience. The gap between them is your opportunity.
Daily snapshots compound — One day of data is useless. Two weeks of data shows trends. A month shows seasonal patterns.
Automate the boring parts — Let scrapers collect data and scripts detect changes. Spend your time on decisions, not data entry.
Act on alerts quickly — Wholesale prices move fast. A 10% drop today might be gone tomorrow.

📦 Tools used in this article:

Yiwugo Scraper — Extract product data from China's largest wholesale market
DHgate Scraper — Scrape DHgate product listings and prices
Made-in-China Scraper — Extract B2B product data, supplier info, and MOQ from Made-in-China.com

📚 Related reading:

DEV Community