DEV Community

wfgsss
wfgsss

Posted on • Edited on

How to Scrape DHgate.com for Wholesale Product Data

DHgate is one of the biggest cross-border wholesale platforms out of China. Millions of products, thousands of suppliers, prices that make AliExpress look expensive. If you're doing dropshipping research, price monitoring, or supplier discovery, you need structured data — not hours of manual browsing.

This guide walks through how to scrape DHgate.com product listings programmatically, what data you can extract, and how to use it for real business decisions.

What Data Can You Get from DHgate?

DHgate search result pages are surprisingly data-rich. Each product listing contains:

  • Product name and direct URL
  • Price range (wholesale tiers with volume discounts)
  • Minimum order quantity — critical for dropshippers testing products
  • Seller name, store URL, and trust level
  • Feedback percentage — seller reliability at a glance
  • Free shipping availability
  • Product images (CDN URLs)
  • Sponsored listing flag — know what's organic vs paid

That's a lot of signal packed into search results alone, before you even hit individual product pages.

The Architecture: No Browser Needed

Here's the good news: DHgate renders product data server-side. That means you can extract everything with plain HTTP requests — no Puppeteer, no Playwright, no headless browser overhead.

The approach:

  1. Build search URLs with your keywords
  2. Fetch the HTML response
  3. Parse the embedded product data (JSON in script tags)
  4. Paginate through results (up to 50 pages, ~60 products per page)

Building the Search URL

DHgate search URLs follow a predictable pattern:

https://www.dhgate.com/wholesale/search.do?searchkey={keyword}&page={page}
Enter fullscreen mode Exit fullscreen mode

You can also add sort parameters:

  • &sorttype=price_asc — cheapest first
  • &sorttype=price_desc — most expensive first
  • &sorttype=orders_desc — best sellers
  • &sorttype=newest — newest listings

And a ship-to country filter:

  • &shipcountry=us — prices/availability for US buyers

Extracting Product Data

DHgate embeds structured product data in the page HTML. Here's a simplified extraction approach using Node.js:

const axios = require("axios");
const cheerio = require("cheerio");

async function scrapeDHgate(keyword, maxPages = 3) {
  const products = [];

  for (let page = 1; page <= maxPages; page++) {
    const url = `https://www.dhgate.com/wholesale/search.do?searchkey=${encodeURIComponent(keyword)}&page=${page}`;

    const { data: html } = await axios.get(url, {
      headers: {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
        "Accept-Language": "en-US,en;q=0.9",
      },
    });

    const $ = cheerio.load(html);

    // DHgate embeds product data in script tags
    $("script").each((_, el) => {
      const text = $(el).html() || "";
      if .includes("window.__INITIAL_STATE__")) {
        const match = text.match(/window\.__INITIAL_STATE__\s*=\s*({.*?});/s);
        if (match) {
          const state = JSON.parse(match[1]);
          const items = state?.searchResult?.items || [];
          items.forEach(item => {
            products.push({
              productName: item.productName,
              price: item.price,
              minOrder: item.minOrder,
              productUrl: `https://www.dhgate.com${item.productDetailUrl}`,
              imageUrl: item.imageUrl,
              sellerName: item.sellerName,
              feedbackPercent: item.feedbackPercent,
              freeShipping: item.freeShipping || false,
              searchKeyword: keyword,
            });
          });
        }
      }
    });

    // Be polite — don't hammer the server
    await new Promise(r => setTimeout(r, 2000));
  }

  return products;
}
Enter fullscreen mode Exit fullscreen mode

Handling Common Challenges

Rate limiting: DHgate will throttle aggressive scrapers. Add 1-3 second delays between requests and rotate User-Agent strings.

IP blocks: For production use, proxy rotation is essential. Residential proxies work best for Chinese e-commerce sites.

Data format changes: DHgate occasionally updates their frontend. The __INITIAL_STATE__ pattern has been stable, but always validate your parser output.

The Easy Way: Use a Pre-Built Scraper

If you don't want to maintain your own scraping infrastructure, there's a ready-to-use DHgate Scraper on Apify Store that handles all of this:

  • Made-in-China Scraper — Extract B2B product data, supplier info, and MOQ from Made-in-China.com

  • Keyword search with pagination

  • Sort options (best match, price, orders, newest)

  • Ship-to country filtering

  • Built-in retry logic and polite delays

  • Structured JSON output

Quick Start with Apify CLI

# Install Apify CLI
npm install -g apify-cli

# Run the DHgate scraper
apify call jungle_intertwining/dhgate-scraper \
- **[Made-in-China Scraper](https://apify.com/jungle_intertwining/made-in-china-scraper)** — Extract B2B product data, supplier info, and MOQ from Made-in-China.com
  -i '{ "searchKeywords": ["wireless earbuds", "phone case"], "maxPages": 3 }\n```



### Sample Output



```json
{
  "itemcode": 1071872302,
  "productName": "Liquid Silicone Phone Case For iPhone 17 16 15 14 13 Pro Max",
  "price": "US $2.54 - 10.78",
  "minOrder": "1 Piece",
  "productUrl": "https://www.dhgate.com/product/liquid-silicone-phone-case/1071872302.html",
  "imageUrl": "https://img4.dhresource.com/260x260/f3/albu/...",
  "sellerName": "bluetoothheadphone1",
  "feedbackPercent": "98.9%",
  "freeShipping": false,
  "searchKeyword": "phone case"
}
Enter fullscreen mode Exit fullscreen mode

Real-World Use Cases

1. Dropshipping Product Research

Search for trending product categories, filter by:

  • minOrder = 1 (no bulk commitment)
  • freeShipping = true (better margins)
  • feedbackPercent > 95% (reliable suppliers)

2. Price Monitoring

Run the scraper on a schedule (daily or weekly) to track price changes across your product catalog. Spot price drops before your competitors do.

3. Supplier Discovery

Compare sellers across the same product category. Sort by feedback percentage and review count to find the most reliable suppliers.

4. Cross-Platform Price Comparison

Combine DHgate data with data from other wholesale platforms like Yiwugo to find the best wholesale prices across China's major marketplaces.

Tips for Production Use

  1. Respect rate limits — 1-2 second delays between requests minimum
  2. Cache results — Don't re-scrape the same queries within 24 hours
  3. Validate output — Check that key fields (price, productName) aren't null
  4. Monitor for changes — Set up alerts if your scraper returns 0 results
  5. Use proxies — Essential for any serious scraping operation

What's Next?

Once you have structured DHgate data, you can:

  • Build a product comparison dashboard
  • Set up automated price alerts
  • Create a supplier scoring system
  • Feed data into your e-commerce analytics pipeline

The wholesale data game is all about having better information faster. Whether you build your own scraper or use a pre-built tool, the key is turning raw product listings into actionable business intelligence.


📦 Tools mentioned in this article:

📚 Related reading:

Top comments (0)