KazKN

Posted on Jun 9

How to Scrape Vestiaire Collective for Sold Items, Prices, and Seller Intelligence

#webscraping #ecommerce #automation #data

Vestiaire Collective is not a normal ecommerce website.

If you only scrape product titles and prices, you miss most of the value.

For luxury resale, the useful questions are different:

Which Chanel Classic Flap listings are priced below the market?
Which seller countries return the best supply for a specific model?
Which items have already sold?
Which listings disappeared after being tracked for several runs?
Which sellers keep posting very similar items?
Which markets expose the same listing several times?

That is why I built a scraper around resale workflows, not just around product pages.

You can run the ready-made Vestiaire Collective Smart Scraper on Apify if you want the hosted version, or use the structure below to understand what a serious Vestiaire Collective data workflow should collect.

Disclosure: this article contains affiliate links to my Apify Actor.

🧭 Why Vestiaire Collective scraping is different from normal ecommerce scraping

Most ecommerce scraping tutorials follow the same pattern:

Fetch a category page.
Extract product cards.
Open product pages.
Save title, price, image, brand, and URL.

That works for a standard retail catalog.

It is incomplete for Vestiaire Collective.

Vestiaire is a resale marketplace. The same brand and model can appear in many conditions, many seller countries, many currencies, and many market locales. A single search like chanel classic flap can return active listings, near-duplicates, irrelevant flap bags, different conditions, different seller locations, and sometimes the same item across several country pages.

So the useful dataset is not only:

{
  "title": "Chanel bag",
  "price": 4200,
  "url": "https://..."
}

The useful dataset looks more like:

{
  "recordType": "listing",
  "listingId": "67486244",
  "title": "Sac a main en cuir Classic CC Shopping",
  "brand": "Chanel",
  "price": 1000,
  "currency": "EUR",
  "country": "FR",
  "sellerCountry": "DE",
  "condition": "Bon etat",
  "isSold": false,
  "displayStatus": "active",
  "likesCount": 22,
  "url": "https://fr.vestiairecollective.com/..."
}

The difference is important. The second record can support sourcing, resale analysis, seller benchmarking, and price tracking. The first one is just a product card.

🧩 What data matters on Vestiaire Collective

For a useful Vestiaire Collective scraper, I would prioritize these fields:

Field	Why it matters
`listingId`	Deduplicate the same item across multiple market pages
`title` and `model`	Filter precise models like Classic Flap, Jackie, Kelly, or Saddle
`price` and `currency`	Compare resale prices across markets
`country`	The market locale searched, such as FR, IT, DE, GB, or US
`sellerCountry`	The seller or item location returned by Vestiaire
`condition`	Critical for underpriced deal detection
`isSold` and `displayStatus`	Separate live inventory from sold or likely sold records
`url`	Open the original listing for manual review
`priceHistory`	Track price drops over repeated runs
`seller summary`	Benchmark sellers seen in the collected data
`duplicate signals`	Flag similar public records for manual review

Two fields are often confused: market country and seller country.

country is the Vestiaire market searched. For example, you can search the French, German, or UK marketplace view.

sellerCountry is where the seller or item appears to be located according to the public data returned by Vestiaire.

For most sourcing use cases, seller country is usually more important than market country. If you want French sellers, use sellerCountries: ["FR"]. If you want to compare how the same query behaves across local market pages, use countries.

⚙️ Quick start with the ready-made Actor

The fastest way to collect this data is to run the Vestiaire Collective Smart Scraper.

It supports:

Search terms.
Public Vestiaire start URLs.
All 70 supported Vestiaire market countries.
Optional seller country filters.
Optional item condition filters.
Product detail enrichment.
Sold item collection.
Price history across repeated runs.
Likely sold tracking when listings disappear over time.
Seller summary rows.
Duplicate and suspicious seller signal rows.

The key design choice is that countries are not hardcoded to a few large markets.

If the user leaves countries empty or selects ALL, the Actor can search every supported Vestiaire market. If the user only wants France, Italy, Germany, the UK, or the US, they can restrict the run.

That matters because you do not always know the user's intent in advance.

A reseller might want only sellers in France. A market researcher might want all public market views. A pricing analyst might want to compare the same model across Europe and the US. The input should not block any of these workflows.

🧪 Example input for a deal-finding workflow

Here is a realistic workflow: find potentially underpriced Chanel Classic Flap listings from French, Italian, German, and UK markets, but only keep items in very good or good condition.

{
  "searchTerms": ["chanel classic flap"],
  "countries": ["FR", "IT", "DE", "GB"],
  "sellerCountries": ["ALL"],
  "itemConditions": ["3", "4"],
  "requiredKeywords": ["classic flap"],
  "maxListings": 50,
  "maxDatasetRecords": 80,
  "maxPagesPerCountry": 2,
  "includeDetails": true,
  "includeSellerInfo": true,
  "includeDuplicateSignals": false,
  "includeSoldItems": false
}

This input does several useful things.

It uses a broad search term, then tightens the result with requiredKeywords. That helps avoid collecting every Chanel flap bag when you only want Classic Flap candidates.

It searches several market locales, but it does not restrict seller country. That is useful when the goal is deal discovery. If your business only buys from specific countries, you can replace sellerCountries: ["ALL"] with something like:

{
  "sellerCountries": ["FR", "IT"]
}

The maxListings value controls the number of listing-like records. The maxDatasetRecords value controls total dataset rows, including details and seller summaries. If you enable enrichments, always give maxDatasetRecords more room than maxListings.

📦 Example output fields

A clean output should not force the user to decode raw website internals.

For each listing, I want the dataset to be readable directly inside Apify, CSV, Excel, or an API response.

Example listing row:

{
  "recordType": "listing",
  "displayStatus": "active",
  "isSold": false,
  "listingId": "67486244",
  "title": "Sac a main en cuir Classic CC Shopping",
  "brand": "Chanel",
  "price": 1000,
  "currency": "EUR",
  "priceText": "1000 EUR",
  "country": "FR",
  "sellerCountry": "DE",
  "condition": "Bon etat",
  "conditionSource": "vestiaire",
  "itemSummary": "Chanel - Sac a main en cuir Classic CC Shopping - 1000 EUR - DE"
}

Example tracking row after repeated runs:

{
  "recordType": "price_history",
  "listingId": "67486244",
  "displayStatus": "active",
  "priceHistory": [
    { "price": 1200, "currency": "EUR", "observedAt": "2026-06-01T12:00:00.000Z" },
    { "price": 1000, "currency": "EUR", "observedAt": "2026-06-08T12:00:00.000Z" }
  ],
  "priceDrop": 200
}

Example seller summary:

{
  "recordType": "seller_summary",
  "sellerCountry": "FR",
  "uniqueListings": 20,
  "averagePrice": 1850,
  "minPrice": 650,
  "maxPrice": 4200
}

This is the part most simple scrapers miss. Once you have seller summaries and tracking records, the dataset becomes a market intelligence workflow, not just a list of products.

💰 Cost model and pricing logic

For scraping tools, pricing has to match the value of the output.

If a competitor charges around 20 to 30 dollars per 1,000 results, a new Actor should not be randomly priced at 50 or 70 dollars per 1,000. It needs to be close enough to the market while still leaving margin.

For the current Actor pricing, the main paid event is the listing result. Search pages and enrichments are priced separately so users only pay more when they ask for deeper data.

Approximate effective pricing:

Workflow	BRONZE	SILVER	GOLD+
Listing-only results	About $20.08 per 1,000	About $18.06 per 1,000	About $16.05 per 1,000
Listing + product details	About $26.08 per 1,000	About $23.06 per 1,000	About $20.05 per 1,000
Sold item results	About $25.08 per 1,000	About $22.06 per 1,000	About $20.05 per 1,000

This makes free accounts useful for testing, but paid accounts quickly become more efficient for serious usage.

If you want to run it yourself, the live Actor is here: Vestiaire Collective Scraper on Apify.

🛠️ Building your own scraper vs using an Actor

You can build your own Vestiaire Collective scraper.

The basic architecture is straightforward:

Generate search URLs or API requests from query terms.
Loop through market countries.
Parse listing cards or public JSON data.
Normalize fields.
Deduplicate by listing ID.
Optionally fetch product detail pages.
Store state between runs for price history.
Export the dataset.

The hard part is not the first 50 lines of code.

The hard part is making the workflow reliable enough for actual users.

You need input validation, country handling, seller country filters, title precision filters, item condition filters, result caps, charge limits, retries, deduplication, clean output schemas, and a way to avoid marking old listings as sold just because one run failed.

For a personal script, you can keep it simple.

For a public data product, you need guardrails.

That is the main reason I packaged this as an Apify Actor. Users can run it from the UI, schedule it, export results, call it through the API, and reuse the same input without maintaining infrastructure.

🔍 Use cases this unlocks

Here are practical workflows this type of scraper can support.

Find underpriced luxury items:

{
  "searchTerms": ["chanel classic flap"],
  "itemConditions": ["3", "4"],
  "includeDetails": true
}

Track sold items for resale price research:

{
  "searchTerms": ["hermes kelly 28"],
  "includeSoldItems": true,
  "maxListings": 100
}

Monitor price drops over time:

{
  "searchTerms": ["dior saddle bag"],
  "trackingStoreName": "dior-saddle-weekly-tracking",
  "includeDetails": true
}

Benchmark sellers by country:

{
  "searchTerms": ["gucci jackie bag"],
  "sellerCountries": ["FR", "IT", "ES"],
  "includeSellerInfo": true
}

Find duplicate or suspicious public listing patterns:

{
  "searchTerms": ["louis vuitton neverfull"],
  "includeDuplicateSignals": true,
  "maxListings": 200,
  "maxDatasetRecords": 260
}

Duplicate signals are not proof of fraud. They are review signals. The goal is to surface listings or sellers that deserve a closer look, not to make a final authenticity decision.

You can test these workflows with the Vestiaire Collective Sold Items and Price Tracker Actor.

🧯 Practical scraping caveats

There are a few things to be careful with.

First, country selection and seller country selection are not the same thing.

If you search countries: ["FR"], you are searching the French market view. You may still see sellers from Germany, Italy, Spain, or elsewhere. If you want only French sellers, use sellerCountries: ["FR"].

Second, sold item tracking is not always a single perfect website flag.

Some records come from public sold search filters. Some records come from explicit sold pages. Some records become likely_sold only after they disappear across repeated tracking runs. A good scraper should show the source of the status instead of pretending every sold signal is equally certain.

Third, condition data can be incomplete.

If Vestiaire does not return a condition for a result, the scraper should not invent certainty. In my Actor, condition fields expose whether the value came from Vestiaire or from a selected filter fallback.

Fourth, duplicate detection must stay conservative.

Similar titles, images, brands, categories, and price bands can signal a duplicate cluster. They do not automatically prove counterfeit activity. Treat these records as manual review leads.

Finally, respect public data boundaries.

Do not use scraping to collect private account information, bypass access controls, or overload a website. Keep requests reasonable, cache when possible, and design your workflow around public data.

❓ FAQ

Q: Can I scrape sold items from Vestiaire Collective?

A: Yes, you can collect public sold-search results and public sold-item pages when available. For recurring tracking, you can also mark listings as likely_sold only after they are missing for a configured number of runs.

Q: What is the difference between market country and seller country?

A: Market country is the Vestiaire locale searched, such as FR, IT, DE, GB, or US. Seller country is the seller or item location returned in the public listing data. For sourcing, seller country is often the more useful filter.

Q: Do I need product detail enrichment?

A: Not always. Listing-only runs are cheaper and faster. Enable product details when you need richer product metadata, product page fields, or more confidence before manual review.

Q: Can I use this without writing code?

A: Yes. Apify Actors can run from the browser UI with JSON input. Developers can also call the same Actor through the Apify API and export results as JSON, CSV, Excel, or via dataset endpoints.

✅ Final thoughts

The main mistake with Vestiaire Collective scraping is treating it like a static product catalog.

The value is not only in scraping listings.

The value is in connecting listings with sold signals, seller location, condition, duplicate detection, and price history over time.

That is how you turn public marketplace pages into something useful for resale research, sourcing, competitive monitoring, and seller intelligence.

If you want the hosted version, start with the Vestiaire Collective Smart Scraper on Apify, run a small query first, inspect the output, then scale the run once the fields match your workflow.

DEV Community