Quick answer: The Discogs marketplace is the world's largest secondary market for vinyl, CDs, and cassettes — but its free REST API exposes only a
lowest_priceaggregate and anum_for_salecount. To get the per-listing detail — who is selling, at what asking price, in what condition, and shipping from where — you have to scrape the Cloudflare-protected marketplace HTML. The Apify Actor at apify.com/DevilScrapes/discogs-sold-price does exactly that, joining the public REST API with paginated marketplace HTML into a single flat dataset. Price: $0.005 per row, ~$5.05 per 1,000 results.
If you have ever priced a record collection for resale, you know the workflow: open Discogs, search the release, squint at 25 listings per page, mentally adjust for condition grade and seller location. Multiply by 200 records and a weekend is gone. The data you want — every asking price, every seller rating, every condition grade — is sitting on a public web page, but Discogs exposes none of it in any published API, and the page sits behind Cloudflare.
What is Discogs? 🎵
Discogs is a community-built catalog of released music and an active secondary marketplace. Collectors use it to catalog inventory, buy and sell physical media, and settle "what is this copy worth" questions. The catalog side is free and has a REST API. The marketplace side — the live order book of who is selling what, at what price — is a Cloudflare-protected web app with no programmatic export.
The Discogs API exposes release metadata (title, artist, year, format, genres), community stats, and exactly two market-signal fields: lowest_price and num_for_sale. That is the entire API surface for marketplace pricing. Every column past those two — asking price by listing, condition grade, seller country, seller rating — lives only in the HTML.
Does Discogs have an API for marketplace listings?
No. The Discogs REST API at api.discogs.com exposes a GET /marketplace/stats/{release_id} endpoint that returns the floor ask (lowest_price) and the total listing count (num_for_sale). It does not return individual listings. The per-listing data — who is selling, at what price, with what media and sleeve grade, ships-from where — is available only on the paginated HTML page at www.discogs.com/sell/release/{release_id}. That page is served through Cloudflare and does not have a documented API counterpart.
This Actor fills that gap: it joins the two public REST endpoints (release metadata + marketplace stats) with the paginated listing HTML into one flat, typed row per listing.
What the data looks like 📤
Each marketplace listing comes back as one flat row. Here is a real one, from release ID 249504 (Rick Astley — Never Gonna Give You Up, 1987 UK 7"):
{
"row_type": "listing",
"release_id": 249504,
"release_title": "Never Gonna Give You Up",
"artist": "Rick Astley",
"year": 1987,
"country": "UK",
"format_name": "Vinyl",
"format_descriptions": ["7\"", "45 RPM", "Single", "Stereo"],
"genres": ["Electronic", "Pop"],
"master_id": 96559,
"release_url": "https://www.discogs.com/release/249504",
"listing_id": 3761251765,
"listing_url": "https://www.discogs.com/sell/item/3761251765",
"asking_price": 0.5,
"asking_currency": "GBP",
"shipping_text": "+£15.00",
"condition_media": "Very Good Plus (VG+)",
"condition_sleeve": "Generic",
"seller_username": "Ronan266",
"seller_rating_pct": 100.0,
"seller_rating_count": 35,
"seller_country": "United Kingdom",
"stats_lowest_price": null,
"stats_lowest_currency": null,
"stats_num_for_sale": null,
"stats_blocked_from_sale": null,
"scraped_at": "2026-05-16T12:00:00.000Z"
}
That is the full 27-field schema. With includeStatsRow on (default), the Actor also emits one aggregate row per release — same schema, row_type: "stats", stats_* fields populated, listing fields null — so you can GROUP BY release_id for both per-listing detail and the market floor. Pydantic v2 validates every row before it is pushed: ISO-8601 timestamps, proper nulls, typed numerics — not stringified soup.
The naive approach (and why it falls apart) ⚠️
Every developer who finds the Discogs marketplace HTML eventually tries the same thing: open Chrome DevTools, find the request for /sell/release/{id}, replay it with requests.get(). Here is what breaks.
Cloudflare. The www.discogs.com/sell/release/ path sits behind Cloudflare's bot-management layer. Python's stdlib SSL stack, httpx, and plain requests all fail with a 403 and a JS challenge — the response looks like HTML but holds no listing data. We handle it with curl-cffi impersonation: AsyncSession(impersonate="chrome131") replays a real Chrome 131 TLS ClientHello, ALPN order, and HTTP/2 SETTINGS frame. Before any listing page we run a single homepage warm-up to seed the Cloudflare cookies, after which pages return 200 cleanly. We thread Apify residential proxies (BUYPROXIES94952) to give the session a clean exit IP — verified in cloud QA on real Apify datacenter IPs.
Rate limits. Discogs documents 60 req/min for anonymous API traffic; the marketplace HTML surface has an unwritten ceiling around 25 req/min. We throttle at one request per 1.5 seconds (~40 req/min) across all calls to stay under both, honour Retry-After on 429, and retry with exponential backoff on 408/429/503 and network errors (five attempts). When a release fails after all retries we log a WARNING, emit whatever rows we collected, and continue — one bad ID never kills the run.
The REST API needs a custom User-Agent. The Discogs API Terms require every request to carry an Application-Name/Version User-Agent; default browser-impersonation headers get a 403 on the JSON surface. The Actor sends DevilScrapes/0.1 (+https://apify.com/DevilScrapes) automatically — you configure nothing.
Parsing the condition grades. Condition text is free-form HTML. We extract it with a regex against the canonical Discogs grade vocabulary (Mint, Near Mint, VG+, VG, Generic, Not Graded, and the rest); anything outside it sets the field to null with a WARNING rather than emitting garbage. None of this is glamorous — all of it is the difference between a script that works once and a feed that survives Cloudflare's quarterly changes.
The Actor ⚙️
The Actor is live on the Apify Store: apify.com/DevilScrapes/discogs-sold-price. Run it from the Apify Console (paste release IDs, click Start) or via the Apify Python client:
from apify_client import ApifyClient
client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("DevilScrapes/discogs-sold-price").call(
run_input={
"releaseIds": [249504, 10843],
"maxPagesPerRelease": 4,
"maxListingsPerRelease": 100,
"includeStatsRow": True,
"useProxy": True,
}
)
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
print(item["row_type"], item.get("asking_price"), item.get("condition_media"))
If you do not know the release ID, use searchQuery instead:
run = client.actor("DevilScrapes/discogs-sold-price").call(
run_input={
"searchQuery": "nirvana nevermind",
"maxSearchResults": 5,
"maxListingsPerRelease": 50,
}
)
The search path resolves the top maxSearchResults hits into release IDs, then fetches listings for each. Exactly one of releaseIds or searchQuery is required — passing both, or neither, raises a Pydantic ValidationError before any network call goes out.
Key input parameters:
| Field | Default | Max | Notes |
|---|---|---|---|
releaseIds |
— | 100 IDs per run | XOR with searchQuery
|
searchQuery |
— | 200 chars | Resolves via Discogs search API |
maxSearchResults |
5 | 50 | Only used with searchQuery
|
maxPagesPerRelease |
4 | 20 | 25 listings per page |
maxListingsPerRelease |
100 | 500 | Hard cap per release |
includeStatsRow |
true |
— | One aggregate row per release |
useProxy |
true |
— | Recommended for Apify cloud runs |
Use cases 💡
Vinyl and CD reseller benchmarking. Took in 200 records at an estate sale? Pull the marketplace listings for each release, compute the median asking price by condition grade, and price your own copies in minutes instead of an afternoon — filtering on seller_country for the shipping-adjusted competitive picture.
Music-collectibles arbitrage. The same release often trades at materially different prices across seller countries. Query seller_country + asking_price + asking_currency across a handful of releases and spot regional underpricing before another buyer does.
Label and catalog market intel. For a label's back-catalog, track stats_num_for_sale and stats_lowest_price over time with Apify Schedules. A sustained drop in num_for_sale alongside a rising stats_lowest_price is a secondary-market appreciation signal.
Journalism and pricing studies. "What does a first pressing of Nevermind cost right now?" is a recurring music-publication paragraph. One Actor run on the relevant release IDs gives you a defensible, timestamped dataset instead of a manual spot-check.
Seller-quality screening. Filter listings by seller_rating_pct >= 99.0 and seller_rating_count >= 100 before buying, or sweep stats_blocked_from_sale to flag releases Discogs has quietly restricted. The Actor hands you the data; the filtering is one line of Pandas.
Pricing — exact numbers 💰
Pay-per-event. You pay only for rows that land in the dataset. No data, no charge (beyond the $0.05 run start fee).
| Event | Price (USD) | When |
|---|---|---|
actor-start |
$0.05 | Once per run, at boot |
result-row |
$0.005 | Per listing OR per stats row written |
What that looks like in practice:
| Run | Rows | Cost |
|---|---|---|
| 1 release × 25 listings + 1 stats row | 26 | $0.18 |
| 5 releases × 100 listings + 5 stats rows | 505 | $2.58 |
| 10 releases × 100 listings + 10 stats rows | 1,010 | $5.10 |
| 50 releases × 100 listings + 50 stats rows | 5,050 | $25.30 |
At scale the per-row charge dominates: approximately $5.05 per 1,000 rows, priced for a hand-parsed marketplace listing with full seller metadata rather than a JSON API field copy. Apify's $5 free trial credit covers your first ~990 listing rows, no credit card required.
The technically interesting part
The devil is in the data-attribute vs. display-text split. Discogs encodes the machine-readable price in an HTML attribute (data-pricevalue="0.50", data-currency="GBP") and a human-formatted string in the visible text ("£0.50"). The displayed text goes through server-side currency conversion for non-UK visitors, so text-scraping gives you a converted price in whatever currency the datacenter IP geolocates to. We always prefer data-pricevalue and data-currency — the canonical values the seller entered — and only fall back to text parsing (with a WARNING) when the attributes are missing. For any cross-country price comparison the text number is a moving target and the attribute is not.
Limitations 🚧
No closed-sale price history. Discogs hosts sold-price history at /sell/history/{release_id}, but that page is gated behind account login (Auth0) and is out of scope without user OAuth. What you get instead: every active asking price (the live offer side) plus the public lowest_price aggregate. For most reseller and arbitrage workflows the live asking-price distribution is the more actionable signal anyway.
One snapshot per run. The Actor captures marketplace state at a point in time. For time-series tracking, schedule recurring runs via Apify Schedules and concatenate datasets by release_id + scraped_at.
500-listing ceiling per release. Discogs serves 25 listings per page server-side; the maxPagesPerRelease cap of 20 gives a 500-listing hard ceiling per release per run. Releases with more active listings need multiple runs.
Currency is not normalised. asking_price comes in the seller's local currency, stats_lowest_price in the request IP's resolved currency. Join on asking_currency / stats_lowest_currency; no canonical USD conversion is applied.
Throughput is ~40 req/min, imposed by Discogs — a 10-release run with 4 pages each takes roughly 90 seconds plus warm-up. And note the 7-day default storage retention on the Apify FREE plan: export your dataset right after the run or upgrade for longer retention.
FAQ ❓
Is scraping Discogs marketplace listings legal?
The marketplace listings page is public — no login, no paywall. This Actor reads only what the anonymous public UI exposes, paces itself at ~40 req/min (below Discogs' documented 60 req/min ceiling), and authenticates against nothing. As always, verify against your own jurisdiction and use case before running at scale.
Does Discogs have an API for marketplace listings?
Not for per-listing data. The REST API returns only lowest_price and num_for_sale aggregates per release; individual listings (price, condition, seller) live only on the Cloudflare-protected HTML page, which this Actor scrapes.
Can I get the closed-sale / sold-price history?
No. The /sell/history/{release_id} page is gated behind Auth0 login. The "sold price" in the slug is a vestige of the original scope. What the Actor delivers is the ask-side snapshot: every active listing plus the public floor-price aggregate — for most reseller workflows the more actionable data.
Can I export the results to a spreadsheet or warehouse?
Yes — CSV, Excel, JSON, and XML exports from the Apify dataset viewer. You can also webhook the dataset on ACTOR.RUN.SUCCEEDED into Make, Zapier, or n8n, or pull it via the Apify API: GET /datasets/{id}/items?format=csv&clean=true.
Try it
The Actor is live: apify.com/DevilScrapes/discogs-sold-price.
Free $5 Apify credit, no credit card. Run it on release ID 249504 (the Rick Astley test classic) and you will have 25 typed listing rows in your dataset in under a minute. Found a field you wish it returned — median price, condition-grade distribution? Drop a comment or open a request on the Store page. We ship based on what builders actually need.
Built by Devil Scrapes — Apify Actors with attitude. Pay-per-event, transparent pricing, no junk fields. 😈
Top comments (0)