Bandcamp Artist Data Scraping: Music Research and Analytics
Bandcamp is the largest independent music platform, home to millions of artists selling directly to fans. For music researchers, label scouts, and analytics teams, Bandcamp data is incredibly valuable — but there is no public API.
This guide shows you how to extract artist data, pricing patterns, and genre trends from Bandcamp using Python.
What Data Can You Extract?
Bandcamp artist pages contain rich structured data:
- Artist info: name, location, bio, genre tags
- Discography: albums, EPs, singles with release dates
- Pricing: album prices, track prices, "name your price" flags
- Fan data: supporter counts, top supporters
- Label info: label name, catalog number, other releases
- Merch: physical products, bundles, pricing
Setting Up Your Scraper
First, install the required packages:
pip install requests beautifulsoup4 lxml
Bandcamp uses server-rendered HTML with embedded JSON-LD data, making it relatively straightforward to parse.
import requests
import json
import re
from bs4 import BeautifulSoup
from urllib.parse import urljoin
import time
class BandcampScraper:
def __init__(self, proxy_url=None):
self.session = requests.Session()
self.session.headers.update({
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
})
if proxy_url:
self.session.proxies = {"http": proxy_url, "https": proxy_url}
def get_artist_info(self, artist_url):
"""Extract artist data from a Bandcamp artist page."""
resp = self.session.get(artist_url)
soup = BeautifulSoup(resp.text, "lxml")
# Extract embedded JSON-LD data
ld_json = soup.find("script", type="application/ld+json")
data = json.loads(ld_json.string) if ld_json else {}
# Extract band name and location
band_name = soup.select_one("#band-name-location .title")
location = soup.select_one("#band-name-location .location")
# Extract bio
bio = soup.select_one(".signed-out-artists-bio-text")
return {
"name": band_name.text.strip() if band_name else data.get("name"),
"location": location.text.strip() if location else "",
"bio": bio.text.strip() if bio else "",
"url": artist_url,
"type": data.get("@type"),
"image": data.get("image"),
}
def get_discography(self, artist_url):
"""Extract all releases from an artist page."""
music_url = artist_url.rstrip("/") + "/music"
resp = self.session.get(music_url)
soup = BeautifulSoup(resp.text, "lxml")
releases = []
for item in soup.select(".music-grid-item"):
link = item.select_one("a")
title = item.select_one(".title")
img = item.select_one("img")
releases.append({
"title": title.text.strip() if title else "",
"url": urljoin(artist_url, link["href"]) if link else "",
"art": img["src"] if img else ""
})
return releases
Extracting Album Pricing Data
Pricing analysis is where Bandcamp data gets really interesting for market research:
def get_album_details(self, album_url):
"""Extract pricing and track details from an album page."""
resp = self.session.get(album_url)
soup = BeautifulSoup(resp.text, "lxml")
price_el = soup.select_one(".buyItem .base-text-color")
nyp = soup.select_one(".buyItem .name-your-price")
tracks = []
for row in soup.select(".track_list .track_row_view"):
track_title = row.select_one(".track-title")
duration = row.select_one(".time")
tracks.append({
"title": track_title.text.strip() if track_title else "",
"duration": duration.text.strip() if duration else ""
})
return {
"url": album_url,
"price": price_el.text.strip() if price_el else "Free",
"name_your_price": nyp is not None,
"tracks": tracks,
"track_count": len(tracks)
}
Genre and Tag Discovery
Bandcamp tags are excellent for music discovery and genre research:
def explore_tag(self, tag, sort="pop", page=1):
"""Explore releases by genre tag."""
url = f"https://bandcamp.com/tag/{tag}?sort_field={sort}&page={page}"
resp = self.session.get(url)
soup = BeautifulSoup(resp.text, "lxml")
results = []
for item in soup.select(".item_list .item"):
artist = item.select_one(".itemsubtext")
title = item.select_one(".itemtext")
results.append({
"title": title.text.strip() if title else "",
"artist": artist.text.strip() if artist else "",
"tag": tag
})
return results
# Discover trending electronic music
scraper = BandcampScraper()
trending = scraper.explore_tag("electronic", sort="date")
for item in trending[:10]:
print(f"{item['artist']} - {item['title']}")
Using Proxies for Scale
For large-scale Bandcamp research across thousands of artists, you need proxy rotation to avoid rate limiting. ThorData residential proxies provide reliable access with automatic IP rotation:
scraper = BandcampScraper(
proxy_url="http://user:pass@proxy.thordata.com:9000"
)
artist_urls = [
"https://artist1.bandcamp.com",
"https://artist2.bandcamp.com",
]
all_data = []
for url in artist_urls:
info = scraper.get_artist_info(url)
info["discography"] = scraper.get_discography(url)
all_data.append(info)
time.sleep(2) # Respectful delay
Production-Ready Solution
For production use cases — label research, market analysis, or building music discovery tools — the Bandcamp Scraper on Apify handles all the edge cases: JavaScript rendering, pagination, rate limiting, and structured data output.
It exports clean JSON with artist profiles, discographies, pricing, and tags — ready for analysis or database import.
Use Cases for Bandcamp Data
| Use Case | Data Needed |
|---|---|
| Label scouting | Artist info, genre tags, fan counts |
| Pricing research | Album prices, NYP stats, currency |
| Genre analysis | Tag exploration, trending releases |
| Music discovery | Discographies, related artists |
| Market sizing | Release counts, pricing distributions |
Ethical Considerations
- Respect rate limits: Add delays between requests (2+ seconds)
- Cache responses: Do not re-scrape the same pages repeatedly
- Public data only: Never attempt to access private fan data or sales figures
- Attribution: Credit Bandcamp as your data source
Conclusion
Bandcamp is a goldmine for music industry research. While there is no official API, the structured HTML and embedded JSON-LD make it one of the cleaner sites to scrape. Combine Python extraction with ThorData proxies for reliable large-scale collection, or use a pre-built Bandcamp scraper for instant results.
Follow me for more web scraping tutorials covering music, social media, and data extraction guides.
Top comments (0)