DEV Community

agenthustler
agenthustler

Posted on

Bandcamp Scraping: Extract Music, Artists, and Fan Data

Bandcamp is one of the most artist-friendly music platforms on the internet, hosting millions of tracks from independent musicians across every genre imaginable. Unlike streaming giants, Bandcamp gives artists direct control over pricing and provides detailed track metadata, making it a goldmine for music data analysis, market research, and building recommendation systems.

In this guide, we'll explore how to scrape Bandcamp data — including track metadata, album information, artist pages, and fan activity — effectively and responsibly.

Understanding Bandcamp's Data Structure

Bandcamp's architecture is fundamentally different from platforms like Spotify or Apple Music. Each artist gets their own subdomain (e.g., artist.bandcamp.com), making the site feel like a network of individual storefronts rather than a monolithic platform.

Tracks

Individual tracks are the atomic unit of content on Bandcamp. Each track contains:

  • Title and duration — the track name and length
  • Album association — which album (if any) the track belongs to
  • Price — name-your-price minimum or fixed price
  • Streaming URL — a temporary URL for the audio preview
  • Lyrics — full lyrics if the artist has provided them
  • Credits — musician credits and production details
  • Tags/genres — artist-applied genre tags
  • Play count and fan count — engagement metrics
  • Release date — when the track was published

Albums

Albums group tracks together and add additional metadata:

  • Album title and artist
  • Track listing — ordered list of all tracks
  • Cover art URL — the album artwork
  • Total price — the album's price (often discounted vs. individual tracks)
  • Release date — the official release date
  • About/description — artist's notes about the album
  • Tags — genre and descriptive tags
  • UPC/catalog number — if provided by the artist

Artist Pages

Artist pages serve as storefronts with rich information:

  • Artist/band name and location
  • Bio/description — the artist's story
  • Discography — all albums and singles
  • Merch listings — physical products for sale
  • Shows/tour dates — upcoming performances
  • Links — social media and website URLs
  • Fan count — total followers

Fan Profiles

Bandcamp fans (buyers/followers) have public profiles showing:

  • Username and avatar
  • Collection — music they've purchased
  • Wishlist — music they want to buy
  • Following — artists and labels they follow
  • Fan-since date — when they joined

Why Scrape Bandcamp?

Bandcamp data has numerous practical applications:

  1. Music Discovery: Build recommendation engines based on genre tags and fan overlap
  2. Market Research: Analyze pricing strategies across genres and regions
  3. Trend Spotting: Identify emerging genres and artists before they break out
  4. Academic Research: Study independent music economics and distribution patterns
  5. Label Scouting: Find promising unsigned artists based on engagement metrics
  6. Data Journalism: Report on the state of independent music

Bandcamp's Technical Architecture

Bandcamp uses a relatively straightforward server-rendered architecture. Most pages are delivered as complete HTML with embedded JSON data, which makes scraping more reliable than heavily JavaScript-dependent sites.

Embedded JSON Data (TralbumData)

One of the most valuable aspects of Bandcamp's architecture for scrapers is that each album and track page embeds structured data directly in the HTML. Look for the data-tralbum attribute on the page:

// This JSON is embedded right in the HTML!
const tralbumData = {
  current: {
    title: "Album Title",
    artist: "Artist Name",
    release_date: "01 Mar 2026 00:00:00 GMT",
    minimum_price: 7.00,
    art_id: 1234567890,
  },
  trackinfo: [
    {
      title: "Track One",
      duration: 234.5,
      file: { "mp3-128": "https://..." },
      track_num: 1,
    },
    // ... more tracks
  ],
  url: "https://artist.bandcamp.com/album/album-name",
};
Enter fullscreen mode Exit fullscreen mode

This is extremely convenient — you don't need to make separate API calls or parse complex DOM structures. The data is right there in the page source.

The Bandcamp API (Internal)

Bandcamp also has internal API endpoints used by its frontend. Some useful ones include:

https://bandcamp.com/api/discover/3/get_web
https://bandcamp.com/api/bcweekly/3/list
https://bandcamp.com/api/fancollection/1/collection_items
Enter fullscreen mode Exit fullscreen mode

These endpoints return JSON and support parameters for filtering and pagination.

Building a Bandcamp Track Scraper

Let's build a scraper that extracts track and album data from Bandcamp artist pages.

Setting Up the Project

mkdir bandcamp-scraper && cd bandcamp-scraper
npm init -y
npm install crawlee cheerio
Enter fullscreen mode Exit fullscreen mode

Extracting Album Data

const { CheerioCrawler, Dataset } = require('crawlee');

const crawler = new CheerioCrawler({
  maxRequestsPerCrawl: 50,
  async requestHandler({ $, request, log, enqueueLinks }) {
    const url = request.url;

    // If this is an artist page, find all album links
    if (!url.includes('/album/') && !url.includes('/track/')) {
      log.info(`Scanning artist page: ${url}`);

      // Enqueue all album links
      await enqueueLinks({
        selector: 'a[href*="/album/"]',
        baseUrl: url,
      });
      return;
    }

    // This is an album page — extract the embedded data
    log.info(`Scraping album: ${url}`);

    // Find the embedded TralbumData
    const scriptContent = $('script[data-tralbum]').attr('data-tralbum');
    if (!scriptContent) {
      // Try alternative: look for it in a script tag
      const scripts = $('script').toArray();
      for (const script of scripts) {
        const text = $(script).html();
        if (text && text.includes('TralbumData')) {
          // Parse the embedded JSON
          const match = text.match(/TralbumData\s*=\s*({.*?});/s);
          if (match) {
            const data = JSON.parse(match[1]);
            await processAlbumData(data, url);
            return;
          }
        }
      }
      return;
    }

    const data = JSON.parse(scriptContent);
    await processAlbumData(data, url);
  },
});

async function processAlbumData(data, url) {
  const album = {
    url,
    title: data.current?.title,
    artist: data.current?.artist || data.artist,
    releaseDate: data.current?.release_date,
    minimumPrice: data.current?.minimum_price,
    currency: data.current?.currency,
    about: data.current?.about,
    tracks: (data.trackinfo || []).map(track => ({
      title: track.title,
      duration: track.duration,
      trackNumber: track.track_num,
      hasLyrics: !!track.has_lyrics,
      isStreamable: !!track.file,
    })),
    tags: data.current?.tags || [],
    trackCount: data.trackinfo?.length || 0,
    scrapedAt: new Date().toISOString(),
  };

  await Dataset.pushData(album);
}

// Run starting from an artist page
await crawler.run(['https://artist.bandcamp.com/']);
Enter fullscreen mode Exit fullscreen mode

Scraping Artist Profile Details

async function scrapeArtistPage($, url) {
  const bandData = {};

  // Extract basic info
  bandData.name = $('#band-name-location .title').text().trim();
  bandData.location = $('#band-name-location .location').text().trim();
  bandData.bio = $('meta[property="og:description"]').attr('content') || '';
  bandData.imageUrl = $('img.band-photo').attr('src') || null;
  bandData.url = url;

  // Extract discography links
  bandData.albums = [];
  $('#music-grid .music-grid-item').each((i, el) => {
    const $el = $(el);
    bandData.albums.push({
      title: $el.find('.title').text().trim(),
      url: new URL($el.find('a').attr('href'), url).toString(),
      artUrl: $el.find('img').attr('src') || null,
    });
  });

  // Extract links
  bandData.links = [];
  $('#band-links a').each((i, el) => {
    bandData.links.push({
      text: $(el).text().trim(),
      url: $(el).attr('href'),
    });
  });

  return bandData;
}
Enter fullscreen mode Exit fullscreen mode

Extracting Fan Collection Data

Fan collections reveal purchasing patterns and taste profiles:

async function scrapeFanCollection(fanUrl) {
  const { CheerioCrawler } = require('crawlee');
  const collections = [];

  const crawler = new CheerioCrawler({
    async requestHandler({ $, request, log }) {
      log.info(`Scraping fan page: ${request.url}`);

      // Extract collection items
      const itemsData = $('div[data-blob]').attr('data-blob');
      if (itemsData) {
        const blob = JSON.parse(itemsData);
        const items = blob.item_cache || {};

        Object.values(items).forEach(item => {
          collections.push({
            type: item.tralbum_type === 'a' ? 'album' : 'track',
            title: item.album_title || item.title,
            artist: item.band_name,
            purchaseDate: item.purchased,
            itemUrl: item.item_url,
            artId: item.art_id,
          });
        });
      }

      // Extract fan info
      const fanName = $('#fan-name').text().trim();
      const fanSince = $('.fan-since').text().trim();

      log.info(`Fan: ${fanName}, Collection: ${collections.length} items`);
    },
  });

  await crawler.run([fanUrl]);
  return collections;
}
Enter fullscreen mode Exit fullscreen mode

Scraping Bandcamp Discover and Tags

Bandcamp's discover page and tag system are excellent for trend analysis:

async function scrapeDiscoverPage(genre, subgenre = null) {
  const params = new URLSearchParams({
    g: genre,          // e.g., 'electronic', 'rock', 'hip-hop-rap'
    t: 'top',          // 'top', 'new', 'rec'
    f: 'all',          // format: 'all', 'digital', 'vinyl'
    w: 0,              // time window: 0=all, 1=past week, 2=past month
    p: 0,              // page number
  });

  if (subgenre) {
    params.set('s', subgenre);
  }

  const response = await fetch(
    `https://bandcamp.com/api/discover/3/get_web?${params}`
  );
  const data = await response.json();

  return data.items.map(item => ({
    title: item.primary_text,
    artist: item.secondary_text,
    genre: item.genre_text,
    url: item.tralbum_url,
    artUrl: `https://f4.bcbits.com/img/a${item.art_id}_16.jpg`,
    featuredDate: item.featured_date_s,
  }));
}
Enter fullscreen mode Exit fullscreen mode

Tag-Based Scraping

Bandcamp's tag pages group music by genre and descriptive tags:

async function scrapeTagPage($, tagUrl) {
  const results = [];

  // Extract albums from the tag page
  $('.item_list .item').each((i, el) => {
    const $el = $(el);
    results.push({
      title: $el.find('.itemtext').text().trim(),
      artist: $el.find('.itemsubtext').text().trim(),
      url: $el.find('a').attr('href'),
      artUrl: $el.find('img').attr('src'),
    });
  });

  // Get related tags
  const relatedTags = [];
  $('.tags_cloud a').each((i, el) => {
    relatedTags.push({
      tag: $(el).text().trim(),
      url: $(el).attr('href'),
    });
  });

  return { results, relatedTags };
}
Enter fullscreen mode Exit fullscreen mode

Scaling with Apify

While building your own Bandcamp scraper is educational, running it at scale requires infrastructure for proxy management, scheduling, and result storage. Apify provides all of this out of the box.

Why Use Apify for Bandcamp Scraping?

  1. Cloud execution — no need to keep your machine running
  2. Proxy management — automatic IP rotation to avoid rate limiting
  3. Data storage — built-in datasets with export to JSON, CSV, Excel
  4. Scheduling — run scrapers hourly, daily, or weekly
  5. Monitoring — get alerts when scrapers fail
  6. Pay-per-result — cost-effective pricing model

Using Bandcamp Scrapers from the Apify Store

The Apify Store offers pre-built scrapers for various music platforms. These actors handle the complexities of scraping — pagination, rate limiting, proxy rotation — so you can focus on your analysis.

To get started:

  1. Sign up at apify.com
  2. Search the Store for music and Bandcamp-related actors
  3. Configure inputs — specify artist URLs, genres, or search terms
  4. Run and download — execute in the cloud and get structured results

Running Bandcamp Scrapers via the API

const { ApifyClient } = require('apify-client');

const client = new ApifyClient({
  token: 'YOUR_APIFY_TOKEN',
});

async function scrapeBandcampArtist(artistUrl) {
  const run = await client.actor('YOUR_ACTOR_ID').call({
    startUrls: [{ url: artistUrl }],
    maxAlbums: 50,
    includeTracks: true,
    includeFans: false,
    proxy: {
      useApifyProxy: true,
    },
  });

  const { items } = await client.dataset(run.defaultDatasetId).listItems();
  console.log(`Scraped ${items.length} albums/tracks`);
  return items;
}

// Usage
scrapeBandcampArtist('https://artist.bandcamp.com/');
Enter fullscreen mode Exit fullscreen mode

Building a Genre Trend Monitor

Combine Apify scheduling with Bandcamp's discover API to track genre trends over time:

const { Actor } = require('apify');

Actor.main(async () => {
  const input = await Actor.getInput();
  const { genres = ['electronic', 'indie', 'hip-hop-rap'] } = input;

  const results = [];

  for (const genre of genres) {
    const response = await fetch(
      `https://bandcamp.com/api/discover/3/get_web?g=${genre}&t=top&f=all&w=1&p=0`
    );
    const data = await response.json();

    data.items.forEach(item => {
      results.push({
        genre,
        title: item.primary_text,
        artist: item.secondary_text,
        url: item.tralbum_url,
        featuredDate: item.featured_date_s,
        scrapedAt: new Date().toISOString(),
      });
    });
  }

  await Actor.pushData(results);
  console.log(`Tracked ${results.length} trending items across ${genres.length} genres`);
});
Enter fullscreen mode Exit fullscreen mode

Data Analysis: What Can You Do With Bandcamp Data?

Pricing Analysis

function analyzePricing(albums) {
  const priced = albums.filter(a => a.minimumPrice > 0);
  const nameYourPrice = albums.filter(a => a.minimumPrice === 0);

  const avgPrice = priced.reduce((sum, a) => sum + a.minimumPrice, 0) / priced.length;

  const priceByGenre = {};
  albums.forEach(album => {
    (album.tags || []).forEach(tag => {
      if (!priceByGenre[tag]) priceByGenre[tag] = [];
      priceByGenre[tag].push(album.minimumPrice);
    });
  });

  const avgByGenre = {};
  Object.entries(priceByGenre).forEach(([genre, prices]) => {
    avgByGenre[genre] = prices.reduce((a, b) => a + b, 0) / prices.length;
  });

  return {
    totalAlbums: albums.length,
    pricedAlbums: priced.length,
    nameYourPriceAlbums: nameYourPrice.length,
    averagePrice: avgPrice.toFixed(2),
    averagePriceByGenre: avgByGenre,
  };
}
Enter fullscreen mode Exit fullscreen mode

Fan Overlap Analysis

function findFanOverlap(artist1Fans, artist2Fans) {
  const set1 = new Set(artist1Fans.map(f => f.username));
  const overlap = artist2Fans.filter(f => set1.has(f.username));

  return {
    artist1FanCount: artist1Fans.length,
    artist2FanCount: artist2Fans.length,
    overlapCount: overlap.length,
    overlapPercentage: (
      (overlap.length / Math.min(artist1Fans.length, artist2Fans.length)) * 100
    ).toFixed(1),
    sharedFans: overlap.map(f => f.username),
  };
}
Enter fullscreen mode Exit fullscreen mode

Best Practices and Ethics

When scraping Bandcamp, follow these guidelines:

  1. Respect rate limits: Bandcamp is an independent platform — don't hammer their servers. Add delays of 2-5 seconds between requests.
  2. Check robots.txt: Review and respect Bandcamp's crawling directives.
  3. Don't scrape audio files: Downloading music without permission is piracy. Scrape metadata only.
  4. Respect artist privacy: Some artists may not want their data aggregated. Be mindful of how you use and share the data.
  5. Support the artists: If you find music you like through your data analysis, buy it! Bandcamp pays artists directly.
  6. Cache responsibly: Store results locally to minimize repeat requests.
  7. Comply with regulations: Follow GDPR/CCPA when handling fan data.

Conclusion

Bandcamp is a uniquely scraper-friendly platform thanks to its clean architecture and embedded JSON data. Whether you're building a music recommendation engine, analyzing pricing trends in independent music, or scouting for emerging artists, Bandcamp data provides rich, structured insights.

For production scraping at scale, the Apify platform handles the infrastructure challenges — proxy rotation, cloud execution, scheduling, and data export — letting you focus on the analysis and insights that matter.

The independent music ecosystem is a vibrant, data-rich space. With the right scraping tools and ethical practices, you can unlock powerful insights that benefit artists, labels, and music lovers alike.


Explore the Apify Store for ready-to-use music and web scraping actors that handle the infrastructure complexity for you.

Top comments (0)