Introduction
Music streaming platforms generate massive amounts of data — from chart rankings to artist statistics. Whether you're building a music analytics dashboard, tracking emerging artists, or analyzing genre trends, scraping streaming data opens up powerful insights.
In this tutorial, we'll build a Python scraper that collects Spotify chart data and artist statistics from publicly available sources.
Setting Up the Environment
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Scraping Spotify Charts Data
Spotify's public chart pages display top tracks by country and globally. Let's build a scraper for these:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Collecting Artist Statistics
Beyond charts, artist profile pages contain monthly listeners, follower counts, and popular tracks:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Building a Trend Tracker
The real value comes from tracking changes over time:
def track_chart_movements(countries=None, interval_days=7):
"""Track chart position changes across countries."""
if countries is None:
countries = ["global", "us", "gb", "de", "jp", "br"]
all_data = []
for country in countries:
print(f"Scraping charts for {country}...")
tracks = scrape_spotify_charts(country)
for track in tracks:
track["country"] = country
track["date"] = pd.Timestamp.now().strftime("%Y-%m-%d")
all_data.append(track)
time.sleep(2) # Respect rate limits
df = pd.DataFrame(all_data)
df.to_csv(f"charts_{pd.Timestamp.now().strftime('%Y%m%d')}.csv", index=False)
return df
def analyze_trends(historical_dir="./data"):
"""Analyze chart trends from historical data."""
import glob
files = glob.glob(f"{historical_dir}/charts_*.csv")
all_charts = pd.concat([pd.read_csv(f) for f in files])
# Find fastest risers
risers = all_charts.groupby(["title", "artist"]).agg(
best_rank=("rank", "min"),
worst_rank=("rank", "max"),
appearances=("rank", "count")
).reset_index()
risers["climb"] = risers["worst_rank"] - risers["best_rank"]
return risers.sort_values("climb", ascending=False).head(20)
Monitoring Multiple Platforms
For comprehensive analysis, scrape across platforms and compare:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Data Storage and Analysis
import sqlite3
def store_chart_data(tracks, db_path="music_data.db"):
"""Store scraped data in SQLite for analysis."""
conn = sqlite3.connect(db_path)
df = pd.DataFrame(tracks)
df.to_sql("charts", conn, if_exists="append", index=False)
conn.close()
Conclusion
By combining chart scraping, artist statistics, and cross-platform comparison, you can build powerful music analytics tools. Remember to respect robots.txt, use rate limiting, and consider using ScraperAPI for handling JavaScript rendering and proxy rotation at scale.
The complete code with scheduling and visualization is available in the examples above — adapt it to track the genres and markets that matter to your analysis.
Top comments (0)