DEV Community

agenthustler
agenthustler

Posted on

Scraping SoundCloud in 2026: Public API, Track Data, and Artist Stats

SoundCloud doesn't offer a public API program for new developers anymore — registrations have been closed since 2017. But SoundCloud's web app uses an internal API at api-v2.soundcloud.com that's accessible if you know how to extract the client_id. In this tutorial, we'll walk through how it works, how to use it with Python, and when you should use a managed tool instead.

How SoundCloud's Internal API Works

When you load soundcloud.com in a browser, the JavaScript frontend makes requests to api-v2.soundcloud.com with a client_id parameter. This client_id is embedded in one of SoundCloud's JavaScript bundles and rotates periodically.

The API returns JSON and covers most of what you see on the site: tracks, users, playlists, search results, and more.

Step 1: Extract the client_id

The client_id is embedded in SoundCloud's JS bundles. Here's how to extract it programmatically:

import httpx
import re

def get_client_id() -> str:
    """Extract SoundCloud's current client_id from their JS bundles."""
    resp = httpx.get("https://soundcloud.com")
    # Find JS bundle URLs
    scripts = re.findall(r'src="(https://a-v2\.sndcdn\.com/assets/[^"]+\.js)"', resp.text)

    for script_url in scripts:
        js = httpx.get(script_url).text
        match = re.search(r'client_id:"([a-zA-Z0-9]{32})"', js)
        if match:
            return match.group(1)

    raise ValueError("Could not find client_id")

client_id = get_client_id()
print(f"client_id: {client_id}")
Enter fullscreen mode Exit fullscreen mode

Important: The client_id changes every few weeks. Your code needs to re-extract it periodically or handle 401 errors by refreshing.

Step 2: Search for Tracks

With the client_id, you can query the search endpoint:

import httpx

BASE = "https://api-v2.soundcloud.com"

def search_tracks(query: str, client_id: str, limit: int = 20):
    resp = httpx.get(f"{BASE}/search/tracks", params={
        "q": query,
        "client_id": client_id,
        "limit": limit,
        "offset": 0,
    })
    resp.raise_for_status()
    data = resp.json()
    return data["collection"]

tracks = search_tracks("lo-fi hip hop", client_id, limit=10)
for track in tracks:
    print(f"{track['title']} by {track['user']['username']}")
    print(f"  Plays: {track['playback_count']:,}")
    print(f"  Likes: {track['likes_count']:,}")
    print(f"  Genre: {track.get('genre', 'N/A')}")
    print()
Enter fullscreen mode Exit fullscreen mode

Step 3: Get Artist Profiles

def get_user(user_id: int, client_id: str):
    resp = httpx.get(f"{BASE}/users/{user_id}", params={
        "client_id": client_id,
    })
    resp.raise_for_status()
    return resp.json()

def resolve_url(url: str, client_id: str):
    """Resolve a SoundCloud URL to its API object."""
    resp = httpx.get(f"{BASE}/resolve", params={
        "url": url,
        "client_id": client_id,
    })
    resp.raise_for_status()
    return resp.json()

# Resolve an artist URL
artist = resolve_url("https://soundcloud.com/flaboratory", client_id)
print(f"Artist: {artist['username']}")
print(f"Followers: {artist['followers_count']:,}")
print(f"Tracks: {artist['track_count']}")
Enter fullscreen mode Exit fullscreen mode

Step 4: Get Track Details and Playlists

def get_user_tracks(user_id: int, client_id: str, limit: int = 50):
    resp = httpx.get(f"{BASE}/users/{user_id}/tracks", params={
        "client_id": client_id,
        "limit": limit,
        "offset": 0,
    })
    resp.raise_for_status()
    return resp.json()["collection"]

def get_playlist(playlist_id: int, client_id: str):
    resp = httpx.get(f"{BASE}/playlists/{playlist_id}", params={
        "client_id": client_id,
    })
    resp.raise_for_status()
    return resp.json()
Enter fullscreen mode Exit fullscreen mode

Available Data Fields

Here's what the track endpoint returns:

Field Description
title Track title
user.username Artist name
playback_count Total plays
likes_count Total likes
reposts_count Total reposts
comment_count Total comments
duration Length in milliseconds
genre Genre tag
tag_list Space-separated tags
created_at Upload timestamp
permalink_url Public URL
artwork_url Cover art URL
waveform_url Waveform data URL
description Track description

The Limitations of DIY Scraping

Building your own SoundCloud scraper works for small projects, but you'll run into issues at scale:

  1. client_id rotation: SoundCloud rotates the client_id. Your scraper breaks silently until you detect and refresh it.
  2. Rate limiting: The API enforces rate limits. Without proxy rotation, you'll get blocked at a few hundred requests per hour.
  3. IP blocking: Aggressive scraping from a single IP triggers blocks.
  4. Pagination complexity: Large result sets require cursor-based pagination that varies by endpoint.
  5. Data completeness: Some fields are only available through certain endpoints or require additional API calls.

The Easier Alternative: Managed Scraping

If you need reliable, production-grade SoundCloud data without maintaining scraper infrastructure, SoundCloud Scraper on Apify handles all of this for you:

  • Automatic client_id management
  • Built-in proxy rotation and rate limit handling
  • Pagination handled automatically
  • Outputs in JSON, CSV, or Excel
  • Cloud execution — no servers to manage
  • API access for automated pipelines
from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")
run = client.actor("cryptosignals/soundcloud-scraper").call(
    run_input={
        "searchQueries": ["ambient electronic"],
        "maxItems": 200
    }
)

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(f"{item['title']}{item['playCount']:,} plays")
Enter fullscreen mode Exit fullscreen mode

This gives you the same data with none of the maintenance overhead.

When to Use Each Approach

Scenario Recommendation
Learning / experimenting DIY with httpx
One-off data pull (<100 tracks) DIY with httpx
Production pipeline Apify SoundCloud Scraper
Daily automated extraction Apify SoundCloud Scraper
Custom data processing needs DIY + Apify as fallback

Conclusion

SoundCloud's internal API at api-v2.soundcloud.com gives you access to track metadata, artist profiles, playlists, and search results — all in clean JSON. For prototyping and small-scale use, the httpx approach works well. For anything production-grade, a managed solution like the Apify SoundCloud Scraper saves you from the ongoing maintenance of client_id extraction, proxy management, and rate limit handling.


This article is part of the Web Scraping in 2026 series. Check out the companion article: Best SoundCloud Scrapers in 2026.

Top comments (0)