DEV Community

agenthustler
agenthustler

Posted on

Scraping Metacritic in 2026: Metascores, Reviews, and Game Data Without API Keys

Metacritic doesn't offer a public API. Never has. But there's a clean way to get structured Metascores, reviews, and game data without API keys, without headless browsers, and without fragile HTML parsing.

The key insight: Metacritic's frontend is powered by a backend API at backend.metacritic.com that returns clean JSON. You can hit it directly with standard HTTP requests.

Let's build a Metacritic scraper from scratch.

Prerequisites

pip install httpx beautifulsoup4
Enter fullscreen mode Exit fullscreen mode

We'll use httpx for async HTTP requests and beautifulsoup4 as a fallback for any HTML parsing needs. But the main approach is pure API calls — no HTML parsing required for most data.

Understanding Metacritic's Backend API

Open your browser's DevTools on any Metacritic page and watch the Network tab. You'll see requests going to backend.metacritic.com. These endpoints return JSON with all the data the frontend needs — scores, reviews, release dates, platforms, descriptions.

The key headers you need:

import httpx

HEADERS = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
    "Referer": "https://www.metacritic.com/",
    "Origin": "https://www.metacritic.com",
    "Accept": "application/json",
}
Enter fullscreen mode Exit fullscreen mode

That's it. No API key. No authentication token. No OAuth flow. Just standard browser headers so the server knows you're coming from the Metacritic frontend.

Fetching Game Data

Here's how to fetch data for a specific game:

import httpx
import json

HEADERS = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Referer": "https://www.metacritic.com/",
    "Origin": "https://www.metacritic.com",
    "Accept": "application/json",
}

def get_game_details(slug: str, platform: str = "playstation-5") -> dict:
    """Fetch game details from Metacritic's backend API."""
    url = f"https://backend.metacritic.com/v1/catalog/game/{slug}/platform/{platform}"

    with httpx.Client(headers=HEADERS) as client:
        response = client.get(url)
        response.raise_for_status()
        return response.json()

# Example usage
game = get_game_details("grand-theft-auto-vi")
print(f"Title: {game['title']}")
print(f"Metascore: {game['metaScore']}")
print(f"User Score: {game['userScore']}")
print(f"Critic Reviews: {game['criticReviewCount']}")
Enter fullscreen mode Exit fullscreen mode

Scraping Reviews

Critic and user reviews are available through separate endpoints:

def get_critic_reviews(slug: str, platform: str = "playstation-5", page: int = 0) -> list:
    """Fetch critic reviews for a game."""
    url = f"https://backend.metacritic.com/v1/catalog/game/{slug}/platform/{platform}/critic-reviews"
    params = {"page": page}

    with httpx.Client(headers=HEADERS) as client:
        response = client.get(url, params=params)
        response.raise_for_status()
        data = response.json()
        return data.get("reviews", [])

def get_user_reviews(slug: str, platform: str = "playstation-5", page: int = 0) -> list:
    """Fetch user reviews for a game."""
    url = f"https://backend.metacritic.com/v1/catalog/game/{slug}/platform/{platform}/user-reviews"
    params = {"page": page}

    with httpx.Client(headers=HEADERS) as client:
        response = client.get(url, params=params)
        response.raise_for_status()
        data = response.json()
        return data.get("reviews", [])

# Fetch first page of critic reviews
reviews = get_critic_reviews("elden-ring")
for review in reviews[:3]:
    print(f"{review['publication']}: {review['score']}/100")
    print(f"  {review['snippet'][:100]}...")
Enter fullscreen mode Exit fullscreen mode

Browsing by Category

Want all games for a platform, sorted by Metascore?

def browse_games(platform: str = "playstation-5", sort: str = "score", page: int = 0) -> list:
    """Browse games by platform and sort order."""
    url = f"https://backend.metacritic.com/v1/catalog/browse/game/platform/{platform}"
    params = {
        "sort": sort,  # 'score', 'date', 'title'
        "page": page,
    }

    with httpx.Client(headers=HEADERS) as client:
        response = client.get(url, params=params)
        response.raise_for_status()
        data = response.json()
        return data.get("items", [])

# Top PS5 games by Metascore
top_games = browse_games("playstation-5", sort="score")
for game in top_games[:10]:
    print(f"{game['title']}: {game.get('metaScore', 'N/A')}")
Enter fullscreen mode Exit fullscreen mode

Going Async for Speed

If you need to scrape hundreds of titles, go async:

import httpx
import asyncio

async def fetch_games_async(slugs: list[str], platform: str = "playstation-5") -> list[dict]:
    """Fetch multiple games concurrently."""
    async with httpx.AsyncClient(headers=HEADERS) as client:
        tasks = []
        for slug in slugs:
            url = f"https://backend.metacritic.com/v1/catalog/game/{slug}/platform/{platform}"
            tasks.append(client.get(url))

        responses = await asyncio.gather(*tasks, return_exceptions=True)

        results = []
        for resp in responses:
            if isinstance(resp, Exception):
                continue
            if resp.status_code == 200:
                results.append(resp.json())

        return results

# Fetch 50 games in parallel
slugs = ["elden-ring", "baldurs-gate-3", "grand-theft-auto-vi"]
games = asyncio.run(fetch_games_async(slugs))
Enter fullscreen mode Exit fullscreen mode

Add a semaphore to be respectful:

semaphore = asyncio.Semaphore(5)  # Max 5 concurrent requests

async def fetch_with_limit(client, url):
    async with semaphore:
        await asyncio.sleep(0.5)  # Be nice
        return await client.get(url)
Enter fullscreen mode Exit fullscreen mode

Handling Movies and TV Shows

The same API pattern works for movies and TV:

# Movies
movie = get_details("dune-part-two", media_type="movie")

# TV Shows  
show = get_details("the-last-of-us", media_type="tv")

def get_details(slug: str, media_type: str = "game", platform: str = None) -> dict:
    """Generic fetch for any media type."""
    if media_type == "game" and platform:
        url = f"https://backend.metacritic.com/v1/catalog/{media_type}/{slug}/platform/{platform}"
    else:
        url = f"https://backend.metacritic.com/v1/catalog/{media_type}/{slug}"

    with httpx.Client(headers=HEADERS) as client:
        response = client.get(url)
        response.raise_for_status()
        return response.json()
Enter fullscreen mode Exit fullscreen mode

Common Pitfalls

1. Rate limiting: Metacritic will block you if you hammer their API. Add delays between requests (0.5-1 second minimum). Use a semaphore for async code.

2. Slug format: Game slugs use lowercase with hyphens. "The Legend of Zelda: Tears of the Kingdom" becomes the-legend-of-zelda-tears-of-the-kingdom. Check the URL on metacritic.com if unsure.

3. Platform identifiers: Use lowercase with hyphens: playstation-5, xbox-series-x, pc, nintendo-switch.

4. Missing data: Not all fields are present for all titles. Always use .get() with defaults.

5. Endpoint changes: While more stable than HTML, the backend API can change. If you need production reliability, consider using the Metacritic Scraper on Apify which is actively maintained and handles these changes.

When to Build vs. Buy

The code above works great for one-off analysis, research projects, and learning. But if you need:

  • Production reliability — endpoint monitoring, automatic fixes when things break
  • Proxy rotation — to avoid IP blocks at scale
  • Scheduling — automated daily/weekly runs
  • Data storage — managed datasets with export to JSON/CSV/Excel

Then a managed solution like the Metacritic Scraper on Apify saves you significant maintenance time. It uses the same backend API approach described here, packaged as a cloud-ready actor with built-in proxy support and scheduling.

Putting It All Together

Here's a complete script that scrapes the top 100 games for a platform and exports to CSV:

import httpx
import csv
import time

HEADERS = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
    "Referer": "https://www.metacritic.com/",
    "Origin": "https://www.metacritic.com",
    "Accept": "application/json",
}

def scrape_top_games(platform: str, pages: int = 5) -> list[dict]:
    all_games = []
    with httpx.Client(headers=HEADERS, timeout=30) as client:
        for page in range(pages):
            url = f"https://backend.metacritic.com/v1/catalog/browse/game/platform/{platform}"
            resp = client.get(url, params={"sort": "score", "page": page})
            if resp.status_code != 200:
                break
            items = resp.json().get("items", [])
            if not items:
                break
            all_games.extend(items)
            time.sleep(1)
    return all_games

games = scrape_top_games("playstation-5")

with open("metacritic_ps5_top.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["title", "metaScore", "userScore", "releaseDate"])
    writer.writeheader()
    for g in games:
        writer.writerow({
            "title": g.get("title"),
            "metaScore": g.get("metaScore"),
            "userScore": g.get("userScore"),
            "releaseDate": g.get("releaseDate"),
        })

print(f"Exported {len(games)} games to metacritic_ps5_top.csv")
Enter fullscreen mode Exit fullscreen mode

Wrapping Up

Metacritic's backend API is the cleanest way to get review data in 2026. No API keys, no headless browsers, no fragile HTML selectors. Just HTTP requests and JSON responses.

The approach works for games, movies, and TV shows. Add async for speed, respect rate limits, and you've got a solid data pipeline.

For production use, check out the Metacritic Scraper on Apify — same approach, zero maintenance.


Questions about scraping Metacritic? Drop them in the comments.

Top comments (0)