IMDb is the largest movie database on the planet — 10M+ titles, 500M monthly visitors, and the go-to source for ratings, cast info, and box office data. But since Amazon shut down the IMDb API in 2023, there is no official way to get bulk data anymore.
Scraping is the only reliable option left. And in 2026, IMDb actually makes it easier than you might think.
Why IMDb Is Surprisingly Scrapable
IMDb runs on Next.js and embeds two goldmines in every page:
-
JSON-LD structured data — Schema.org Movie objects with title, rating, director, genre, and more, right in the
<head>tag -
__NEXT_DATA__— A full JSON payload with cast lists, box office numbers, runtime, and metadata that the frontend hydrates from
This means you do not need to parse HTML tables or deal with CSS selectors that break every redesign. The data is already structured. You just need to extract it.
No login required. No API key. No rate-limit headers. Just fetch the page and parse the JSON.
The Fast Way: Use a Ready-Made IMDb Scraper
If you want results in minutes instead of hours of coding, the IMDb Scraper on Apify handles all of this out of the box. It supports four modes:
| Mode | What It Does |
|---|---|
| Search | Find movies by keyword, genre, or year |
| Movie Details | Full data for specific titles |
| Actor/Person | Filmography, bio, known-for titles |
| Top Charts | IMDb Top 250, Most Popular, Box Office |
Example: Search for Sci-Fi Movies from 2025
{
"mode": "search",
"query": "sci-fi 2025",
"maxItems": 50
}
What You Get Back
Each result includes:
{
"title": "Emergence",
"year": 2025,
"rating": 7.4,
"genres": ["Sci-Fi", "Thriller"],
"director": "Denis Villeneuve",
"cast": ["Timothée Chalamet", "Zendaya"],
"runtime": "148 min",
"boxOffice": "$312M",
"plot": "A physicist discovers...",
"imdbUrl": "https://www.imdb.com/title/tt1234567/"
}
Clean, structured, ready to pipe into a database or spreadsheet.
Use Cases That Actually Make Money
1. Movie Recommendation Engines
Pull ratings, genres, and cast data for thousands of titles. Feed it into a collaborative filtering model. Services like Letterboxd and JustWatch started with exactly this kind of data pipeline.
2. Film Industry Research
Track box office performance by genre, director, or studio. Hedge funds and entertainment analysts pay for this data — and IMDb is the primary source.
3. Content Aggregation
Build a niche movie site (horror rankings, Oscar predictions, franchise trackers) with auto-updated data. Monetize with ads or affiliate links to streaming platforms.
4. Academic & Data Science Projects
IMDb datasets on Kaggle are years old and incomplete. A live scraper gives you current ratings, new releases, and trending titles that static datasets miss.
Building Your Own IMDb Scraper
If you prefer to build from scratch, here is the approach:
- Fetch the page with a headless browser or HTTP client (IMDb does not heavily block requests)
-
Extract
__NEXT_DATA__from the<script id="__NEXT_DATA__">tag -
Parse JSON-LD from
<script type="application/ld+json"> -
Merge both sources — JSON-LD has clean Schema.org fields,
__NEXT_DATA__has deeper details like full cast and box office
The main challenge is pagination for search results and handling IMDb's occasional layout changes. A managed scraper like the Apify actor handles retries, proxy rotation, and schema changes automatically.
Proxy Considerations
IMDb is lenient compared to most sites, but if you are pulling thousands of pages, you will want proxies. ScrapeOps offers a proxy aggregator that works well for entertainment sites — it rotates across multiple providers and handles CAPTCHAs if they appear.
For smaller jobs (under 500 pages), residential proxies are overkill. Datacenter proxies or even raw requests with reasonable delays (2-3 seconds between requests) work fine. For larger jobs where you do want residential IPs to avoid any throttling, ThorData offers good residential proxy bandwidth at per-GB rates — useful if you are pulling thousands of movie pages in a single run.
Legal Notes
IMDb's Terms of Service restrict automated access, but courts have consistently ruled that scraping publicly available data is legal (hiQ v. LinkedIn, 2022). That said:
- Do not scrape user reviews or personal data at scale
- Do not republish IMDb ratings while claiming them as your own
- Respect robots.txt and rate limits
- Use the data for analysis, aggregation, or building derivative products
Getting Started
The fastest path from zero to data:
- Go to the IMDb Scraper on Apify
- Set your mode (search, details, charts, or person)
- Run it — free tier gives you enough for testing
- Export as JSON, CSV, or push directly to your database
IMDb is not going to build another API. Scraping is the permanent solution. The data is structured, the access is straightforward, and the use cases are real. Start pulling data today.
Top comments (0)