Google Play has no official public API for app listings or reviews. So if you want app details, ratings, the ratings histogram, or customer reviews as clean JSON — for ASO tracking, review monitoring, an ML dataset, or market research — you either parse Google's deeply-nested, undocumented page payloads yourself, or you call something that already does.
This tutorial does the second thing. We'll pull Google Play data with Node.js in a few lines, without an API key, without proxies, and without writing a single HTML parser, by calling a hosted scraper actor on Apify. At the end I'll show the same thing for one-off runs you don't even need code for.
Why not just fetch() the Play Store?
You can fetch('https://play.google.com/store/apps/details?id=com.spotify.music') — but the useful data isn't in clean HTML. Google embeds it in AF_initDataCallback blocks as positional arrays with no field names, e.g. the exact rating lives at roughly data[1][2][51][0][1] and the ratings histogram at data[1][2][51][1]. Reviews come from a separate batchexecute RPC that returns a )]}'-prefixed blob and throttles aggressively. Those index paths also shift when Google reshapes the page, which is why hand-rolled scrapers quietly break.
Offloading the parsing + rate-limit handling to a maintained actor means you get stable, named fields like rating, ratingHistogram, installs, and developerResponse instead.
Setup
You'll need a free Apify account and your API token (Settings → Integrations). Then install the official client:
npm install apify-client
Set your token as an env var so it isn't hard-coded:
export APIFY_TOKEN="your_token_here"
1. Scrape app details
Here we fetch full details for two apps. The actor's input takes a mode, an array of appIds (package names — the id= part of a Play Store URL), and an optional country.
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
const run = await client.actor('freshactors/google-play-scraper').call({
mode: 'details',
appIds: ['com.spotify.music', 'com.whatsapp'],
country: 'us',
lang: 'en',
});
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const app of items) {
console.log(`${app.title} — ${app.rating}★ (${app.ratingCount.toLocaleString()} ratings), ${app.installs}`);
}
Each item is a clean object. An abridged record looks like this:
{
"_type": "app_details",
"appId": "com.spotify.music",
"title": "Spotify: Music and Podcasts",
"developer": "Spotify AB",
"rating": 4.33,
"ratingCount": 35677435,
"ratingHistogram": { "1": 3695081, "2": 1079953, "3": 1332269, "4": 3058122, "5": 26511993 },
"installs": "1,000,000,000+",
"free": true,
"currency": "USD",
"genre": "Music & Audio",
"contentRating": "Teen",
"containsAds": true,
"updated": "2026-05-26T14:23:49.000Z",
"url": "https://play.google.com/store/apps/details?id=com.spotify.music"
}
Note the ratingHistogram — the per-star breakdown that's painful to extract by hand but is exactly what you need to compute, say, the share of 1★ reviews over time.
2. Scrape reviews (with pagination handled for you)
Switch mode to reviews and set maxReviewsPerApp. The actor paginates the reviews RPC, de-duplicates, and retries on throttling under the hood — you just say how many you want and how to sort them.
const reviewsRun = await client.actor('freshactors/google-play-scraper').call({
mode: 'reviews',
appIds: ['com.spotify.music'],
maxReviewsPerApp: 500,
reviewsSort: 'newest', // or 'mostHelpful' | 'rating'
country: 'us',
});
const { items: reviews } = await client.dataset(reviewsRun.defaultDatasetId).listItems();
// Quick sentiment-ish slice: how many recent reviews are 1–2 stars?
const negative = reviews.filter((r) => r.rating <= 2);
console.log(`${negative.length}/${reviews.length} recent reviews are 1–2★`);
console.log(reviews[0]);
A review record:
{
"_type": "review",
"appId": "com.spotify.music",
"reviewId": "7e1815f2-36d1-4187-a936-12f969747892",
"userName": "Catherine Hempel",
"rating": 5,
"body": "The paid version is excellent but very expensive...",
"thumbsUp": 14,
"date": "2026-04-21T11:35:57.000Z",
"appVersion": "9.1.40.1486",
"developerResponse": null
}
The appVersion and developerResponse fields are handy: you can track whether complaints cluster around a specific release, and whether the developer actually replied.
3. Search the Play Store by keyword
If you don't know the package names yet — say you're researching a niche — use search mode. It searches Google Play for each term and returns full details for every matching app, tagged with the _searchTerm that surfaced it.
const searchRun = await client.actor('freshactors/google-play-scraper').call({
mode: 'search',
searchTerms: ['habit tracker', 'budget app'],
maxSearchResults: 20,
country: 'us',
});
const { items: found } = await client.dataset(searchRun.defaultDatasetId).listItems();
// Rank a niche by rating
found
.sort((a, b) => (b.rating ?? 0) - (a.rating ?? 0))
.slice(0, 5)
.forEach((a) => console.log(`[${a._searchTerm}] ${a.title} — ${a.rating}★`));
That's an entire competitive map of a category — names, developers, ratings, install buckets, whether they're ad-supported — in one call.
Streaming large datasets
For big review pulls, don't load everything into memory. Iterate the dataset:
const dataset = client.dataset(reviewsRun.defaultDatasetId);
let offset = 0;
const limit = 1000;
while (true) {
const { items } = await dataset.listItems({ offset, limit });
if (items.length === 0) break;
// process this page (write to DB, file, etc.)
offset += items.length;
}
What does this cost?
Pricing is pay-per-event, so you pay for data returned, not server time:
- App details: $0.002 per app → 100 apps = $0.20.
- Reviews: $0.0001 per review → a 1,000-review pull = $0.10.
No subscription, no proxy bill. A daily competitor-tracking job over a few dozen apps costs cents.
No code? Run it from the UI
You don't strictly need the client. On the actor's page you can paste the same JSON input, click Start, and export the dataset as JSON, CSV, or Excel — or hit the dataset via its REST URL. The Node snippets above are just the programmable version of that.
Wrapping up
With one hosted actor you can pull Google Play app details, the ratings histogram, paginated reviews, and keyword search results as named JSON fields — no API key, no proxy setup, no positional-array archaeology. The actor is monitored by a daily canary and patched when Google changes its layout, so the pipeline you build today keeps returning data next month.
If you build something with it, or hit a package that behaves oddly, the Issues tab is the fastest way to reach a human.
Top comments (0)