What will be scraped
Full code
If you don't need an explanation, have a look at the full code example in the online IDE
import dotenv from "dotenv";
dotenv.config();
import { getJson } from "serpapi";
const getSearchParams = (searchType) => {
const isProduct = searchType === "product";
const reviewsLimit = 10; // hardcoded limit for demonstration purpose
const engine = isProduct ? "apple_product" : "apple_reviews"; // search engine
const params = {
api_key: process.env.API_KEY, //your API key from serpapi.com
product_id: "1507782672", // Parameter defines the ID of a product you want to get the reviews for
country: "us", // Parameter defines the country to use for the search
type: isProduct ? "app" : undefined, // Parameter defines the type of Apple Product to get the product page of
page: isProduct ? undefined : 1, // Parameter is used to get the items on a specific page
sort: isProduct ? undefined : "mostrecent", // Parameter is used for sorting reviews
};
return { engine, params, reviewsLimit };
};
const getProductInfo = async () => {
const { engine, params } = getSearchParams("product");
const json = await getJson(engine, params);
delete json.search_metadata;
delete json.search_parameters;
delete json.search_information;
return json;
};
const getReviews = async () => {
const reviews = [];
const { engine, params, reviewsLimit } = getSearchParams();
while (true) {
const json = await getJson(engine, params);
if (json.reviews) {
reviews.push(...json.reviews);
params.page += 1;
} else break;
if (reviews.length >= reviewsLimit) break;
}
return reviews;
};
const getResults = async () => {
return { productInfo: await getProductInfo(), reviews: await getReviews() };
};
getResults().then((result) => console.dir(result, { depth: null }));
Why use Apple Product Page Scraper and Apple App Store Reviews Scraper APIs from SerpApi?
Using API generally solves all or most problems that might get encountered while creating own parser or crawler. From webscraping perspective, our API can help to solve the most painful problems:
- Bypass blocks from supported search engines by solving CAPTCHA or IP blocks.
- No need to create a parser from scratch and maintain it.
- Pay for proxies, and CAPTCHA solvers.
- Don't need to use browser automation if there's a need to extract data in large amounts faster.
Head to the Apple Product Page playground and Apple App Store Reviews playground for a live and interactive demo.
Preparation
First, we need to create a Node.js* project and add npm packages serpapi and dotenv.
To do this, in the directory with our project, open the command line and enter:
$ npm init -y
And then:
$ npm i serpapi dotenv
*If you don't have Node.js installed, you can download it from nodejs.org and follow the installation documentation.
SerpApi package is used to scrape and parse search engine results using SerpApi. Get search results from Google, Bing, Baidu, Yandex, Yahoo, Home Depot, eBay, and more.
dotenv package is a zero-dependency module that loads environment variables from a
.envfile intoprocess.env.
Next, we need to add a top-level "type" field with a value of "module" in our package.json file to allow using ES6 modules in Node.JS:
For now, we complete the setup Node.JS environment for our project and move to the step-by-step code explanation.
Code explanation
First, we need to import dotenv from dotenv library and call config() method, then import getJson from serpapi library:
import dotenv from "dotenv";
dotenv.config();
import { getJson } from "serpapi";
-
config()will read your.envfile, parse the contents, assign it toprocess.env, and return an Object with aparsedkey containing the loaded content or anerrorkey if it failed. -
getJson()allows you to get a JSON response based on search parameters.
Next, we write getSearchParams function, to make the necessary search parameters for two different APIs. In this function, we define and set isProduct constant depending on the searchType argument.
Next, we define and return different search parameters for Product Page API and Reviews API: search engine; how many reviews we want to receive (reviewsLimit constant); search parameters for making a request:
const getSearchParams = (searchType) => {
const isProduct = searchType === "product";
const reviewsLimit = 10; // hardcoded limit for demonstration purpose
const engine = isProduct ? "apple_product" : "apple_reviews"; // search engine
const params = {
api_key: process.env.API_KEY, //your API key from serpapi.com
product_id: "1507782672", // Parameter defines the ID of a product you want to get the reviews for
country: "us", // Parameter defines the country to use for the search
type: isProduct ? "app" : undefined, // Parameter defines the type of Apple Product to get the product page of
page: isProduct ? undefined : 1, // Parameter is used to get the items on a specific page
sort: isProduct ? undefined : "mostrecent", // Parameter is used for sorting reviews
};
return { engine, params, reviewsLimit };
};
When we run this function, we receive different search parameters for:
You can use the next search params:
Common params:
-
api_keyparameter defines the SerpApi private key to use. -
product_idparameter defines the ID of a product you want to get the reviews for. You can get the ID of a product from our Web scraping Apple App Store Search with Nodejs blog post. You can also get it from the URL of the app. For exampleproduct_idof "https://apps.apple.com/us/app/the-great-coffee-app/id534220544", is the long numerical value that comes after "id",534220544. -
countryparameter defines the country to use for the search. It's a two-letter country code. (e.g.,us(default) for the United States,ukfor United Kingdom, orfrfor France). Head to the Apple Regions for a full list of supported Apple Regions. -
no_cacheparameter will force SerpApi to fetch the App Store Search results even if a cached version is already present. A cache is served only if the query and all parameters are exactly the same. Cache expires after 1h. Cached searches are free, and are not counted towards your searches per month. It can be set tofalse(default) to allow results from the cache, ortrueto disallow results from the cache.no_cacheandasyncparameters should not be used together. -
asyncparameter defines the way you want to submit your search to SerpApi. It can be set tofalse(default) to open an HTTP connection and keep it open until you got your search results, ortrueto just submit your search to SerpApi and retrieve them later. In this case, you'll need to use our Searches Archive API to retrieve your results.asyncandno_cacheparameters should not be used together.asyncshould not be used on accounts with Ludicrous Speed enabled.
Product Page params:
-
typeparameter defines the type of Apple Product to get the product page of. It defaults toapp.
Reviews params:
-
pageparameter is used to get the items on a specific page. (e.g.,1(default) is the first page of results,2is the 2nd page of results,3is the 3rd page of results, etc.). -
sortparameter is used for sorting reviews. It can be set to:mostrecent(Most recent (default)) ormosthelpful(Most helpful).
Next, we declare the function getProductInfo that gets all product information from the page and returns it. In this function we receive and destructure engine and params from getSearchParams function with "product" argument. Next, we get json with results, delete unnecessary keys, and return it:
const getProductInfo = async () => {
const { engine, params } = getSearchParams("product");
const json = await getJson(engine, params);
delete json.search_metadata;
delete json.search_parameters;
delete json.search_information;
return json;
};
Next, we declare the function getReviews that gets reviews results from all pages (using pagination) and return it:
const getReviews = async () => {
...
};
In this function we need to declare an empty reviews array, receive and destructure engine, params and reviewsLimit from getSearchParams function without arguments, then and using while loop get json with results, add reviews from each page and set next page index (to params.page value).
If there are no more results on the page or if the number of received results is more than reviewsLimit we stop the loop (using break) and return an array with results:
const reviews = [];
const { engine, params, reviewsLimit } = getSearchParams();
while (true) {
const json = await getJson(engine, params);
if (json.reviews) {
reviews.push(...json.reviews);
params.page += 1;
} else break;
if (reviews.length >= reviewsLimit) break;
}
return reviews;
And finally, we declare and run the getResults function, in which we make an object with results from getProductInfo and getReviews functions. Then we print all the received information in the console with the console.dir method, which allows you to use an object with the necessary parameters to change default output options:
const getResults = async () => {
return { productInfo: await getProductInfo(), reviews: await getReviews() };
};
getResults().then((result) => console.dir(result, { depth: null }));
Output
{
"productInfo":{
"title":"Pixea",
"snippet":"The invisible image viewer",
"id":"1507782672",
"age_rating":"4+",
"developer":{
"name":"ImageTasks Inc",
"link":"https://apps.apple.com/us/developer/imagetasks-inc/id450316587"
},
"rating":4.6,
"rating_count":"594 Ratings",
"price":"Free",
"logo":"https://is3-ssl.mzstatic.com/image/thumb/Purple118/v4/f6/93/b6/f693b68f-9b14-3689-7521-c19a83fb0d88/AppIcon-1x_U007emarketing-85-220-6.png/320x0w.webp",
"mac_screenshots":[
"https://is3-ssl.mzstatic.com/image/thumb/PurpleSource124/v4/b1/8c/fb/b18cfb80-cb5c-d67d-2edc-ee1f6666e012/35b8d5a7-b493-4a80-bdbd-3e9d564601dd_Pixea-1.jpg/643x0w.webp",
"https://is1-ssl.mzstatic.com/image/thumb/PurpleSource124/v4/96/08/83/9608834d-3d2b-5c0b-570c-f022407ff5cc/1836573e-1b6a-421c-b654-6ae2f915d755_Pixea-2.jpg/643x0w.webp",
"https://is1-ssl.mzstatic.com/image/thumb/PurpleSource124/v4/58/fd/db/58fddb5d-9480-2536-8679-92d6b067d285/98e22b63-1575-4ee6-b08d-343b9e0474ea_Pixea-3.jpg/643x0w.webp",
"https://is2-ssl.mzstatic.com/image/thumb/PurpleSource124/v4/c3/f3/f3/c3f3f3b5-deb0-4b58-4afc-79073373b7b9/28f51f38-bc59-4a61-a5a1-bff553838267_Pixea-4.jpg/643x0w.webp"
],
"description":"Pixea is an image viewer for macOS with a nice minimal modern user interface. Pixea works great with JPEG, HEIC, PSD, RAW, WEBP, PNG, GIF, and many other formats. Provides basic image processing, including flip and rotate, shows a color histogram, EXIF, and other information. Supports keyboard shortcuts and trackpad gestures. Shows images inside archives, without extracting them.Supported formats:JPEG, HEIC, GIF, PNG, TIFF, Photoshop (PSD), BMP, Fax images, macOS and Windows icons, Radiance images, Google's WebP. RAW formats: Leica DNG and RAW, Sony ARW, Olympus ORF, Minolta MRW, Nikon NEF, Fuji RAF, Canon CR2 and CRW, Hasselblad 3FR. Sketch files (preview only). ZIP-archives.Export formats:JPEG, JPEG-2000, PNG, TIFF, BMP.Found a bug? Have a suggestion? Please, send it to support@imagetasks.comFollow us on Twitter @imagetasks!",
"version_history":[
{
"release_version":"1.4",
"release_notes":"- New icon- macOS Big Sur support- Universal Binary- Bug fixes and improvements",
"release_date":"2020-11-09"
},
... and other versions
],
"ratings_and_reviews":{
"rating_percentage":{
"5_star":"76%",
"4_star":"14%",
"3_star":"4%",
"2_star":"2%",
"1_star":"3%"
},
"review_examples":[
{
"rating":"5 out of 5",
"username":"MyrtleBlink182",
"review_date":"01/18/2022",
"review_title":"Full-Screen Perfection",
"review_text":"This photo-viewer is by far the best in the biz. I thoroughly enjoy viewing photos with it. I tried a couple of others out, but this one is exactly what I was looking for. There is no dead space or any extra design baggage when viewing photos. Pixea knocks it out of the park keeping the design minimalistic while ensuring the functionality is through the roof"
},
... and other reviews examples
]
},
"privacy":{
"description":"The developer, ImageTasks Inc, indicated that the app’s privacy practices may include handling of data as described below. For more information, see the developer’s privacy policy.",
"privacy_policy_link":"https://www.imagetasks.com/Pixea-policy.txt",
"cards":[
{
"title":"Data Not Collected",
"description":"The developer does not collect any data from this app."
}
],
"sidenote":"Privacy practices may vary, for example, based on the features you use or your age. Learn More",
"learn_more_link":"https://apps.apple.com/story/id1538632801"
},
"information":{
"seller":"ImageTasks Inc",
"price":"Free",
"size":"5.8 MB",
"categories":[
"Photo & Video"
],
"compatibility":[
{
"device":"Mac",
"requirement":"Requires macOS 10.12 or later."
}
],
"supported_languages":[
"English"
],
"age_rating":{
"rating":"4+"
},
"copyright":"Copyright © 2020 Andrey Tsarkov. All rights reserved.",
"developer_website":"https://www.imagetasks.com",
"app_support_link":"https://www.imagetasks.com/pixea",
"privacy_policy_link":"https://www.imagetasks.com/Pixea-policy.txt"
},
"more_by_this_developer":{
"apps":[
{
"logo":"https://is3-ssl.mzstatic.com/image/thumb/Purple118/v4/f6/93/b6/f693b68f-9b14-3689-7521-c19a83fb0d88/AppIcon-1x_U007emarketing-85-220-6.png/320x0w.webp",
"link":"https://apps.apple.com/us/app/istatistica/id1126874522",
"serpapi_link":"https://serpapi.com/search.json?country=us&engine=apple_product&product_id=1507782672&type=app",
"name":"iStatistica",
"category":"Utilities"
},
... and other apps
],
"result_type":"Full",
"see_all_link":"https://apps.apple.com/us/app/id1507782672#see-all/developer-other-apps"
}
},
"reviews":[
{
"position":1,
"id":"9332275235",
"title":"Doesn't respect aspect ratios",
"text":"Seemingly no way to maintain the aspect ratio of an image. It always wants to fill the photo to the window size, no matter what sizing options you pick. How useless is that?",
"rating":3,
"review_date":"2022-11-26 13:29:43 UTC",
"author":{
"name":"soren121",
"link":"https://itunes.apple.com/us/reviews/id33706024"
}
},
... and other reviews
]
}
Links
- Code in the online IDE
- Apple Product Page Scraper API documentation
- Apple Product Page playground
- Apple App Store Reviews Scraper API documentation
- Apple App Store Reviews playground
If you want other functionality added to this blog post or if you want to see some projects made with SerpApi, write me a message.
Add a Feature Request💫 or a Bug🐞




Top comments (0)