Product reviews are gold for market research. Here is how to extract them.
Review Data Structure
Most review systems return:
- Rating (1-5 stars)
- Review text
- Author name
- Date posted
- Verified purchase flag
- Helpful votes
Method 1: JSON-LD (Easiest)
Many e-commerce sites embed reviews as structured data:
const cheerio = require("cheerio");
async function getReviewsFromJsonLD(url) {
const res = await fetch(url);
const $ = cheerio.load(await res.text());
const scripts = $("script[type=application/ld+json]");
for (const script of scripts) {
const data = JSON.parse($(script).html());
if (data.review || data["@type"] === "Product") {
return {
name: data.name,
rating: data.aggregateRating?.ratingValue,
reviewCount: data.aggregateRating?.reviewCount,
reviews: (data.review || []).map(r => ({
rating: r.reviewRating?.ratingValue,
text: r.reviewBody,
author: r.author?.name,
date: r.datePublished
}))
};
}
}
}
Method 2: API Interception
// Playwright intercepts review API calls
const reviews = [];
page.on("response", async (res) => {
if (res.url().includes("review") && res.status() === 200) {
try { reviews.push(await res.json()); } catch(e) {}
}
});
await page.goto(productUrl);
Platform-Specific Guides
Use Cases
- Competitor sentiment analysis
- Product improvement insights
- Market research
- Content generation
- Customer pain point discovery
Need product reviews extracted? Amazon, Trustpilot, G2, Capterra — $20. Email: Spinov001@gmail.com | Hire me
Top comments (0)