Scraping Amazon can help you monitor prices, track reviews, and analyze product listings—but you must do it responsibly and within the site’s Terms of Service. If you’re exploring a starter approach, this GitHub project is a handy reference: amazon-scraper-python.
Quick Overview
At a high level, you’ll:
Send an HTTP request with realistic headers.
Parse the HTML to extract product data (title, price, rating, reviews).
Handle pagination and anti-bot measures (rotating user agents/proxies).
Store results (CSV/JSON/DB).
The sample structure in the repo—amazon-scraper-python
—illustrates these steps with clean, beginner-friendly code.
Minimal Example (Educational Use Only)
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0",
"Accept-Language": "en-US,en;q=0.9"
}
url = "https://www.amazon.com/s?k=wireless+earbuds"
resp = requests.get(url, headers=headers, timeout=20)
resp.raise_for_status()
soup = BeautifulSoup(resp.text, "html.parser")
items = []
for card in soup.select("div[data-component-type='s-search-result']"):
title = card.select_one("h2 a span")
price_whole = card.select_one(".a-price .a-offscreen")
rating = card.select_one("i span.a-icon-alt")
link = card.select_one("h2 a")
items.append({
"title": title.get_text(strip=True) if title else None,
"price": price_whole.get_text(strip=True) if price_whole else None,
"rating": rating.get_text(strip=True) if rating else None,
"url": f"https://www.amazon.com{link['href']}" if link else None
})
print(items[:5])
For a fuller implementation with utilities, see the examples in this https://github.com/maivyly52-gif/amazon-scraper-python
Practical Tips to Reduce Blocks
Rotate headers & delays: Randomize User-Agent and add human-like waits.
Use residential/mobile proxies: Avoid sending many requests from a single IP.
Retry logic & backoff: Handle 503/robot checks gracefully.
CSS selectors change: Keep selectors resilient; prefer stable attributes.
Respect robots & ToS: Prefer official APIs where possible.
The project at https://github.com/maivyly52-gif/amazon-scraper-python shows a sensible baseline you can extend with rotating proxies, session handling, and structured outputs.
What Data Can You Extract?
Product title, ASIN, price, availability
Rating, review count
Seller info and badges
Image URLs and variant options
(Availability varies by page/locale—validate selectors regularly. See examples in https://github.com/maivyly52-gif/amazon-scraper-python)
Legal & Ethical Notes
Check Amazon’s Terms of Service and your local laws.
Use data you’re authorized to access; avoid personal data.
Cache results and keep request rates low.
Ready to try it? Explore the code, run examples, and adapt it for your use case here: https://github.com/maivyly52-gif/amazon-scraper-python
.
Top comments (0)