Amazon is one of the most valuable datasets on the internet. Real-time prices, competitor stock levels, customer reviews, ratings trends, historical price changes — all of it is there, waiting to be harvested. The problem: Amazon knows this and has spent billions making scraping as hard as possible.
If you've ever tried to scrape Amazon with requests and BeautifulSoup, you know the story. You send one request, get a 503 error. Send two, get blocked for an hour. Try rotating user agents? Now you hit CAPTCHA. This isn't an accident — Amazon employs sophisticated bot detection, rotating proxies into different edge locations, JavaScript rendering, and behavioral fingerprinting. Scraping Amazon in 2026 isn't like scraping Reddit. It's a real arms race.
But it's not impossible. This guide covers five practical methods: the official API (limited but safe), the naive Python approach (instructive but fragile), dedicated scraping APIs (expensive but reliable), pre-built Apify actors (fastest to deploy), and hybrid approaches for price monitoring at scale.
Why Scrape Amazon (And What You'll Actually Get)
Before we dive into methods, let's be honest about what you can realistically extract and why it matters:
Price Monitoring: Track competitor pricing on identical products. Most valuable for e-commerce businesses undercutting larger sellers or aggregators watching market shifts.
Historical Price Data: Amazon doesn't publicly expose price history, but by polling every 6-12 hours you can build your own dataset. Valuable for research, market analysis, or arbitrage strategies.
Competitor Analysis: Track which products your competitors are selling, review counts, ratings, stock status. Less dense than price data but useful for strategic decisions.
Product Discovery: Identify trending products, new categories, high-volume ASIN clusters. Good for dropshipping research, market validation, or content creation (e.g., "Best budget X products").
Review Sentiment: Scrape review text and ratings (though Amazon aggressively obfuscates review pages). Most scrapers focus on metadata (count, average rating) rather than full text.
Stock/Availability: Real-time "In Stock" vs "Out of Stock" signals. Useful for arbitrage or identifying sudden demand spikes.
What you won't get: customer personal data, private seller information, or anything behind authentication. Amazon's anti-scraping is primarily about protecting merchant data and controlling access to their search/recommendation algorithms.
Amazon's Anti-Scraping Arsenal (What You're Up Against)
Amazon doesn't just block scrapers — it fingerprints them. Here's the defensive stack:
1. User-Agent Filtering: Generic curl or requests user agents are immediately flagged. Amazon expects a real browser.
2. Behavioral Analysis: Amazon tracks:
- Request patterns (too many requests to the same product in a short window)
- Scroll velocity (are you hovering on product images like a human?)
- Mouse movement (yes, JavaScript can detect if you move the mouse at realistic speeds)
- Timing between clicks (robots are too consistent; humans waver)
3. JavaScript Rendering: Modern Amazon product pages load prices, stock status, and reviews via JavaScript. A simple HTTP GET won't capture dynamic data.
4. CAPTCHA and Challenges: If Amazon suspects a bot, you hit a CAPTCHA. After 3-5 failures, your IP gets temporarily blocked (24-72 hours).
5. IP Reputation: Amazon maintains a database of known proxy/VPN IP ranges. Cheap residential proxies are often burned (already flagged). Premium proxy providers (Bright Data, Luminati) rotate to avoid this, but are expensive.
6. Rate Limiting: Even with perfect headers, hammering Amazon with 100s of requests per minute will trigger IP-level throttling.
7. Cookies and Session Management: Amazon uses cookies to detect scraper patterns. Rotating cookies or failing to maintain session state flags your client as a bot.
The upshot: naive scraping (plain requests) will fail immediately. You need to either:
- Use Amazon's official API (limited scope, safe)
- Use a dedicated scraping service that handles the above (costs money, extremely reliable)
- Use pre-built actors (Apify, same reliability, better for teams)
- Accept that you'll get some blocked requests and build error handling (fragile, research-only)
Method 1: Amazon Product Advertising API (Official, Limited)
Amazon's official API is the safest, most legal option. But it's also the most constrained.
What It Does
The Product Advertising API (formerly called the MWS API, now branded as the "Amazon Associates API") lets you:
- Search for products by keyword
- Get product details (ASIN, title, price, ratings, image URLs)
- Track price changes over time (if you poll regularly)
- Get links for monetization (Amazon Associates affiliates)
What It Doesn't Do
- Full historical price data (no price history endpoint)
- Review text or detailed review metadata
- Real-time stock tracking (only on some items)
- Competitor seller information
- Search ranking or trending products
- Access to private seller metrics
Setup and Requirements
- Sign up for Amazon Associates: Go to https://associates.amazon.com/
- Verify your account: Takes 1-3 days. Amazon will ask where you plan to send traffic.
-
Get API credentials: In your Associates dashboard, request API access. You'll get:
- Access Key ID
- Secret Access Key
- Tracking ID (for affiliate links)
- Install the SDK:
pip install amazon-product-advertising-api
Code Example: Search Products
from amazon_product_advertising_api import get_api
# Initialize the API client
api = get_api(
access_key='YOUR_ACCESS_KEY',
secret_key='YOUR_SECRET_KEY',
partner_tag='YOUR_TRACKING_ID',
region='US' # or 'GB', 'DE', 'FR', 'JP', etc.
)
def search_products(keyword, max_results=10):
"""Search Amazon for products by keyword"""
try:
results = api.search_items(
keywords=keyword,
resources=['ItemInfo.Title', 'ItemInfo.ByLineInfo', 'Offers.Summaries.Price', 'CustomerReviews.StarRating']
)
products = []
for item in results['SearchResult']['Items']:
# Extract product data
product = {
'asin': item['ASIN'],
'title': item['ItemInfo']['Title']['DisplayValue'],
'price': None,
'currency': None,
'rating': None,
'reviews': None,
'url': item.get('DetailPageURL', '')
}
# Price (if available)
if 'Offers' in item and 'Summaries' in item['Offers']:
price_obj = item['Offers']['Summaries'][0]['Price']
product['price'] = float(price_obj.get('Amount', 0))
product['currency'] = price_obj.get('Currency', 'USD')
# Rating
if 'CustomerReviews' in item:
rating = item['CustomerReviews'].get('StarRating', {}).get('DisplayValue')
product['rating'] = float(rating) if rating else None
products.append(product)
return products
except Exception as e:
print(f"Error: {e}")
return []
# Example usage
products = search_products('wireless headphones', max_results=5)
for p in products:
print(f"{p['title'][:60]}")
print(f" ASIN: {p['asin']}")
print(f" Price: ${p['price']} {p['currency']}" if p['price'] else " Price: N/A")
print(f" Rating: {p['rating']}/5.0" if p['rating'] else " Rating: N/A")
print()
Limits & Reality Check
- Rate limit: 1 request per second (in practice, Amazon is lenient; 10+ per second usually works)
- Monthly quota: 10,000 API calls/month on free tier (paid tier allows more)
- Cost: Free (no charge per API call; you earn money through affiliate commissions)
- Data freshness: Prices update hourly, but you're dependent on Amazon's API cache
- Search scope: Limited to products Amazon's search index covers; some products excluded
When to Use This
Production services that need legal cover, small monitoring operations (< 10K lookups/month), or anything where you need to demonstrate compliance.
Method 2: Python + Requests + BeautifulSoup (Educational; Gets Blocked Fast)
This is how 95% of people start with web scraping. It rarely works for more than a few requests on Amazon, but it's instructive.
Why This Will Fail
import requests
from bs4 import BeautifulSoup
# This approach will fail. Do not use in production.
response = requests.get(f'https://www.amazon.com/s?k=wireless+headphones')
soup = BeautifulSoup(response.content, 'html.parser')
Within 1-3 requests, you'll hit:
-
403 Forbidden(IP blocked) - CAPTCHA challenge
-
<title>Robot Check</title>HTML page instead of product data
The reason: Amazon's servers can detect that:
- The User-Agent is
python-requests/2.x.x(screams "bot") - There are no browser headers (Accept-Language, Referer, etc.)
- No cookies (HTTP requests are stateless; browsers maintain cookie jars)
- The IP is recognized as a known scraper IP or data center
The "Better" Version (Still Fails, But Slower)
import requests
from bs4 import BeautifulSoup
import time
import random
def scrape_amazon_naive(keyword):
"""
This improved version lasts a little longer but still gets blocked.
Use only for educational purposes or small one-off scrapes.
"""
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate',
'DNT': '1',
'Connection': 'keep-alive',
'Referer': 'https://www.amazon.com/',
}
session = requests.Session()
session.headers.update(headers)
url = f'https://www.amazon.com/s?k={keyword}'
try:
# Add random delay to seem more human
time.sleep(random.uniform(2, 5))
response = session.get(url, timeout=10)
response.raise_for_status()
soup = BeautifulSoup(response.content, 'html.parser')
# Amazon's HTML structure for product listings:
products = []
for item in soup.find_all('div', {'data-component-type': 's-search-result'}):
title_elem = item.find('span', class_='a-size-medium a-color-base a-text-normal')
price_elem = item.find('span', class_='a-price-whole')
if title_elem:
products.append({
'title': title_elem.get_text(strip=True),
'price': price_elem.get_text(strip=True) if price_elem else 'N/A'
})
return products
except Exception as e:
print(f"Scraping failed: {e}")
return []
# Try it
products = scrape_amazon_naive('wireless headphones')
print(f"Found {len(products)} products")
Honest assessment: This will work for maybe 1-5 requests before Amazon blocks you. The added headers and session management help, but are insufficient. Amazon's detection is sophisticated enough to flag behavioral patterns — consistent timing, lack of JavaScript execution, missing cookies from browsing history, etc.
When to Use This
Never in production. Educational purposes only, or genuine one-off scrapes where you don't mind a manual CAPTCHA solve every few requests.
Method 3: Scraping APIs (ScraperAPI, Bright Data, etc.)
If you want reliable Amazon scraping without building a complex proxy rotation system yourself, dedicated scraping APIs are the solution.
How They Work
Services like ScraperAPI and Bright Data maintain massive networks of residential proxies (IPs from real homes/mobile networks, not data centers). When you send a request, they:
- Route it through a residential IP
- Handle CAPTCHAs automatically (with human workers or ML)
- Render JavaScript if needed
- Return clean HTML or JSON
The cost is higher than raw API or web scraping, but you get reliability.
ScraperAPI Example
import requests
import json
from datetime import datetime
SCRAPERAPI_KEY = 'YOUR_SCRAPERAPI_KEY'
def scrape_amazon_with_scraperapi(asin):
"""
Use ScraperAPI to reliably scrape an Amazon product page.
ScraperAPI handles proxies, CAPTCHAs, and JavaScript rendering.
"""
url = f'https://www.amazon.com/dp/{asin}'
payload = {
'api_key': SCRAPERAPI_KEY,
'url': url,
'render': 'true', # Enable JavaScript rendering (costs more)
}
try:
# ScraperAPI's endpoint
response = requests.get('http://api.scraperapi.com', params=payload, timeout=30)
response.raise_for_status()
html = response.text
# Now parse with BeautifulSoup (HTML is clean, real)
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
# Extract product data
product = {
'asin': asin,
'title': None,
'price': None,
'rating': None,
'num_reviews': None,
'in_stock': False,
'scraped_at': datetime.now().isoformat()
}
# Title
title_elem = soup.find('span', class_='a-size-large hxp-headline-text')
if title_elem:
product['title'] = title_elem.get_text(strip=True)
# Price
price_elem = soup.find('span', class_='a-price-whole')
if price_elem:
price_text = price_elem.get_text(strip=True)
# Remove $ and commas, convert to float
product['price'] = float(price_text.replace('$', '').replace(',', ''))
# Rating
rating_elem = soup.find('span', class_='a-icon-star-small')
if rating_elem:
rating_text = rating_elem.get_text(strip=True)
try:
product['rating'] = float(rating_text.split()[0])
except:
pass
# Number of reviews
reviews_elem = soup.find('span', {'data-hook': 'social-proof-fpa-link-analyzing'})
if reviews_elem:
reviews_text = reviews_elem.get_text(strip=True)
try:
product['num_reviews'] = int(reviews_text.split()[0].replace(',', ''))
except:
pass
# Stock status
stock_elem = soup.find(class_=lambda x: x and 'availability' in x.lower())
if stock_elem:
stock_text = stock_elem.get_text(strip=True).lower()
product['in_stock'] = 'in stock' in stock_text
return product
except Exception as e:
print(f"ScraperAPI scrape failed: {e}")
return None
# Example usage
product = scrape_amazon_with_scraperapi('B0C3SJG2D2') # Example ASIN for AirPods
if product:
print(f"Product: {product['title']}")
print(f"Price: ${product['price']}")
print(f"Rating: {product['rating']}/5.0 ({product['num_reviews']} reviews)")
print(f"In Stock: {product['in_stock']}")
Cost Comparison
| Service | Cost | Features | Best For |
|---|---|---|---|
| ScraperAPI | $10/month (5K calls) - $300/month (500K calls) | Proxy rotation, CAPTCHA solving, JS rendering | Small to medium scale scraping |
| Bright Data | $100/month (25GB bandwidth) - up | Residential proxies, mobile proxies, dedicated IP pools | Large-scale commercial scraping |
| Apify | Pay-per-run + infrastructure ($0.05-0.20/1K items) | Pre-built actors, cloud execution, built-in monitors | Teams, no maintenance overhead |
Limits & Reality Check
- Cost: $0.02-0.05 per successful scrape (varies by provider)
- Speed: 2-10 seconds per request (JavaScript rendering adds latency)
- Reliability: 95-99% success rate; failed requests need retry logic
- Rate limit: Depends on your tier; most allow 100-1000 requests/minute
When to Use This
Any production scraping service, ecommerce monitoring, price aggregation platforms, or data for resale. The cost is worth it for reliability and legal safety.
Method 4: Pre-Built Scrapers (Apify Actors for Amazon)
Apify is a platform of pre-built, tested web scrapers ("actors"). The Amazon actors handle all the hard stuff: anti-scraping logic, error handling, retries, structured data export.
Why Use Apify
You get 80% of the way there in 5 minutes, with no code. The actor runs in Apify's cloud infrastructure (you don't manage proxies or IPs). Results are validated and structured. You pay by usage, not by infrastructure.
Using the Apify Amazon Scraper
Without code (via web UI):
- Go to https://apify.com/apify/amazon-product-scraper
- Click "Try now"
- Enter your parameters (ASINs, keywords, number of items)
- Start the run; results stream to your browser
- Export as CSV, JSON, or webhook
With code (Apify SDK):
from apify_client import ApifyClient
def scrape_amazon_with_apify(search_keywords, max_items=100):
"""
Use Apify's Amazon Product Scraper actor to reliably scrape products.
"""
client = ApifyClient('YOUR_APIFY_TOKEN')
# Run the Amazon Product Scraper actor
run = client.actor('apify/amazon-product-scraper').call(
run_input={
'searchKeywords': search_keywords,
'maxResults': max_items,
'startPage': 0,
}
)
# Retrieve structured results
dataset_client = client.dataset(run['defaultDatasetId'])
results = dataset_client.list_items()['items']
return results
# Example
products = scrape_amazon_with_apify(['wireless headphones'], max_items=50)
for product in products:
print(f"{product.get('title', 'N/A')[:60]}")
print(f" Price: ${product.get('price', 'N/A')}")
print(f" Rating: {product.get('reviewsCount', 0)} reviews")
print()
Install the Apify SDK
pip install apify-client
Apify Pricing
- Free tier: 20 actor runs/month, limited compute
- Paid: $5/month ($0.25 per 1K items scraped, approximately)
Limits & Reality Check
- Speed: 100-300 items per minute (faster than manual API calls)
- Cost: ~$0.02-0.10 per 100 items
- Structure: Returns clean, validated JSON with all key fields
- Reliability: 99%+ success on valid ASINs
- Maintenance: Zero — Apify updates the actor when Amazon changes
When to Use This
Teams without dedicated infrastructure, one-off data collection projects, or regular monitoring services where you need quick iteration.
Building a Price Monitoring Pipeline
Let's say you want to track the price of 10 products every 6 hours and alert when prices drop. Here's a hybrid approach using ScraperAPI or Apify + scheduled runs + a simple database:
import requests
import sqlite3
from datetime import datetime
import json
import time
# Database setup
def init_db():
conn = sqlite3.connect('/data/price_monitor.db')
cursor = conn.cursor()
cursor.execute('''
CREATE TABLE IF NOT EXISTS products (
asin TEXT PRIMARY KEY,
title TEXT,
last_price REAL,
lowest_price REAL,
highest_price REAL,
last_checked TIMESTAMP
)
''')
cursor.execute('''
CREATE TABLE IF NOT EXISTS price_history (
id INTEGER PRIMARY KEY AUTOINCREMENT,
asin TEXT,
price REAL,
timestamp TIMESTAMP,
in_stock BOOLEAN,
FOREIGN KEY(asin) REFERENCES products(asin)
)
''')
conn.commit()
return conn
def monitor_products(asin_list, scraper_api_key):
"""
Check current prices for a list of ASINs and log to database.
Run this function every 6 hours via a scheduler (APScheduler, cron, etc.)
"""
conn = init_db()
cursor = conn.cursor()
for asin in asin_list:
print(f"Checking {asin}...")
# Scrape via ScraperAPI
payload = {
'api_key': scraper_api_key,
'url': f'https://www.amazon.com/dp/{asin}',
'render': 'true',
}
try:
response = requests.get('http://api.scraperapi.com', params=payload, timeout=30)
html = response.text
# Parse (simplified for brevity)
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
# Extract price, title, stock
title = soup.find('span', class_='a-size-large')
price_elem = soup.find('span', class_='a-price-whole')
if not title or not price_elem:
print(f" Failed to extract data for {asin}")
continue
title_text = title.get_text(strip=True)
price_str = price_elem.get_text(strip=True).replace('$', '').replace(',', '')
price = float(price_str)
in_stock = 'in stock' in html.lower()
# Log to database
cursor.execute('''
INSERT OR IGNORE INTO products (asin, title, last_price, lowest_price, highest_price, last_checked)
VALUES (?, ?, ?, ?, ?, ?)
''', (asin, title_text, price, price, price, datetime.now()))
# Update price
cursor.execute('''
UPDATE products
SET last_price = ?, last_checked = ?
WHERE asin = ?
''', (price, datetime.now(), asin))
# Track in history
cursor.execute('''
INSERT INTO price_history (asin, price, timestamp, in_stock)
VALUES (?, ?, ?, ?)
''', (asin, price, datetime.now(), in_stock))
# Update lowest/highest
cursor.execute('''
SELECT lowest_price, highest_price FROM products WHERE asin = ?
''', (asin,))
row = cursor.fetchone()
if row:
lowest = min(row[0], price)
highest = max(row[1], price)
cursor.execute('''
UPDATE products SET lowest_price = ?, highest_price = ?
WHERE asin = ?
''', (lowest, highest, asin))
print(f" {title_text[:50]}... = ${price}")
except Exception as e:
print(f" Error: {e}")
time.sleep(1) # Polite delay
conn.commit()
conn.close()
def get_price_drops(threshold_percent=5):
"""Find products that dropped in price by more than threshold_percent"""
conn = sqlite3.connect('/data/price_monitor.db')
cursor = conn.cursor()
cursor.execute('''
SELECT
p.asin, p.title, p.last_price, p.lowest_price,
((p.lowest_price - p.last_price) / p.last_price * 100) as drop_percent
FROM products p
WHERE p.lowest_price < p.last_price
AND ((p.last_price - p.lowest_price) / p.last_price * 100) > ?
ORDER BY drop_percent DESC
''', (threshold_percent,))
drops = cursor.fetchall()
conn.close()
return drops
# Usage: Run this every 6 hours
asin_list = [
'B0C3SJG2D2', # AirPods Pro
'B08CX5Z9QN', # Echo Dot
'B07FKR6KXF', # Fire Stick 4K
# ... add more
]
# monitor_products(asin_list, scraper_api_key='YOUR_KEY')
# Check for price drops
# drops = get_price_drops(threshold_percent=5)
# for drop in drops:
# print(f"Price drop: {drop[1]} from ${drop[2]} to ${drop[3]} ({drop[4]:.1f}%)")
Scheduling: Use APScheduler to run this every 6 hours:
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
scheduler.add_job(lambda: monitor_products(asin_list, scraper_api_key), 'interval', hours=6)
scheduler.start()
Or deploy to a cloud function (AWS Lambda, Google Cloud Functions) and trigger via cron.
Legal Considerations: What You Actually Need to Know
Like Reddit, Amazon's terms of service explicitly forbid scraping. But again, TOS violations aren't laws.
Amazon's Terms of Service
Violating Amazon's TOS can result in:
- Account ban (if you use your personal account)
- IP ban (temporary, 24-72 hours usually)
- Legal cease-and-desist (rare, unless you're republishing data)
It's not a criminal issue.
The Computer Fraud and Abuse Act (CFAA)
The same HiQ v. LinkedIn (2017) precedent that protects Reddit scraping applies here: scraping publicly available data is legal, even if the TOS forbids it.
But Amazon is more aggressive than LinkedIn was about enforcement.
- Scraping prices, reviews, and product metadata is in a legal gray zone.
- If you republish Amazon data without adding value (e.g., copying product descriptions wholesale), you infringe on copyright.
- If your scraping causes measurable damage (DDoS-level load, stealing compute), you could face CFAA charges.
The Safe Path
- Use the official API when possible (Product Advertising API). It's endorsed and carries zero legal risk.
- Use dedicated scraping APIs (ScraperAPI, Bright Data). These companies absorb the legal risk and have done the vetting.
- Use pre-built actors (Apify). Same as above — they maintain the tool and assume legal liability.
- Scrape responsibly: Add delays (1-2 seconds between requests), don't scrape the same product 100 times per day, don't republish raw data.
- Don't pretend to be a different service: No faking user agents as mobile browsers if you're a server.
- Document your purpose: "Price monitoring for my ecommerce business" is defensible. "Scraping to republish on my aggregator" is not.
For anything commercial: Use a paid service (ScraperAPI, Apify) so there's a clear audit trail and the service provider assumes liability.
What Changed Since 2023?
- Amazon killed most unofficial scrapers (2024-2025): CAPTCHA became mandatory on the majority of requests. Cheap proxies no longer work.
- Apify's Amazon actor improved dramatically: Now handles JavaScript rendering, stock tracking, and review counts. High reliability (99%+).
- ScraperAPI added Amazon-specific routing: They now pre-route Amazon requests through premium residential proxies, CAPTCHA solve built-in.
- The official API stayed stable: No price increases, rate limits remain consistent. Still the safest option for non-commercial use.
- Mobile Amazon became harder to scrape: Amazon mobile site (m.amazon.com) is more resilient to automated access.
Real-World Example: Dropshipping Price Monitoring
You run a dropshipping store with 50 products sourced from Amazon. You want to alert when your suppliers' prices drop so you can adjust your margins or restock.
# dropship_monitor.py
import sqlite3
from datetime import datetime
import smtplib
from email.mime.text import MIMEText
from apify_client import ApifyClient
def monitor_dropship_products(product_asins, alert_threshold=0.10):
"""
Monitor supplier prices and send alerts when they drop.
Threshold = 0.10 means alert if price drops by 10% or more.
"""
client = ApifyClient('YOUR_APIFY_TOKEN')
# Scrape all products in one batch run (more efficient)
run = client.actor('apify/amazon-product-scraper').call(
run_input={
'asinListUrl': product_asins, # List of ASINs
'maxResults': 1, # Just get current price
}
)
# Retrieve results
dataset = client.dataset(run['defaultDatasetId'])
items = dataset.list_items()['items']
# Check database for price history
conn = sqlite3.connect('/data/dropship.db')
cursor = conn.cursor()
alerts = []
for item in items:
asin = item['asin']
current_price = item['price']
# Get last recorded price
cursor.execute('SELECT price FROM price_log WHERE asin = ? ORDER BY timestamp DESC LIMIT 1', (asin,))
result = cursor.fetchone()
if result:
last_price = result[0]
drop_percent = (last_price - current_price) / last_price
if drop_percent >= alert_threshold:
alerts.append({
'asin': asin,
'title': item['title'],
'old_price': last_price,
'new_price': current_price,
'drop_percent': drop_percent
})
# Log current price
cursor.execute('''
INSERT INTO price_log (asin, title, price, timestamp)
VALUES (?, ?, ?, ?)
''', (asin, item['title'], current_price, datetime.now()))
conn.commit()
conn.close()
# Send email alerts
if alerts:
send_alert_email(alerts)
return alerts
def send_alert_email(alerts):
"""Send email with price drop alerts"""
subject = f"[Price Alert] {len(alerts)} products dropped in price"
body = "Price drops detected:\n\n"
for alert in alerts:
body += f"• {alert['title'][:60]}\n"
body += f" Was: ${alert['old_price']:.2f} → Now: ${alert['new_price']:.2f}\n"
body += f" Drop: {alert['drop_percent']*100:.1f}%\n"
body += f" ASIN: {alert['asin']}\n\n"
# Send via SMTP (Gmail, AWS SES, etc.)
msg = MIMEText(body)
msg['Subject'] = subject
msg['From'] = 'alerts@yourstore.com'
msg['To'] = 'you@yourstore.com'
# (Implement SMTP sending here; depends on your email provider)
print(f"Email alert: {subject}")
# Run every 6 hours
# monitor_dropship_products(['B0C3SJG2D2', 'B08CX5Z9QN', ...], alert_threshold=0.10)
This is a real, revenue-generating use case. You automate margin optimization and rebuy signals.
Comparison Table: Which Method to Use
| Method | Speed | Cost | Legal Risk | Best For | Effort |
|---|---|---|---|---|---|
| Official API | 1 req/sec | Free | Lowest | Small monitoring, research | Low |
| Raw Python | 10 req/min (before block) | $0 | Highest | Educational, one-offs | Low (but unreliable) |
| ScraperAPI | 1-5 sec/req | $0.02/req | Low | Small-to-medium scale | Low |
| Bright Data | 1-3 sec/req | $100+/month | Low | Large-scale commercial | Medium |
| Apify Actor | 1-3 sec/req | $0.02/100 items | Low | Teams, no DevOps | Very low |
Resources
- Amazon Product Advertising API: https://advertising.amazon.com/
- Apify Amazon Scraper: https://apify.com/apify/amazon-product-scraper
- ScraperAPI: https://www.scraperapi.com/
- Bright Data: https://brightdata.com/
- BeautifulSoup Documentation: https://www.crummy.com/software/BeautifulSoup/
- Previous articles you might like:
TL;DR
- For small monitoring or research: Use the official Amazon Product Advertising API. It's free, safe, and legal.
- For reliable scraping: Use ScraperAPI or Bright Data. They handle anti-bot, pay them, and avoid legal friction.
- For teams or one-offs: Use Apify's pre-built Amazon actor. Deploy in 5 minutes, zero maintenance.
- For learning: Raw BeautifulSoup works on a few requests. Expect blocks. Don't use in production.
- For monitoring at scale: Build a pipeline with scheduled scrapes, price history tracking, and alerts.
- Always be respectful: Add delays, don't republish raw data, use appropriate tools for your scale.
- Track pricing trends: Historical price data is valuable. Build it once, monetize it later.
Amazon's prices are transparent — they're just guarded. Use the right tool for your use case and you'll get there.
Want to stay ahead of data extraction trends? Subscribe to The Data Collector for working code, legal updates, and practical scraping strategies that actually work in 2026. No fluff — just what works.
Disclosure: This post contains affiliate links. I may earn a commission if you sign up through my links, at no extra cost to you.
Disclosure: This post contains affiliate links. I may earn a commission if you sign up through my links, at no extra cost to you.
Compare web scraping APIs:
- ScraperAPI — 5,000 free credits, 50+ countries, structured data parsing
- Scrape.do — From $29/mo, strong Cloudflare bypass
- ScrapeOps — Proxy comparison + monitoring dashboard
Need custom web scraping? Email hustler@curlship.com — fast turnaround, fair pricing.
Top comments (0)