Etsy is one of the largest marketplaces for handmade, vintage, and craft goods — with over 90 million active buyers and 7+ million sellers. For market researchers, dropshippers, and e-commerce analysts, Etsy data is incredibly valuable: product trends, pricing strategies, seller performance, and buyer sentiment are all embedded in those listing pages.
In this guide, I'll walk through scraping Etsy product listings, seller profiles, and reviews using Python. I'll cover the technical challenges (spoiler: JavaScript rendering is the big one) and show practical code you can adapt.
What Data Can You Extract from Etsy?
Here's what's publicly available on Etsy pages:
- Product listings — title, description, price, images, tags, materials, shipping info
- Seller profiles — shop name, location, sales count, star rating, member since
- Reviews — star rating, review text, buyer photos, date, item purchased
- Search results — products ranked by relevance/price/recency for any keyword
- Category data — trending items, bestsellers, category structure
Step 1: Understanding Etsy's URL Structure
Etsy search URLs look like this:
https://www.etsy.com/search?q=handmade+jewelry&ref=search_bar&page=1
Key parameters:
-
q— search query -
page— page number (starts at 1) -
min_price/max_price— price range filters -
ship_to— shipping destination country code -
order— sort order (most_relevant, price_asc, price_desc, date_desc)
Individual product pages:
https://www.etsy.com/listing/1234567890/product-title-slug
Seller shop pages:
https://www.etsy.com/shop/ShopName
Step 2: Scraping Search Results
Etsy renders much of its content server-side, but also includes structured data that we can extract:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Step 3: Scraping Product Details
Individual product pages contain the richest data — descriptions, tags, shipping info, and all pricing details:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Step 4: Scraping Seller Shop Pages
Seller data helps you understand market positioning:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Step 5: Review Extraction
Product reviews reveal buyer sentiment and product quality:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
The JavaScript Rendering Challenge
Here's where Etsy scraping gets tricky. Etsy heavily uses React for rendering, which means many page elements only appear after JavaScript execution. Basic requests + BeautifulSoup will miss:
- Dynamic pricing (especially for items with variations)
- "Add to cart" availability status
- Related items and recommendations
- Some review content loaded lazily
For these elements, you need browser automation:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
The problem? Running Playwright at scale is slow and resource-heavy. Each page needs a full browser instance, which eats RAM and CPU.
Scaling with Residential Proxies
For any serious Etsy scraping project, you'll need residential proxies. Etsy's anti-bot system fingerprints requests and will block datacenter IPs after a few dozen requests.
ThorData residential proxies work well here because they support session persistence — meaning you can maintain the same IP across multiple requests, which looks more like natural browsing:
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Building a Complete Product Research Tool
Here's how to tie it all together for e-commerce research:
import csv
def research_niche(search_terms, output_prefix="etsy_research"):
"""Full pipeline: search → detail → reviews for market research."""
all_products = []
for term in search_terms:
print(f"\n{'='*50}")
print(f"Researching: {term}")
# Step 1: Get search results
products = scrape_etsy_search(term, pages=2)
# Step 2: Get details for top 5 products
for product in products[:5]:
if product.get("url"):
details = scrape_product_details(product["url"])
product.update(details)
time.sleep(random.uniform(3, 6))
for p in products:
p["search_term"] = term
all_products.extend(products)
time.sleep(random.uniform(5, 10))
# Save results
if all_products:
flat_products = []
for p in all_products:
flat = {k: v for k, v in p.items() if not isinstance(v, (list, dict))}
flat_products.append(flat)
keys = set()
for p in flat_products:
keys.update(p.keys())
with open(f"{output_prefix}.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=sorted(keys))
writer.writeheader()
writer.writerows(flat_products)
print(f"\nSaved {len(flat_products)} products to {output_prefix}.csv")
return all_products
# Research multiple niches
niches = [
"handmade+candles",
"custom+jewelry",
"vintage+clothing",
"digital+planner",
"resin+art",
]
data = research_niche(niches)
Etsy's Open API (Alternative to Scraping)
Before building a scraper, consider the Etsy Open API v3. It provides:
- Listing search and details
- Shop information
- Reviews
- Category browsing
You need to register an app and get an API key. The free tier allows 10,000 requests per day, which is plenty for most research projects. The main limitation: some data fields are only available to apps with shop owner authorization.
# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
If the API gives you what you need, use it. It's faster, more reliable, and won't get you blocked.
Limitations and Honest Assessment
- JavaScript rendering is the main challenge. Unlike some sites where HTML parsing works fine, Etsy's React-based frontend means many elements require browser automation for full extraction.
- Anti-bot detection is moderate. Etsy uses bot detection that flags datacenter IPs and unusual request patterns. Residential proxies are necessary for scale.
- Etsy has an official API. Unlike many e-commerce sites, Etsy actually provides a decent API. Check it first before scraping — you might not need a scraper at all.
- Listing data changes fast. Prices, availability, and listings themselves change frequently. Any dataset you build starts going stale immediately.
- Respect seller data. Many Etsy sellers are individuals and small businesses. Don't use scraped data to undercut pricing or copy product ideas wholesale.
When Scraping Makes Sense vs. Using the API
| Use Case | Best Approach |
|---|---|
| Basic product search | Etsy API v3 |
| Pricing trends over time | Scraper + database |
| Competitor shop analysis | Scraper (API limits shop data) |
| Review sentiment analysis | Etsy API v3 (reviews endpoint) |
| Category-wide market research | Scraper (API pagination limits) |
| One-time data export | Etsy API v3 |
Wrapping Up
Etsy is one of the more scraping-friendly e-commerce platforms — they have a usable API, their HTML includes JSON-LD structured data, and their anti-bot measures are moderate compared to sites like Amazon or Booking.com.
Start with the official API for basic data needs. Use scraping for data the API doesn't expose — deep category analysis, historical pricing, or cross-shop comparisons at scale. And always use residential proxies when scraping beyond a handful of pages.
The handmade and vintage market is fascinating to analyze with data. Every Etsy niche has its own pricing dynamics, seasonal trends, and competitive patterns. A good scraper turns that into actionable intelligence.
Questions? Drop a comment — I'm happy to help with specific Etsy scraping challenges.
Top comments (0)