DEV Community

agenthustler
agenthustler

Posted on

How to Scrape Etsy in 2026: Product Listings, Seller Data, and Prices

Etsy is one of the largest marketplaces for handmade, vintage, and craft goods — with over 90 million active buyers and 7+ million sellers. For market researchers, dropshippers, and e-commerce analysts, Etsy data is incredibly valuable: product trends, pricing strategies, seller performance, and buyer sentiment are all embedded in those listing pages.

In this guide, I'll walk through scraping Etsy product listings, seller profiles, and reviews using Python. I'll cover the technical challenges (spoiler: JavaScript rendering is the big one) and show practical code you can adapt.

What Data Can You Extract from Etsy?

Here's what's publicly available on Etsy pages:

  • Product listings — title, description, price, images, tags, materials, shipping info
  • Seller profiles — shop name, location, sales count, star rating, member since
  • Reviews — star rating, review text, buyer photos, date, item purchased
  • Search results — products ranked by relevance/price/recency for any keyword
  • Category data — trending items, bestsellers, category structure

Step 1: Understanding Etsy's URL Structure

Etsy search URLs look like this:

https://www.etsy.com/search?q=handmade+jewelry&ref=search_bar&page=1
Enter fullscreen mode Exit fullscreen mode

Key parameters:

  • q — search query
  • page — page number (starts at 1)
  • min_price / max_price — price range filters
  • ship_to — shipping destination country code
  • order — sort order (most_relevant, price_asc, price_desc, date_desc)

Individual product pages:

https://www.etsy.com/listing/1234567890/product-title-slug
Enter fullscreen mode Exit fullscreen mode

Seller shop pages:

https://www.etsy.com/shop/ShopName
Enter fullscreen mode Exit fullscreen mode

Step 2: Scraping Search Results

Etsy renders much of its content server-side, but also includes structured data that we can extract:

import requests
from bs4 import BeautifulSoup
import json
import time
import random
import re

def scrape_etsy_search(query, pages=3):
    """Scrape Etsy search results for product listings."""

    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                       "AppleWebKit/537.36 (KHTML, like Gecko) "
                       "Chrome/124.0.0.0 Safari/537.36",
        "Accept-Language": "en-US,en;q=0.9",
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    }

    all_products = []

    for page in range(1, pages + 1):
        url = f"https://www.etsy.com/search?q={query}&page={page}"

        response = requests.get(url, headers=headers, timeout=30)

        if response.status_code != 200:
            print(f"Page {page}: Status {response.status_code}")
            continue

        soup = BeautifulSoup(response.text, "html.parser")

        # Method 1: Extract from JSON-LD structured data
        for script in soup.select('script[type="application/ld+json"]'):
            try:
                ld_data = json.loads(script.string)
                if isinstance(ld_data, dict) and ld_data.get("@type") == "ItemList":
                    for item in ld_data.get("itemListElement", []):
                        product = item.get("item", {})
                        parsed = {
                            "name": product.get("name"),
                            "url": product.get("url"),
                            "price": product.get("offers", {}).get("price"),
                            "currency": product.get("offers", {}).get("priceCurrency"),
                            "image": product.get("image"),
                        }
                        if parsed["name"]:
                            all_products.append(parsed)
            except (json.JSONDecodeError, TypeError):
                continue

        # Method 2: Parse listing cards from HTML
        if not all_products:
            listings = soup.select('.v2-listing-card')
            for listing in listings:
                product = {}

                # Title
                title_el = listing.select_one('.v2-listing-card__title')
                product["name"] = title_el.get_text(strip=True) if title_el else None

                # Price
                price_el = listing.select_one('.currency-value')
                product["price"] = price_el.get_text(strip=True) if price_el else None

                # Link
                link_el = listing.select_one('a.listing-link')
                product["url"] = link_el["href"] if link_el else None

                # Shop name
                shop_el = listing.select_one('.v2-listing-card__shop')
                product["shop"] = shop_el.get_text(strip=True) if shop_el else None

                # Rating
                rating_el = listing.select_one('.stars-svg')
                if rating_el:
                    aria = rating_el.get("aria-label", "")
                    match = re.search(r'([\d.]+) out of 5', aria)
                    product["rating"] = float(match.group(1)) if match else None

                # Review count
                review_el = listing.select_one('.text-gray-lighter')
                if review_el:
                    count_text = review_el.get_text(strip=True)
                    count_match = re.search(r'([\d,]+)', count_text)
                    product["reviews_count"] = count_match.group(1) if count_match else None

                if product["name"]:
                    all_products.append(product)

        print(f"Page {page}: {len(all_products)} products total")
        time.sleep(random.uniform(2, 5))

    return all_products

# Usage
products = scrape_etsy_search("handmade+leather+wallet", pages=3)
print(f"Found {len(products)} products")
for p in products[:5]:
    print(f"  {p['name'][:50]} — ${p.get('price', 'N/A')}")
Enter fullscreen mode Exit fullscreen mode

Step 3: Scraping Product Details

Individual product pages contain the richest data — descriptions, tags, shipping info, and all pricing details:

def scrape_product_details(product_url):
    """Extract detailed data from an Etsy product page."""

    headers = {
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
                       "AppleWebKit/537.36 Chrome/124.0.0.0 Safari/537.36",
        "Accept-Language": "en-US,en;q=0.9",
    }

    response = requests.get(product_url, headers=headers, timeout=30)
    soup = BeautifulSoup(response.text, "html.parser")

    details = {}

    # Extract from JSON-LD (most reliable)
    for script in soup.select('script[type="application/ld+json"]'):
        try:
            ld_data = json.loads(script.string)
            if isinstance(ld_data, dict) and ld_data.get("@type") == "Product":
                details["name"] = ld_data.get("name")
                details["description"] = ld_data.get("description")
                details["brand"] = ld_data.get("brand", {}).get("name")
                details["image"] = ld_data.get("image")

                offers = ld_data.get("offers", {})
                details["price"] = offers.get("price")
                details["currency"] = offers.get("priceCurrency")
                details["availability"] = offers.get("availability")

                aggregate = ld_data.get("aggregateRating", {})
                details["rating"] = aggregate.get("ratingValue")
                details["review_count"] = aggregate.get("reviewCount")
        except (json.JSONDecodeError, TypeError):
            continue

    # Additional HTML parsing for fields not in JSON-LD

    # Tags
    tags = []
    for tag_el in soup.select('a[href*="/search?q="]'):
        tag_text = tag_el.get_text(strip=True)
        if tag_text and len(tag_text) < 50:
            tags.append(tag_text)
    details["tags"] = list(set(tags))[:20]

    # Materials
    materials_section = soup.select_one('#materials-information')
    if materials_section:
        details["materials"] = materials_section.get_text(strip=True)

    # Shipping info
    shipping_el = soup.select_one('[data-estimated-delivery]')
    if shipping_el:
        details["shipping_estimate"] = shipping_el.get_text(strip=True)

    # Shop info
    shop_el = soup.select_one('[data-shop-name]')
    if shop_el:
        details["shop_name"] = shop_el.get("data-shop-name")

    # Variations (size, color, etc.)
    variations = []
    for select_el in soup.select('select[id*="variation"]'):
        options = [opt.get_text(strip=True) for opt in select_el.select('option') if opt.get("value")]
        if options:
            label = select_el.get("aria-label", "Variation")
            variations.append({"label": label, "options": options})
    details["variations"] = variations

    return details

# Usage
product = scrape_product_details("https://www.etsy.com/listing/1234567890/example")
print(json.dumps(product, indent=2))
Enter fullscreen mode Exit fullscreen mode

Step 4: Scraping Seller Shop Pages

Seller data helps you understand market positioning:

def scrape_shop(shop_name):
    """Scrape an Etsy shop page for seller information."""

    url = f"https://www.etsy.com/shop/{shop_name}"
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                       "AppleWebKit/537.36 Chrome/124.0.0.0 Safari/537.36",
    }

    response = requests.get(url, headers=headers, timeout=30)
    soup = BeautifulSoup(response.text, "html.parser")

    shop = {"shop_name": shop_name, "url": url}

    # Sales count
    sales_el = soup.select_one('.shop-sales-count')
    if sales_el:
        sales_text = sales_el.get_text(strip=True)
        match = re.search(r'([\d,]+)\s*sales', sales_text, re.IGNORECASE)
        shop["total_sales"] = match.group(1) if match else sales_text

    # Star rating
    rating_el = soup.select_one('[data-rating]')
    if rating_el:
        shop["rating"] = rating_el.get("data-rating")

    # Location
    location_el = soup.select_one('.shop-location')
    shop["location"] = location_el.get_text(strip=True) if location_el else None

    # Member since
    member_el = soup.select_one('.etsy-since')
    shop["member_since"] = member_el.get_text(strip=True) if member_el else None

    # Active listings count
    count_el = soup.select_one('.shop-listings-count')
    if count_el:
        count_text = count_el.get_text(strip=True)
        match = re.search(r'([\d,]+)', count_text)
        shop["active_listings"] = match.group(1) if match else None

    # Announcement
    announcement_el = soup.select_one('.shop-announcement')
    shop["announcement"] = announcement_el.get_text(strip=True) if announcement_el else None

    return shop
Enter fullscreen mode Exit fullscreen mode

Step 5: Review Extraction

Product reviews reveal buyer sentiment and product quality:

def scrape_product_reviews(listing_id, pages=3):
    """Scrape reviews for a specific Etsy listing."""

    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                       "AppleWebKit/537.36 Chrome/124.0.0.0 Safari/537.36",
        "Accept": "application/json",
        "X-Requested-With": "XMLHttpRequest",
    }

    all_reviews = []

    for page in range(1, pages + 1):
        # Etsy loads reviews via an internal API endpoint
        url = (
            f"https://www.etsy.com/api/v3/ajax/bespoke/"
            f"member/neu/specs/reviews?listing_id={listing_id}"
            f"&page={page}&sort=newest"
        )

        response = requests.get(url, headers=headers, timeout=30)

        if response.status_code == 200:
            try:
                data = response.json()
                reviews_html = data.get("output", {}).get("reviews", "")
                soup = BeautifulSoup(reviews_html, "html.parser")

                for review_el in soup.select('.review-item'):
                    review = {}

                    stars_el = review_el.select_one('[data-rating]')
                    review["rating"] = stars_el.get("data-rating") if stars_el else None

                    text_el = review_el.select_one('.review-text')
                    review["text"] = text_el.get_text(strip=True) if text_el else None

                    date_el = review_el.select_one('.review-date')
                    review["date"] = date_el.get_text(strip=True) if date_el else None

                    buyer_el = review_el.select_one('.reviewer-name')
                    review["buyer"] = buyer_el.get_text(strip=True) if buyer_el else None

                    if review["text"]:
                        all_reviews.append(review)

            except (json.JSONDecodeError, KeyError):
                pass

        time.sleep(random.uniform(2, 4))

    return all_reviews
Enter fullscreen mode Exit fullscreen mode

The JavaScript Rendering Challenge

Here's where Etsy scraping gets tricky. Etsy heavily uses React for rendering, which means many page elements only appear after JavaScript execution. Basic requests + BeautifulSoup will miss:

  • Dynamic pricing (especially for items with variations)
  • "Add to cart" availability status
  • Related items and recommendations
  • Some review content loaded lazily

For these elements, you need browser automation:

from playwright.sync_api import sync_playwright

def scrape_with_browser(url):
    """Use Playwright for JS-rendered Etsy pages."""

    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        context = browser.new_context(
            user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                       "AppleWebKit/537.36 Chrome/124.0.0.0 Safari/537.36",
            viewport={"width": 1920, "height": 1080},
        )

        page = context.new_page()
        page.goto(url, wait_until="networkidle")

        # Wait for React to render product details
        page.wait_for_selector('[data-listing-id]', timeout=15000)

        # Extract rendered content
        content = page.content()
        browser.close()

        return BeautifulSoup(content, "html.parser")
Enter fullscreen mode Exit fullscreen mode

The problem? Running Playwright at scale is slow and resource-heavy. Each page needs a full browser instance, which eats RAM and CPU.

Scaling with Residential Proxies

For any serious Etsy scraping project, you'll need residential proxies. Etsy's anti-bot system fingerprints requests and will block datacenter IPs after a few dozen requests.

ThorData residential proxies work well here because they support session persistence — meaning you can maintain the same IP across multiple requests, which looks more like natural browsing:

# ThorData proxy configuration
PROXY_HOST = "proxy.thordata.com"
PROXY_PORT = 9000
PROXY_USER = "your_user"
PROXY_PASS = "your_pass"

proxies = {
    "http": f"http://{PROXY_USER}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}",
    "https": f"http://{PROXY_USER}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}",
}

def scrape_etsy_with_proxy(url, headers):
    """Route Etsy requests through residential proxy."""
    response = requests.get(
        url,
        headers=headers,
        proxies=proxies,
        timeout=30,
    )
    return response

# For Playwright with proxy
def scrape_with_browser_proxy(url):
    with sync_playwright() as p:
        browser = p.chromium.launch(
            headless=True,
            proxy={
                "server": f"http://{PROXY_HOST}:{PROXY_PORT}",
                "username": PROXY_USER,
                "password": PROXY_PASS,
            },
        )
        page = browser.new_page()
        page.goto(url, wait_until="networkidle")
        content = page.content()
        browser.close()
        return content
Enter fullscreen mode Exit fullscreen mode

Building a Complete Product Research Tool

Here's how to tie it all together for e-commerce research:

import csv

def research_niche(search_terms, output_prefix="etsy_research"):
    """Full pipeline: search → detail → reviews for market research."""

    all_products = []

    for term in search_terms:
        print(f"\n{'='*50}")
        print(f"Researching: {term}")

        # Step 1: Get search results
        products = scrape_etsy_search(term, pages=2)

        # Step 2: Get details for top 5 products
        for product in products[:5]:
            if product.get("url"):
                details = scrape_product_details(product["url"])
                product.update(details)
                time.sleep(random.uniform(3, 6))

        for p in products:
            p["search_term"] = term
        all_products.extend(products)

        time.sleep(random.uniform(5, 10))

    # Save results
    if all_products:
        flat_products = []
        for p in all_products:
            flat = {k: v for k, v in p.items() if not isinstance(v, (list, dict))}
            flat_products.append(flat)

        keys = set()
        for p in flat_products:
            keys.update(p.keys())

        with open(f"{output_prefix}.csv", "w", newline="", encoding="utf-8") as f:
            writer = csv.DictWriter(f, fieldnames=sorted(keys))
            writer.writeheader()
            writer.writerows(flat_products)

        print(f"\nSaved {len(flat_products)} products to {output_prefix}.csv")

    return all_products

# Research multiple niches
niches = [
    "handmade+candles",
    "custom+jewelry",
    "vintage+clothing",
    "digital+planner",
    "resin+art",
]

data = research_niche(niches)
Enter fullscreen mode Exit fullscreen mode

Etsy's Open API (Alternative to Scraping)

Before building a scraper, consider the Etsy Open API v3. It provides:

  • Listing search and details
  • Shop information
  • Reviews
  • Category browsing

You need to register an app and get an API key. The free tier allows 10,000 requests per day, which is plenty for most research projects. The main limitation: some data fields are only available to apps with shop owner authorization.

# Etsy API v3 example
ETSY_API_KEY = "your_keystring"

def search_etsy_api(query, limit=25, offset=0):
    """Search Etsy using the official API."""
    url = "https://openapi.etsy.com/v3/application/listings/active"

    headers = {"x-api-key": ETSY_API_KEY}
    params = {
        "keywords": query,
        "limit": limit,
        "offset": offset,
        "sort_on": "score",  # relevance
    }

    response = requests.get(url, headers=headers, params=params)
    return response.json()
Enter fullscreen mode Exit fullscreen mode

If the API gives you what you need, use it. It's faster, more reliable, and won't get you blocked.

Limitations and Honest Assessment

  1. JavaScript rendering is the main challenge. Unlike some sites where HTML parsing works fine, Etsy's React-based frontend means many elements require browser automation for full extraction.
  2. Anti-bot detection is moderate. Etsy uses bot detection that flags datacenter IPs and unusual request patterns. Residential proxies are necessary for scale.
  3. Etsy has an official API. Unlike many e-commerce sites, Etsy actually provides a decent API. Check it first before scraping — you might not need a scraper at all.
  4. Listing data changes fast. Prices, availability, and listings themselves change frequently. Any dataset you build starts going stale immediately.
  5. Respect seller data. Many Etsy sellers are individuals and small businesses. Don't use scraped data to undercut pricing or copy product ideas wholesale.

When Scraping Makes Sense vs. Using the API

Use Case Best Approach
Basic product search Etsy API v3
Pricing trends over time Scraper + database
Competitor shop analysis Scraper (API limits shop data)
Review sentiment analysis Etsy API v3 (reviews endpoint)
Category-wide market research Scraper (API pagination limits)
One-time data export Etsy API v3

Wrapping Up

Etsy is one of the more scraping-friendly e-commerce platforms — they have a usable API, their HTML includes JSON-LD structured data, and their anti-bot measures are moderate compared to sites like Amazon or Booking.com.

Start with the official API for basic data needs. Use scraping for data the API doesn't expose — deep category analysis, historical pricing, or cross-shop comparisons at scale. And always use residential proxies when scraping beyond a handful of pages.

The handmade and vintage market is fascinating to analyze with data. Every Etsy niche has its own pricing dynamics, seasonal trends, and competitive patterns. A good scraper turns that into actionable intelligence.

Questions? Drop a comment — I'm happy to help with specific Etsy scraping challenges.

Top comments (0)