DEV Community

Cover image for Amazon URL Parameter Construction: A Complete Guide for Data Scraping
Mox Loop
Mox Loop

Posted on

Amazon URL Parameter Construction: A Complete Guide for Data Scraping

Amazon URL parameter construction is the foundation of efficient e-commerce data scraping. This guide covers:

  • ✅ Complete 14-parameter reference table
  • ✅ 5 practical use cases with code
  • ✅ Python URL builder class
  • ✅ Common pitfalls and solutions

What is URL Parameter Construction?

URL parameter construction means programmatically building complete page access links according to Amazon's official URL structure rules. Instead of manually searching on Amazon's website, you generate precise URLs and send them directly to a scraping API.

Manual approach: Open Amazon → Enter keyword → Set filters → Navigate pages

URL construction: One line of code → Get data directly

Efficiency comparison:

  • Scraping 100 keywords × 10 price ranges = 1,000 operations
  • With URL construction: 1 loop

Why Learn URL Parameter Construction?

1. 100x Efficiency Boost

Automate batch scraping instead of manual operations.

2. Precise Data Control

  • Specify zip code for regional pricing
  • Filter by price range
  • Sort by sales, rating, or price
  • Combine complex conditions (4+ stars + Prime + price range)

3. Reduce API Costs

Correct URL parameters ensure accurate data in one request, avoiding repeated calls due to parameter errors.

Amazon URL Parameter Reference Table

Here's the complete parameter list - bookmark this for quick reference:

Parameter Type Description Example Values
k Search Control Search keyword, supports multi-word wireless+headphones
i Search Control Category ID, limits search scope electronics, aps (all)
rh Filter Conditions Composite filter (brand, rating, etc.) p_72:1249150011 (4+ stars)
low-price Filter Conditions Minimum price (cents in most categories) 5000 (=$50.00)
high-price Filter Conditions Maximum price (cents in most categories) 20000 (=$200.00)
s Sorting Method Result sorting rules price-asc-rank (price ascending)
page Pagination Control Page number (usually max 20 pages) 1, 2, 3...20
ref Tracking Identifier Source tracking, simulates real user sr_pg_1, nb_sb_noss
qid Tracking Identifier Query timestamp 1702284567
node Category Navigation Category node ID, for bestsellers 172282 (Electronics)
field-keywords Search Control Keyword (legacy parameter) laptop
bbn Category Navigation Browse Bin Number 172282
ie Encoding Setting Character encoding UTF8
tag Affiliate Identifier Amazon Associates ID youraffid-20

Important Notes:

  • 💰 Price parameters use cents in most categories: $50 = 5000
  • 🔤 Encode spaces as + or %20
  • 📄 Amazon limits to 20 pages; use price segmentation to bypass
  • 🌍 Different sites (.com, .co.uk, .co.jp) may have different category IDs

5 Practical Use Cases

Case 1: Basic Search

Requirement: Scrape "wireless headphones" in Electronics category

URL Construction:

https://www.amazon.com/s?k=wireless+headphones&i=electronics&ref=nb_sb_noss
Enter fullscreen mode Exit fullscreen mode

Using Pangolin Scrape API:

import requests

url = "https://www.amazon.com/s?k=wireless+headphones&i=electronics&ref=nb_sb_noss"
response = requests.post('https://api.pangolinfo.com/scrape', json={
    'url': url,
    'type': 'search',
    'format': 'json'
})
data = response.json()
print(f"Found {len(data.get('products', []))} products")
Enter fullscreen mode Exit fullscreen mode

Case 2: Price Filtering

Requirement: Scrape Bluetooth speakers priced $50-$200, sorted by price ascending

URL Construction:

https://www.amazon.com/s?k=bluetooth+speaker&i=electronics&low-price=5000&high-price=20000&s=price-asc-rank
Enter fullscreen mode Exit fullscreen mode

Note: 5000 = $50.00 (cents unit)

Case 3: Bestseller List

Requirement: Get Electronics category Best Sellers

URL Construction:

https://www.amazon.com/gp/bestsellers/electronics/ref=zg_bs_nav_electronics_0
Enter fullscreen mode Exit fullscreen mode

Note: Bestseller URL structure differs from search pages; category name is directly in the path.

Case 4: Multi-Page Scraping

Requirement: Scrape first 5 pages of "laptop" results

Python Code:

base_url = "https://www.amazon.com/s"
keyword = "laptop"

for page in range(1, 6):
    url = f"{base_url}?k={keyword}&page={page}&ref=sr_pg_{page}"
    print(f"Page {page}: {url}")
    # Send to API for scraping
Enter fullscreen mode Exit fullscreen mode

Case 5: Complex Filtering

Requirement: Laptops with 4+ stars, Prime badge, priced $100-$500

URL Construction:

https://www.amazon.com/s?k=laptop&i=computers&rh=p_72:1249150011,p_85:2470955011&low-price=10000&high-price=50000
Enter fullscreen mode Exit fullscreen mode

rh Parameter Breakdown:

  • p_72:1249150011 = 4+ stars rating
  • p_85:2470955011 = Prime products
  • Multiple conditions connected by commas

Python URL Builder Class

from urllib.parse import urlencode, quote_plus
from typing import Optional

class AmazonURLBuilder:
    """Amazon URL Builder"""

    BASE_URLS = {
        'us': 'https://www.amazon.com',
        'uk': 'https://www.amazon.co.uk',
        'jp': 'https://www.amazon.co.jp'
    }

    SORT_OPTIONS = {
        'relevance': 'relevanceblender',
        'price_asc': 'price-asc-rank',
        'price_desc': 'price-desc-rank',
        'review': 'review-rank',
        'newest': 'date-desc-rank'
    }

    def __init__(self, marketplace: str = 'us'):
        self.base_url = self.BASE_URLS.get(marketplace, self.BASE_URLS['us'])

    def build_search_url(
        self,
        keyword: str,
        category: Optional[str] = None,
        min_price: Optional[float] = None,
        max_price: Optional[float] = None,
        sort_by: str = 'relevance',
        page: int = 1
    ) -> str:
        """Build search URL"""
        params = {
            'k': keyword,
            's': self.SORT_OPTIONS.get(sort_by, sort_by),
            'page': page,
            'ref': f'sr_pg_{page}'
        }

        if category:
            params['i'] = category

        # Convert price to cents
        if min_price is not None:
            params['low-price'] = int(min_price * 100)
        if max_price is not None:
            params['high-price'] = int(max_price * 100)

        query_string = urlencode(params, quote_via=quote_plus)
        return f"{self.base_url}/s?{query_string}"

# Usage example
builder = AmazonURLBuilder()
url = builder.build_search_url(
    keyword='wireless headphones',
    category='electronics',
    min_price=50,
    max_price=200,
    sort_by='price_asc'
)
print(url)
Enter fullscreen mode Exit fullscreen mode

Common Pitfalls

❌ Mistake 1: Wrong Price Unit

# Wrong
params = {'low-price': 50, 'high-price': 200}  # Interpreted as $0.50-$2.00

# Correct
params = {'low-price': 5000, 'high-price': 20000}  # $50-$200
Enter fullscreen mode Exit fullscreen mode

❌ Mistake 2: Unencoded Spaces

# Wrong
url = "https://www.amazon.com/s?k=wireless headphones"  # Will error

# Correct
url = "https://www.amazon.com/s?k=wireless+headphones"
# Or
from urllib.parse import quote_plus
url = f"https://www.amazon.com/s?k={quote_plus('wireless headphones')}"
Enter fullscreen mode Exit fullscreen mode

❌ Mistake 3: Ignoring 20-Page Limit

Problem: Amazon search shows max 20 pages

Solution: Price segmentation

price_ranges = [(0, 50), (50, 100), (100, 200), (200, 500)]

for min_p, max_p in price_ranges:
    for page in range(1, 21):  # 20 pages per price segment
        url = builder.build_search_url(
            'laptop',
            min_price=min_p,
            max_price=max_p,
            page=page
        )
        # Scrape data
Enter fullscreen mode Exit fullscreen mode

Integration with Pangolin Scrape API

import requests

class PangolinScraper:
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.api_url = 'https://api.pangolinfo.com/scrape'
        self.url_builder = AmazonURLBuilder()

    def scrape_search(self, keyword: str, **kwargs):
        """Scrape search results"""
        url = self.url_builder.build_search_url(keyword, **kwargs)

        payload = {
            'api_key': self.api_key,
            'url': url,
            'type': 'search',
            'format': 'json'
        }

        response = requests.post(self.api_url, json=payload)
        response.raise_for_status()
        return response.json()

# Usage
scraper = PangolinScraper(api_key='your_key')
data = scraper.scrape_search(
    keyword='wireless mouse',
    category='electronics',
    min_price=10,
    max_price=50
)
print(f"Found {len(data.get('products', []))} products")
Enter fullscreen mode Exit fullscreen mode

Pangolin Advantages:

  • ✅ 98% sponsored ad capture rate
  • ✅ Automatic URL parameter handling
  • ✅ Supports zip code, price, and complex filtering
  • ✅ Returns structured JSON data

Tool Recommendations

For Technical Teams: Pangolin Scrape API

  • Automatic URL parameter handling
  • High-quality data return
  • Pay-as-you-go pricing

For Non-Technical Users: AMZ Data Tracker

  • Zero-code configuration
  • Visual interface
  • Scheduled scraping (minute-level)
  • Anomaly alerts

Summary

Master these 3 core points:

  1. Prices use cents: $50 = 5000
  2. Encode spaces: Use + or %20
  3. Bypass limits: Price segmentation solves 20-page limit

URL parameter construction boosts scraping efficiency by 100x. Whether you build your own tools or use professional APIs like Pangolin, understanding these fundamentals is essential.


Found this helpful? Drop a ❤️ and bookmark for later!

Questions? Drop them in the comments below! 👇

python #webdev #datascience #scraping #ecommerce

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.