TL;DR: Amazon seller data overload is a real problem, but it's fixable with the right infrastructure. This tutorial shows you how to unify BSR, keyword SERP, and ad position data using Pangolinfo's Scrape API — so your analytics dashboard answers questions instead of creating more of them.
The Problem
Amazon operators typically juggle three data sources that don't talk to each other:
- BSR rankings — updates hourly, extracted from product pages
- ABA keyword data — weekly CSV downloads from Seller Central (manual)
- Advertising reports — 24-48 hour delay, different metrics entirely
When BSR drops 20 positions, diagnosing the root cause means manually correlating data from all three sources. Average time: 2-3 hours per incident. And the diagnosis still might be wrong.
The fix is a unified data pipeline that aligns these three streams on a shared time axis. Here's how to build one.
Prerequisites
pip install requests psycopg2-binary python-dotenv
PANGOLINFO_API_KEY=your_key_here
DATABASE_URL=postgresql://user:pass@localhost:5432/amazon_analytics
Step 1: Database Schema
-- Unified time-series schema for Amazon operational data
CREATE TABLE asin_snapshots (
id BIGSERIAL PRIMARY KEY,
asin VARCHAR(20) NOT NULL,
marketplace VARCHAR(5) NOT NULL DEFAULT 'US',
bsr_main INTEGER,
bsr_category VARCHAR(100),
price DECIMAL(10,2),
review_count INTEGER,
rating DECIMAL(3,2),
collected_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE keyword_snapshots (
id BIGSERIAL PRIMARY KEY,
keyword TEXT NOT NULL,
target_asin VARCHAR(20),
organic_rank SMALLINT, -- NULL = not on first page
sp_positions_count SMALLINT NOT NULL DEFAULT 0,
target_sp_rank SMALLINT, -- NULL = no SP ad running
collected_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX ON asin_snapshots (asin, collected_at DESC);
CREATE INDEX ON keyword_snapshots (keyword, collected_at DESC);
Step 2: Data Collector
"""
collector.py - Pangolinfo Scrape API wrapper
Handles ASIN snapshots and keyword SERP collection
"""
import os
import requests
from datetime import datetime
from dataclasses import dataclass
from typing import Optional
API_KEY = os.getenv("PANGOLINFO_API_KEY")
BASE_URL = "https://api.pangolinfo.com/v1/amazon"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}
@dataclass
class AsinSnapshot:
asin: str
marketplace: str
bsr_main: Optional[int]
bsr_category: Optional[str]
price: Optional[float]
review_count: Optional[int]
rating: Optional[float]
collected_at: str
@dataclass
class KeywordSnapshot:
keyword: str
target_asin: Optional[str]
organic_rank: Optional[int]
sp_positions_count: int
target_sp_rank: Optional[int]
collected_at: str
def collect_asin(asin: str, marketplace: str = "US") -> AsinSnapshot:
"""Collect product snapshot: BSR, price, reviews, ratings."""
resp = requests.post(
f"{BASE_URL}/product",
headers=HEADERS,
json={
"asin": asin,
"marketplace": marketplace,
"fields": ["bsr", "price", "reviews", "ratings"],
"output_format": "json"
}
)
resp.raise_for_status()
d = resp.json()
return AsinSnapshot(
asin=asin,
marketplace=marketplace,
bsr_main=d.get("bsr", {}).get("main_category_rank"),
bsr_category=d.get("bsr", {}).get("main_category_name"),
price=d.get("price", {}).get("current"),
review_count=d.get("reviews", {}).get("count"),
rating=d.get("ratings", {}).get("average"),
collected_at=datetime.utcnow().isoformat()
)
def collect_keyword(
keyword: str,
target_asin: Optional[str] = None,
marketplace: str = "US"
) -> KeywordSnapshot:
"""
Collect keyword SERP: organic ranks + sponsored positions.
Note: Sponsored ad position extraction is one of the more
technically challenging aspects of Amazon scraping due to
dynamic rendering. Pangolinfo handles this at the API level,
so you get complete SP position data reliably.
"""
resp = requests.post(
f"{BASE_URL}/keyword-serp",
headers=HEADERS,
json={
"keyword": keyword,
"marketplace": marketplace,
"include_ad_positions": True,
"output_format": "json"
}
)
resp.raise_for_status()
d = resp.json()
# Find target ASIN's organic position
organic_rank = None
if target_asin:
for item in d.get("organic_results", []):
if item.get("asin") == target_asin:
organic_rank = item.get("position")
break
# Find target ASIN's sponsored position
sp_positions = d.get("sponsored_positions", [])
target_sp_rank = None
if target_asin:
for sp in sp_positions:
if sp.get("asin") == target_asin:
target_sp_rank = sp.get("position")
break
return KeywordSnapshot(
keyword=keyword,
target_asin=target_asin,
organic_rank=organic_rank,
sp_positions_count=len(sp_positions),
target_sp_rank=target_sp_rank,
collected_at=datetime.utcnow().isoformat()
)
Step 3: Database Writer
"""
writer.py - Persist collected data to PostgreSQL
"""
import os
import psycopg2
from dataclasses import asdict
from collector import AsinSnapshot, KeywordSnapshot
conn = psycopg2.connect(os.getenv("DATABASE_URL"))
def write_asin_snapshot(snapshot: AsinSnapshot):
with conn.cursor() as cur:
cur.execute("""
INSERT INTO asin_snapshots
(asin, marketplace, bsr_main, bsr_category, price, review_count, rating, collected_at)
VALUES
(%(asin)s, %(marketplace)s, %(bsr_main)s, %(bsr_category)s,
%(price)s, %(review_count)s, %(rating)s, %(collected_at)s)
""", asdict(snapshot))
conn.commit()
def write_keyword_snapshot(snapshot: KeywordSnapshot):
with conn.cursor() as cur:
cur.execute("""
INSERT INTO keyword_snapshots
(keyword, target_asin, organic_rank, sp_positions_count, target_sp_rank, collected_at)
VALUES
(%(keyword)s, %(target_asin)s, %(organic_rank)s,
%(sp_positions_count)s, %(target_sp_rank)s, %(collected_at)s)
""", asdict(snapshot))
conn.commit()
Step 4: Scheduled Monitoring Job
"""
monitor.py - Daily monitoring routine
Run via cron: 0 */6 * * * python monitor.py
"""
import json
from collector import collect_asin, collect_keyword
from writer import write_asin_snapshot, write_keyword_snapshot
# Configure your monitoring scope here
CONFIG = {
"target_asin": "B09XXXXXXX",
"competitor_asins": ["B08YYYYYYY", "B07ZZZZZZZ"],
"core_keywords": [
"camping chair lightweight",
"portable outdoor chair",
"folding camp chair"
],
"marketplace": "US"
}
def run_monitoring():
all_asins = [CONFIG["target_asin"]] + CONFIG["competitor_asins"]
print("Collecting ASIN snapshots...")
for asin in all_asins:
snapshot = collect_asin(asin, CONFIG["marketplace"])
write_asin_snapshot(snapshot)
print(f" [{asin}] BSR: {snapshot.bsr_main} | Reviews: {snapshot.review_count}")
print("\nCollecting keyword SERP data...")
for keyword in CONFIG["core_keywords"]:
snapshot = collect_keyword(keyword, CONFIG["target_asin"], CONFIG["marketplace"])
write_keyword_snapshot(snapshot)
print(
f" [{keyword}] "
f"Organic: {snapshot.organic_rank or 'not ranked'} | "
f"SP slots: {snapshot.sp_positions_count} | "
f"Your SP: {snapshot.target_sp_rank or 'not running'}"
)
if __name__ == "__main__":
run_monitoring()
Step 5: Anomaly Detection Query
Once you have a few days of data, this query detects BSR drops that may require immediate attention:
-- Detect BSR drops > 10% vs previous collection
WITH ranked AS (
SELECT
asin,
bsr_main,
LAG(bsr_main) OVER (PARTITION BY asin ORDER BY collected_at) AS prev_bsr,
collected_at
FROM asin_snapshots
WHERE collected_at >= NOW() - INTERVAL '24 hours'
),
anomalies AS (
SELECT
asin,
prev_bsr,
bsr_main AS current_bsr,
ROUND(((bsr_main - prev_bsr)::float / prev_bsr * 100)::numeric, 1) AS change_pct,
collected_at
FROM ranked
WHERE prev_bsr IS NOT NULL
AND bsr_main > prev_bsr * 1.10 -- BSR increased (rank worsened) by 10%+
)
SELECT * FROM anomalies ORDER BY change_pct DESC;
Cross-reference any anomaly result against keyword_snapshots on the same timestamp window to see if the BSR drop correlates with a sponsored position displacement.
What You Get
After running for a few days, you'll have a queryable time-series of:
- ASIN BSR trajectory (visualize in Grafana or Metabase)
- Keyword organic rank changes per collection window
- SP position count fluctuations on your core keywords
- Your ASIN's presence/absence in sponsored positions
When BSR drops, run the anomaly query. Cross-reference with keyword data. The root cause that used to take three hours manually is now visible in one joined query.
No-Code Alternative
Building and maintaining this pipeline requires ongoing engineering time. If you want the same structured, unified view without writing code, AMZ Data Tracker provides a configuration interface for the same monitoring logic — ASIN tracking, keyword ranking, competitor monitoring — with automatic collection and a built-in dashboard.
API documentation for the Scrape API is at docs.pangolinfo.com. Free trial credits available.
Have questions about the implementation or want to share how you've built your own Amazon analytics stack? Drop a comment — happy to discuss architecture tradeoffs.
Top comments (0)