How to Build a Competitor Feature Tracker from Product Changelogs
Knowing what your competitors ship — and when — is competitive intelligence gold. Most SaaS products publish changelogs, release notes, or "what's new" pages. Let's scrape them systematically.
Why Track Competitor Features?
- Spot trends before they become table stakes
- Identify gaps in your own product
- Time your launches strategically
- Feed product roadmap discussions with data
Setup
import requests
from bs4 import BeautifulSoup
import json
import hashlib
from datetime import datetime
PROXY_URL = "https://api.scraperapi.com"
API_KEY = "YOUR_SCRAPERAPI_KEY"
Changelog pages often use JavaScript rendering. ScraperAPI handles this with its render=true parameter.
Defining Competitor Sources
COMPETITORS = {
"notion": {
"url": "https://www.notion.so/releases",
"selector": ".release-note",
"title_sel": "h2",
"date_sel": "time",
"body_sel": ".release-body"
},
"linear": {
"url": "https://linear.app/changelog",
"selector": "article",
"title_sel": "h2",
"date_sel": "time",
"body_sel": ".changelog-content"
}
}
The Core Scraper
def scrape_changelog(name, config):
params = {
"api_key": API_KEY,
"url": config["url"],
"render": "true"
}
response = requests.get(PROXY_URL, params=params)
soup = BeautifulSoup(response.text, "html.parser")
entries = []
for item in soup.select(config["selector"]):
title = item.select_one(config["title_sel"])
date = item.select_one(config["date_sel"])
body = item.select_one(config["body_sel"])
entry = {
"competitor": name,
"title": title.text.strip() if title else "",
"date": date.text.strip() if date else "",
"body": body.text.strip()[:500] if body else "",
"hash": hashlib.md5((title.text if title else "").encode()).hexdigest(),
"scraped_at": datetime.now().isoformat()
}
entries.append(entry)
return entries
Change Detection with SQLite
import sqlite3
def init_tracker_db():
conn = sqlite3.connect("competitor_tracker.db")
conn.execute('''CREATE TABLE IF NOT EXISTS features (
id INTEGER PRIMARY KEY AUTOINCREMENT,
competitor TEXT, title TEXT, date TEXT,
body TEXT, hash TEXT UNIQUE, scraped_at TEXT
)''')
conn.commit()
return conn
def detect_new_features(conn, entries):
new_features = []
for entry in entries:
try:
conn.execute(
"INSERT INTO features (competitor, title, date, body, hash, scraped_at) VALUES (?,?,?,?,?,?)",
(entry["competitor"], entry["title"], entry["date"],
entry["body"], entry["hash"], entry["scraped_at"])
)
new_features.append(entry)
except sqlite3.IntegrityError:
pass
conn.commit()
return new_features
Feature Categorization
CATEGORIES = {
"ai": ["ai", "machine learning", "gpt", "copilot", "assistant"],
"collaboration": ["real-time", "multiplayer", "share", "team"],
"integration": ["api", "webhook", "integration", "connect"],
"performance": ["faster", "speed", "performance", "optimize"]
}
def categorize_feature(entry):
text = (entry["title"] + " " + entry["body"]).lower()
matched = []
for category, keywords in CATEGORIES.items():
if any(kw in text for kw in keywords):
matched.append(category)
return matched or ["other"]
Proxy Infrastructure
Scraping multiple SaaS sites reliably requires good infrastructure:
- ScraperAPI — JavaScript rendering for modern changelog pages
- ThorData — residential proxies to avoid detection across multiple targets
- ScrapeOps — centralized monitoring across all your scraping jobs
Conclusion
A competitor feature tracker turns public information into strategic advantage. Run it weekly, categorize automatically, and your product team gets a data-driven view of the competitive landscape — no manual research required.
Top comments (0)