The Salary Data Gap
Most job seekers negotiate blind. Lets build a Python tool that scrapes salary data and creates negotiation briefs with percentile ranges.
Setup
pip install requests beautifulsoup4 pandas numpy
Salary sites have anti-bot measures. ScraperAPI handles JavaScript rendering and CAPTCHAs.
Scraping Salary Data
import requests
from bs4 import BeautifulSoup
import pandas as pd
SCRAPER_API_KEY = "YOUR_KEY"
def scrape_salary_data(job_title, location):
target = f"https://www.levels.fyi/t/{job_title.replace(chr(32), chr(45)).lower()}"
url = f"http://api.scraperapi.com?api_key={SCRAPER_API_KEY}&url={target}"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
salaries = []
for card in soup.select(".salary-card"):
total = extract_number(card.select_one(".total-comp"))
base = extract_number(card.select_one(".base-salary"))
if total:
salaries.append({"total_comp": total, "base": base})
return pd.DataFrame(salaries)
def extract_number(el):
if not el:
return None
text = el.text.strip().replace("$", "").replace(",", "")
try:
return float(text)
except ValueError:
return None
Multi-Source Aggregation
def aggregate(job_title, location):
sources = [scrape_salary_data(job_title, location)]
all_data = pd.concat([df for df in sources if not df.empty])
return {
"median": all_data["total_comp"].median(),
"p25": all_data["total_comp"].quantile(0.25),
"p75": all_data["total_comp"].quantile(0.75),
"p90": all_data["total_comp"].quantile(0.90)
}
Negotiation Brief Generator
def generate_brief(job_title, location, offer):
stats = aggregate(job_title, location)
print(f"Role: {job_title} in {location}")
print(f"Your Offer: ${offer:,.0f}")
print(f"Market: P25=${stats[p25]:,.0f} Med=${stats[median]:,.0f} P75=${stats[p75]:,.0f}")
if offer < stats["median"]:
print(f"Below median. Target ${stats[median]:,.0f}-${stats[p75]:,.0f}")
elif offer < stats["p75"]:
print(f"Competitive. Push for ${stats[p75]:,.0f}")
else:
print("Strong offer. Negotiate perks.")
generate_brief("Senior Software Engineer", "San Francisco", 185000)
Proxy Strategy
Use ThorData residential proxies for sites blocking datacenter IPs. Monitor with ScrapeOps.
Extending the Tool
- Equity valuation with RSU vesting schedules
- Cost of living normalization across cities
- Trend analysis tracking quarterly salary changes
- Company-specific drilldowns
Conclusion
Data-driven negotiation beats guessing. ScraperAPI makes gathering salary data at scale practical and reliable.
Top comments (0)