Scraping Influencer Engagement Rates Across Platforms with Python
Influencer marketing is a $21B industry, but choosing the right influencer is guesswork. Follower counts lie — engagement rates tell the truth. Let's build a cross-platform influencer analyzer.
What Metrics Matter
- Engagement rate — (likes + comments) / followers
- Posting consistency — frequency and timing
- Audience authenticity — comment quality signals
- Cross-platform presence — same creator, different audiences
Setup
import requests
from bs4 import BeautifulSoup
import json
import re
from datetime import datetime
from statistics import mean
PROXY_URL = "https://api.scraperapi.com"
API_KEY = "YOUR_SCRAPERAPI_KEY"
Social platforms fight scraping aggressively. ScraperAPI handles the evolving anti-bot measures.
Scraping Public Profile Metrics
def scrape_instagram_profile(username):
params = {
"api_key": API_KEY,
"url": f"https://www.instagram.com/{username}/",
"render": "true"
}
response = requests.get(PROXY_URL, params=params)
soup = BeautifulSoup(response.text, "html.parser")
meta_desc = soup.select_one("meta[name='description']")
if meta_desc:
content = meta_desc["content"]
numbers = re.findall(r'([\d,.]+[KMB]?)\s+(Followers|Following|Posts)', content)
metrics = {}
for value, label in numbers:
metrics[label.lower()] = parse_social_number(value)
return {
"platform": "instagram",
"username": username,
"followers": metrics.get("followers", 0),
"following": metrics.get("following", 0),
"posts": metrics.get("posts", 0)
}
return None
def parse_social_number(text):
text = text.strip().upper().replace(",", "")
multipliers = {"K": 1000, "M": 1000000, "B": 1000000000}
for suffix, mult in multipliers.items():
if text.endswith(suffix):
return int(float(text[:-1]) * mult)
return int(float(text)) if text else 0
Engagement Rate Calculation
def calculate_engagement_rate(profile, posts):
if not posts or not profile.get("followers"):
return 0
engagements = [p["likes"] + p["comments"] for p in posts]
avg_engagement = mean(engagements)
return round((avg_engagement / profile["followers"]) * 100, 3)
Authenticity Scoring
def score_authenticity(profile, posts):
score = 100
reasons = []
if profile.get("followers", 0) > 0:
ratio = profile.get("following", 0) / profile["followers"]
if ratio > 2:
score -= 20
reasons.append("High following-to-follower ratio")
if posts:
engagements = [p["likes"] + p["comments"] for p in posts]
avg = mean(engagements) if engagements else 0
if avg > 0:
variance = sum((e - avg) ** 2 for e in engagements) / len(engagements)
cv = (variance ** 0.5) / avg
if cv > 1.5:
score -= 15
reasons.append("Inconsistent engagement patterns")
er = calculate_engagement_rate(profile, posts)
if profile.get("followers", 0) > 100000 and er > 10:
score -= 25
reasons.append("Suspiciously high engagement for follower count")
return {"score": max(0, score), "reasons": reasons}
Cross-Platform Report
def cross_platform_report(username_map):
report = {"profiles": []}
for platform, username in username_map.items():
if platform == "instagram":
profile = scrape_instagram_profile(username)
if profile:
report["profiles"].append(profile)
import time; time.sleep(5)
if report["profiles"]:
total_reach = sum(p.get("followers", 0) for p in report["profiles"])
report["total_reach"] = total_reach
report["primary_platform"] = max(
report["profiles"], key=lambda x: x.get("followers", 0)
)["platform"]
return report
Infrastructure
- ScraperAPI — handles Instagram and TikTok's anti-bot systems
- ThorData — residential proxies essential for social platform access
- ScrapeOps — monitor success rates across social platforms
Conclusion
Engagement rate scraping turns influencer selection from gut feeling into data science. Real engagement beats follower counts every time. Build this pipeline, score authenticity, and your influencer marketing ROI will improve dramatically.
Top comments (0)