DEV Community

agenthustler
agenthustler

Posted on

Scraping Influencer Engagement Rates Across Platforms with Python

Scraping Influencer Engagement Rates Across Platforms with Python

Influencer marketing is a $21B industry, but choosing the right influencer is guesswork. Follower counts lie — engagement rates tell the truth. Let's build a cross-platform influencer analyzer.

What Metrics Matter

  • Engagement rate — (likes + comments) / followers
  • Posting consistency — frequency and timing
  • Audience authenticity — comment quality signals
  • Cross-platform presence — same creator, different audiences

Setup

import requests
from bs4 import BeautifulSoup
import json
import re
from datetime import datetime
from statistics import mean

PROXY_URL = "https://api.scraperapi.com"
API_KEY = "YOUR_SCRAPERAPI_KEY"
Enter fullscreen mode Exit fullscreen mode

Social platforms fight scraping aggressively. ScraperAPI handles the evolving anti-bot measures.

Scraping Public Profile Metrics

def scrape_instagram_profile(username):
    params = {
        "api_key": API_KEY,
        "url": f"https://www.instagram.com/{username}/",
        "render": "true"
    }
    response = requests.get(PROXY_URL, params=params)
    soup = BeautifulSoup(response.text, "html.parser")

    meta_desc = soup.select_one("meta[name='description']")
    if meta_desc:
        content = meta_desc["content"]
        numbers = re.findall(r'([\d,.]+[KMB]?)\s+(Followers|Following|Posts)', content)
        metrics = {}
        for value, label in numbers:
            metrics[label.lower()] = parse_social_number(value)

        return {
            "platform": "instagram",
            "username": username,
            "followers": metrics.get("followers", 0),
            "following": metrics.get("following", 0),
            "posts": metrics.get("posts", 0)
        }
    return None

def parse_social_number(text):
    text = text.strip().upper().replace(",", "")
    multipliers = {"K": 1000, "M": 1000000, "B": 1000000000}
    for suffix, mult in multipliers.items():
        if text.endswith(suffix):
            return int(float(text[:-1]) * mult)
    return int(float(text)) if text else 0
Enter fullscreen mode Exit fullscreen mode

Engagement Rate Calculation

def calculate_engagement_rate(profile, posts):
    if not posts or not profile.get("followers"):
        return 0
    engagements = [p["likes"] + p["comments"] for p in posts]
    avg_engagement = mean(engagements)
    return round((avg_engagement / profile["followers"]) * 100, 3)
Enter fullscreen mode Exit fullscreen mode

Authenticity Scoring

def score_authenticity(profile, posts):
    score = 100
    reasons = []

    if profile.get("followers", 0) > 0:
        ratio = profile.get("following", 0) / profile["followers"]
        if ratio > 2:
            score -= 20
            reasons.append("High following-to-follower ratio")

    if posts:
        engagements = [p["likes"] + p["comments"] for p in posts]
        avg = mean(engagements) if engagements else 0
        if avg > 0:
            variance = sum((e - avg) ** 2 for e in engagements) / len(engagements)
            cv = (variance ** 0.5) / avg
            if cv > 1.5:
                score -= 15
                reasons.append("Inconsistent engagement patterns")

    er = calculate_engagement_rate(profile, posts)
    if profile.get("followers", 0) > 100000 and er > 10:
        score -= 25
        reasons.append("Suspiciously high engagement for follower count")

    return {"score": max(0, score), "reasons": reasons}
Enter fullscreen mode Exit fullscreen mode

Cross-Platform Report

def cross_platform_report(username_map):
    report = {"profiles": []}
    for platform, username in username_map.items():
        if platform == "instagram":
            profile = scrape_instagram_profile(username)
        if profile:
            report["profiles"].append(profile)
        import time; time.sleep(5)

    if report["profiles"]:
        total_reach = sum(p.get("followers", 0) for p in report["profiles"])
        report["total_reach"] = total_reach
        report["primary_platform"] = max(
            report["profiles"], key=lambda x: x.get("followers", 0)
        )["platform"]
    return report
Enter fullscreen mode Exit fullscreen mode

Infrastructure

  • ScraperAPI — handles Instagram and TikTok's anti-bot systems
  • ThorData — residential proxies essential for social platform access
  • ScrapeOps — monitor success rates across social platforms

Conclusion

Engagement rate scraping turns influencer selection from gut feeling into data science. Real engagement beats follower counts every time. Build this pipeline, score authenticity, and your influencer marketing ROI will improve dramatically.

Top comments (0)