DEV Community

Cover image for How to Find High-Converting Influencers (Without Paying for Expensive Tools)
Olamide Olaniyan
Olamide Olaniyan

Posted on

How to Find High-Converting Influencers (Without Paying for Expensive Tools)

Influencer marketing tools cost $300-500/month.

For that price, you get:

  • A database of "millions of influencers" (90% irrelevant)
  • "AI-powered" recommendations (basic filters)
  • Engagement rate calculations (anyone can do this)
  • "Fake follower detection" (often inaccurate)

I built my own influencer research system for $20/month in API costs.

It's better than the paid tools because I control the criteria. Today I'll show you exactly how to build it.

What We're Building

A Python system that:

  1. Discovers influencers in your niche (not random celebrities)
  2. Scores them on metrics that actually matter
  3. Verifies authenticity (fake followers, engagement pods)
  4. Calculates estimated ROI before you reach out
  5. Exports a prioritized list ready for outreach

The Metrics That Actually Matter

Before we code, let's talk about what we're measuring:

Vanity Metrics (Ignore These)

  • Total followers - Meaningless without context
  • Total likes - Can be bought or botted
  • Follower growth rate - Could be from giveaways

Metrics That Predict Conversions

Metric Why It Matters Target Range
Engagement Rate Shows actual audience interest 3-8% (varies by platform)
Comment Sentiment Positive = buying intent >70% positive
Comment Authenticity Real comments vs bots >80% real
Audience Demographics Do they match your customers? Depends on product
Content Relevance Do they talk about your niche? >50% relevant posts
Posting Consistency Active = engaged audience 3-7 posts/week
Sponsored Post Performance How do ads do vs organic? <30% drop

Step 1: Project Setup

mkdir influencer-finder && cd influencer-finder
python -m venv venv
source venv/bin/activate

pip install requests pandas numpy python-dotenv textblob
Enter fullscreen mode Exit fullscreen mode

Create .env:

SOCIAVAULT_API_KEY=your_api_key
Enter fullscreen mode Exit fullscreen mode

Step 2: The Influencer Discovery System

First, we find potential influencers by analyzing who's active in your niche:

# discovery.py
import os
import requests
from typing import List, Dict, Set
from dotenv import load_dotenv

load_dotenv()

class InfluencerDiscovery:
    def __init__(self):
        self.api_key = os.getenv("SOCIAVAULT_API_KEY")
        self.base_url = "https://api.sociavault.com/v1"
        self.headers = {"Authorization": f"Bearer {self.api_key}"}

    def discover_from_hashtags(
        self, 
        platform: str,
        hashtags: List[str], 
        min_followers: int = 10000,
        max_followers: int = 500000
    ) -> Set[str]:
        """Find influencers posting about specific hashtags."""

        discovered = set()

        for hashtag in hashtags:
            print(f"Searching #{hashtag}...")

            if platform == "instagram":
                posts = self._get_instagram_hashtag_posts(hashtag)
            elif platform == "tiktok":
                posts = self._get_tiktok_hashtag_posts(hashtag)
            else:
                continue

            for post in posts:
                username = post.get("author", {}).get("username") or post.get("author", {}).get("uniqueId")
                followers = post.get("author", {}).get("followerCount", 0)

                if username and min_followers <= followers <= max_followers:
                    discovered.add(username)

        return discovered

    def discover_from_competitors(
        self,
        platform: str,
        competitor_usernames: List[str]
    ) -> Set[str]:
        """Find influencers who engage with competitors."""

        discovered = set()

        for competitor in competitor_usernames:
            print(f"Analyzing @{competitor}'s commenters...")

            # Get recent posts
            if platform == "instagram":
                posts = self._get_instagram_posts(competitor)
            elif platform == "tiktok":
                posts = self._get_tiktok_posts(competitor)
            else:
                continue

            # Analyze commenters
            for post in posts[:10]:  # Last 10 posts
                post_id = post.get("id") or post.get("shortcode")
                comments = self._get_post_comments(platform, post_id)

                for comment in comments:
                    commenter = comment.get("author", {})
                    username = commenter.get("username") or commenter.get("uniqueId")
                    followers = commenter.get("followerCount", 0)
                    is_verified = commenter.get("verified", False)

                    # Look for verified accounts or those with decent following
                    if is_verified or followers > 10000:
                        discovered.add(username)

        return discovered

    def discover_from_keywords(
        self,
        platform: str,
        keywords: List[str],
        min_followers: int = 10000
    ) -> Set[str]:
        """Find influencers by searching bio/content keywords."""

        discovered = set()

        for keyword in keywords:
            print(f"Searching for '{keyword}'...")

            if platform == "tiktok":
                results = self._search_tiktok_users(keyword)
            elif platform == "instagram":
                results = self._search_instagram_users(keyword)
            else:
                continue

            for user in results:
                username = user.get("username") or user.get("uniqueId")
                followers = user.get("followerCount", 0)

                if followers >= min_followers:
                    discovered.add(username)

        return discovered

    def _get_instagram_hashtag_posts(self, hashtag: str) -> List[Dict]:
        response = requests.get(
            f"{self.base_url}/scrape/instagram/hashtag",
            params={"hashtag": hashtag, "limit": 50},
            headers=self.headers,
            timeout=30
        )
        return response.json().get("data", []) if response.ok else []

    def _get_tiktok_hashtag_posts(self, hashtag: str) -> List[Dict]:
        response = requests.get(
            f"{self.base_url}/scrape/tiktok/hashtag",
            params={"hashtag": hashtag, "limit": 50},
            headers=self.headers,
            timeout=30
        )
        return response.json().get("data", []) if response.ok else []

    def _get_instagram_posts(self, username: str) -> List[Dict]:
        response = requests.get(
            f"{self.base_url}/scrape/instagram/posts",
            params={"username": username, "limit": 20},
            headers=self.headers,
            timeout=30
        )
        return response.json().get("data", []) if response.ok else []

    def _get_tiktok_posts(self, username: str) -> List[Dict]:
        response = requests.get(
            f"{self.base_url}/scrape/tiktok/videos",
            params={"username": username, "limit": 20},
            headers=self.headers,
            timeout=30
        )
        return response.json().get("data", []) if response.ok else []

    def _get_post_comments(self, platform: str, post_id: str) -> List[Dict]:
        if platform == "instagram":
            endpoint = f"{self.base_url}/scrape/instagram/comments"
            params = {"shortcode": post_id, "limit": 50}
        else:
            endpoint = f"{self.base_url}/scrape/tiktok/comments"
            params = {"videoId": post_id, "limit": 50}

        response = requests.get(endpoint, params=params, headers=self.headers, timeout=30)
        return response.json().get("data", []) if response.ok else []

    def _search_tiktok_users(self, keyword: str) -> List[Dict]:
        response = requests.get(
            f"{self.base_url}/scrape/tiktok/search/users",
            params={"keyword": keyword, "limit": 30},
            headers=self.headers,
            timeout=30
        )
        return response.json().get("data", []) if response.ok else []

    def _search_instagram_users(self, keyword: str) -> List[Dict]:
        response = requests.get(
            f"{self.base_url}/scrape/instagram/search/users",
            params={"query": keyword, "limit": 30},
            headers=self.headers,
            timeout=30
        )
        return response.json().get("data", []) if response.ok else []
Enter fullscreen mode Exit fullscreen mode

Step 3: The Authenticity Analyzer

Now we verify these influencers are legit:

# authenticity.py
import os
import re
import requests
from typing import Dict, List
from textblob import TextBlob
from statistics import mean, stdev
from dotenv import load_dotenv

load_dotenv()

class AuthenticityAnalyzer:
    def __init__(self):
        self.api_key = os.getenv("SOCIAVAULT_API_KEY")
        self.base_url = "https://api.sociavault.com/v1"
        self.headers = {"Authorization": f"Bearer {self.api_key}"}

    def analyze(self, platform: str, username: str) -> Dict:
        """Full authenticity analysis of an influencer."""

        print(f"Analyzing @{username}...")

        # Get profile data
        profile = self._get_profile(platform, username)
        if not profile:
            return {"error": "Could not fetch profile"}

        # Get recent posts
        posts = self._get_posts(platform, username)
        if not posts:
            return {"error": "Could not fetch posts"}

        # Get comments for engagement analysis
        all_comments = []
        for post in posts[:5]:  # Sample 5 posts
            post_id = post.get("id") or post.get("shortcode")
            comments = self._get_comments(platform, post_id)
            all_comments.extend(comments)

        # Run analyses
        engagement_analysis = self._analyze_engagement(profile, posts)
        comment_analysis = self._analyze_comments(all_comments)
        consistency_analysis = self._analyze_consistency(posts)
        follower_quality = self._estimate_follower_quality(profile, engagement_analysis)

        # Calculate overall authenticity score
        authenticity_score = self._calculate_authenticity_score(
            engagement_analysis,
            comment_analysis,
            consistency_analysis,
            follower_quality
        )

        return {
            "username": username,
            "platform": platform,
            "followers": profile.get("followerCount", 0),
            "engagement_rate": engagement_analysis["engagement_rate"],
            "authenticity_score": authenticity_score,
            "analyses": {
                "engagement": engagement_analysis,
                "comments": comment_analysis,
                "consistency": consistency_analysis,
                "follower_quality": follower_quality
            },
            "red_flags": self._identify_red_flags(
                engagement_analysis,
                comment_analysis,
                consistency_analysis
            ),
            "recommendation": self._get_recommendation(authenticity_score)
        }

    def _get_profile(self, platform: str, username: str) -> Dict:
        endpoint = f"{self.base_url}/scrape/{platform}/profile"
        response = requests.get(
            endpoint,
            params={"username": username},
            headers=self.headers,
            timeout=30
        )
        return response.json().get("data", {}) if response.ok else {}

    def _get_posts(self, platform: str, username: str) -> List[Dict]:
        endpoint = f"{self.base_url}/scrape/{platform}/{'posts' if platform == 'instagram' else 'videos'}"
        response = requests.get(
            endpoint,
            params={"username": username, "limit": 20},
            headers=self.headers,
            timeout=30
        )
        return response.json().get("data", []) if response.ok else []

    def _get_comments(self, platform: str, post_id: str) -> List[Dict]:
        if platform == "instagram":
            endpoint = f"{self.base_url}/scrape/instagram/comments"
            params = {"shortcode": post_id, "limit": 100}
        else:
            endpoint = f"{self.base_url}/scrape/tiktok/comments"
            params = {"videoId": post_id, "limit": 100}

        response = requests.get(endpoint, params=params, headers=self.headers, timeout=30)
        return response.json().get("data", []) if response.ok else []

    def _analyze_engagement(self, profile: Dict, posts: List[Dict]) -> Dict:
        """Analyze engagement patterns."""

        followers = profile.get("followerCount", 1)

        engagement_rates = []
        like_counts = []
        comment_counts = []

        for post in posts:
            likes = post.get("likeCount") or post.get("likesCount", 0)
            comments = post.get("commentCount") or post.get("commentsCount", 0)

            engagement = (likes + comments) / followers * 100 if followers > 0 else 0
            engagement_rates.append(engagement)
            like_counts.append(likes)
            comment_counts.append(comments)

        avg_engagement = mean(engagement_rates) if engagement_rates else 0
        engagement_std = stdev(engagement_rates) if len(engagement_rates) > 1 else 0

        # High variance in engagement can indicate bought engagement on specific posts
        engagement_consistency = 1 - min(engagement_std / avg_engagement, 1) if avg_engagement > 0 else 0

        return {
            "engagement_rate": round(avg_engagement, 2),
            "engagement_std": round(engagement_std, 2),
            "engagement_consistency": round(engagement_consistency, 2),
            "avg_likes": round(mean(like_counts), 0) if like_counts else 0,
            "avg_comments": round(mean(comment_counts), 0) if comment_counts else 0,
            "like_to_comment_ratio": round(mean(like_counts) / mean(comment_counts), 1) if comment_counts and mean(comment_counts) > 0 else 0
        }

    def _analyze_comments(self, comments: List[Dict]) -> Dict:
        """Analyze comment quality and authenticity."""

        if not comments:
            return {
                "total_analyzed": 0,
                "authentic_ratio": 0,
                "sentiment_positive": 0,
                "avg_comment_length": 0
            }

        authentic_count = 0
        positive_count = 0
        lengths = []

        # Spam patterns
        spam_patterns = [
            r"^(nice|great|wow|cool|love it|amazing|beautiful)!*$",
            r"^πŸ”₯+$",
            r"follow me",
            r"check my",
            r"dm for",
            r"^[\W]+$",  # Only emojis/symbols
        ]

        for comment in comments:
            text = comment.get("text", "")

            if not text:
                continue

            lengths.append(len(text))

            # Check for spam patterns
            is_spam = any(re.match(pattern, text.lower()) for pattern in spam_patterns)

            # Check if too short
            is_too_short = len(text) < 5

            if not is_spam and not is_too_short:
                authentic_count += 1

                # Sentiment analysis
                blob = TextBlob(text)
                if blob.sentiment.polarity > 0:
                    positive_count += 1

        total = len(comments)

        return {
            "total_analyzed": total,
            "authentic_ratio": round(authentic_count / total, 2) if total > 0 else 0,
            "sentiment_positive": round(positive_count / authentic_count, 2) if authentic_count > 0 else 0,
            "avg_comment_length": round(mean(lengths), 1) if lengths else 0
        }

    def _analyze_consistency(self, posts: List[Dict]) -> Dict:
        """Analyze posting consistency."""

        if len(posts) < 2:
            return {
                "posts_per_week": 0,
                "consistency_score": 0
            }

        # Parse timestamps
        from datetime import datetime

        timestamps = []
        for post in posts:
            ts = post.get("timestamp") or post.get("createTime")
            if ts:
                try:
                    if isinstance(ts, int):
                        timestamps.append(datetime.fromtimestamp(ts))
                    else:
                        timestamps.append(datetime.fromisoformat(ts.replace("Z", "+00:00")))
                except:
                    continue

        if len(timestamps) < 2:
            return {"posts_per_week": 0, "consistency_score": 0}

        timestamps.sort(reverse=True)

        # Calculate posting frequency
        date_range = (timestamps[0] - timestamps[-1]).days
        posts_per_week = len(timestamps) / max(date_range / 7, 1)

        # Calculate gaps between posts
        gaps = []
        for i in range(len(timestamps) - 1):
            gap = (timestamps[i] - timestamps[i + 1]).days
            gaps.append(gap)

        avg_gap = mean(gaps)
        gap_std = stdev(gaps) if len(gaps) > 1 else 0

        # Consistency score (lower variance = more consistent)
        consistency = 1 - min(gap_std / avg_gap, 1) if avg_gap > 0 else 0

        return {
            "posts_per_week": round(posts_per_week, 1),
            "avg_days_between_posts": round(avg_gap, 1),
            "consistency_score": round(consistency, 2)
        }

    def _estimate_follower_quality(self, profile: Dict, engagement: Dict) -> Dict:
        """Estimate follower quality based on engagement patterns."""

        engagement_rate = engagement.get("engagement_rate", 0)
        followers = profile.get("followerCount", 0)

        # Expected engagement rates by follower count (approximate)
        expected_rates = {
            (0, 10000): (5, 15),           # Small: 5-15%
            (10000, 50000): (3, 8),         # Medium: 3-8%
            (50000, 200000): (2, 5),        # Large: 2-5%
            (200000, 1000000): (1, 3),      # Very large: 1-3%
            (1000000, float('inf')): (0.5, 2)  # Mega: 0.5-2%
        }

        expected_min, expected_max = (1, 5)  # Default
        for (low, high), (e_min, e_max) in expected_rates.items():
            if low <= followers < high:
                expected_min, expected_max = e_min, e_max
                break

        # Score based on where they fall in expected range
        if engagement_rate < expected_min:
            quality = "low"
            fake_estimate = min((expected_min - engagement_rate) / expected_min * 100, 80)
        elif engagement_rate > expected_max * 1.5:
            quality = "suspicious"  # Too high might indicate engagement pods
            fake_estimate = 20  # Might be using pods, not necessarily fake followers
        else:
            quality = "good"
            fake_estimate = 10

        return {
            "quality": quality,
            "estimated_fake_percentage": round(fake_estimate, 0),
            "expected_engagement_range": f"{expected_min}-{expected_max}%",
            "actual_engagement": f"{engagement_rate}%"
        }

    def _calculate_authenticity_score(
        self,
        engagement: Dict,
        comments: Dict,
        consistency: Dict,
        follower_quality: Dict
    ) -> float:
        """Calculate overall authenticity score (0-100)."""

        scores = []

        # Engagement consistency (25% weight)
        scores.append(engagement.get("engagement_consistency", 0) * 25)

        # Comment authenticity (30% weight)
        scores.append(comments.get("authentic_ratio", 0) * 30)

        # Posting consistency (15% weight)
        scores.append(consistency.get("consistency_score", 0) * 15)

        # Follower quality (30% weight)
        fake_pct = follower_quality.get("estimated_fake_percentage", 50)
        follower_score = (100 - fake_pct) / 100
        scores.append(follower_score * 30)

        return round(sum(scores), 1)

    def _identify_red_flags(
        self,
        engagement: Dict,
        comments: Dict,
        consistency: Dict
    ) -> List[str]:
        """Identify potential red flags."""

        flags = []

        if engagement.get("engagement_rate", 0) < 1:
            flags.append("Very low engagement rate (<1%)")

        if engagement.get("engagement_rate", 0) > 15:
            flags.append("Suspiciously high engagement rate (>15%)")

        if engagement.get("engagement_consistency", 1) < 0.5:
            flags.append("Inconsistent engagement (possible bought likes on some posts)")

        if comments.get("authentic_ratio", 1) < 0.5:
            flags.append("Many spam/bot comments")

        if engagement.get("like_to_comment_ratio", 0) > 100:
            flags.append("Unusual like-to-comment ratio (possible bought likes)")

        if consistency.get("posts_per_week", 0) < 1:
            flags.append("Inactive account (<1 post/week)")

        return flags

    def _get_recommendation(self, score: float) -> str:
        """Get recommendation based on authenticity score."""

        if score >= 80:
            return "βœ… HIGHLY RECOMMENDED - Strong authenticity signals"
        elif score >= 60:
            return "πŸ‘ RECOMMENDED - Generally authentic with minor concerns"
        elif score >= 40:
            return "⚠️ PROCEED WITH CAUTION - Some red flags detected"
        else:
            return "❌ NOT RECOMMENDED - Multiple authenticity concerns"
Enter fullscreen mode Exit fullscreen mode

Step 4: ROI Calculator

Now let's estimate the potential ROI before reaching out:

# roi_calculator.py
from typing import Dict

class ROICalculator:
    def __init__(
        self,
        avg_order_value: float = 50,
        conversion_rate: float = 0.02,  # 2% is typical for influencer marketing
        content_lifespan_days: int = 30
    ):
        self.avg_order_value = avg_order_value
        self.conversion_rate = conversion_rate
        self.content_lifespan_days = content_lifespan_days

    def calculate(
        self,
        followers: int,
        engagement_rate: float,
        authenticity_score: float,
        estimated_cost: float,
        platform: str = "instagram"
    ) -> Dict:
        """Calculate estimated ROI for an influencer partnership."""

        # Adjust for authenticity (fake followers don't buy)
        real_follower_ratio = authenticity_score / 100
        effective_followers = followers * real_follower_ratio

        # Platform-specific reach estimates
        reach_rates = {
            "instagram": 0.20,  # Stories + Feed reach ~20% of followers
            "tiktok": 0.30,    # TikTok has better organic reach
            "youtube": 0.40,   # Subscribers are more engaged
        }
        reach_rate = reach_rates.get(platform, 0.20)

        # Calculate estimated reach
        estimated_reach = effective_followers * reach_rate

        # Calculate engagement (people who actually interact)
        estimated_engagements = estimated_reach * (engagement_rate / 100)

        # Calculate click-through (typically 1-3% of engagements)
        ctr = 0.02  # 2% CTR
        estimated_clicks = estimated_engagements * ctr

        # Calculate conversions
        estimated_conversions = estimated_clicks * self.conversion_rate

        # Calculate revenue
        estimated_revenue = estimated_conversions * self.avg_order_value

        # Calculate ROI
        roi = ((estimated_revenue - estimated_cost) / estimated_cost) * 100 if estimated_cost > 0 else 0

        # Cost per acquisition
        cpa = estimated_cost / estimated_conversions if estimated_conversions > 0 else float('inf')

        # Cost per engagement
        cpe = estimated_cost / estimated_engagements if estimated_engagements > 0 else float('inf')

        return {
            "input_metrics": {
                "followers": followers,
                "engagement_rate": engagement_rate,
                "authenticity_score": authenticity_score,
                "estimated_cost": estimated_cost
            },
            "reach_metrics": {
                "effective_followers": round(effective_followers),
                "estimated_reach": round(estimated_reach),
                "estimated_engagements": round(estimated_engagements),
                "estimated_clicks": round(estimated_clicks),
            },
            "conversion_metrics": {
                "estimated_conversions": round(estimated_conversions, 1),
                "estimated_revenue": round(estimated_revenue, 2),
            },
            "roi_metrics": {
                "roi_percentage": round(roi, 1),
                "cost_per_acquisition": round(cpa, 2),
                "cost_per_engagement": round(cpe, 4),
                "break_even_conversions": round(estimated_cost / self.avg_order_value, 1)
            },
            "recommendation": self._get_roi_recommendation(roi, cpa)
        }

    def estimate_cost(
        self,
        followers: int,
        platform: str,
        content_type: str = "post"
    ) -> float:
        """Estimate influencer cost based on followers and platform."""

        # Industry standard: $10-100 per 1K followers depending on niche
        # Tech/finance pays more, lifestyle pays less

        base_rates = {
            "instagram": {
                "post": 10,      # $10 per 1K followers
                "story": 5,      # $5 per 1K followers
                "reel": 15,      # $15 per 1K followers
            },
            "tiktok": {
                "post": 12,      # TikTok creators charge slightly more
                "story": 5,
            },
            "youtube": {
                "integration": 20,  # $20 per 1K subscribers
                "dedicated": 50,    # Dedicated video costs more
            }
        }

        platform_rates = base_rates.get(platform, base_rates["instagram"])
        rate_per_1k = platform_rates.get(content_type, 10)

        estimated = (followers / 1000) * rate_per_1k

        # Apply minimum ($100) and caps
        estimated = max(100, estimated)
        estimated = min(50000, estimated)  # Cap at $50K for mega influencers

        return round(estimated, 2)

    def _get_roi_recommendation(self, roi: float, cpa: float) -> str:
        """Get recommendation based on ROI metrics."""

        if roi > 200:
            return "🎯 EXCELLENT - High ROI potential"
        elif roi > 100:
            return "βœ… GOOD - Positive ROI expected"
        elif roi > 0:
            return "⚠️ MARGINAL - Low but positive ROI"
        else:
            return "❌ NEGATIVE ROI - Not recommended at this price"
Enter fullscreen mode Exit fullscreen mode

Step 5: Main Pipeline

Putting it all together:

#!/usr/bin/env python3
# main.py

import pandas as pd
from datetime import datetime
from discovery import InfluencerDiscovery
from authenticity import AuthenticityAnalyzer
from roi_calculator import ROICalculator

def find_influencers(
    platform: str,
    niche_hashtags: list,
    niche_keywords: list,
    competitor_accounts: list,
    min_followers: int = 10000,
    max_followers: int = 500000,
    avg_order_value: float = 50,
    output_file: str = "influencers.csv"
):
    """Main pipeline to discover and analyze influencers."""

    print(f"\n{'='*60}")
    print(f"Influencer Discovery Pipeline")
    print(f"Platform: {platform}")
    print(f"{'='*60}\n")

    # Initialize components
    discovery = InfluencerDiscovery()
    analyzer = AuthenticityAnalyzer()
    roi_calc = ROICalculator(avg_order_value=avg_order_value)

    # Step 1: Discover influencers
    print("πŸ” PHASE 1: Discovery")
    print("-" * 40)

    all_usernames = set()

    # From hashtags
    print(f"Searching hashtags: {niche_hashtags}")
    hashtag_results = discovery.discover_from_hashtags(
        platform, niche_hashtags, min_followers, max_followers
    )
    all_usernames.update(hashtag_results)
    print(f"   Found: {len(hashtag_results)} from hashtags")

    # From keywords
    print(f"Searching keywords: {niche_keywords}")
    keyword_results = discovery.discover_from_keywords(
        platform, niche_keywords, min_followers
    )
    all_usernames.update(keyword_results)
    print(f"   Found: {len(keyword_results)} from keywords")

    # From competitors
    print(f"Analyzing competitors: {competitor_accounts}")
    competitor_results = discovery.discover_from_competitors(
        platform, competitor_accounts
    )
    all_usernames.update(competitor_results)
    print(f"   Found: {len(competitor_results)} from competitors")

    print(f"\nπŸ“Š Total unique influencers discovered: {len(all_usernames)}")

    # Step 2: Analyze each influencer
    print(f"\nπŸ”¬ PHASE 2: Authenticity Analysis")
    print("-" * 40)

    results = []

    for i, username in enumerate(all_usernames):
        print(f"[{i+1}/{len(all_usernames)}] Analyzing @{username}...")

        try:
            analysis = analyzer.analyze(platform, username)

            if "error" in analysis:
                print(f"   ⚠️ Skipped: {analysis['error']}")
                continue

            # Calculate ROI
            estimated_cost = roi_calc.estimate_cost(
                analysis["followers"],
                platform,
                "post"
            )

            roi = roi_calc.calculate(
                analysis["followers"],
                analysis["engagement_rate"],
                analysis["authenticity_score"],
                estimated_cost,
                platform
            )

            results.append({
                "username": username,
                "platform": platform,
                "followers": analysis["followers"],
                "engagement_rate": analysis["engagement_rate"],
                "authenticity_score": analysis["authenticity_score"],
                "estimated_cost": estimated_cost,
                "estimated_roi": roi["roi_metrics"]["roi_percentage"],
                "estimated_cpa": roi["roi_metrics"]["cost_per_acquisition"],
                "recommendation": analysis["recommendation"],
                "red_flags": ", ".join(analysis["red_flags"]) if analysis["red_flags"] else "None",
                "profile_url": f"https://{platform}.com/{username}"
            })

            print(f"   βœ“ Score: {analysis['authenticity_score']} | ROI: {roi['roi_metrics']['roi_percentage']}%")

        except Exception as e:
            print(f"   ❌ Error: {str(e)}")

    # Step 3: Rank and export
    print(f"\nπŸ“‹ PHASE 3: Ranking & Export")
    print("-" * 40)

    df = pd.DataFrame(results)

    if len(df) > 0:
        # Sort by composite score (authenticity + ROI)
        df["composite_score"] = df["authenticity_score"] * 0.6 + df["estimated_roi"].clip(upper=200) * 0.4 / 2
        df = df.sort_values("composite_score", ascending=False)

        # Save to CSV
        df.to_csv(output_file, index=False)
        print(f"βœ… Saved {len(df)} influencers to {output_file}")

        # Print top 10
        print(f"\nπŸ† TOP 10 INFLUENCERS:")
        print("-" * 60)

        for i, row in df.head(10).iterrows():
            print(f"{row['username']:20} | {row['followers']:>10,} followers | "
                  f"Auth: {row['authenticity_score']:>5.1f} | ROI: {row['estimated_roi']:>6.1f}%")
    else:
        print("⚠️ No valid influencers found")

    print(f"\n{'='*60}")
    print(f"Pipeline complete!")
    print(f"{'='*60}\n")

    return df


if __name__ == "__main__":
    # Example: Find influencers for a SaaS product

    results = find_influencers(
        platform="instagram",
        niche_hashtags=[
            "saas", "startuplife", "techstartup", 
            "entrepreneurship", "buildinpublic"
        ],
        niche_keywords=[
            "SaaS founder", "tech entrepreneur", 
            "startup advisor", "growth marketing"
        ],
        competitor_accounts=[
            "stripe", "notion", "figma"
        ],
        min_followers=10000,
        max_followers=500000,
        avg_order_value=99,  # Your product price
        output_file="saas_influencers.csv"
    )
Enter fullscreen mode Exit fullscreen mode

Output Example

πŸ† TOP 10 INFLUENCERS:
------------------------------------------------------------
@techfounder_jane     |     125,000 followers | Auth:  87.3 | ROI:  245.2%
@startupsteve         |      85,000 followers | Auth:  82.1 | ROI:  198.4%
@growthsarah          |      45,000 followers | Auth:  91.2 | ROI:  312.8%
@buildinpublic_mike   |      32,000 followers | Auth:  89.5 | ROI:  287.1%
...
Enter fullscreen mode Exit fullscreen mode

Cost Comparison

Let's compare this to paid tools:

Tool Monthly Cost What You Get
Upfluence $499/mo Database access, basic analytics
Grin $999/mo Full platform, CRM features
CreatorIQ Custom Enterprise features
This system ~$20/mo Custom scoring, full control

API costs for this system:

  • 100 influencer analyses = ~$5 in API calls
  • Unlimited re-runs on your prioritized list
  • Custom metrics for YOUR business

Wrap Up

Stop paying $500/month for influencer tools that don't understand your business.

Build this system, customize the scoring for what matters to YOU, and make data-driven influencer decisions.

The complete code is ~500 lines of Python. You can build it in an afternoon.


Need the data APIs? SociaVault provides profile, post, and comment data for Instagram, TikTok, YouTube, and more. Pay-as-you-go pricing.

Questions? Drop them in the comments or hit me up on Twitter @sociavault.

Related guides:

Top comments (0)