DEV Community

Cover image for Build an Audience Overlap Detector: Find Shared Followers Across Creators
Olamide Olaniyan
Olamide Olaniyan

Posted on

Build an Audience Overlap Detector: Find Shared Followers Across Creators

Two influencers. Same niche. Same follower count. Same engagement rate.

You pick both for a campaign. Spend $6,000. Expect to reach 200K unique people.

But 40% of their audiences overlap. You actually reached 120K. You wasted $2,400 reaching the same people twice.

Audience overlap analysis is the most underused tool in influencer marketing. Nobody does it because the data is hard to get. But if you have access to follower lists, the math is dead simple.

Here's how to build it.

The Stack

  • Node.js – runtime
  • SociaVault API – fetch follower lists across platforms
  • Set operations – intersection, union, Jaccard similarity

Why Overlap Matters

Scenario Budget Expected Reach Actual Reach Waste
0% overlap $6,000 200K 200K $0
20% overlap $6,000 200K 160K $1,200
40% overlap $6,000 200K 120K $2,400
60% overlap $6,000 200K 80K $3,600

Even 20% overlap means you're overpaying by 20%. At $50K campaign budgets, that's $10K thrown away.

Step 1: Fetch Follower Data

const axios = require('axios');

const API_BASE = 'https://api.sociavault.com/v1';
const API_KEY = process.env.SOCIAVAULT_API_KEY;

const api = axios.create({
  baseURL: API_BASE,
  headers: { 'x-api-key': API_KEY },
});

async function getFollowers(platform, username, limit = 5000) {
  const followers = [];
  let cursor = null;

  while (followers.length < limit) {
    const params = { limit: Math.min(200, limit - followers.length) };
    if (cursor) params.cursor = cursor;

    const { data } = await api.get(`/${platform}/followers/${username}`, { params });

    followers.push(...data.followers);
    cursor = data.nextCursor;

    if (!cursor) break; // No more pages
  }

  return followers;
}
Enter fullscreen mode Exit fullscreen mode

Note: For large accounts (1M+ followers), you don't need the full list. A sample of 5,000-10,000 gives you statistically reliable overlap estimates.

Step 2: Calculate Overlap

Set intersection is the core operation. Two follower lists → how many appear in both?

function calculateOverlap(followersA, followersB) {
  // Create sets of user IDs for O(1) lookup
  const setA = new Set(followersA.map(f => f.userId));
  const setB = new Set(followersB.map(f => f.userId));

  // Find intersection (users in both sets)
  const intersection = new Set();
  for (const userId of setA) {
    if (setB.has(userId)) {
      intersection.add(userId);
    }
  }

  // Union = A + B - intersection
  const unionSize = setA.size + setB.size - intersection.size;

  return {
    creatorAFollowers: setA.size,
    creatorBFollowers: setB.size,
    sharedFollowers: intersection.size,
    uniqueReach: unionSize,

    // Overlap as percentage of each creator's audience
    overlapPercentA: parseFloat(((intersection.size / setA.size) * 100).toFixed(1)),
    overlapPercentB: parseFloat(((intersection.size / setB.size) * 100).toFixed(1)),

    // Jaccard similarity (0 = no overlap, 1 = identical audiences)
    jaccardSimilarity: parseFloat((intersection.size / unionSize).toFixed(3)),
  };
}
Enter fullscreen mode Exit fullscreen mode

Jaccard similarity is the standard metric:

  • 0.0 = completely different audiences
  • 0.1-0.2 = minimal overlap (good for reach campaigns)
  • 0.2-0.4 = moderate overlap (same niche, consider carefully)
  • 0.4+ = high overlap (you're paying twice for the same eyeballs)

Step 3: Pairwise Analysis for Multiple Creators

When comparing 5+ creators, you need every pair analyzed.

async function analyzeCreatorGroup(platform, usernames) {
  // Fetch all follower lists
  console.log(`Fetching followers for ${usernames.length} creators...`);
  const followerMap = {};

  for (const username of usernames) {
    console.log(`  Fetching @${username}...`);
    followerMap[username] = await getFollowers(platform, username, 5000);
  }

  // Calculate pairwise overlap
  const results = [];

  for (let i = 0; i < usernames.length; i++) {
    for (let j = i + 1; j < usernames.length; j++) {
      const a = usernames[i];
      const b = usernames[j];

      const overlap = calculateOverlap(followerMap[a], followerMap[b]);

      results.push({
        creatorA: a,
        creatorB: b,
        ...overlap,
      });
    }
  }

  return results;
}
Enter fullscreen mode Exit fullscreen mode

Step 4: Estimate Unique Reach

The real question: if you pick N creators, how many unique people do you reach?

function estimateUniqueReach(followerMap, selectedCreators) {
  const allFollowerIds = new Set();

  for (const username of selectedCreators) {
    for (const follower of followerMap[username]) {
      allFollowerIds.add(follower.userId);
    }
  }

  // Total if no overlap existed
  const totalRaw = selectedCreators.reduce(
    (sum, u) => sum + followerMap[u].length,
    0
  );

  return {
    rawTotal: totalRaw,
    uniqueReach: allFollowerIds.size,
    overlapCount: totalRaw - allFollowerIds.size,
    efficiency: parseFloat(((allFollowerIds.size / totalRaw) * 100).toFixed(1)),
  };
}
Enter fullscreen mode Exit fullscreen mode

Efficiency is the key metric. 100% means zero overlap. 60% means 40% of your spend hits duplicate eyeballs.

Step 5: Optimize Creator Selection

Given a pool of 20 creators and a budget for 5, which 5 maximize unique reach?

function optimizeSelection(followerMap, candidates, budget) {
  // Greedy algorithm: always pick the creator that adds the most unique followers
  const selected = [];
  const reachedFollowers = new Set();

  for (let i = 0; i < budget; i++) {
    let bestCandidate = null;
    let bestNewReach = 0;

    for (const username of candidates) {
      if (selected.includes(username)) continue;

      // Count how many NEW followers this creator would add
      let newReach = 0;
      for (const follower of followerMap[username]) {
        if (!reachedFollowers.has(follower.userId)) {
          newReach++;
        }
      }

      if (newReach > bestNewReach) {
        bestNewReach = newReach;
        bestCandidate = username;
      }
    }

    if (bestCandidate) {
      selected.push(bestCandidate);
      for (const follower of followerMap[bestCandidate]) {
        reachedFollowers.add(follower.userId);
      }
      console.log(`Pick ${i + 1}: @${bestCandidate} (+${bestNewReach.toLocaleString()} new followers)`);
    }
  }

  return {
    selected,
    totalUniqueReach: reachedFollowers.size,
  };
}
Enter fullscreen mode Exit fullscreen mode

This greedy approach gives near-optimal results. First pick = largest audience. Second pick = creator with the most followers not already covered. And so on.

Step 6: Generate the Report

async function runOverlapAnalysis(platform, usernames) {
  const followerMap = {};
  for (const u of usernames) {
    followerMap[u] = await getFollowers(platform, u, 5000);
  }

  // Pairwise overlap
  const pairs = await analyzeCreatorGroup(platform, usernames);

  console.log('\n=== AUDIENCE OVERLAP REPORT ===\n');

  // Overlap matrix
  console.log('Pairwise Overlap:');
  for (const pair of pairs.sort((a, b) => b.jaccardSimilarity - a.jaccardSimilarity)) {
    const flag = pair.jaccardSimilarity > 0.3 ? ' ⚠️ HIGH' : '';
    console.log(
      `  @${pair.creatorA} × @${pair.creatorB}: ` +
      `${pair.sharedFollowers.toLocaleString()} shared ` +
      `(${pair.overlapPercentA}% / ${pair.overlapPercentB}%) ` +
      `Jaccard: ${pair.jaccardSimilarity}${flag}`
    );
  }

  // Unique reach estimate
  const reach = estimateUniqueReach(followerMap, usernames);
  console.log(`\nCombined Reach:`);
  console.log(`  Raw total: ${reach.rawTotal.toLocaleString()}`);
  console.log(`  Unique reach: ${reach.uniqueReach.toLocaleString()}`);
  console.log(`  Efficiency: ${reach.efficiency}%`);

  // Optimal selection
  console.log(`\nOptimal 3 from ${usernames.length}:`);
  const optimal = optimizeSelection(followerMap, usernames, 3);
  console.log(`  Unique reach: ${optimal.totalUniqueReach.toLocaleString()}`);

  return { pairs, reach, optimal };
}

// Run it
runOverlapAnalysis('instagram', [
  'fitness_creator_1',
  'fitness_creator_2',
  'fitness_creator_3',
  'fitness_creator_4',
  'fitness_creator_5',
]);
Enter fullscreen mode Exit fullscreen mode

Sample Output

=== AUDIENCE OVERLAP REPORT ===

Pairwise Overlap:
  @fitness_creator_1 × @fitness_creator_2: 1,847 shared (36.9% / 24.6%) Jaccard: 0.213 ⚠️ HIGH
  @fitness_creator_3 × @fitness_creator_5: 1,203 shared (24.1% / 16.0%) Jaccard: 0.152
  @fitness_creator_1 × @fitness_creator_4: 412 shared (8.2% / 5.5%) Jaccard: 0.043
  ...

Combined Reach:
  Raw total: 25,000
  Unique reach: 18,200
  Efficiency: 72.8%

Optimal 3 from 5:
  Pick 1: @fitness_creator_2 (+5,000 new followers)
  Pick 2: @fitness_creator_5 (+4,300 new followers)
  Pick 3: @fitness_creator_4 (+3,800 new followers)
  Unique reach: 13,100
Enter fullscreen mode Exit fullscreen mode

Creator 1 and Creator 2 have 37% overlap — picking both wastes budget. The optimizer skips Creator 1 entirely.

Read the Full Guide

This is a condensed version. The full guide includes:

  • Cross-platform overlap (same person on Instagram + TikTok)
  • Weighted overlap by engagement quality
  • Visualization with D3.js Venn diagrams
  • Scaling to 100+ creator comparisons

Read the complete guide on SociaVault →


Building influencer analytics tools? SociaVault provides social media data APIs for TikTok, Instagram, YouTube, and 10+ platforms. Fetch follower lists, profiles, posts, and comments through one unified API.

Discussion

Have you ever found massive audience overlap on a campaign? How much budget did it waste? Drop your numbers 👇

webdev #api #nodejs #marketing #javascript

Top comments (0)