Two influencers. Same niche. Same follower count. Same engagement rate.
You pick both for a campaign. Spend $6,000. Expect to reach 200K unique people.
But 40% of their audiences overlap. You actually reached 120K. You wasted $2,400 reaching the same people twice.
Audience overlap analysis is the most underused tool in influencer marketing. Nobody does it because the data is hard to get. But if you have access to follower lists, the math is dead simple.
Here's how to build it.
The Stack
- Node.js – runtime
- SociaVault API – fetch follower lists across platforms
- Set operations – intersection, union, Jaccard similarity
Why Overlap Matters
| Scenario | Budget | Expected Reach | Actual Reach | Waste |
|---|---|---|---|---|
| 0% overlap | $6,000 | 200K | 200K | $0 |
| 20% overlap | $6,000 | 200K | 160K | $1,200 |
| 40% overlap | $6,000 | 200K | 120K | $2,400 |
| 60% overlap | $6,000 | 200K | 80K | $3,600 |
Even 20% overlap means you're overpaying by 20%. At $50K campaign budgets, that's $10K thrown away.
Step 1: Fetch Follower Data
const axios = require('axios');
const API_BASE = 'https://api.sociavault.com/v1';
const API_KEY = process.env.SOCIAVAULT_API_KEY;
const api = axios.create({
baseURL: API_BASE,
headers: { 'x-api-key': API_KEY },
});
async function getFollowers(platform, username, limit = 5000) {
const followers = [];
let cursor = null;
while (followers.length < limit) {
const params = { limit: Math.min(200, limit - followers.length) };
if (cursor) params.cursor = cursor;
const { data } = await api.get(`/${platform}/followers/${username}`, { params });
followers.push(...data.followers);
cursor = data.nextCursor;
if (!cursor) break; // No more pages
}
return followers;
}
Note: For large accounts (1M+ followers), you don't need the full list. A sample of 5,000-10,000 gives you statistically reliable overlap estimates.
Step 2: Calculate Overlap
Set intersection is the core operation. Two follower lists → how many appear in both?
function calculateOverlap(followersA, followersB) {
// Create sets of user IDs for O(1) lookup
const setA = new Set(followersA.map(f => f.userId));
const setB = new Set(followersB.map(f => f.userId));
// Find intersection (users in both sets)
const intersection = new Set();
for (const userId of setA) {
if (setB.has(userId)) {
intersection.add(userId);
}
}
// Union = A + B - intersection
const unionSize = setA.size + setB.size - intersection.size;
return {
creatorAFollowers: setA.size,
creatorBFollowers: setB.size,
sharedFollowers: intersection.size,
uniqueReach: unionSize,
// Overlap as percentage of each creator's audience
overlapPercentA: parseFloat(((intersection.size / setA.size) * 100).toFixed(1)),
overlapPercentB: parseFloat(((intersection.size / setB.size) * 100).toFixed(1)),
// Jaccard similarity (0 = no overlap, 1 = identical audiences)
jaccardSimilarity: parseFloat((intersection.size / unionSize).toFixed(3)),
};
}
Jaccard similarity is the standard metric:
- 0.0 = completely different audiences
- 0.1-0.2 = minimal overlap (good for reach campaigns)
- 0.2-0.4 = moderate overlap (same niche, consider carefully)
- 0.4+ = high overlap (you're paying twice for the same eyeballs)
Step 3: Pairwise Analysis for Multiple Creators
When comparing 5+ creators, you need every pair analyzed.
async function analyzeCreatorGroup(platform, usernames) {
// Fetch all follower lists
console.log(`Fetching followers for ${usernames.length} creators...`);
const followerMap = {};
for (const username of usernames) {
console.log(` Fetching @${username}...`);
followerMap[username] = await getFollowers(platform, username, 5000);
}
// Calculate pairwise overlap
const results = [];
for (let i = 0; i < usernames.length; i++) {
for (let j = i + 1; j < usernames.length; j++) {
const a = usernames[i];
const b = usernames[j];
const overlap = calculateOverlap(followerMap[a], followerMap[b]);
results.push({
creatorA: a,
creatorB: b,
...overlap,
});
}
}
return results;
}
Step 4: Estimate Unique Reach
The real question: if you pick N creators, how many unique people do you reach?
function estimateUniqueReach(followerMap, selectedCreators) {
const allFollowerIds = new Set();
for (const username of selectedCreators) {
for (const follower of followerMap[username]) {
allFollowerIds.add(follower.userId);
}
}
// Total if no overlap existed
const totalRaw = selectedCreators.reduce(
(sum, u) => sum + followerMap[u].length,
0
);
return {
rawTotal: totalRaw,
uniqueReach: allFollowerIds.size,
overlapCount: totalRaw - allFollowerIds.size,
efficiency: parseFloat(((allFollowerIds.size / totalRaw) * 100).toFixed(1)),
};
}
Efficiency is the key metric. 100% means zero overlap. 60% means 40% of your spend hits duplicate eyeballs.
Step 5: Optimize Creator Selection
Given a pool of 20 creators and a budget for 5, which 5 maximize unique reach?
function optimizeSelection(followerMap, candidates, budget) {
// Greedy algorithm: always pick the creator that adds the most unique followers
const selected = [];
const reachedFollowers = new Set();
for (let i = 0; i < budget; i++) {
let bestCandidate = null;
let bestNewReach = 0;
for (const username of candidates) {
if (selected.includes(username)) continue;
// Count how many NEW followers this creator would add
let newReach = 0;
for (const follower of followerMap[username]) {
if (!reachedFollowers.has(follower.userId)) {
newReach++;
}
}
if (newReach > bestNewReach) {
bestNewReach = newReach;
bestCandidate = username;
}
}
if (bestCandidate) {
selected.push(bestCandidate);
for (const follower of followerMap[bestCandidate]) {
reachedFollowers.add(follower.userId);
}
console.log(`Pick ${i + 1}: @${bestCandidate} (+${bestNewReach.toLocaleString()} new followers)`);
}
}
return {
selected,
totalUniqueReach: reachedFollowers.size,
};
}
This greedy approach gives near-optimal results. First pick = largest audience. Second pick = creator with the most followers not already covered. And so on.
Step 6: Generate the Report
async function runOverlapAnalysis(platform, usernames) {
const followerMap = {};
for (const u of usernames) {
followerMap[u] = await getFollowers(platform, u, 5000);
}
// Pairwise overlap
const pairs = await analyzeCreatorGroup(platform, usernames);
console.log('\n=== AUDIENCE OVERLAP REPORT ===\n');
// Overlap matrix
console.log('Pairwise Overlap:');
for (const pair of pairs.sort((a, b) => b.jaccardSimilarity - a.jaccardSimilarity)) {
const flag = pair.jaccardSimilarity > 0.3 ? ' ⚠️ HIGH' : '';
console.log(
` @${pair.creatorA} × @${pair.creatorB}: ` +
`${pair.sharedFollowers.toLocaleString()} shared ` +
`(${pair.overlapPercentA}% / ${pair.overlapPercentB}%) ` +
`Jaccard: ${pair.jaccardSimilarity}${flag}`
);
}
// Unique reach estimate
const reach = estimateUniqueReach(followerMap, usernames);
console.log(`\nCombined Reach:`);
console.log(` Raw total: ${reach.rawTotal.toLocaleString()}`);
console.log(` Unique reach: ${reach.uniqueReach.toLocaleString()}`);
console.log(` Efficiency: ${reach.efficiency}%`);
// Optimal selection
console.log(`\nOptimal 3 from ${usernames.length}:`);
const optimal = optimizeSelection(followerMap, usernames, 3);
console.log(` Unique reach: ${optimal.totalUniqueReach.toLocaleString()}`);
return { pairs, reach, optimal };
}
// Run it
runOverlapAnalysis('instagram', [
'fitness_creator_1',
'fitness_creator_2',
'fitness_creator_3',
'fitness_creator_4',
'fitness_creator_5',
]);
Sample Output
=== AUDIENCE OVERLAP REPORT ===
Pairwise Overlap:
@fitness_creator_1 × @fitness_creator_2: 1,847 shared (36.9% / 24.6%) Jaccard: 0.213 ⚠️ HIGH
@fitness_creator_3 × @fitness_creator_5: 1,203 shared (24.1% / 16.0%) Jaccard: 0.152
@fitness_creator_1 × @fitness_creator_4: 412 shared (8.2% / 5.5%) Jaccard: 0.043
...
Combined Reach:
Raw total: 25,000
Unique reach: 18,200
Efficiency: 72.8%
Optimal 3 from 5:
Pick 1: @fitness_creator_2 (+5,000 new followers)
Pick 2: @fitness_creator_5 (+4,300 new followers)
Pick 3: @fitness_creator_4 (+3,800 new followers)
Unique reach: 13,100
Creator 1 and Creator 2 have 37% overlap — picking both wastes budget. The optimizer skips Creator 1 entirely.
Read the Full Guide
This is a condensed version. The full guide includes:
- Cross-platform overlap (same person on Instagram + TikTok)
- Weighted overlap by engagement quality
- Visualization with D3.js Venn diagrams
- Scaling to 100+ creator comparisons
Read the complete guide on SociaVault →
Building influencer analytics tools? SociaVault provides social media data APIs for TikTok, Instagram, YouTube, and 10+ platforms. Fetch follower lists, profiles, posts, and comments through one unified API.
Discussion
Have you ever found massive audience overlap on a campaign? How much budget did it waste? Drop your numbers 👇
Top comments (0)