Fatih İlhan

Posted on Apr 15

How I Built Instagram Intelligence Suite for IG Growth

#api #apify #automation #socialmedia

Instagram research usually breaks down in the same three places:

You find profiles, but you still do manual qualification.
You know a brand or influencer is active, but you cannot track story behavior cleanly.
You see high-intent comments under posts and reels, but nobody turns them into structured leads.

That is exactly why I split the workflow into three focused APIs instead of trying to force everything into one oversized scraper:

IGLead for profile qualification
IG_story_snapshot for story activity monitoring
IG_comment_lead for comment-to-lead extraction

All three are built around Apify actors, but the bigger idea is simple: treat Instagram intelligence like a pipeline, not a one-off scrape.

Why I split this into three APIs

A lot of Instagram tools try to do discovery, enrichment, monitoring, and lead scoring in one place. That sounds convenient until the inputs, auth requirements, and output formats start fighting each other.

I wanted each API to answer one clean question:

IGLead: Is this profile worth contacting?
IG_story_snapshot: Is this profile active on stories right now?
IG_comment_lead: Which commenters look like real demand?

That separation makes the stack easier to maintain, easier to schedule, and easier to plug into downstream automations.

1. IGLead: qualifying influencer and creator profiles before outreach

IGLead starts with a list of Instagram usernames or profile URLs and turns them into scored outreach candidates.

Instead of just scraping follower counts, it combines multiple signals:

follower count
recent post engagement
engagement rate
business email detection from public bio text
niche keyword matching
verification status
a final lead score and recommendation

One thing I especially like in this API is that the scoring is not flat. Engagement expectations change depending on account size. A micro creator should not be evaluated like a mega influencer, so the actor adjusts its thresholds by tier.

It also uses multiple extraction paths for reliability:

Instagram web profile API
feed endpoints when timeline media is incomplete
HTML parsing fallbacks
meta tag parsing for follower and post counts

That matters because Instagram is rarely stable enough for a single-method scraper.

Example `IGLead` input

{
  "profiles": ["therock", "cristiano", "https://www.instagram.com/kyliejenner/"],
  "sessionId": "YOUR_SESSION_ID",
  "minFollowers": 100000,
  "minEngagementRate": 1.0,
  "requireBusinessEmail": false,
  "nicheKeywords": ["fitness", "health", "wellness"],
  "maxProfilesPerRun": 50,
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"]
  }
}

What comes back

For each profile, IGLead can return:

normalized profile data
recent post stats
average and median engagement metrics
extracted business email if it is publicly visible in the bio
niche match score
leadScore
recommendation such as contact, review, or skip

For outreach teams, that means you can stop treating every Instagram profile as equal. You can prioritize the ones that actually fit your campaign and have the engagement to justify the spend.

2. IG_story_snapshot: monitoring story activity without owning the account

Stories are one of the hardest parts of Instagram to operationalize because they are temporary, fast-moving, and usually checked manually.

IG_story_snapshot is built to answer a very specific operational question:

Does this public profile have an active story right now, and what does that story set look like?

It tracks:

whether a profile currently has an active story
how many story frames are live
image vs video composition
oldest and newest story timestamps
hours since the story sequence started
hours left until expiry
optional profile context such as follower count

This is useful for:

competitor monitoring
campaign verification
event coverage tracking
brand activity benchmarking

What I like most here is that it avoids trying to do too much. It does not pretend to give story view counts, and it does not download story content. It focuses on presence and metadata, which is the part most teams actually need for monitoring.

Example `IG_story_snapshot` input

{
  "profiles": ["nike", "adidas", "@puma"],
  "sessionId": "YOUR_SESSION_ID",
  "includeProfileContext": true,
  "maxProfilesPerRun": 50,
  "maxRequestsPerMinute": 15,
  "proxyConfiguration": {
    "useApifyProxy": true
  }
}

What comes back

The output is centered around a few operational fields:

active_story
story_count
story_frames
story_metadata
story_age_hours
hours_left_until_expiry

That makes the API especially useful for scheduled runs. If you execute it every hour, you can build a clean timeline of who is posting stories, how often they post, and whether they lean more toward video or image content.

3. IG_comment_lead: turning Instagram comments into lead intelligence

IG_comment_lead is the most directly sales-oriented API in the stack.

The idea is straightforward: people reveal intent in comments all the time. They ask about price, shipping, details, availability, or how to order. Most teams read those comments manually, if they read them at all.

This API takes Instagram post or reel URLs, fetches comments, and scores commenters based on lead relevance.

The pipeline includes:

input validation for post and reel URLs
authenticated scraping with sessionId
fallback comment extraction strategies
keyword-based intent scoring
lightweight sentiment scoring
spam checks
deduplication by username
early stopping once the target lead count is reached
a final analytics summary for the whole run

I also like that this API is optimized for cost control. You can cap comments per post, define a minimum lead score, and stop the run as soon as enough leads are found.

Example `IG_comment_lead` input

{
  "postUrls": [
    "https://www.instagram.com/p/C3xYz1234Ab/",
    "https://www.instagram.com/reel/C3xYz5678Cd/"
  ],
  "sessionId": "YOUR_SESSION_ID",
  "cookie": "sessionid=...; csrftoken=...; ds_user_id=...;",
  "maxCommentsPerPost": 500,
  "targetLeads": 30,
  "minLeadScore": 0.6,
  "debugComments": false
}

What makes this one interesting

The comment fetch flow does not rely on a single endpoint. It tries multiple strategies:

GraphQL queries
shortcode-based REST endpoints
mobile-style REST endpoints

That gives it a better chance of surviving endpoint instability.

On top of extraction, it enriches leads with:

buyer_intent_score
engagement_score
likely customer flag
extracted keywords
inferred niche
inferred geography

At the end of a run, it also pushes an analytics summary with:

total comments processed
total leads found
lead rate
top commenters
intent distribution
sentiment distribution
top keywords
per-post breakdown

That means the output is useful for both direct lead capture and campaign analysis.

How the three APIs work together

The fun part is not each API in isolation. It is the workflow they create together.

A practical sequence looks like this:

Use IGLead to qualify creators, influencers, or niche accounts before outreach.
Use IG_story_snapshot to monitor who is actively posting stories right now.
Use IG_comment_lead on posts and reels in your niche to surface warm demand from commenters.

That gives you three different layers of Instagram intelligence:

profile quality
current activity
audience intent

In other words, you can answer:

Who should I contact?
Who is active right now?
Who is already asking buying questions?

Implementation notes

All three projects are built around the same practical philosophy:

use Apify actors for deployment and scheduling
use Crawlee for request orchestration
use Playwright when browser context is needed
keep concurrency controlled to reduce blocks
use fallback strategies instead of trusting one endpoint
support session cookies when Instagram requires authentication
preserve debug artifacts when extraction fails

For IGLead, that means debug HTML and screenshots when profile parsing breaks.

For IG_story_snapshot, that means API-first story detection with a visual fallback for story presence.

For IG_comment_lead, that means endpoint fallback plus a summary record at the end so a run is not just raw data, but something closer to decision-ready output.

What I would improve next

If I keep iterating on this stack, these are the next areas I would push:

cross-actor orchestration so leads can move automatically from one API to the next
historical storage for story activity trends over weeks instead of single snapshots
richer commenter enrichment for repeat engagement across multiple posts
better dashboarding on top of the analytics summary

The core scraping part is useful, but the real leverage comes from building a repeatable operating system around it.

Final thoughts

The biggest lesson from building IGLead, IG_story_snapshot, and IG_comment_lead is that Instagram automation becomes much more useful when you stop thinking in terms of "scrape a page" and start thinking in terms of "answer a business question."

Each API here is narrow on purpose:

IGLead answers qualification
IG_story_snapshot answers activity
IG_comment_lead answers intent

Put together, they form a lightweight Instagram intelligence stack for outreach, competitor research, and lead generation.

If you are building in this space, I would strongly recommend resisting the urge to turn everything into one monolith. Small, composable APIs are easier to trust, easier to debug, and much easier to turn into real workflows.

You can find all of my APIs here: https://apify.com/store/categories?search=seralifatih

DEV Community

How I Built Instagram Intelligence Suite for IG Growth

Why I split this into three APIs

1. IGLead: qualifying influencer and creator profiles before outreach

Example `IGLead` input

What comes back

2. IG_story_snapshot: monitoring story activity without owning the account

Example `IG_story_snapshot` input

What comes back

3. IG_comment_lead: turning Instagram comments into lead intelligence

Example `IG_comment_lead` input

What makes this one interesting

How the three APIs work together

Implementation notes

What I would improve next

Final thoughts

Top comments (0)

Why I split this into three APIs

1. IGLead: qualifying influencer and creator profiles before outreach

Example IGLead input

What comes back

2. IG_story_snapshot: monitoring story activity without owning the account

Example IG_story_snapshot input

What comes back

3. IG_comment_lead: turning Instagram comments into lead intelligence

Example IG_comment_lead input

What makes this one interesting

How the three APIs work together

Implementation notes

What I would improve next

Final thoughts

Example `IGLead` input

Example `IG_story_snapshot` input

Example `IG_comment_lead` input