DEV Community

Vhub Systems
Vhub Systems

Posted on

How to Scrape Instagram Without Getting Banned in 2026 (3 Working Methods)

Instagram bans scrapers faster than almost any other platform. Their bot detection has gotten dramatically better since 2024 — traditional Selenium approaches fail within minutes.

Here are the 3 methods that actually work, tested in production as of April 2026.

Why Instagram Is Hard to Scrape

Instagram uses several layers of bot detection:

  1. Rate limiting — too many requests from one IP = block
  2. Fingerprinting — browser/device consistency checks
  3. Account-based restrictions — logged-in sessions have rate limits
  4. Graph API deprecation — the old Basic Display API was shut down

You can't just use requests.get() on Instagram URLs and parse the HTML. The data is loaded dynamically, and anonymous browsing is heavily restricted.


Method 1: Apify Instagram Scraper (Recommended)

Success rate: ~100% for public profiles

Cost: ~$0.004 per profile

Setup time: 10 minutes

This is the most reliable production-ready option. Apify's Instagram Profile Scraper (apify.com/store) uses residential proxies and rotating sessions to avoid detection.

What it extracts:

  • Follower/following counts
  • Bio, website URL
  • Post count
  • Recent posts (captions, likes, comments, date)
  • Profile metadata

Input:

{
  "usernames": ["nationalgeographic", "nasa", "nike"],
  "resultsLimit": 50
}
Enter fullscreen mode Exit fullscreen mode

Output:

{
  "username": "nationalgeographic",
  "followersCount": 280000000,
  "followsCount": 176,
  "postsCount": 35847,
  "biography": "...",
  "externalUrl": "https://www.nationalgeographic.com",
  "isVerified": true
}
Enter fullscreen mode Exit fullscreen mode

Limitations:

  • Public profiles only (no private account data)
  • Post details limited to public content
  • Not suitable for real-time monitoring at high frequency

Method 2: Playwright with Residential Proxies

Success rate: 60-80% depending on proxy quality

Cost: $8-15/GB residential proxy + compute

Setup time: 2-4 hours

If you need more control or want to run on your own infrastructure:

from playwright.async_api import async_playwright
import asyncio

async def scrape_instagram_profile(username: str, proxy: dict):
    async with async_playwright() as p:
        browser = await p.chromium.launch(
            headless=True,
            proxy=proxy  # {"server": "...", "username": "...", "password": "..."}
        )
        context = await browser.new_context(
            user_agent="Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) AppleWebKit/605.1.15",
            viewport={"width": 390, "height": 844},
            device_scale_factor=3,
            is_mobile=True,
            has_touch=True
        )

        page = await context.new_page()

        # Use mobile endpoint — less aggressive bot detection
        await page.goto(f"https://www.instagram.com/{username}/", 
                       wait_until="networkidle",
                       timeout=30000)

        # Extract from page source
        content = await page.content()
        # Parse the __data variable that Instagram embeds
        # ...

        await browser.close()
Enter fullscreen mode Exit fullscreen mode

Critical details that make this work:

  • Use mobile user-agents — Instagram's mobile detection is weaker
  • Use residential proxies, never datacenter
  • Add realistic timing delays (2-5 seconds between requests)
  • Rotate proxies every 10-15 requests minimum

Method 3: Official Instagram Graph API (Limited)

Success rate: 100% but restricted data

Cost: Free (within rate limits)

What you can access: Only your OWN account data or Business accounts that grant you permission

The Graph API still works but it's not for scraping competitor data:

import requests

# Only works for your own connected accounts
ACCESS_TOKEN = "your_access_token"
USER_ID = "your_user_id"

response = requests.get(
    f"https://graph.instagram.com/{USER_ID}/media",
    params={
        "fields": "id,media_type,timestamp,like_count,comments_count",
        "access_token": ACCESS_TOKEN
    }
)
Enter fullscreen mode Exit fullscreen mode

Use this only for: Analyzing your own Instagram account metrics, managing content for accounts you own.


What NOT to Do

Avoid these common mistakes:

Scrapy with residential proxies — Instagram detects Scrapy's request patterns almost immediately.

Logged-in automation without proper fingerprinting — Logging into an Instagram account via automation gets that account banned within hours.

High-frequency requests — Even with perfect proxies, requesting more than 5-10 profiles/minute from one IP causes detection.

Direct mobile API calls — Instagram's internal API (/api/v1/users/{pk}/info/) requires valid session tokens that expire quickly.


Real-World Use Cases

Influencer research: Validate follower counts and engagement rates before a partnership. At $0.004/profile, you can check 1,000 influencer profiles for $4 vs paying an influencer marketing platform $200/month.

Competitor monitoring: Track follower growth rate for 20-30 competitors weekly. Monthly cost: ~$3.

Brand monitoring: Check if your brand is being mentioned in public posts/stories on profiles you care about.

Market research: Analyze what content performs in your niche by scraping public profiles and ranking posts by engagement.


Summary: Which Method to Use

Use Case Method Why
One-off research Apify actor Easiest, no infrastructure
Daily monitoring Apify scheduled run Set-and-forget, reliable
High volume (10k+/day) Playwright + proxies More control, lower per-unit cost
Own account data Official API Free, within ToS

For most teams: start with Apify. At $0.004/profile, the cost is negligible vs developer time spent maintaining a custom scraper.


The Apify Scrapers Bundle includes the Instagram Profile Scraper along with 34 other production actors — Google SERP, Amazon, LinkedIn, TikTok Shop, contact info. Pre-configured inputs so you're running in 10 minutes.

$29 one-time purchase →


Ready-to-Use Instagram Scraper

For bulk Instagram scraping without managing sessions:

Top comments (0)