Instagram's official Graph API requires a business account and Facebook review to get meaningful access. For research, competitive analysis, or lead generation, here's what works without the API.
What's publicly accessible
Instagram public profiles expose:
- Username, full name, bio, website
- Post count, followers, following
- Profile picture URL
- Recent posts (thumbnails, captions, like/comment counts)
- Story highlights (titles only)
What requires auth: stories, DMs, private accounts, detailed post insights.
Method 1: Instagram's public data endpoints
Instagram has JSON endpoints that return profile data without authentication:
import requests
def get_instagram_profile(username: str) -> dict:
url = f"https://www.instagram.com/{username}/?__a=1&__d=dis"
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/122.0.0.0 Safari/537.36",
"Accept": "application/json",
"X-IG-App-ID": "936619743392459",
}
session = requests.Session()
# First get cookies
session.get("https://www.instagram.com/", headers={"User-Agent": headers["User-Agent"]})
response = session.get(url, headers=headers)
if response.status_code == 200:
try:
data = response.json()
user = data.get("graphql", {}).get("user", {})
return {
"username": user.get("username"),
"full_name": user.get("full_name"),
"bio": user.get("biography"),
"followers": user.get("edge_followed_by", {}).get("count"),
"following": user.get("edge_follow", {}).get("count"),
"posts": user.get("edge_owner_to_timeline_media", {}).get("count"),
"is_verified": user.get("is_verified"),
"website": user.get("external_url"),
}
except Exception:
pass
return {}
Note: Instagram rotates and breaks this endpoint. If you get 401/403, rotate user agents or add delays.
Method 2: Playwright with session warm-up
For more reliable extraction, use a headless browser with a proper session:
from playwright.async_api import async_playwright
import asyncio
async def scrape_instagram_profile(username: str) -> dict:
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=True,
args=["--disable-blink-features=AutomationControlled", "--no-sandbox"]
)
context = await browser.new_context(
user_agent="Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) AppleWebKit/605.1.15",
viewport={"width": 390, "height": 844},
# Mobile UA gets simpler page structure
)
page = await context.new_page()
# Intercept API calls to capture JSON response
profile_data = {}
async def handle_route(route, request):
if "api/v1/users/web_profile_info" in request.url:
response = await route.fetch()
try:
data = await response.json()
user = data.get("data", {}).get("user", {})
profile_data.update({
"username": user.get("username"),
"followers": user.get("edge_followed_by", {}).get("count"),
"following": user.get("edge_follow", {}).get("count"),
"bio": user.get("biography"),
})
except Exception:
pass
await route.continue_()
else:
await route.continue_()
await page.route("**/*", handle_route)
await page.goto(f"https://www.instagram.com/{username}/")
await page.wait_for_timeout(3000)
await browser.close()
return profile_data
profile = asyncio.run(scrape_instagram_profile("natgeo"))
print(profile)
Method 3: Public search without auth
Instagram's Explore page and hashtag search return some profile data without login:
import requests
def search_instagram_users(query: str) -> list:
url = "https://www.instagram.com/web/search/topsearch/"
params = {
"context": "blended",
"query": query,
"rank_token": "0.0",
}
headers = {"User-Agent": "Mozilla/5.0", "X-Requested-With": "XMLHttpRequest"}
session = requests.Session()
session.get("https://www.instagram.com/") # Get initial cookies
response = session.get(url, params=params, headers=headers)
if response.status_code == 200:
data = response.json()
return [
{
"username": u.get("user", {}).get("username"),
"full_name": u.get("user", {}).get("full_name"),
"followers": u.get("user", {}).get("follower_count"),
}
for u in data.get("users", [])
]
return []
# Find influencers in a niche
users = search_instagram_users("web scraping developer")
print(users[:5])
Method 4: Pre-built actor (recommended for scale)
The Instagram Profile Scraper on Apify handles auth rotation, proxy management, and rate limiting. Input usernames or profile URLs.
Sample output:
{
"username": "natgeo",
"fullName": "National Geographic",
"bio": "Experience the world through the eyes of National Geographic photographers.",
"followersCount": 279000000,
"followingCount": 143,
"postsCount": 30500,
"isVerified": true,
"website": "https://www.nationalgeographic.com",
"profilePicUrl": "https://..."
}
107+ production runs. Pay-per-result pricing.
Anti-detection tips
Instagram's rate limits are aggressive:
- Max ~50 profile requests per session before getting soft-blocked
- Use residential proxies for bulk extraction
- Add 2-5 second delays between requests
- Rotate sessions (new browser context) every 20-30 requests
- Mobile user agents tend to work better than desktop
Use cases
- Influencer discovery: find accounts in a niche by follower range + bio keywords
- Brand monitoring: track competitor account growth
- Lead generation: identify Instagram-active potential customers
- Market research: analyze posting frequency + engagement patterns
n8n AI Automation Pack ($39) — 5 production-ready workflows
Pre-built and maintained
Skip the extraction layer entirely:
Apify Scrapers Bundle — $29 one-time
35+ scrapers including Instagram, Twitter, LinkedIn, Amazon, and more.
Top comments (0)