How to Scrape Threads Profiles and Posts Without the API in 2026
Meta's Threads platform has grown to 350M+ users but offers no public API for data extraction. If you want follower counts, post engagement, or profile data at scale — you need to scrape it.
Here's what works in 2026.
Why Scraping Threads Is Different From Instagram
Threads is built on Meta's infrastructure, which means:
- No public API (ActivityPub federation exists but doesn't expose profile data)
- GraphQL API under the hood (similar to Instagram, same WAF)
- Session-based rate limits — requests without valid session get throttled fast
- JavaScript-rendered content — most data loads after initial HTML
The good news: Threads loads faster than Instagram and has less aggressive bot detection for profile data.
What You Can Extract
| Field | Available |
|---|---|
| Username, display name, bio | ✅ |
| Follower count | ✅ |
| Following count | ✅ |
| Verified status | ✅ |
| Post text content | ✅ |
| Post likes, replies, reposts | ✅ |
| Post timestamps | ✅ |
| Profile picture URL | ✅ |
| External links in bio | ✅ |
Method 1: Direct API Approach (Python)
Threads uses a private GraphQL API. With the right session cookie, you can hit it directly:
import requests
import json
SESSION_COOKIE = "your_sessionid_cookie_here"
def get_threads_profile(username):
headers = {
"User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 16_0 like Mac OS X) AppleWebKit/605.1.15",
"Accept": "application/json",
"X-IG-App-ID": "238260118697367",
"Cookie": f"sessionid={SESSION_COOKIE}",
}
url = f"https://www.threads.net/api/v1/users/web_profile_info/?username={username}"
r = requests.get(url, headers=headers)
if r.status_code != 200:
return None
user_data = r.json().get("data", {}).get("user", {})
return {
"username": user_data.get("username"),
"full_name": user_data.get("full_name"),
"biography": user_data.get("biography"),
"follower_count": user_data.get("follower_count"),
"following_count": user_data.get("following_count"),
"is_verified": user_data.get("is_verified"),
}
profile = get_threads_profile("zuck")
print(json.dumps(profile, indent=2))
Method 2: Playwright with Stealth Mode
For post content (requires JS execution):
from playwright.sync_api import sync_playwright
import time
def scrape_threads_posts(username, max_posts=20):
posts = []
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
context = browser.new_context(
user_agent="Mozilla/5.0 (iPhone; CPU iPhone OS 16_0 like Mac OS X) AppleWebKit/605.1.15",
viewport={"width": 390, "height": 844}
)
def handle_response(response):
if "threads_web" in response.url and "graphql" in response.url:
try:
data = response.json()
items = data.get("data", {}).get("mediaData", {}).get("threads", [])
for item in items:
for ti in item.get("thread_items", []):
post = ti.get("post", {})
if post:
posts.append({
"text": post.get("caption", {}).get("text", ""),
"likes": post.get("like_count", 0),
"replies": post.get("reply_count", 0),
})
except: pass
page = context.new_page()
page.on("response", handle_response)
page.goto(f"https://www.threads.net/@{username}")
time.sleep(3)
for _ in range(3):
page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
time.sleep(2)
browser.close()
return posts[:max_posts]
Method 3: Apify Actor (Easiest)
The Threads Profile Scraper handles session management and proxy rotation automatically.
import requests, time
run = requests.post(
"https://api.apify.com/v2/acts/lanky_quantifier~threads-profile-scraper/runs",
headers={"Authorization": "Bearer YOUR_TOKEN"},
json={"usernames": ["zuck", "mosseri"], "maxPostsPerProfile": 50}
).json()["data"]
while True:
status = requests.get(
f"https://api.apify.com/v2/actor-runs/{run['id']}",
headers={"Authorization": "Bearer YOUR_TOKEN"}
).json()["data"]["status"]
if status in ("SUCCEEDED", "FAILED"): break
time.sleep(5)
results = requests.get(
f"https://api.apify.com/v2/actor-runs/{run['id']}/dataset/items",
headers={"Authorization": "Bearer YOUR_TOKEN"}
).json()
for p in results:
print(f"@{p['username']}: {p['follower_count']:,} followers")
Cost: ~$0.50 per 1,000 profiles scraped.
Rate Limits and Anti-Detection
| Behavior | Risk Level |
|---|---|
| >100 requests/min without session | 🔴 Block |
| Datacenter IPs | 🟡 Medium |
| Residential IPs with session | 🟢 Low |
| Mobile user-agent + session | 🟢 Low |
Use mobile user-agents (Threads is mobile-first) and residential proxies for sustained scraping.
Use Cases
- Influencer research: Track follower growth before partnerships
- Competitor monitoring: Watch brand accounts for content strategy changes
- Trend detection: Find which topics drive highest engagement
- Lead generation: Find professionals by keywords in their bio
- Market research: Track public sentiment on product launches
The method depends on your scale. For one-time research, use the direct API approach. For ongoing monitoring of hundreds of profiles, use Apify's managed scraper to avoid session rotation headaches.
Save hours on scraping setup: The $29 Apify Scrapers Bundle includes 35+ production-ready actors — Google SERP, LinkedIn, Amazon, TikTok, contact info, and more. Pre-configured inputs, working on day one.
Top comments (0)