๐ InstaScrape โ Async Instagram Comment Scraper
Visit: Github
Scrape all parent comments from any Instagram Reel with automated login, async speed, real-time progress, and clean exports โ no manual cookie copying required.
โจ Features
- โ
Automated Login:
cookie.jsonpersistence with iat + expiry, no manual cookies needed. - ๐ Self-healing Auth: detects expired cookies mid-run, prompts relogin, resumes automatically.
- โก Async Engine: powered by
httpx.AsyncClientwith requests-per-second throttling. - ๐ Progress Tracking: accurate percent and ETA from Instagramโs comment count.
- ๐ Dual Exports: TXT and JSON files saved in timestamped folders.
๐ฆ Requirements
- Python 3.9+
- Dependencies:
pip install -r requirements.txt
๐ ๏ธ Installation
git clone https://github.com/kaifcodec/InstaScrape
cd InstaScrape
pip install -r requirements.txt
โถ๏ธ Usage
python3 main.py
- Enter the Instagram Reel URL (e.g., https://www.instagram.com/reel/SHORTCODE/).
- Set Max requests per second (5-7 recommended). Adjust for stability.
- On first run, provide username/password; cookie.json is created and reused until expiry.
๐ Output
- TXT: download_comments/txt/reel_comments_YYYYMMDD_HHMMSS.txt
- JSON: download_comments/json/reel_comments_YYYYMMDD_HHMMSS.json Example JSON structure:
{
"generated_at": 1700000000,
"count": 123,
"comments": [
{ "username": "user1", "text": "Nice!", "created_at": 1699999000 }
]
}
๐ง How it Works
- Cookie Lifecycle: cookie.json stores iat and expiry; validated on startup & during requests.
- Error Resilience: retries transient errors and refreshes cookies on 401/redirect-to-login.
- Progress Accuracy: uses Instagramโs comment count to calculate percent & ETA.
- Async Efficiency: httpx.AsyncClient with HTTP/2, keep-alive, and RPS limiter.
๐ก Tips
- Start with 5-7 RPS to minimize throttling; increase gradually.
- Filenames use local time; switch to UTC by replacing datetime.now() with datetime.utcnow() in main.py.
โ ๏ธ Disclaimer
Use responsibly. Comply with Instagramโs Terms of Service. Intended for personal or permitted use only.
Top comments (1)
It's bit slower right now, cause it uses
/graphqlAPI endpoint of Instagram that loads comments in new pages dynamically, Feel free to suggest fixes and improvements.