DEV Community

Cover image for Visualizing Instagram engagement with instascrape and Python

Visualizing Instagram engagement with instascrape and Python

Chris Greening on October 21, 2020

In a recent post, I introduced my open source Instagram web scraper instascrape as a lightweight means of collecting data from Instagram using Pyth...
Collapse
 
raghav739 profile image
Raghav Vasudeva

This error

C:\Users\dell\anaconda3\lib\site-packages\instascrape\core_static_scraper.py:134: MissingCookiesWarning: Request header does not contain cookies! It's recommended you pass at least a valid sessionid otherwise
Instagram will likely redirect you to their login page.
warnings.warn(
upload_date comments likes
0 1609352716 NaN NaN
1 1609262155 NaN NaN
2 1609098057 NaN NaN
3 1609010932 NaN NaN
4 1608314370 NaN NaN
5 1608227544 NaN NaN
6 1607624673 NaN NaN
7 1607115440 NaN NaN
8 1605468666 NaN NaN
9 1605464706 NaN NaN
10 1605457109 NaN NaN
11 1605199872 NaN NaN

Collapse
 
chrisgreening profile image
Chris Greening

Good find! I recently had to reimplement some of the code that's used in this tutorial, I'll fix it right now and get back to you in a bit once the patch is pushed through, thank you for bringing this to my attention!

Collapse
 
chrisgreening profile image
Chris Greening

Alright, the patch has been uploaded, reinstall the library with pip install instascrape==2.1.1 and you should be good to go! Thanks again

Collapse
 
raghav739 profile image
Raghav Vasudeva

*update
reinstall with the following
pip install insta-scrape==2.1.1

Collapse
 
raghav739 profile image
Raghav Vasudeva

Thanks for the quick reply!

Collapse
 
gabrielarcangelbol profile image
Gabriel Arcangel Bol

Thanks for this post, i´ve tried most of your examples, however im getting this error; "['upload_date'] not in index".

When i see the columns i found the following ones:

Index(['csrf_token', 'viewer', 'viewer_id', 'country_code', 'language_code',
'locale', 'device_id', 'browser_push_pub_key', 'key_id', 'public_key',
'version', 'is_dev', 'rollout_hash', 'bundle_variant', 'frontend_dev',
'id', 'shortcode', 'height', 'width', 'gating_info',
'fact_check_overall_rating', 'fact_check_information',
'sensitivity_friction_info', 'media_overlay_info', 'media_preview',
'display_url', 'accessibility_caption', 'is_video', 'tracking_token',
'tagged_users', 'caption', 'caption_is_edited', 'has_ranked_comments',
'comments', 'comments_disabled', 'commenting_disabled_for_viewer',
'timestamp', 'likes', 'location', 'viewer_has_liked',
'viewer_has_saved', 'viewer_has_saved_to_collection',
'viewer_in_photo_of_you', 'viewer_can_reshare', 'video_url',
'has_audio', 'video_view_count', 'username', 'full_name'],
dtype='object')

I guess that 'timestamp' is the right one to use, instead of 'upload date'

Please let me know if im missing something here.

Collapse
 
chrisgreening profile image
Chris Greening

Hey Gabriel! First of all, thanks so much for followin along!

Looks like you discovered a lil bug that I'm gonna go fix right now, thank you for bringing this to my attention!!! Since writing this post, the implementation of get_recent_posts has changed and it looks like I forgot to include the timestamp to upload_date conversion. Instagram only serves back an integer timestamp that instascrape then converts to a datetime object and embarrassingly I seem to have forgotten to write that back in after the update lol

Collapse
 
chrisgreening profile image
Chris Greening

ok my friend, the bug should be fixed! I merged the fix with the repo and am pushing it to PyPI under version 1.3.3 as we speak. Thanks for the find!

Collapse
 
gabrielarcangelbol profile image
Gabriel Arcangel Bol

Thank you so much for your time and fast reply. I´m doing a project to make some data analysis through any Instagram scrape tool. I came across to a RapidApi instagram API, but i haven´t figured out yet how to get the data from the request module. So, it was great to find your your api, because its easy to use. If you don´t mind i would let you know about my findings.

Thread Thread
 
chrisgreening profile image
Chris Greening

I'd love to hear about what you come up with! I actually just opened a discussion board about an hour ago on the repo, feel free to post about your project/ask questions about instascrape on there!

Thread Thread
 
gabrielarcangelbol profile image
Gabriel Arcangel Bol

Nice, thanks!

Collapse
 
berencelli profile image
Anacleto Berencelli

Hey, Chris!
Great scraper. But I'm getting the following error right after installing it running...
!pip install insta-scrape
from instascrape import Profile


File "/usr/local/lib/python3.6/dist-packages/instascrape/scrapers/profile.py", line 1
from future import annotations
^

SyntaxError: future feature annotations is not defined

Please let me know what I'm missing here.

Collapse
 
chrisgreening profile image
Chris Greening

Hey thanks so much for checking out the lib!

Based on your traceback, it looks like you're running Python 3.6 and from futures import annotations is only available in >=3.7! Hope this helps 😄

Collapse
 
berencelli profile image
Anacleto Berencelli

Thanks!!

Collapse
 
leonavalos profile image
Leon Avalos

Looks great, however i'm getting this error when trying to call profille.scrape():

json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
Enter fullscreen mode Exit fullscreen mode

Thanks!!

Collapse
 
prasannjeet profile image
Prasannjeet Singh

Facing the exact same problem. I was wondering if you were able to fix it?

Collapse
 
douelfakar profile image
DOUELFAKAR

Thanks a lot for this post Chris, i encounter an issue when I am on my home internet but not when I am connected with my phone or outside. How is it possible ?
Here is the error message I get:

instascrape.exceptions.exceptions.InstagramLoginRedirectError: Instagram is redirecting you to the login page instead of the page you are trying to scrape. This could be occuring because you 1. made too many requests too quickly or 2. are not logged into Instagram on your machine.

Collapse
 
villival profile image
villival

easy to use... perfect for data analysis

Collapse
 
chrisgreening profile image
Chris Greening

Thanks so much!!! One of the primary inspirations for this project was easy to use data scraping <3

Collapse
 
villival profile image
villival

wonderful efforts

Collapse
 
athiyarastogi profile image
Athiya Rastogi • Edited

Hi Chris, getting this error when I am trying to scrape posts.

scraped, unscraped = scrape_posts(posts, silent=False, headers=headers, pause=10)

Error: ValueError: Invalid value NaN (not a number)

What do I do to fix this? The code was working just fine until recently but it started giving the error above with the same piece of code.

Collapse
 
raghav739 profile image
Raghav Vasudeva

Getting this after running the code, please help out

C:\Users\dell\anaconda3\lib\site-packages\instascrape\core_static_scraper.py:134: MissingCookiesWarning: Request header does not contain cookies! It's recommended you pass at least a valid sessionid otherwise
Instagram will likely redirect you to their login page.
warnings.warn(
upload_date comments likes
0 1609352716 NaN NaN
1 1609262155 NaN NaN
2 1609098057 NaN NaN
3 1609010932 NaN NaN
4 1608314370 NaN NaN
5 1608227544 NaN NaN
6 1607624673 NaN NaN
7 1607115440 NaN NaN
8 1605468666 NaN NaN
9 1605464706 NaN NaN
10 1605457109 NaN NaN
11 1605199872 NaN NaN

Collapse
 
pirzadajunaid1 profile image
Pirzada Junaid Raza

How can i get data of all the posts?