What will be scraped
Prerequisites
Full Code
Extracting data from the JSON string
Links
What will be scraped
Prerequisites
Basic k...
For further actions, you may consider blocking this person and/or reporting abuse
Dmi, the code does not work? it says "Error: It looks like you are using Playwright Sync API inside the asyncio loop. Please use the Async API instead." Alsi, it would be extremely helful to extract data from each profile (Research Interest, Citations, and h-index).
Hi, @datum_geek :) The code works, most likely you got a captcha on your end. The provided code should be used in addition to proxies or at least a captcha solving service.
After
Xnumber of requests, ResearchGate throws a captcha that needs to be solved.Try to change
user-agentto yours. Check what's youruser-agentand replace it.Also, I'm not sure about the error as the code import
sync_apiand context manager also:GIF that shows the output. 624 results in total:

The follow up blog post will be about scraping whole profile page π«
Yes, indeed. Notice that the code does not work in jupyter notebook/lab environmlent, but can write in vs code.
I don't work with notebooks too often :) Don't know such nuances that can make it work other than using async playwright API instead.
A possible workaround is to run the script in terminal, where data will be saved to the file and then load data inside the notebook.
Not very convenient though.
Let me know if you figure out it or not π