In this post, we'll look into how we can optimize and improve our puppeteer Web Scraping API. We'll also look into several puppeteer plugins to imp...
For further actions, you may consider blocking this person and/or reporting abuse
Hii, I am making an instagram scraping tool.
Instagram divs weren't loading in headless:true mode, than I changed to puppeteer-extra and added stealth plugin. Everything worked fine on localhost, thanks to you.
But, unfortunately when deployed to heroku, the divs are not loading again, even page.waitForSelector shows timeout error.
PS-: 1) I've added the args: ['--no-sandbox']
2) I've also added github.com/jontewks/puppeteer-hero... buildpack in my heroku-app-settings.
Link to my project-: github.com/apanjwani0/Scrape-Insta...
Thanks in advance !
did you find any solution for that?
No, I thought maybe dockerizing my project would solve the issue (that way we can also run headless:false), but never continued with the project.
Do let me know if it works for you, or you find any other solution.
await page.goto(BASE_URL, { waitUntil: "networkidle0" })
waitUntil: "networkidle0" is nessary for this issue and set the headless to new
Great article!
Have you faced a problem with Heroku IP being blocked by the website scraped? If yes, how did you bypass it? Example: stackoverflow.com/questions/143289...