DEV Community

loading...

Discussion on: If you would need to scrape many different websites nowdays, which tool/language combo would you pick?

Collapse
crimsonmed profile image
Médéric Burlet

Would depend on the type of scraping.

If we need to interact as a human then puppetteer with JS / TS would be good: github.com/puppeteer/puppeteer

If you just need to parse data I really like to use cheerio with JS / TS : github.com/cheeriojs/cheerio
It let's you access webpage information with jquery syntax. which can be quite practical.

Collapse
davcevski profile image
Mario Davchevski Author

Thanks for the response!

I do not need to interact as a human, but just collect news articles from different websites, at scale. Looking at cheerio, seems like a very decent option. Thanks!