DEV Community

loading...

Discussion on: If you would need to scrape many different websites nowdays, which tool/language combo would you pick?

Collapse
davcevski profile image
Mario Davchevski Author

But async in Python does not seems to be as natural as Node.js

This is one of the reasons I listed Go in the tags. Still learning it, but it feels that well thought concurrent code can go a long way in scraping at scale.

Basically I want to crawl simple blogs and extract their blog posts. The biggest challenge here would probably be the parsing of the data and understanding different content parts within a blogpost