DEV Community

Discussion on: What is the best way for web scraping?

Collapse
 
patarapolw profile image
Pacharapol Withayasakpunt • Edited

Indeed, in the past, I used Python.

  • Concurrency -- try ThreadPoolExecutor, or some kinds of coroutines. It may speed up things a lot.
  • GET the content. I guess requests is OK.
  • Locating the content. I now prefer lxml to BeautifulSoup.

As you have noticed in some of the comments, you might try Node.js, where you can use Cheerio, which is jQuery-ish; but has no problem with CORS. (You may still need to fetch with axios or Node-fetch, though.)

Collapse
 
nishantwrp profile image
Nishant Mittal

Yup, but still I'm wondering why almost no one is in favour of Scrapy.