loading...

re: What is the best way for web scraping? VIEW POST

FULL DISCUSSION
 

Indeed, in the past, I used Python.

  • Concurrency -- try ThreadPoolExecutor, or some kinds of coroutines. It may speed up things a lot.
  • GET the content. I guess requests is OK.
  • Locating the content. I now prefer lxml to BeautifulSoup.

As you have noticed in some of the comments, you might try Node.js, where you can use Cheerio, which is jQuery-ish; but has no problem with CORS. (You may still need to fetch with axios or Node-fetch, though.)

 

Yup, but still I'm wondering why almost no one is in favour of Scrapy.

Code of Conduct Report abuse