Concurrency -- try ThreadPoolExecutor, or some kinds of coroutines. It may speed up things a lot.
GET the content. I guess requests is OK.
Locating the content. I now prefer lxml to BeautifulSoup.
As you have noticed in some of the comments, you might try Node.js, where you can use Cheerio, which is jQuery-ish; but has no problem with CORS. (You may still need to fetch with axios or Node-fetch, though.)
Indeed, in the past, I used Python.
requests
is OK.As you have noticed in some of the comments, you might try Node.js, where you can use Cheerio, which is jQuery-ish; but has no problem with CORS. (You may still need to fetch with axios or Node-fetch, though.)
Yup, but still I'm wondering why almost no one is in favour of
Scrapy
.