DEV Community

Cover image for Scrape Everything! Building Web scrapers with Python and Go

Scrape Everything! Building Web scrapers with Python and Go

Diretnan Domnan on February 14, 2020

Table of Content Introduction Recognizing web scraping opportunities Key components of a scraping project Manual Inspection Web Requ...
Collapse
 
ericotrips profile image
The Nigerian Travel Blogger

I love how you broke down everything in this article from html, colly and python.
One thing that I surely relate to is the popups! Sites like netnaija have a lot of them and I'm glad you pointed this out.
Good writeup Diretnan!
Keep scribbling...

Collapse
 
camilocaquimbo profile image
Camilo Caquimbo Tabares

For popups or any other dynamic element on webpages you can use Selenium and a headless browser to get the information.

Collapse
 
svenvarkel profile image
Sven Varkel

Thanks, that's a good overview of the basics.
With scraping things can get big and serious fast and the codebase would get very big quickly. The majority of the work would be maintaining different scrapers/parsers for different websites that are always changing etc.
There's an excellent library/framework for creating scrapers (spiders) in Python: Scrapy. It takes a bit of a learning and setup but it's really really powerful once you master the concepts. There are daemons like scrapyd, web admin interface like Spiderkeeper etc and these work quite nicely together. If you're serious about scraping then you'd need a proxy solution also. I've had really good experience with Luminati. They're expensive but the best. And the comes cracking the captchas and other advanced topics. So scraping is a big world on its own. Happy scraping! :)

Collapse
 
deven96 profile image
Diretnan Domnan

Exactly!... things get complicated quickly but there are excellent libraries out there to help. Thanks for the helpful references too

Collapse
 
alex24409331 profile image
alex24409331

Awesome tutorial thank you. but for a non-tech user it is quiet hard to do a workable scraper for my WooCommerce store.
As a side solution i am using eCommerce scraper via e-scraper.com maybe it helps somebody too.
But I am not giving up))) Thank you for your input!!!

Collapse
 
derinsola16 profile image
Derin

Nice Article

Collapse
 
educr6 profile image
educr6

I loved the way you wrote this article. Super good!!!!

Collapse
 
deven96 profile image
Diretnan Domnan

Thanks!

Collapse
 
royce247 profile image
Royce O A • Edited

It is an educating and interesting piece of knowledge to be shared. Very aspiring and encouraging information. Thanks a lot. You can also visit:  schoolentry.xyz/  is good to go, u can check them out. myfinanceblog.xyz/  for finance, scholarstufz.xyz/  for scholarship, rsguide.xyz/ for relationship and scholarshipshall.com/ scholarship again. 

Collapse
 
hendanny profile image
Info Comment hidden by post author - thread only accessible via permalink
hendanny
Collapse
 
raynewsafrica profile image
Info Comment hidden by post author - thread only accessible via permalink
Ray News Africa

Nice content dotunsblog.com.ng/

Some comments have been hidden by the post's author - find out more