DEV Community

How to Create a Web Crawler From Scratch in Python

Frankie on July 26, 2019

Overview Most Python web crawling/scraping tutorials use some kind of crawling library. This is great if you want to get things done qui...
Collapse
 
adikwok profile image
adikwok

may i ask, sir?

why the proxy script returned 'curl' in my screen?

thank you and best regards,
adi kwok

Collapse
 
hkdennisk profile image
hkdennisk

I got the same error, the problem is the api key for proxyorbit.com.

if you print the proxy_info after line 16 "proxy_info = requests.get(self.proxy_orbit_url).json()"
then you should see {u'error': u'API token is incorrect'}

Collapse
 
redapemusic35 profile image
redapemusic35

How do we fix this error? I am guessing that it might have something to do with getting a new api token?

Thread Thread
 
phi16180 profile image
phi16180

Any updates on how to solve this issue?

Thread Thread
 
fprime profile image
Frankie

I have updated the post, check the bottom UPDATE section.

Collapse
 
fprime profile image
Frankie

Since there has been a lot of confusion about the curl issue I have updated the post to address the reason and possible workarounds.

Collapse
 
hossain000 profile image
Hossain000

That code was awesome. In this code you have extracted meta description, can you show how to extract <div class , PLEASE

Collapse
 
hossain000 profile image
Hossain000 • Edited

That code was awesome. In this code you have extracted meta description, can you show how to extract <div class , PLEASE

Some comments have been hidden by the post's author - find out more