DEV Community

Discussion on: Web Scraping using Python

Collapse
 
ayeshasaeedhaq profile image
Ayesha Saeed Haq

One more question, for some reason i am getting the following error in the code

def unique_links(tags,url):
cleaned_links = set()

for link in links:
    link = link.get("href") #to get external

    if link in links is None: 
        continue 
    if link.endswith('/') or link.endswith('#'):
        link = link[-1]
    actual_url = urllib.parse.urljoin(url,link)
    cleaned_links.add(actual_url)
return cleaned_links

cleaned_links = unique_links(links,url)

ttributeError Traceback (most recent call last)
in
13 return cleaned_links
14
---> 15 cleaned_links = unique_links(links,url)

in unique_links(tags, url)
7 if link in links is None:
8 continue
----> 9 if link.endswith('/') or link.endswith('#'):
10 link = link[-1]
11 actual_url = urllib.parse.urljoin(url,link)

AttributeError: 'NoneType' object has no attribute 'endswith' any idea why is that..

Collapse
 
prsharankumar profile image
Sharan Kumar Paratala Rajagopal

Can you check line 9, I think it is not recognizing # character and also can you please check if all libraries are loaded. Are any of your links ending with #?

Collapse
 
ayeshasaeedhaq profile image
Ayesha Saeed Haq

thanks.. i ended up commenting out line number 9