DEV Community

drake
drake

Posted on

从URL中解析出域名

import re

for domain in all_data:
    # 抽取出域名
    if 'http' in domain:
        domain = domain.replace('https://', '').replace('http://', '')
    if '/' in domain:
        domain = domain.split('/')[0]
    res = re.findall('\.(.+?)\.(.+)',domain)
    # 剔除前缀
    if res:
        domain = '.'.join(list(res[0]))
    print(domain)
Enter fullscreen mode Exit fullscreen mode

Top comments (0)