Python从URL中提取域名

Python如何从URL中提取域名?url有各种格式的如下:

输入:

https://docs.google.com/spreadsheet/ccc?key=blah-blah-blah-blah#gid=1
https://stackoverflow.com/questions/1234567/blah-blah-blah-blah
http://www.domain.com
https://www.other-domain.com/whatever/blah/blah/?v1=0&v2=blah+blah ...

输出:

docs.google.com
stackoverflow.com
www.domain.com
www.other-domain.com
阅读 23.9k
3 个回答

使用Python 内置的模块 urlparse

from urlparse import *
url = 'https://docs.google.com/spreadsheet/ccc?key=blah-blah-blah-blah#gid=1'
result = urlparse(url)

result 包含了URL的所有信息

原文出处:Python实用脚本清单

从URL中提取域名

def extractDomainFromURL(url):
    """Get domain name from url"""
    from urlparse import urlparse
    parsed_uri = urlparse(url)
    domain = '{uri.netloc}'.format(uri=parsed_uri)
    return domain

python3.7要这样导入包

from urllib.parse import urlparse

url = """https://www.jiazhao.com/jiaojingshoushi/169/"""
parse_result = urlparse(url)
print(parse_result )
ParseResult(scheme='https', netloc='www.jiazhao.com', path='/jiaojingshoushi/', params='', query='', fragment='')
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题
宣传栏