初学者一个,爬取某网站urlhtml=requests.get(url=url).text#,headers=headers,timeout=10
一直卡主不返回,加上headers timeout无效,代码改为
def ff_webdriver_html(url):
from selenium import webdriver
driver = webdriver.Chrome()
driver.get(url) #请求和获取页面
page_source = driver.page_source #获得页面的源代码
print(page_source)
return page_source
driver.close()
driver.quit()
return page_source
print(ff_webdriver_html(url))
原来并不是卡死,而是在加载完成后还在无限加载
data: {"rc":0,"rt":2,"svr":177617933,"lt":1,"full":1,"dlmkts":"","data":null}
data: {"rc":0,"rt":2,"svr":177617933,"lt":1,"full":1,"dlmkts":"","data":null}
data: {"rc":0,"rt":2,"svr":177617933,"lt":1,"full":1,"dlmkts":"","data":null}
data: {"rc":0,"rt":2,"svr":177617933,"lt":1,"full":1,"dlmkts":"","data":null}
data: {"rc":0,"rt":2,"svr":177617933,"lt":1,"full":1,"dlmkts":"","data":null}
data: {"rc":0,"rt":2,"svr":177617933,"lt":1,"full":1,"dlmkts":"","data":null}
data: {"rc":0,"rt":2,"svr":177617933,"lt":1,"full":1,"dlmkts":"","data":null}
如何在初次加载完成后立即返回不再重复加载,手动点击stop按钮即可返回,如何在初次加载完成后立即返回不再重复加载,手动点击stop按钮即可返回,如何在driver.get(url) #请求和获取页面
之后模拟点击stop按钮,或者其它更好的处理,如果用requests.get(url=url)或DrissionPage可以解决就更好了
在`driver.get(url)之前加
driver.set_page_load_timeout(5)
解决,谢谢