刚接触python,按照https://blog.csdn.net/mtbaby/...
想爬取小猪短租信息,但之后IP被封。
于是看起了代理ip的问题,但是仍无法获得信息
import requests
from lxml import etree
import time
proxies = {
'http': 'http://61.135.217.7:80',
}
user_agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36'
url = 'http://hz.xiaozhu.com/'
headers = {'User-Agent': user_agent}
data = requests.get(url, headers=headers, proxies=proxies).text
h = etree.HTML(data)
home = h.xpath('//*[@id="page_list"]/ul/li')
time.sleep(2)
for div in home:
title = h.xpath('./div[2]/div/a/span/text()')[0] # 标题
price = h.xpath('./div[2]/span[1]/i/text()')[0] # 价格
print("{}-->{}}".format(title, price))
运行结果如下
希望能够帮忙解决,不胜感激!
并不是每个代理IP都有效,你要先确认代理是否有效再去使用