版本:python3.5
代码:
header = {
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Encoding":"gzip, deflate",
"Accept-Language":"zh-CN,zh;q=0.9",
"Connection":"keep-alive",
"Host":"www.dianping.com",
"Upgrade-Insecure-Requests":"1",
"User-Agent": 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0',
"Pragma": "no-cache",
"Cache-Control": "no-cache",
'Cookie':'cye=guangzhou; '
}
result = requests.get('http://www.dianping.com/search/category/4/10/g0r0', headers=header)
print(result.content)
同一个头部,同样的请求地址,尝试十次,大概有三次出现错误页面,其他出现的都是正确页面,请问为什么呢
错误页面部分代码
<html><head>\r\n<script language="javascript">\r\nsetTimeout("location.replace(location.href.split(\\"#\\")[0])",2000);\r\n</script>\r\n<script src="http://18.20.18.20:44582/nsflashcookie/flash.js" type="text/javascript">
用浏览器以及pycharm测试请求频繁返回的是403forbidden
就是请求太频繁了,每次都一样的header很容易被认出来
比如 3-5s 或 30-60s 请求一次