如何在scrapy中发起请求时自定义 headers

我一般使用 yield Request()发起请求,尝试过改动settings中的 DEFAULT_REQUEST_HEADERS,也尝试过在Request()的构造函数中传入headers,都没有效果

阅读 28.3k
3 个回答
 headers = {
      "Host":"onlinelibrary.wiley.com",
      "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
      "Accept-Language":"zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3",
      "Accept-Encoding":"gzip, deflate",
      "Referer":"http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1521-3773",
      "Cookie":"EuCookie='this site uses cookies'; __utma=235730399.1295424692.1421928359.1447763419.1447815829.20; s_fid=2945BB418F8B3FEE-1902CCBEDBBA7EA2; __atuvc=0%7C37%2C0%7C38%2C0%7C39%2C0%7C40%2C3%7C41; __gads=ID=44b4ae1ff8e30f86:T=1423626648:S=ALNI_MalhqbGv303qnu14HBk1HfhJIDrfQ; __utmz=235730399.1447763419.19.2.utmcsr=baidu|utmccn=(organic)|utmcmd=organic; TrackJS=c428ef97-432b-443e-bdfe-0880dcf38417; OLProdServerID=1026; JSESSIONID=441E57608CA4A81DFA82F4C7432B400F.f03t02; WOLSIGNATURE=7f89d4e4-d588-49a2-9f19-26490ac3cdd3; REPORTINGWOLSIGNATURE=7306160150857908530; __utmc=235730399; s_vnum=1450355421193%26vn%3D2; s_cc=true; __utmb=235730399.3.10.1447815829; __utmt=1; s_invisit=true; s_visit=1; s_prevChannel=JOURNALS; s_prevProp1=TITLE_HOME; s_prevProp2=TITLE_HOME",
      "Connection":"keep-alive"
    }
yield Request(link,headers=self.headers,callback=self.parse2)

clipboard.png
我这样做没问题啊

楼上设置方法都是没问题的,
但是我是再scrapy 里面的settings中设置的

DEFAULT_REQUEST_HEADERS = {
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'Accept-Encoding':  'gzip, deflate',
    'Accept-Language': 'zh-CN,zh;q=0.9',
    'Connection':  'keep-alive',
    'host': 'www.web.cn',
    'Referer': 'http://www.web.cn/',
    'Cookie': 'is cookis'
]

想看一下设置是否生效,再解析相应的方法里用response获取
iCookie = response.request.headers.getlist('Cookie')
获取到的icookie是空列表,是不是没设置上cookie

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题