爬取数据的url访问成功200但是没有返回数据?

想要爬取京东的评论数据,反正在京东的网站内这段url可以获取到数据
image
https://club.jd.com/comment/productPageComments.action?callback=fetchJSON\_comment98vv1111&productId=4247631&score=0&sortType=5&page=0&pageSize=10&isShadowSku=0&fold=1

但是单独访问,就无法返回数据,不知道jd是做了什么反扒处理吗?
如何破解?

image
image

阅读 9.2k
2 个回答

类似于这样的接口,肯定有其相应的请求Headers的各种限制才给你返回数据,直接在浏览器地址栏贴上url访问是不行的,具体下来可以找到这条请求,右键-Copy-Copy as cURL:
image.png

然后到终端去执行这个curl命令就可以了,大概是这个样子:

curl "https://segmentfault.com/q/1010000021560158" -H "authority: segmentfault.com" -H "cache-control: max-age=0" -H "upgrade-insecure-requests: 1" -H "user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36" -H "sec-fetch-user: ?1" -H "accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9" -H "sec-fetch-site: same-origin" -H "sec-fetch-mode: navigate" -H "referer: https://segmentfault.com/questions/unanswered" -H "accept-encoding: gzip, deflate, br" -H "accept-language: zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7,ja;q=0.6,zh-TW;q=0.5,ms;q=0.4,ht;q=0.3,ko;q=0.2" -H "cookie: _ga=GA1.2.1926293500.1575098486; __gads=Test; sf_remember=e4e0c08fba7773c9bcf7c8e6b88ba902; PHPSESSID=web1~9mcseoma54sh574i2ehop20213; _gid=GA1.2.1832093998.1578766685; _gat_gtag_UA_918487_8=1; io=BxQGjb_8_rAym-CmvkfM" --compressed

如果你是写了个爬虫的话,把命令里面的headers弄出来放在代码里面一一设置即可,希望能帮助到你。

referer或者cookie吧。

这咋还跨域了呢

image.png

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进