1.目标网站是:http://app1.sfda.gov.cn/datas...
2.test.py代码如下:
#-*- coding: UTF-8 -*-
import urllib
import urllib2
print "======================="
url="http://app1.sfda.gov.cn/datasearch/face3/base.jsp?tableId=25&tableName=TABLE25&title=%B9%FA%B2%FA%D2%A9%C6%B7&bcId=124356560303886909015737447882";
headers = {
'Host':'app1.sfda.gov.cn',
'Referer':'http://app1.sfda.gov.cn/datasearch/face3/base.jsp?tableId=25&tableName=TABLE25&title=%B9%FA%B2%FA%D2%A9%C6%B7&bcId=124356560303886909015737447882',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 LBBROWSER',
};
values = {};
values['tableId']='25';
values['tableName']='TABLE25';
values['title']='%B9%FA%B2%FA%D2%A9%C6%B7';
values['bcId']='124356560303886909015737447882';
data = urllib.urlencode(values)
request = urllib2.Request(url,data,headers)
response = urllib2.urlopen(request)
print response.read()
print "======================="
3.运行test.py后,获取不到数据
尝试了使用phantomjs模拟浏览器同样抓不到,求助大家了?
这是用了js混淆,参考文章:http://www.bijishequ.com/deta...
你可以使用selenium来获取源码