piplines.py:
class ApkspiderPipeline(FilesPipeline):
def get_media_requests(self, item, info):
for file_url in item["file_urls"]:
yield Request(file_url)
def item_completed(self, results, item, info):
file_paths = [x["path"] for ok, x in results if ok]
print file_paths
if not file_paths:
raise DropItem("Item contains no images")
return item
def file_path(self,request,response=None,info=None):
media_guid = item['AppName']+'.apk'
filename = u'full/{0}/{1}'.format(Request.file_url.replace('http://sj.qq.com/myapp/category.htm?orgame=1&categoryId=',''), media_guid)
return filename
setting.py:
ITEM_PIPELINES = {
#'apkSpider.pipelines.CheckPipeline': 300,
#'apkSpider.pipelines.JsonWriterPipeline': 300,
#'apkSpider.pipelines.ApkspiderPipeline': 300,
'scrapy.pipelines.files.FilesPipeline': 1,
}
FILES_STORE = 'output'
FILES_EXPIRES = 90
爬去下来的文件名依然是sha1编码过的
settings错了,按你现在的设定还是在用默认pipeline在处理item,你要设定成你重写的类名,类似