Python:重写了Scrapy FilesPipeline中的item_completed和file_path，但没有被调用

Question

Python:重写了Scrapy FilesPipeline中的item_completed和file_path，但没有被调用

发布于
2017-08-21

更新于
2017-08-22

piplines.py:

class ApkspiderPipeline(FilesPipeline):
    def get_media_requests(self, item, info):
        for file_url in item["file_urls"]:
            yield Request(file_url)
    def item_completed(self, results, item, info):
        file_paths = [x["path"] for ok, x in results if ok]
        print file_paths
        if not file_paths:
            raise DropItem("Item contains no images")        
        return item
    def file_path(self,request,response=None,info=None):
        media_guid = item['AppName']+'.apk'
        filename = u'full/{0}/{1}'.format(Request.file_url.replace('http://sj.qq.com/myapp/category.htm?orgame=1&categoryId=',''), media_guid) 
        return filename

setting.py:

ITEM_PIPELINES = {
    #'apkSpider.pipelines.CheckPipeline': 300,
    #'apkSpider.pipelines.JsonWriterPipeline': 300,
    #'apkSpider.pipelines.ApkspiderPipeline': 300,
    'scrapy.pipelines.files.FilesPipeline': 1,
}
FILES_STORE = 'output'
FILES_EXPIRES = 90

图片描述

爬去下来的文件名依然是sha1编码过的

图片描述

python scrapy

阅读 7.4k

1 个回答

得票最新

鼠标不吃猫

411

发布于
2017-08-28

✓ 已被采纳

settings错了，按你现在的设定还是在用默认pipeline在处理item，你要设定成你重写的类名，类似

'yourprojectname.pipelines.ApkspiderPipeline': 1

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

相似问题

找不到问题？创建新问题

Python:重写了Scrapy FilesPipeline中的item_completed和file_path，但没有被调用

你尚未登录，登录后可以

Qt中布局是否只有5种呢？

这段代码为什么不能获取到数据？

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

请问一下，如何理解reduce函数呢？

如何使用Python+Selenium爬取Goodreads上万条书评而不崩溃？

如何使用 python 代码实现迅雷磁力链接资源的下载？

在PyCharm开发不同python项目，如果每个项目使用自己的venv环境，是不是每次切换项目都需要修改python interpreter？