Scrapy 1.1.2 在 python3.4.4 安装成功。
并用了 Scrapy bench 作测试:
C:\Documents and Settings\Administrator>scrapy bench
2016-09-02 18:06:42 [scrapy] INFO: Scrapy 1.1.2 started (bot: scrapybot)
2016-09-02 18:06:42 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 1, 'LOG_LEVEL': 'INFO', 'CLOSESPIDER_TIMEOUT': 10}
2016-09-02 18:06:44 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats',
'scrapy.extensions.closespider.CloseSpider']
2016-09-02 18:06:45 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats']
2016-09-02 18:06:45 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2016-09-02 18:06:45 [scrapy] INFO: Enabled item pipelines:
[]
2016-09-02 18:06:45 [scrapy] INFO: Spider opened
2016-09-02 18:06:45 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:46 [scrapy] INFO: Crawled 1 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:47 [scrapy] INFO: Crawled 2 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:49 [scrapy] INFO: Crawled 3 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:49 [scrapy] INFO: Crawled 10 pages (at 420 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:52 [scrapy] INFO: Crawled 23 pages (at 780 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:54 [scrapy] INFO: Crawled 31 pages (at 480 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:55 [scrapy] INFO: Closing spider (closespider_timeout)
2016-09-02 18:06:55 [scrapy] INFO: Crawled 39 pages (at 480 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:56 [scrapy] INFO: Crawled 50 pages (at 660 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:57 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 15412,
'downloader/request_count': 50,
'downloader/request_method_count/GET': 50,
'downloader/response_bytes': 87156,
'downloader/response_count': 50,
'downloader/response_status_count/200': 50,
'finish_reason': 'closespider_timeout',
'finish_time': datetime.datetime(2016, 9, 2, 10, 6, 57, 218750),
'log_count/INFO': 15,
'request_depth_max': 4,
'response_received_count': 50,
'scheduler/dequeued': 50,
'scheduler/dequeued/memory': 50,
'scheduler/enqueued': 1001,
'scheduler/enqueued/memory': 1001,
'start_time': datetime.datetime(2016, 9, 2, 10, 6, 45, 609375)}
2016-09-02 18:06:57 [scrapy] INFO: Spider closed (closespider_timeout)
从反馈的信息上看,是成功的。
然后,我按照这个帖子做例子:scrapy简单学习
却提示出错了,如下:
C:\Documents and Settings\Administrator>scrapy crawl dmoz -o items.json
Scrapy 1.1.2 - no active project
Unknown command: crawl
Use "scrapy" to see available commands
有哪位知道具体怎么运用Scrapy 1.1.2 吗?
scrapy 运行你的爬虫项目时是需要进入项目目录下运行命令的,不然它不解析你的命令,比如说我的项目Spiders,那就cd Spiders,然后执行命令