所有楼盘页:
http://fsfc.fsjw.gov.cn/searc...
具体某一楼盘:
http://fsfc.fsjw.gov.cn/hpms_...
红框内为多个楼栋信息
查看网页源代码,可以找出这些楼盘每个都有一个json地址,
拼接起来,例如:http://fsfc.fsjw.gov.cn/hpms_...
可以看到这栋楼的所有房号的信息
这是我想要的格式为:
一部分(楼盘名称、行政区)在具体楼盘页里,另一部分(楼栋名称,房号名称,面积)在某一楼栋json网址中,https://git.oschina.net/wiwis... 这是我的代码,好像是在请求json地址时出错了,不能加入Item中,如何能在一个代码中获得改网站的所有房号名称(同时又楼盘名称,行政区,又有楼栋名称,房号名称,面积)?
出错信息如下:
appledeMacBook-Air:fangchang wiw$ scrapy crawl fsfc
2017-02-22 09:36:13 [scrapy] INFO: Scrapy 1.0.5 started (bot: fangchang)
2017-02-22 09:36:13 [scrapy] INFO: Optional features available: ssl, http11
2017-02-22 09:36:13 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'fangchang.spiders', 'SPIDER_MODULES': ['fangchang.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36', 'DOWNLOAD_DELAY': 0.5, 'BOT_NAME': 'fangchang'}
2017-02-22 09:36:13 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState
2017-02-22 09:36:13 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2017-02-22 09:36:13 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2017-02-22 09:36:13 [scrapy] INFO: Enabled item pipelines: FangchangPipeline
2017-02-22 09:36:13 [scrapy] INFO: Spider opened
2017-02-22 09:36:13 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2017-02-22 09:36:13 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2017-02-22 09:36:14 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=275930> (referer: None)
2017-02-22 09:36:14 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=275930> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:14 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=216870> (referer: None)
2017-02-22 09:36:14 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=216870> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:14 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=272620> (referer: None)
2017-02-22 09:36:15 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=272620> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:15 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=280540> (referer: None)
2017-02-22 09:36:15 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=280540> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:15 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=279080> (referer: None)
2017-02-22 09:36:16 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=279080> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:16 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=216480> (referer: None)
2017-02-22 09:36:16 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=216480> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:17 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=285240> (referer: None)
2017-02-22 09:36:17 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=285240> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:17 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=273120> (referer: None)
2017-02-22 09:36:18 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=273120> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:18 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=271670> (referer: None)
2017-02-22 09:36:18 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=271670> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:18 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=270070> (referer: None)
2017-02-22 09:36:19 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=270070> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:19 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=275930> (referer: None)
2017-02-22 09:36:19 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=275930> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:20 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=216870> (referer: None)
2017-02-22 09:36:20 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=216870> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:20 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=272620> (referer: None)
2017-02-22 09:36:20 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=272620> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:21 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=280540> (referer: None)
2017-02-22 09:36:21 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=280540> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:21 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=279080> (referer: None)
2017-02-22 09:36:21 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=279080> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:22 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=216480> (referer: None)
2017-02-22 09:36:22 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=216480> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:22 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=285240> (referer: None)
2017-02-22 09:36:22 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=285240> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:23 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=273120> (referer: None)
2017-02-22 09:36:23 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=273120> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:23 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=271670> (referer: None)
2017-02-22 09:36:23 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=271670> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:24 [scrapy] DEBUG: Crawled (200) <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=270070> (referer: None)
2017-02-22 09:36:24 [scrapy] ERROR: Spider error processing <GET http://fsfc.fsjw.gov.cn/hpms_project/roomView.jhtml?id=270070> (referer: None)
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/apple/fangchang/fangchang/spiders/fsfc.py", line 34, in parse
all = re.findall(patten, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
return _compile(pattern, flags).findall(string)
TypeError: expected string or buffer
2017-02-22 09:36:24 [scrapy] INFO: Closing spider (finished)
2017-02-22 09:36:24 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 6968,
'downloader/request_count': 20,
'downloader/request_method_count/GET': 20,
'downloader/response_bytes': 187226,
'downloader/response_count': 20,
'downloader/response_status_count/200': 20,
'finish_reason': 'finished',
'finish_time': datetime.datetime(2017, 2, 22, 1, 36, 24, 382534),
'log_count/DEBUG': 21,
'log_count/ERROR': 20,
'log_count/INFO': 7,
'response_received_count': 20,
'scheduler/dequeued': 20,
'scheduler/dequeued/memory': 20,
'scheduler/enqueued': 20,
'scheduler/enqueued/memory': 20,
'spider_exceptions/TypeError': 20,
'start_time': datetime.datetime(2017, 2, 22, 1, 36, 13, 895738)}
2017-02-22 09:36:24 [scrapy] INFO: Spider closed (finished)
可以帮我看看哪里出错吗?十分感谢大家!