使用scrapy的时候,用了start_requests()循环生成要爬取的网址,还需要写start_urls吗?
比如:
class demoSpider(RedisSpider):
name = "demospider"
redis_key = 'demospider:start_urls'
start_urls = ['http://www.example.com']
def start_requests(self):
pages=[]
for i in range(1,10):
url='http://www.example.com/?page=%s'%i
page=scrapy.Request(url)
pages.append(page)
return pages
不需要 而且写了
start_urls
也没有用http://doc.scrapy.org/en/latest/topics/spiders.html#scrapy.spiders.Spider.start_requests
你重写start_requests也就不会从
start_urls
generate Requests了看看源码