使用splash模拟鼠标下拉加载更多数据的行为,无效果,未报错并能抓到首页显示数据,下拉之后加载的数据抓不到。是否是我在设置:splash.scroll_position= {1000,700}的定位有问题?又如何定位到触发加载更多页面行为的坐标呢?
代码如下:
import scrapy
from scrapy.selector import Selector
from scrapy_splash import SplashRequest
import json
class gougou(scrapy.Spider):
name = 'gg'
allow_domains = ['baidu.com']
start_urls = ('http://www.tudou.com/sec/%E8%90%8C%E7%89%A9?spm=a2h28.8313475.nav.dn_sec4',)
f = open("tudou.txt",'a')
def start_requests(self):
script = """
function main(splash)
splash.scroll_position = {1000,700}
splash:go(splash.args.url)
splash:wait(3)
return {
html = splash:html()
}
end
"""
for url in self.start_urls:
yield scrapy.Request(url=url,callback=self.parse,meta={
'splash':{
'args':{'lua_source':script},
'endpoint':'execute'
}
})
def parse(self,response):
scs = Selector(response)
message = scs.xpath('//img[contains(@class, "v-thumb__pic")]/@alt').extract()
for sc in message:
self.f.write(sc)
self.f.write("\r\n")
测试多次未报错。但是结果就只有30条。求大神指教!