scrapy,我想模拟登陆天眼查网站,那个网站要滑动对齐验证,我能怎么办才能模拟登陆成功呢?

这是我模拟登陆的核心代码:

def __init__(self):
        dcap = dict(webdriver.DesiredCapabilities.PHANTOMJS)  # 设置userAgent
        # dcap[
        #     "phantomjs.page.settings.userAgent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"
        self.driver = webdriver.PhantomJS(
            executable_path='C:\\Users\\gt\\Desktop\\tutorial\\phantomjs.exe',
            desired_capabilities=dcap)

        self.driver.maximize_window()

def start_requests(self):
        print("start request!!!")
        yield scrapy.Request(self.login_url, callback=self.parse)

def parse(self, response):
        print("parse!!!")

        self.driver.get(response.url)
        self.set_sleep_time()
        # print(self.driver.page_source)
        self.driver.find_element_by_xpath('//*[@id="web-content"]/div/div[2]/div/div[2]/div/div[3]/div[1]/div[1]').click()
        print("CLICK LEFT")
        time.sleep(1)
        temp = self.driver.find_element_by_xpath('//*[@id="web-content"]/div/div[2]/div/div[2]/div/div[3]/div[3]/div[2]/input')
        temp.click()
        temp.send_keys(PHONE)
        print("PHONE SENT")
        self.driver.find_element_by_xpath('//*[@id="web-content"]/div/div[2]/div/div[2]/div/div[3]/div[1]/div[2]').click()
        print("CLICK RIGHT")
        time.sleep(5)
        temp2 = self.driver.find_element_by_xpath('//*[@id="web-content"]/div/div[2]/div/div[2]/div/div[3]/div[2]/div[3]/input')
        temp2.click()
        temp2.send_keys(PASSWORD)
        print("PASSWORD SENT")
        self.driver.find_element_by_xpath('//*[@id="web-content"]/div/div[2]/div/div[2]/div/div[3]/div[2]/div[5]').click()
        self.set_sleep_time()
        time.sleep(3)
        # print self.driver.page_source
        print("准备进入解析。。。。。")
        cookies = self.driver.get_cookies()
        # print(cookies)

        f = open('data/url_list.txt', mode='r', encoding='utf-8')
        for line in f.readlines():
            url = str(line.replace('\r', '').replace('\n', '').replace('=', ''))
            print(url)
            time.sleep(1)
            print("停顿1秒...............")
            requests = scrapy.Request(url, cookies=cookies,
                                      callback=self.sub_parse)
            yield requests
阅读 3.4k
1 个回答

你可以人工操作滑动验证,这个时候爬虫暂停;
等人工处理完成后爬虫再继续执行。

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题