Python seleium 爬虫，怎样爬翻页时网址不变的网页的指定页码？

Question

Python seleium 爬虫，怎样爬翻页时网址不变的网页的指定页码？

Me1ody丶

4411824

发布于
2017-10-26

用seleium爬翻页时网址不变的网页时，怎样爬取指定页码的内容？（比如我要指定爬第10-20页，而不是从第1页开始）

目前代码（爬取第1-9页）如下：

from selenium import webdriver
from bs4 import BeautifulSoup
import time

browser = webdriver.Chrome()
browser.get("http://lol.qq.com/guide/list.shtml")
for i in range(1,10):

html=browser.page_source
soup=BeautifulSoup(html,'lxml')
all_news=soup.find('ul',id='list_content').find_all('li')
for news in all_news:
    new_info={}
    new_info['title']=news.find('p',class_='btn-a').get_text()
    new_info['read_num']=news.find('p',class_='bfl-playing').get_text()[4:]
    new_info['time']=news.find('span',class_='recommend-div-div-raiders-date fr').get_text()
    print(new_info)
print('第%d页'%(i))
if not browser.find_element_by_class_name('pagenext'):
    break
browser.find_element_by_class_name('pagenext').click()
time.sleep(1)

browser.close()

请指教。

python

阅读 4.3k

1 个回答

lejoy

✓ 已被采纳

第<input size="4" onkeydown="if(event.keyCode==13) pageShow.to(this.value)" value="1" style="width:20px">页