BeautifulSoup python 爬虫无法搜索目标标签

发布于
2016-05-13

代码

import requests
from bs4 import BeautifulSoup

url = 'http://product.pconline.com.cn/mobile/'
response = requests.get(url)
html = response.text
print html

soup = BeautifulSoup(html, 'lxml')
site = soup.find_all('img', class_="pic")
print site

目标网站:http://product.pconline.com.cn/mobile/
打算爬取的是手机图片标签,运行上面这段代码后打印的site得到是空的。

图片部分html片段：

<img class="pic" alt="华为Mate8/3GB+32GB版" title="麒麟950处理器、6寸超大屏、超高屏占比、超窄边框，3GB+32GB全网通版本" src="http://img.pconline.com.cn/images/product/5807/580761/q_sn8.jpg" height="150" width="200">

python beautifulsoup 网页爬虫 html css

阅读 4.4k

3 个回答

得票最新

haofly

1.1k61320

发布于
2016-05-15

我这边是可以的，不知道会不会是编码的问，或者是lxml扩展的问题

happen

3415

发布于
2016-05-28

更新于
2016-05-28

换个解析器试试
soup = BeautifulSoup(html, 'html.parser')

prolifes

11.2k51537

发布于
2016-05-28

更新于
2016-05-28

pyquery，不二之选, 语法和jquery一样

import requests, pyquery
url = 'http://product.pconline.com.cn/mobile/'
r = requests.get(url)

html = r.text.replace('#src', 'jsrc')
Q = pyquery.PyQuery(html)

for _ in Q('img.pic'):
    print Q(_).attr('jsrc'), Q(_).attr('alt'), Q(_).attr('title')

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

相似问题

找不到问题？创建新问题

BeautifulSoup python 爬虫无法搜索目标标签

你尚未登录，登录后可以

字节的 trae AI IDE 不支持类似 vscode 的 ssh remote 远程开发怎么办？

DataCap 中验证码无法显示，后台出现 NullPointerException 错误?

如何实现一个深拷贝函数？

发现深拷贝和浅拷贝效果一致：请问一下有什么区别呢？

Python 成员变量在多个子类实例间共享，如何避免？

body :first-child(不是body:first-child，中间有空格)伪类选择器到底选中了什么元素？

问一个鼠标滚动事件，这种是怎么实现的？