python怎么用正则怎么获取标题

<h3 class="title h1 js-title"> <a href="/out/7561938" target="426428" data-main-tab="/out/7561938" data-new-tab="view/dji.com?c=7561938"> $300 Off Phantom 3 Standard </a> </h3>

python

阅读 2.9k

2 个回答

得票最新

vimac

11.7k21528

发布于
2016-11-11

反向思维, 从标签中找到文本很麻烦, 所以我们干脆就把别的标签都干掉

import re
html = u'<h3 class="title h1 js-title"> <a href="/out/7561938" target="426428" data-main-tab="/out/7561938" data-new-tab="view/dji.com?c=7561938"> $300 Off Phantom 3 Standard </a> </h3>'
title = re.sub(ur'<.+?>', '', html).strip()
print title # $300 Off Phantom 3 Standard

小白兔

5414

发布于
2016-11-12

import re
html = u'<h3 class="title h1 js-title"> <a href="/out/7561938" target="426428" data-main-tab="/out/7561938" data-new-tab="view/dji.com?c=7561938"> $300 Off Phantom 3 Standard </a> </h3>'
print re.search(r'(?<=\d">\s).+?(?=</a>)',html).group(0)

前提是view/dji.com?c=xx 中的xx总是数字

撰写回答

你尚未登录，登录后可以

和开发者交流问题的细节
关注并接收问题和回答的更新提醒
参与内容的编辑和改进，让解决方法与时俱进

推荐问题

python怎么用正则怎么获取标题

你尚未登录，登录后可以

请问： Python中是否有方式可以像前端的TSLint一样进行代码的自动风格格式检查？

为什么 pypi 的页面上的新版本在通过 pip 获取不到？

请问一下Python 可以进行强类型开发吗？

python中最好的单元测试是使用的什么呢？

duckdb 的 python sdk 读取 csv 的时候，如何指定列的字段类型？

Python类属性与实例属性自增行为差异？

可以打印全局命名空间：`globals()` 如何打印内置命名空间呢？