0
<input class="input-xlarge focused" id="listCode" name="listCode" readonly="true" type="text" value="001">
</input>
<input class="input-xlarge focused" id="type" name="type" readonly="true" type="text" value="002">
</input>
<input class="input-xlarge focused" id="yyc" name="yyc" readonly="true" type="text" value="yyzz">
</input>


如何使用python 3.5的正则表达式获取每一行里面的name和value的值,并将name和value的值添加到字典.
最终的结果变为:

dict = {

    'listcode':'001',
    'type':'002',
    'yyc':'yyzz'
    

}

或者不用正则用beautifulsoup是否可以实现? 哪位方便,麻烦指点一二。谢谢。

查看全部 4 个回答

0

只用自带re模块不是更容易实现?

s = '''<input class="input-xlarge focused" id="listCode" name="listCode" readonly="true" type="text" value="001">
</input>
<input class="input-xlarge focused" id="type" name="type" readonly="true" type="text" value="002">
</input>
<input class="input-xlarge focused" id="yyc" name="yyc" readonly="true" type="text" value="yyzz">
</input>'''

import re
compile = r'name="(\S+)".*value="(\S+)"'
matches  = re.finditer(compile,s)
result = dict()
for match in matches:
    result[match.group(1)] = match.group(2)
    
print(result)    


{'yyc': 'yyzz', 'type': '002', 'listCode': '001'}

推荐答案

0

已采纳

参考代码,BeautifulSoup的用法可以阅读官方文档

html = '<input class="input-xlarge focused" id="listCode" name="listCode" readonly="true" type="text" value="001"></input><input class="input-xlarge focused" id="type" name="type" readonly="true" type="text" value="002"></input><input class="input-xlarge focused" id="yyc" name="yyc" readonly="true" type="text" value="yyzz"></input>'
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, "lxml")
content = dict()
datas = soup.find_all("input", class_="input-xlarge focused")
for data in datas:
    content[data["name"]] = data["value"]

print(content)