正则表达式抓不到数据

>>> str=''' <td>
... ...                                     
... ...                                         
... ...                                         应用推广
... ...                                     
... ...                                 </td>
... ...                                 <td>
... ...                                     
... ...                                         大图广告
... ...                                         
... ...                                         
... ...                                         
... ...                                     
... ...                                 </td>
... ...                                 <td>
... ...                                     信息流大图D16
... ...                                 </td>'''
>>> s=str.search('<td>.*?</td>.*?<td>.*?</td>.*?</td>(.*?)</td>',str)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
AttributeError: 'str' object has no attribute 'search'
>>> s=re.search('<td>.*?</td>.*?<td>.*?</td>.*?</td>(.*?)</td>',str)
>>> s
>>> s=re.search('<td>.*?</td>.*?<td>.*?</td>.*?<td>(.*?)</td>',str)
>>> s
>>> print s
None
>>> s=re.search('<td>.*?</td>.*?<td>.*?</td>.*?<td>(.*?)</td>')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: search() takes at least 2 arguments (1 given)
>>> s=re.search('<td>.*?</td>.*?<td>.*?</td>.*?<td>(.*?)</td>',str)
>>> s
>>> str
' <td>\n...                                     \n...                                         \n...                                         \xe5\xba\x94\xe7\x94\xa8\xe6\x8e\xa8\xe5\xb9\xbf\n...                                     \n...                                 </td>\n...                                 <td>\n...                                     \n...                                         \xe5\xa4\xa7\xe5\x9b\xbe\xe5\xb9\xbf\xe5\x91\x8a\n...                                         \n...                                         \n...                                         \n...                                     \n...                                 </td>\n...                                 <td>\n...                                     \xe4\xbf\xa1\xe6\x81\xaf\xe6\xb5\x81\xe5\xa4\xa7\xe5\x9b\xbeD16\n...                                 </td>'
>>> s=re.search('.*?<td>.*?</td>.*?<td>.*?</td>.*?<td>(.*?)</td>',str)
>>> s
>>> s=re.search('.*?<td>.*?</td>.*?<td>.*?</td>.*?<td>(.*?)</td>',str)
>>> s
>>> s=re.search('\s*<td>.*?</td>.*?<td>.*?</td>.*?<td>(.*?)</td>',str)
>>> s
>>> s=re.search(u'\s*<td>.*?</td>.*?<td>.*?</td>.*?<td>(.*?)</td>',str)
>>> s

为啥我这匹配的都是空呢?

阅读 1.6k
1 个回答

所有的.* 换成 [.\S\s]* 匹配完 str.strip()

r'<td>([.\S\s]*)</td>'
推荐问题