比如我想提取 ...<xxx><div xxx>abc</div></xxx>...
中的abc怎么提取?
我现在是这样的,但是不行,请问下要怎么改。sed -n 's/<div class=panel_title align=left>\([^>]*\)[^<\/div>]\+<\/div>/\1/p' a.html
, a.html
中的片段是这样的(一整行,提取Problem Description
):<tr><td align=center><h1 style='color:#1A5CC8'>A + B Problem</h1><font><b><span style='font-family:Arial;font-size:12px;font-weight:bold;color:green'>Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others)<br>Total Submission(s): 481857 Accepted Submission(s): 152572<br></span></b></font><br><br><div class=panel_title align=left>
Problem Description</div> <div class=panel_content>Calculate <i>A + B</i>.<br></div><div class=panel_bottom> </div><br><div class=panel_title align=left>Input</div> <div class=panel_content>Each line will contain two integers <i>A</i> and <i>B</i>. Process to end of file.<br></div><div class=panel_bottom> </div><br><div class=panel_title align=left>Output</div> <div class=panel_content>For each case, output <i>A + B</i> in one line.<br></div><div class=panel_bottom> </div><br><div class=panel_title align=left>Sample Input</div><div class=panel_content><pre><div style="font-family:Courier New,Courier,monospace;">1 1</div></pre></div><div class=panel_bottom> </div><br><div class=panel_title align=left>Sample Output</div><div class=panel_content><pre><div style="font-family:Courier New,Courier,monospace;">2</div></pre></div><div class=panel_bottom> </div><br><div class=panel_title align=left>Author</div> <div class=panel_content>HDOJ</div><div class=panel_bottom> </div><br><div class=panel_title align=left>Recommend</div> <div class=panel_content></div><div class=panel_bottom> </div><br><center style='font-size:15px;font-family:Arial;font-weight:bold;color:#1A5CC8'><a href='statistic.php?pid=1000'>Statistic</a> | <a href='submit.php?pid=1000'>Submit</a> | <a href="./discuss/problem/list.php?problemid=1000">Discuss</a> | <a href='note/note.php?pid=1000'>Note</a><br></td></tr><tr>
问题已解决。
sed -n 's/.*<div class=panel_title align=left>\([^<]*\)<\/div>.*/\1/p' a.html