为何无法用正则表达式提取html?

show.html 包含下面的内容:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title></title>
    <style type='text/css'>
    div#html,div#css,div#js,div#run{
        border:1px solid red;
        height:80px;
        width:80px;
        float:left;
    }
    div#content{
        clear:both;
        width:400px;
        height:400px;
        border:1px solid black;
    }
   </style>
</head>
<body>
    <div id='html'>html</div>
    <div id='css'>css</div>
    <div id='js'>js</div>
    <div id='run'>run</div>
    <div id='content'>
    </div>        
    <script type="text/javascript">
    var html_string = document.body.innerHTML;
    var content = document.getElementById('content');
    var ob_html = document.getElementById('html');
    ob_html.onmouseover = function(){
        content.innerText = html_string; 
    } 
    ob_html.onmouseout = function(){
        content.innerText = '';
    }
   </script>    
</body>
</html>

浏览器代开后,鼠标进入div#html,显示结果是:
image

我希望仅仅显示 html部分:

    <div id='html'>html</div>
    <div id='css'>css</div>
    <div id='js'>js</div>
    <div id='run'>run</div>
    <div id='content'>
    </div> 

于是修改js部分

    var html_string = document.body.innerHTML;
    var content = document.getElementById('content');
    var ob_html = document.getElementById('html');
    var reg = new RegExp('<script type="text/javascript">.+</script>');
    var onlyHtml = html_string.replace(reg,"");
    ob_html.onmouseover = function(){
        content.innerText = onlyHtml; 
    } 
    ob_html.onmouseout = function(){
        content.innerText = '';
    }

为何不能提取出html部分?

var content = "<p>test</p><script type='text/javascript'>somany lines and 
              so many lines</script>"
var reg = new RegExp("<script type='text/javascript'>.+</script>");
var onlyHtml = content.replace(reg,"");
alert(onlyHtml);

上面可是可以提供出正确的结果的哈?!

阅读 2.8k
3 个回答

.不匹配换行符

html.replace(/<script type="text\/javascript">[^]+<\/script>/, '')
content.innerText = html_string.replace(/<script\b[^<]*(?:(?!<\/script>)<[^<]*)*<\/script>/ig, ''); 

还有空格,换行没匹配

<script type="text/javascript">((.*\n*\s*)*)</script>
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题