admin管理员组文章数量:1323529
I need to parse an HTML string and remove all the elements which contain only empty children.
Example:
<P ALIGN="left"><FONT FACE="Arial" SIZE="12" COLOR="#000000" LETTERSPACING="0" KERNING="1"><B></B></FONT></P>
contains no information and must be replaced with </br>
I wrote a regex like this:
<\w+\b[^>]*>(<\w+\b[^>]*>\s*</\w*\s*>)*</\w*\s*>
but the problem is that it's catching only 2 levels of the three. In the abobe example, the <p>
element (the outer-most one) is not selected.
Can you help me fix this regex?
I need to parse an HTML string and remove all the elements which contain only empty children.
Example:
<P ALIGN="left"><FONT FACE="Arial" SIZE="12" COLOR="#000000" LETTERSPACING="0" KERNING="1"><B></B></FONT></P>
contains no information and must be replaced with </br>
I wrote a regex like this:
<\w+\b[^>]*>(<\w+\b[^>]*>\s*</\w*\s*>)*</\w*\s*>
but the problem is that it's catching only 2 levels of the three. In the abobe example, the <p>
element (the outer-most one) is not selected.
Can you help me fix this regex?
Share Improve this question asked Nov 13, 2013 at 10:26 Cristian HoldunuCristian Holdunu 1,9182 gold badges18 silver badges43 bronze badges 4- 1 brace yourself for downvotes on regex+HTML question – hjpotter92 Commented Nov 13, 2013 at 10:29
- 3 The font element has been deprecated since HTML3 so why are you still using it? – user2417483 Commented Nov 13, 2013 at 10:30
- stackoverflow./q/3129738/612202 You should prefer the answer with more votes. – dan-lee Commented Nov 13, 2013 at 10:30
- this is the point, I want to get rid of it. I have an older database from where I take this info. There are some notes with formatting saved as text and I want to get rid off useless elements and of font elements. I replaced them with spans – Cristian Holdunu Commented Nov 13, 2013 at 10:50
3 Answers
Reset to default 5This regex seems to work:
/(<(?!\/)[^>]+>)+(<\/[^>]+>)+/
See a live demo with your example.
Use jQuery and parse all children. For each child you have to check if .html() is empty. If yes -> delete the current element (or the parent if you want) with .remove().
Do for each string:
var appended = $('.yourparent').append('YOUR HTML STRING');
appended.children().each(function ()
{
if(this.html() === '')
{
this.parent().remove();
}
});
This will add the items first and delete, if there are empty children.
please try this:
function removeEmtpyElements(str, iterations){
var re = /<([A-z]+)([^>^/]*)>\s*<\/\1>/gim;
var subst = '';
for(var i = 0; i < iterations; i++){
str = str.replace(re, subst);
}
return str;
}
本文标签: javascriptRegex to remove empty html tagsthat contains only empty childrenStack Overflow
版权声明:本文标题:javascript - Regex to remove empty html tags, that contains only empty children - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742128850a2422067.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论