admin管理员组文章数量:1316825
I'm trying to do a replace on the following string prototype: "I‘m singing & dancing in the rain."
The following regular expression matches the instance properly, but also captures the character following the instance of &
. "(&)[#?a-zA-Z0-9;]"
captures the following string from the above prototype: "&l"
.
How can I limit it to only capture the &
?
Edit: I should add that I don't want to match "&"
by itself.
I'm trying to do a replace on the following string prototype: "I‘m singing & dancing in the rain."
The following regular expression matches the instance properly, but also captures the character following the instance of &
. "(&)[#?a-zA-Z0-9;]"
captures the following string from the above prototype: "&l"
.
How can I limit it to only capture the &
?
Edit: I should add that I don't want to match "&"
by itself.
5 Answers
Reset to default 4look for (this copes with named, decimal and hexadecimal entities):
&([A-Za-z]+|#x[\dA-Fa-f]+|#\d+);
replace with
&$1;
Be warned: This has a real probability to go wrong. I remend using a HTML parser to decode the text. You can decode it twice, if it was double encoded. HTML and regex don't play well together even on the small scale.
Since you are in JavaScript, I expect you are in a browser. If you are, you have a nice DOM parser at your hands. Create a new element, assign the string to its inner HTML property and read out the text value. Done.
I gather that you want to match &
, but only if it is followed by an alphanumeric character or certain punctuation. That calls for lookahead. This regular expression should match what you want without capturing or consuming any additional characters.
(&)(?=[#?a-zA-Z0-9;])
Actually you're matching the string &l
but captured is only the &
. This is because of the character class after the capture group which will match an additional character.
But your original regex is a little flawed to begin with anyway. A (not optimal) replacement might be:
&(#[0-9]+|#x[0-9a-zA-Z]+|[a-zA-Z]+);
which will match the plete entity or character declaration and capture the &
.
If you only want to match &
, why did you include the character class [#?a-zA-Z0-9;]
as well?
In english, your expression would be "Match &
followed by a character that is #, ?, a lowercase letter, an uppercase letter or ;".
Just use (&)
You probably meant:
"&([#a-zA-Z0-9]+;)"
本文标签: javascriptRegex To Match ampampentity or ampamp09 And Capture ampampStack Overflow
版权声明:本文标题:javascript - Regex To Match &entity; or &#0-9; And Capture & - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742009298a2412585.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论