admin管理员组文章数量:1129783
var ss= "<pre>aaaa\nbbb\nccc</pre>ddd";
var arr= ss.match( /<pre.*?<\/pre>/gm );
alert(arr); // null
I'd want the PRE block be picked up, even though it spans over newline characters. I thought the 'm' flag does it. Does not.
Found the answer here before posting. SInce I thought I knew JavaScript (read three books, worked hours) and there wasn't an existing solution at SO, I'll dare to post anyways. throw stones here
So the solution is:
var ss= "<pre>aaaa\nbbb\nccc</pre>ddd";
var arr= ss.match( /<pre[\s\S]*?<\/pre>/gm );
alert(arr); // <pre>...</pre> :)
Does anyone have a less cryptic way?
Edit: this is a duplicate but since it's harder to find than mine, I don't remove.
It proposes [^]
as a "multiline dot". What I still don't understand is why [.\n]
does not work. Guess this is one of the sad parts of JavaScript..
var ss= "<pre>aaaa\nbbb\nccc</pre>ddd";
var arr= ss.match( /<pre.*?<\/pre>/gm );
alert(arr); // null
I'd want the PRE block be picked up, even though it spans over newline characters. I thought the 'm' flag does it. Does not.
Found the answer here before posting. SInce I thought I knew JavaScript (read three books, worked hours) and there wasn't an existing solution at SO, I'll dare to post anyways. throw stones here
So the solution is:
var ss= "<pre>aaaa\nbbb\nccc</pre>ddd";
var arr= ss.match( /<pre[\s\S]*?<\/pre>/gm );
alert(arr); // <pre>...</pre> :)
Does anyone have a less cryptic way?
Edit: this is a duplicate but since it's harder to find than mine, I don't remove.
It proposes [^]
as a "multiline dot". What I still don't understand is why [.\n]
does not work. Guess this is one of the sad parts of JavaScript..
- 49 A less cryptic regex? Impossible, by nature. – Rubens Farias Commented Dec 30, 2009 at 12:18
- btw, you should to read: "Parsing Html: The Cthulhu Way" codinghorror.com/blog/archives/001311.html – Rubens Farias Commented Dec 30, 2009 at 12:23
- 2 The link changed from the previous comment: blog.codinghorror.com/parsing-html-the-cthulhu-way (5yrs-ish later) – dab Commented Jan 4, 2015 at 3:58
8 Answers
Reset to default 403DON'T use (.|[\r\n])
instead of .
for multiline matching.
DO use [\s\S]
instead of .
for multiline matching
Also, avoid greediness where not needed by using *?
or +?
quantifier instead of *
or +
. This can have a huge performance impact.
See the benchmark I have made: https://jsben.ch/R4Hxu
Using [^]: fastest
Using [\s\S]: 0.83% slower
Using (.|\r|\n): 96% slower
Using (.|[\r\n]): 96% slower
NB: You can also use [^]
but it is deprecated in the below comment.
[.\n]
does not work because .
has no special meaning inside of []
, it just means a literal .
. (.|\n)
would be a way to specify "any character, including a newline". If you want to match all newlines, you would need to add \r
as well to include Windows and classic Mac OS style line endings: (.|[\r\n])
.
That turns out to be somewhat cumbersome, as well as slow, (see KrisWebDev's answer for details), so a better approach would be to match all whitespace characters and all non-whitespace characters, with [\s\S]
, which will match everything, and is faster and simpler.
In general, you shouldn't try to use a regexp to match the actual HTML tags. See, for instance, these questions for more information on why.
Instead, try actually searching the DOM for the tag you need (using jQuery makes this easier, but you can always do document.getElementsByTagName("pre")
with the standard DOM), and then search the text content of those results with a regexp if you need to match against the contents.
You do not specify your environment and version of JavaScript (ECMAScript), and I realise this post was from 2009, but just for completeness:
With the release of ECMA2018 we can now use the s
flag to cause .
to match \n
(see https://stackoverflow.com/a/36006948/141801).
Thus:
let s = 'I am a string\nover several\nlines.';
console.log('String: "' + s + '".');
let r = /string.*several.*lines/s; // Note 's' modifier
console.log('Match? ' + r.test(s)); // 'test' returns true
This is a recent addition and will not work in many current environments, for example Node v8.7.0 does not seem to recognise it, but it works in Chromium, and I'm using it in a Typescript test I'm writing and presumably it will become more mainstream as time goes by.
Now there's the s (single line) modifier, that lets the dot matches new lines as well :) \s will also match new lines :D
Just add the s behind the slash
/<pre>.*?<\/pre>/gms
[.\n]
doesn't work, because dot in []
(by regex definition; not javascript only) means the dot-character. You can use (.|\n)
(or (.|[\n\r])
) instead.
I have tested it (Chrome) and it's working for me (both [^]
and [^\0]
), by changing the dot (.
) with either [^\0]
or [^]
, because dot doesn't match line break (See here: http://www.regular-expressions.info/dot.html).
var ss= "<pre>aaaa\nbbb\nccc</pre>ddd";
var arr= ss.match( /<pre[^\0]*?<\/pre>/gm );
alert(arr); //Working
In addition to above-said examples, it is an alternate.
^[\\w\\s]*$
Where \w
is for words and \s
is for white spaces
[\\w\\s]*
This one was beyond helpful for me, especially for matching multiple things that include new lines, every single other answer ended up just grouping all of the matches together.
本文标签: How to use JavaScript regex over multiple linesStack Overflow
版权声明:本文标题:How to use JavaScript regex over multiple lines? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736710745a1948930.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论