admin管理员组文章数量:1130745
I wrote a regex to fetch string from HTML, but it seems the multiline flag doesn't work.
This is my pattern and I want to get the text in h1
tag.
var pattern= /<div class="box-content-5">.*<h1>([^<]+?)<\/h1>/mi
m = html.search(pattern);
return m[1];
I created a string to test it. When the string contains "\n", the result is always null. If I removed all the "\n"s, it gave me the right result, no matter with or without the /m
flag.
What's wrong with my regex?
I wrote a regex to fetch string from HTML, but it seems the multiline flag doesn't work.
This is my pattern and I want to get the text in h1
tag.
var pattern= /<div class="box-content-5">.*<h1>([^<]+?)<\/h1>/mi
m = html.search(pattern);
return m[1];
I created a string to test it. When the string contains "\n", the result is always null. If I removed all the "\n"s, it gave me the right result, no matter with or without the /m
flag.
What's wrong with my regex?
Share Improve this question edited Dec 18, 2021 at 11:08 Wiktor Stribiżew 626k41 gold badges495 silver badges609 bronze badges asked Jul 1, 2009 at 9:52 wangyhwangyh 6 | Show 1 more comment5 Answers
Reset to default 645You are looking for the /.../s
modifier, also known as the dotall modifier. It forces the dot .
to also match newlines, which it does not do by default.
The bad news is that it does not exist in JavaScript (it does as of ES2018, see below). The good news is that you can work around it by using a character class (e.g. \s
) and its negation (\S
) together, like this:
[\s\S]
So in your case the regex would become:
/<div class="box-content-5">[\s\S]*<h1>([^<]+?)<\/h1>/i
As of ES2018, JavaScript supports the s
(dotAll) flag, so in a modern environment your regular expression could be as you wrote it, but with an s
flag at the end (rather than m
; m
changes how ^
and $
work, not .
):
/<div class="box-content-5">.*<h1>([^<]+?)<\/h1>/is
You want the s
(dotall) modifier, which apparently doesn't exist in Javascript - you can replace .
with [\s\S] as suggested by @molf.
The m
(multiline) modifier makes ^ and $ match lines rather than the whole string.
[\s\S]
did not work for me in nodejs 6.11.3. Based on the RegExp documentation, it says to use [^]
which does work for me.
(The dot, the decimal point) matches any single character except line terminators: \n, \r, \u2028 or \u2029.
Inside a character set, the dot loses its special meaning and matches a literal dot.
Note that the m multiline flag doesn't change the dot behavior. So to match a pattern across multiple lines, the character set [^] can be used (if you don't mean an old version of IE, of course), it will match any character including newlines.
For example:
/This is on line 1[^]*?This is on line 3/m
where the *? is the non-greedy grab of 0 or more occurrences of [^].
The dotall modifier has actually made it into JavaScript in June 2018, that is ECMAScript 2018.
https://github.com/tc39/proposal-regexp-dotall-flag
const re = /foo.bar/s; // Or, `const re = new RegExp('foo.bar', 's');`.
re.test('foo\nbar');
// → true
re.dotAll
// → true
re.flags
// → 's'
My suggestion is that it's better to split the multiple-line string with "\n" and concatenate the splits of the original string and becomes a single line and easy to manipulate.
<textarea class="form-control" name="Body" rows="12" data-rule="required"
title='@("Your feedback ".Label())'
placeholder='@("Your Feedback here!".Label())' data-val-required='@("Feedback is required".Label())'
pattern="^[0-9a-zA-Z ,;/?.\s_-]{3,600}$" data-val="true" required></textarea>
$( document ).ready( function() {
var errorMessage = "Please match the requested format.";
var firstVisit = false;
$( this ).find( "textarea" ).on( "input change propertychange", function() {
var pattern = $(this).attr( "pattern" );
var element = $( this );
if(typeof pattern !== typeof undefined && pattern !== false)
{
var ptr = pattern.replace(/^\^|\$$/g, '');
var patternRegex = new RegExp('^' + pattern.replace(/^\^|\$$/g, '') + '$', 'gm');
var ks = "";
$.each($( this ).val().split("\n"), function( index, value ){
console.log(index + "-" + value);
ks += " " + value;
});
//console.log(ks);
hasError = !ks.match( patternRegex );
//debugger;
if ( typeof this.setCustomValidity === "function")
{
this.setCustomValidity( hasError ? errorMessage : "" );
}
else
{
$( this ).toggleClass( "invalid", !!hasError );
$( this ).toggleClass( "valid", !hasError );
if ( hasError )
{
$( this ).attr( "title", errorMessage );
}
else
{
$( this ).removeAttr( "title" );
}
}
}
});
});
本文标签: JavaScript regex multiline text between two tagsStack Overflow
版权声明:本文标题:JavaScript regex multiline text between two tags - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736731662a1950030.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
dotAll
modifier so you can do/.../s
and your dots will also match new lines. As of July 2017 it's behind a flag in Chrome. – user993683 Commented Jul 17, 2017 at 13:48