admin管理员组文章数量:1200985
I'm using the below regex to find most of the matches I'm looking for where I'm trying to mark any 'ml' proceeded by up to a possible 9-digit number, as long as that number isn't in the 1900-2100 range. The issue I'm having, and I don't even know if this is possible in the same regex is '19750ml'. Any time there is a QUOTE character followed by a 2-digit number, I don't want that part selected. This regex is matching on the '0ml' as the value 1975 proceeds it, but in this single quote followed by the 2-digit number, I would like it to mark '750ml'. In other words, can the QUOTE followed by the 2-digit number have the highest priority? This is being used in Microsoft SQL 2019 as a function.
(?i)(?<!\b(?:19|2[01])\d(?=\d))(?<!\b(?:19|2[01])(?=\d{2}))(?<!\b(?:1(?=9\d{2}))|2(?=[01]\d{2}))(?!\b(?:19|2[01])\d\d)\d{0,9}ml$
Here are some examples and what the above regex is matching on:
Mary had a little lamb 1980750ml
Test 19819ml
Test 198218ml
Test 2123456ml
Test 20349876ml
Test 209912345ml
Test 1999123456ml
Test 987654321ml
Test '19750ml <--- This guy is my issue
Test 1988ml
Test 9999ml
Test 2000ml
Test 100ml
Test '2529ml <--- I would like it to mark '29ml' keeping the quote followed by 2 numeric digits
I'm using the below regex to find most of the matches I'm looking for where I'm trying to mark any 'ml' proceeded by up to a possible 9-digit number, as long as that number isn't in the 1900-2100 range. The issue I'm having, and I don't even know if this is possible in the same regex is '19750ml'. Any time there is a QUOTE character followed by a 2-digit number, I don't want that part selected. This regex is matching on the '0ml' as the value 1975 proceeds it, but in this single quote followed by the 2-digit number, I would like it to mark '750ml'. In other words, can the QUOTE followed by the 2-digit number have the highest priority? This is being used in Microsoft SQL 2019 as a function.
(?i)(?<!\b(?:19|2[01])\d(?=\d))(?<!\b(?:19|2[01])(?=\d{2}))(?<!\b(?:1(?=9\d{2}))|2(?=[01]\d{2}))(?!\b(?:19|2[01])\d\d)\d{0,9}ml$
Here are some examples and what the above regex is matching on:
Mary had a little lamb 1980750ml
Test 19819ml
Test 198218ml
Test 2123456ml
Test 20349876ml
Test 209912345ml
Test 1999123456ml
Test 987654321ml
Test '19750ml <--- This guy is my issue
Test 1988ml
Test 9999ml
Test 2000ml
Test 100ml
Test '2529ml <--- I would like it to mark '29ml' keeping the quote followed by 2 numeric digits
2 Answers
Reset to default 1You could extend the regex with these additional restrictions:
The match should not start immediately after a quote when the match starts with two digits.
The match should not start immediately after a quote and digit when the match starts with a digit.
For encoding these two constraints in the regex we can use (?<!'(?=\d\d)|'\d(?=\d))
as an additional look-behind to the ones already in the regex.
We could add this allowance (as exception to already existing rules):
- The match may start when it is preceded by a quote and two digits.
We can encode this with (?<='\d\d)
as an alternative to all other look-behind restrictions.
This leads to this regex:
(?i)(?:(?<!\b(?:19|2[01])\d(?=\d))(?<!\b(?:19|2[01])(?=\d\d))(?<!\b(?:1(?=9\d\d))|2(?=[01]\d\d))(?!\b(?:19|2[01])\d\d)(?<!'(?=\d\d)|'\d(?=\d))|(?<='\d\d))\d{0,9}ml$
See it on regex101
This is an extension in the thought process formulated in this answer:
Assuming you use a regex flavor that hat the (*SKIP)(*FAIL)
keywords you cound use:
(?:'\d{2}|(?:190[0-9]|19[1-9][0-9]|2[01][0-9]{2})(?=\d*ml))(*SKIP)(*FAIL)|\d{,9}ml
See: regex101
Explanation (see also rexegg):
(?: ... )
: Match any of the two conditions'\d{2}
: that eighter the numeric range starts with an'
followed by two digits|
: or(?:190[0-9]|19[1-9][0-9]|2[01][0-9]{2})
: the numeric range 1900-2199 (regex generated based on this answer)
(?=\d*ml)
: if they proceeded any number of digits and "ml"(*SKIP)(*FAIL)
: and discard them.
|
: Or
\d{,9}ml
: match the desired 0 to 9 digits followed by "ml".
本文标签:
版权声明:本文标题:sql - Regex to match up to 9 num. chars followed by 'ml' and those 9 num. chars don't fall between 1900- 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1738545095a2096260.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
(?:'\d{2}|(?:190[0-9]|19[1-9][0-9]|2[01][0-9]{2})(?=\d*ml))(*SKIP)(*FAIL)|\d{,9}ml
work for you? It builds on/extends the logic of my answer to your similar question. – DuesserBaest Commented Jan 22 at 20:3219750ml
is not750ml
, nor is the number in the range of1900..2100
so you're just truncating numbers for some reason. What's the actual intent of truncating the leading digits? Without knowing why this is possibly a suboptimal X/Y solution. For example, why not just do a right-to-left string scan and extract the numbers? And if you can have nine digits, what should you do with a string like1975019750ml
or175021001ml
? – Todd A. Jacobs Commented Jan 24 at 18:54