admin管理员组文章数量:1402836
Struggling with regex and regex AI isn't helping.
What I need to get back is:
Group 1 - All text to before the 3rd numeric
Group 2 - All text from the 3rd numeric to before the 5th numeric
Group 3 - All text from the 5th numeric to before the 7th numeric
Group 4 - All text from the 7th numeric to before the 11th numeric
Group 5 - All text from the 11th numeric to before the 13th numeric
Group 6 - All text from the 13th numeric to the end of the string
The tricky bit is there may be HTML code anywhere between the numbers.
The below PHP is kiiiinda working:
<?php
$gsPattern = '/([^\d]+)(\d{2})(.*?)(\d{2})(.*?)(\d{2})(.*?)(\d{4})(.*?)(\d{2})(.*?)(\d{2})(.*)/i';
$gsReplacement = '${1}${2}-${3}${4}-${5}${6}-${7}${8}-${9}${10}-${11}${12}${13}';
$gsRowCode = "<td>160301<b style=\"color: red;\">1234</b>2525</td>";
echo preg_replace($gsPattern, $gsReplacement, $gsRowCode);
echo "<br>";
$gsRowCode = "<td>16030<b style=\"color: red;\">1123</b>42525</td>";
echo preg_replace($gsPattern, $gsReplacement, $gsRowCode);
The following hopefully shows the output I'm after:
Output<br>
1.<br>
16-03-01-<b style="color: red;">1234-</b>25-25 <br>
<br>
2.<br>
16030<b style="color: red;">1123</b>42525 <br>
<br>
Would like it to output as: <br>
16-03-0<b style="color: red;">1-123</b>4-25-25 <br>
Struggling with regex and regex AI isn't helping.
What I need to get back is:
Group 1 - All text to before the 3rd numeric
Group 2 - All text from the 3rd numeric to before the 5th numeric
Group 3 - All text from the 5th numeric to before the 7th numeric
Group 4 - All text from the 7th numeric to before the 11th numeric
Group 5 - All text from the 11th numeric to before the 13th numeric
Group 6 - All text from the 13th numeric to the end of the string
The tricky bit is there may be HTML code anywhere between the numbers.
The below PHP is kiiiinda working:
<?php
$gsPattern = '/([^\d]+)(\d{2})(.*?)(\d{2})(.*?)(\d{2})(.*?)(\d{4})(.*?)(\d{2})(.*?)(\d{2})(.*)/i';
$gsReplacement = '${1}${2}-${3}${4}-${5}${6}-${7}${8}-${9}${10}-${11}${12}${13}';
$gsRowCode = "<td>160301<b style=\"color: red;\">1234</b>2525</td>";
echo preg_replace($gsPattern, $gsReplacement, $gsRowCode);
echo "<br>";
$gsRowCode = "<td>16030<b style=\"color: red;\">1123</b>42525</td>";
echo preg_replace($gsPattern, $gsReplacement, $gsRowCode);
The following hopefully shows the output I'm after:
Output<br>
1.<br>
16-03-01-<b style="color: red;">1234-</b>25-25 <br>
<br>
2.<br>
16030<b style="color: red;">1123</b>42525 <br>
<br>
Would like it to output as: <br>
16-03-0<b style="color: red;">1-123</b>4-25-25 <br>
Share
Improve this question
edited Mar 21 at 3:41
Tangentially Perpendicular
5,3994 gold badges14 silver badges33 bronze badges
asked Mar 21 at 3:27
AnrikAnrik
594 bronze badges
5
|
3 Answers
Reset to default 2APPROACH:
To help with clarity: I captured every digit separately into a named capture group (
(?P<n1>...)
,(?P<n3>...)
,(?P<n3>...)
, etc.).And, I captured every non-digit string before the first digit (
<?P<string1_beginning>...)
), between digits ((?P<string2>...)
,(?P<string3>...)
,(?P<string4>...)
, etc.) and after the last digit ((?P<string15_end>...)
) into named capture groups.The numbers in the names of the non-digit-string capture groups
P<string_n
) and digit capture groups (P<n_n>
) match.In the replacement string, I build the desired outcome string placing the digit capture groups and string capture groups in to produce the desired outcome. For example, when the named capture group
${string6}
returns<b style=\"color: red;\">
, the group${string_7}
returns[EMPTY]
, and vice versa, resulting in the desired outcome.I added a capture group between each digit. This creates flexibility and allows me to select the possible string-group numbers for the replacement string based on the digit location, as you can see in the suggested replacement string below.
There will be an issue if there are other digits between the digits that we are looking to capture, for example the color is in rgb-format including numbers. This, however, that was not part of the scope of the question.
REGEX PATTERN AND REPLACEMENT STRING(PRCE2 flavor):
$gsPattern = '^(?P<string1_beginning>[^\d]*)(?P<n1>\d)(?P<string2>[^\d]*)(?P<n2>\d)(?P<string3>[^\d]*)(?P<n3>\d)(?P<string4>[^\d]*)(?P<n4>\d)(?P<string5>[^\d]*)(?P<n5>\d)(?P<string6>[^\d]*)(?P<n6>\d)(?P<string7>[^\d]*)(?P<n7>\d)(?P<string8>[^\d]*)(?P<n8>\d)(?P<string9>[^\d]*)(?P<n9>\d)(?P<string10>[^\d]*)(?P<n10>\d)(?P<string11>[^\d]*)(?P<n11>\d)(?P<string12>[^\d]*)(?P<n12>\d)(?P<string13>[^\d]*)(?P<n13>\d)(?P<string14>[^\d]*)(?P<n14>\d)(?P<string15_end>[^\d]*$)'
$gsReplacement = '<br>${n1}${n2}-${n3}${n4}-${n5}${string6}${string7}${n6}-${n7}${n8}${n9}${string10}${string11}${n10}-${n11}${n12}-${n13}${n14} <br>'
Regex Demo: https://regex101/r/EmP12l/4
INPUT:
1: $gsRowCode = "<td>160301<b style=\"color: red;\">1234</b>2525</td>";
2: $gsRowCode = "<td>16030<b style=\"color: red;\">1123</b>42525</td>";
OUTPUT:
1: <br>16-03-0<b style=\"color: red;\">1-123</b>4-25-25 <br>
2: <br>16-03-0<b style=\"color: red;\">1-123</b>4-25-25 <br>
DESIRED OUTPUT:
<br>16-03-0<b style="color: red;">1-123</b>4-25-25 <br>
REGEX NOTES:
^
Match beginning of the string.(?P<string1_beginning>[^\d]*)
Begin Named capture group(?P<name>...)
. Negated character class[^...]
. Matches any character that is NOT a digit\d
0 or more times (*
). In the replacement string, the string captured in this group would be retrieved with using$<string1_beginning>
.(?P<n1>\d)
Named capture group(?P<name>...)
. Matches one digit\d
. In the replacement string, the string captured in this group is retrieved with using$<n1>
.- The same pattern repeats for to capture a total of 14 digits, and 15 non-digit strings.
$
Matches end of string.
Basically its just getting finer granularity digit capture around the color tag to shift the digits over by 1 place.
This is a ECMAScript example. It could be further shrunk if using Pcre style eng.
^.*?(\d{2}).*?(\d{2}).*?(?=\d*(<b[^>]*?color[^>]*?>)\d{4}(</b>))(?:(\d)\3(\d)(\d{3})\4(\d)|(\d)(\d)\3(\d{3})(\d)\4).*?(\d{2}).*?(\d{2}).*
Replace $1-$2-$5$9$3$6$10-$7$11$4$8$12-$13-$14
https://regex101/r/mJuvUJ/1
Below seems to work ok:
$gsPattern = '/^(.*?\d.*?\d)(.*?\d.*?\d)(.*?\d.*?\d)(.*?\d.*?\d.*?\d.*?\d)(.*?\d.*?\d)(.*)/i';
$gsReplacement = '${1}-${2}-${3}-${4}-${5}-${6}';
本文标签: regexapply format to number potential HTML between the numbersStack Overflow
版权声明:本文标题:regex - apply format to number potential HTML between the numbers - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744373968a2603168.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
numeric
andnumbers
. – jhnc Commented Mar 21 at 5:20<table>
. 2. The data in the<table>
is fundamentally flawed or is being misinterpreted. 3. If #2 is incorrect then you have a HTML<table>
and you need the highlighted (eg.<b>
) part of each cell (eg.<td>
) to shift to the left by a single character. If #1 and #3 are correct there's a far better way of dealing with your problem using JavaScript and approaching the source as DOM. If #1 and #2 are correct...? – zer00ne Commented Mar 23 at 0:21