admin管理员组

文章数量:1415645

I have a regex on my input parameter:

r"^(ABC-\d{2,9})|(ABz?-\d{3})$"

Ideally it should not allow parameters with ++ or -- at the end, but it does. Why is the regex not working in this case but works in all other scenarios?

ABC-12 is a valid.
ABC-123456789 is a valid.
AB-123 is a valid.
ABz-123 is a valid.

I have a regex on my input parameter:

r"^(ABC-\d{2,9})|(ABz?-\d{3})$"

Ideally it should not allow parameters with ++ or -- at the end, but it does. Why is the regex not working in this case but works in all other scenarios?

ABC-12 is a valid.
ABC-123456789 is a valid.
AB-123 is a valid.
ABz-123 is a valid.
Share Improve this question edited Feb 4 at 18:46 Timur Shtatland 12.5k3 gold badges38 silver badges64 bronze badges asked Feb 4 at 18:01 TanuTanu 1,6505 gold badges22 silver badges44 bronze badges 2
  • 3 Probably you need ^AB(?:C-\d{2,9}|z?-\d{3})$ (assuming \z is a typo and you mean a \d instead) – anubhava Commented Feb 4 at 18:27
  • 2 ^(?:ABC-\d{2,9}|ABz?-\d{3})$ would also do. – aaa Commented Feb 4 at 18:28
Add a comment  | 

2 Answers 2

Reset to default 8

The problem is that your ^ and $ anchors don't apply to the entire pattern. You match ^ only in the first alternative, and $ only in the second alternative. So if the input matches (ABC-\d{2,9}) at the beginning, the match will succeed even if there's more after this.

You can put a non-capturing group around everything except the anchors to fix this.

r"^(?:(ABC-\d{2,9})|(ABz?-\d{3}))$"

Thank you @DuesserBaest for the feedback. I did not understand the question. Here is my try #2. (I wonder if this is what we are looking for as for the results.)

UPDATED PATTERN AND text:

import re

pattern = r"^(AB(?:C-\d{2,9}|z?-\d{3}))$"

text = '''
AB-1
AB-12
AB-123
AB-1234
AB-12345
AB-123456
AB-1234567
AB-123456789

ABC-1
ABC-12
ABC-123
ABC-1234
ABC-12345
ABC-123456
ABC-1234567
ABC-12345678
ABC-123456789
ABC-1234567891

ABz-1
ABz-12
ABz-123
ABz-1234
ABz-12345
ABz-123456
ABz-1234567
ABz-12345678
ABz-123456789
'''

matchlist = re.findall(pattern, text, flags=re.M)

[print(match) for match in matchlist]

MATCHES:

AB-123
ABC-12
ABC-123
ABC-1234
ABC-12345
ABC-123456
ABC-1234567
ABC-12345678
ABC-123456789
ABz-123

UPDATED Link to test: https://regex101/r/INmKCh/2

本文标签: pythonUnderstanding and Fixing the regexStack Overflow