admin管理员组

文章数量:1415476

First of all, I am not the one who is writing the regexps, so I can't just rewrite them. I am pulling in several Javascript regexps, and trying to parse them, but there seems to be some difference between them. Testing the example regexp on W3Schools, Javascript shows this:

var str="Visit W3Schools";
var patt1=/w3schools/i;
alert(str.match(patt1))

which alerts "W3Schools". However, in Python, I get:

import re
str="Visit W3Schools"
patt1=repile(r"/w3schools/i")
print patt1.match(str)

which prints None. Is there some library I can use to convert the Javascript regexps to Python ones?

First of all, I am not the one who is writing the regexps, so I can't just rewrite them. I am pulling in several Javascript regexps, and trying to parse them, but there seems to be some difference between them. Testing the example regexp on W3Schools, Javascript shows this:

var str="Visit W3Schools";
var patt1=/w3schools/i;
alert(str.match(patt1))

which alerts "W3Schools". However, in Python, I get:

import re
str="Visit W3Schools"
patt1=re.pile(r"/w3schools/i")
print patt1.match(str)

which prints None. Is there some library I can use to convert the Javascript regexps to Python ones?

Share Improve this question asked Jun 27, 2012 at 16:18 SkylerSkyler 9391 gold badge10 silver badges28 bronze badges 2
  • Look up .match vs. .search. – Martijn Pieters Commented Jun 27, 2012 at 16:19
  • Please be careful using w3schools. – Pointy Commented Jun 27, 2012 at 16:23
Add a ment  | 

3 Answers 3

Reset to default 4

In python .match only matches at the start of the string.

What you want to use is instead is .search.

Moreover, you do not need to include the '/' characters, and you need to use a separate argument to re.pile to make the search case insensitive:

>>> import re
>>> str = "Visit W3Schools"
>>> patt1 = re.pile('w3schools', re.I)
>>> print patt1.search(str)
<_sre.SRE_Match object at 0x10088e1d0>

In JavaScript, the slashes are the equivalent of calling re.pile.

I can remend reading up on the python regular expression module, there is even an excellent HOWTO.

Could write a small helper function so /ig could also work:

def js_to_py_re(rx):
    query, params = rx[1:].rsplit('/', 1)
    if 'g' in params:
        obj = re.findall
    else:
        obj = re.search

    # May need to make flags= smarter, but just an example...    
    return lambda L: obj(query, L, flags=re.I if 'i' in params else 0)

print js_to_py_re('/o/i')('school')
# <_sre.SRE_Match object at 0x2d8fe68>

print js_to_py_re('/O/ig')('school')
# ['o', 'o']

print js_to_py_re('/O/g')('school')
# []

You don't want to include the / characters and flags in the regexp, and you should use .search instead of .match for a substring match.

Try:

patt1 = re.pile(r"w3schools", flags=re.IGNORECASE)
srch = patt1.search(str)
print srch.group()

本文标签: How to parse a Javascript regexp in PythonStack Overflow