admin管理员组

文章数量:1406924

Is it possible in the current version of JFlex (1.9.1) to represent a range of full Unicode values in a regular expression ?

Something like this:

UnicodeIdentifier = [a-zA-Z_\u007F-\u10FFFF] [a-zA-Z0-9_\u007F-\u10FFFF]*

except this does not work (and makes JFlex emit a warning) because Unicode escape sequences in Java must be 16 bits in hexadecimal so the high end would be treated as \u10FF.

The spec says that representing supplementary characters in the range U+010000 to U+10FFFF requires two consecutive Unicode escapes however using this:

UnicodeIdentifier = [a-zA-Z_\u007F-\uDBFF\uDFFF] [a-zA-Z0-9_\u007F-\uDBFF\uDFFF]*

does not work either.

本文标签: javahow to represent full Unicode range in regexp in JFlexStack Overflow