admin管理员组文章数量:1332107
How can I change this regular expression to remove everything from a string except alphabets and a '(single quote)?
pattern = /\b(ma?c)?([a-z]+)/ig;
- this pattern removes unwanted spaces and capitalizes the first letter and turns the rest into lower case
- By alphabets I mean English letters a-z.
How can I change this regular expression to remove everything from a string except alphabets and a '(single quote)?
pattern = /\b(ma?c)?([a-z]+)/ig;
- this pattern removes unwanted spaces and capitalizes the first letter and turns the rest into lower case
- By alphabets I mean English letters a-z.
- 2 What’s an alphabet? Like the Latin alphabet, the Greek alphabet, the Cyrillic alphabet? Is this legacy 7-bit data, or is actually Unicode, which the web is now over 80% of? – tchrist Commented Feb 19, 2012 at 14:22
-
2
What is your current regexp about? More specifically, how's the
\b(ma?c)?
related to your need? – pimvdb Commented Feb 19, 2012 at 14:26
1 Answer
Reset to default 8To remove characters, you'd need to use something that actually does that, like the string replace
function (which can accept a regular expression as the "from" parameter).
Then you're just dealing with a normal application of a character class, which in JavaScript (and most other regular expression variants) is described using [...]
, where ...
is what should be in the class. You'd use the ^
at the beginning to invert the meaning of the class:
In your case, it might be:
str = str.replace(/[^A-Za-z']/g, "");
...which will replace except the English characters A-Z (ABCDEFGHIJKLMNOPQRSTUVWXYZ), a-z (abcdefghijklmnopqrstuvwxyz), and the single quote with nothing (e.g., remove it).
let str = "This is a test with the numbers 123 and a '.";
console.log("before:", str);
str = str.replace(/[^A-Za-z']/g, "");
console.log("after: ", str);
However, note that alphabetic characters not used in English will not be excepted, and there are a lot of those in the various languages used on the web (and even, perversely, in English, in "borrowed" words like "voilà" and "naïve").
You've said you're okay with just English A-Z, but for others ing to this: In environemnts supporting ES2018 and above's Unicode property matching, you could handle anything considered "alphabetic" by Unicode instead of just A-Z by using the \p{Alpha}
property. The \p
means "matching this Unicode property" (as usual, the lowercase version \p
means "matching" and the uppercase version \P
means "not matching") and the {Alpha}
means "alphabetic":
str = str.replace(/[^\p{Alpha}']/gu, "");
(Note that, again, \p{Alpha}
means "alphabetic" but because it's in a negated character class, we're excluding alphabetic characters.)
Note the u
flag on that, to enable newer Unicode features. That handles the "voilà" and "naïve" examples too:
let str = "This is a test with the numbers 123 and a ' and voilà and naïve.";
console.log("before:", str);
str = str.replace(/[^\p{Alpha}']/gu, "");
console.log("after: ", str);
本文标签: javascriptRegular expression to remove anything but alphabets and 39single quoteStack Overflow
版权声明:本文标题:javascript - Regular expression to remove anything but alphabets and '[single quote] - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742274005a2444847.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论