admin管理员组文章数量:1294627
I've been getting better at Regex, but I've e up with something that is beyond what I'm currently able to do.
I want to build a function to test (return true or false) to test to see if a word is found inside a string. But I wouldn't want to have a positive match if the word was found inside of another word. I would also like to build in the possibility of checking for pluralization.
Here are some examples of the results I'd expect to get:
Word to look for: "bar"
"Strings to search in" //what it should return as
"foo bar" //true
"foo bar." //true
"foo bar!" //true (would be true with any other punctuation before or after 'bar' too)
"foo bars." //true
"foo bares." //true (even though bares has a different meaning then bars, I would be okay with this returning true since I would need to check for words that pluralize with "es" and I wouldn't expect to build a regex to know which words pluralize with "s" and which to "es")
"my name is bart simpson" //false (bar is actually part of "bart")
"bart simpson went to the bar." //true
I'll be using javascript/jquery to check for matches
Thanks so much for the help!
I've been getting better at Regex, but I've e up with something that is beyond what I'm currently able to do.
I want to build a function to test (return true or false) to test to see if a word is found inside a string. But I wouldn't want to have a positive match if the word was found inside of another word. I would also like to build in the possibility of checking for pluralization.
Here are some examples of the results I'd expect to get:
Word to look for: "bar"
"Strings to search in" //what it should return as
"foo bar" //true
"foo bar." //true
"foo bar!" //true (would be true with any other punctuation before or after 'bar' too)
"foo bars." //true
"foo bares." //true (even though bares has a different meaning then bars, I would be okay with this returning true since I would need to check for words that pluralize with "es" and I wouldn't expect to build a regex to know which words pluralize with "s" and which to "es")
"my name is bart simpson" //false (bar is actually part of "bart")
"bart simpson went to the bar." //true
I'll be using javascript/jquery to check for matches
Thanks so much for the help!
Share Improve this question asked Feb 13, 2013 at 16:59 rgbflawedrgbflawed 2,1571 gold badge23 silver badges29 bronze badges 6- So "child" is not expected to match "children", correct? – Álvaro González Commented Feb 13, 2013 at 17:01
- Yes, I would not expect "child" to match "children". – rgbflawed Commented Feb 13, 2013 at 17:03
- 4 Pluralization is not easy to do with regular expressions. What about mouse/mice and colossus/colossi? – Halcyon Commented Feb 13, 2013 at 17:03
- Frits, I would not expect to match these types of pluralization words. – rgbflawed Commented Feb 13, 2013 at 17:04
-
1
How about these cases:
in5bar
,bares_random
,Ăbar
– nhahtdh Commented Feb 13, 2013 at 17:14
3 Answers
Reset to default 5var rgx = new RegExp('\\b' + word + '(?:es|s)?\\b');
rgx.test(string);
This will return true
for all of the strings you specified in your request. \b
represents a "word boundary," which I believe is any character in \W
(including period and exclamation point) as well as the start or end of the string.
This has already been answered and accepted, but I thought I'd provide a slightly over-engineered approach that does a better job of matching plural forms. Other than that, it uses exactly the same logic as @ExplosionPills' solution:
(function() {
var isWord = function(word) { return /^[a-z]+$/i.test(word); },
exceptions = {
man: 'men',
woman: 'women',
child: 'children',
mouse: 'mice',
tooth: 'teeth',
goose: 'geese',
foot: 'feet',
ox: 'oxen'
},
pluralise = function(word) {
word = word.toLowerCase();
if (word in exceptions) {
// Exceptions
return '(?:' + word + '|' + exceptions[word] + ')';
} else if (word.match(/(?:x|s|[cs]h)$/)) {
// Sibilants
return word + '(?:es)?';
} else if (word.match(/[^f]f$/)) {
// Non-Geminate Labio-Dental Fricative (-f > -ves / -fs)
return '(?:' + word + 's?|' + word.replace(/f$/, 'ves') + ')';
} else if (word.match(/[^aeiou]y$/)) {
// Close-Front Unround Pure Vowel (-Cy > -Cies)
return '(?:' + word + '|' + word.replace(/y$/, 'ies') + ')';
} else if (word.substr(-1) == 'o') {
// Mid-Back Round Vowel (-o > -oes / -os)
return word + '(?:e?s)?';
} else {
// Otherwise
return word + 's?';
}
};
String.prototype.containsNoun = function(singularNoun) {
if (!isWord(singularNoun)) throw new TypeError('Invalid word');
var check = new RegExp('\\b' + pluralise(singularNoun) + '\\b', 'gi');
return check.test(this);
};
String.prototype.pluralException = function(plural) {
if (!isWord(this) || !isWord(plural)) throw new TypeError('Invalid exception');
var singular = this.toLowerCase();
plural = plural.toLowerCase();
if (!(singular in exceptions)) {
exceptions[singular] = plural;
}
};
})();
It extends the native String
object, so you use it like so:
'Are there some foos in here?'.containsNoun('foo'); // True
See the gist for some quick-and-dirty unit testing done in Node.js.
/ (bar((e)?s)?)[ !?.]/
depending on what you need exactly this might work. it won't find two bars in the string "bars bars" because of the overlapping spaces.
/ (bar((e)?s)?)(?=[ !?.])/
that should work with "bars bars" (two matches) since js1.5 which is supported by all the browsers anyway nowadays.
本文标签:
版权声明:本文标题:javascript - Match single word, with possible punctuation or pluralization at end (Regex) - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741606859a2388027.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论