admin管理员组

文章数量:1294627

I've been getting better at Regex, but I've e up with something that is beyond what I'm currently able to do.

I want to build a function to test (return true or false) to test to see if a word is found inside a string. But I wouldn't want to have a positive match if the word was found inside of another word. I would also like to build in the possibility of checking for pluralization.

Here are some examples of the results I'd expect to get:

Word to look for: "bar"

"Strings to search in" //what it should return as

"foo bar" //true

"foo bar." //true

"foo bar!" //true (would be true with any other punctuation before or after 'bar' too)

"foo bars." //true

"foo bares." //true (even though bares has a different meaning then bars, I would be okay with this returning true since I would need to check for words that pluralize with "es" and I wouldn't expect to build a regex to know which words pluralize with "s" and which to "es")

"my name is bart simpson" //false (bar is actually part of "bart")

"bart simpson went to the bar." //true

I'll be using javascript/jquery to check for matches

Thanks so much for the help!

I've been getting better at Regex, but I've e up with something that is beyond what I'm currently able to do.

I want to build a function to test (return true or false) to test to see if a word is found inside a string. But I wouldn't want to have a positive match if the word was found inside of another word. I would also like to build in the possibility of checking for pluralization.

Here are some examples of the results I'd expect to get:

Word to look for: "bar"

"Strings to search in" //what it should return as

"foo bar" //true

"foo bar." //true

"foo bar!" //true (would be true with any other punctuation before or after 'bar' too)

"foo bars." //true

"foo bares." //true (even though bares has a different meaning then bars, I would be okay with this returning true since I would need to check for words that pluralize with "es" and I wouldn't expect to build a regex to know which words pluralize with "s" and which to "es")

"my name is bart simpson" //false (bar is actually part of "bart")

"bart simpson went to the bar." //true

I'll be using javascript/jquery to check for matches

Thanks so much for the help!

Share Improve this question asked Feb 13, 2013 at 16:59 rgbflawedrgbflawed 2,1571 gold badge23 silver badges29 bronze badges 6
  • So "child" is not expected to match "children", correct? – Álvaro González Commented Feb 13, 2013 at 17:01
  • Yes, I would not expect "child" to match "children". – rgbflawed Commented Feb 13, 2013 at 17:03
  • 4 Pluralization is not easy to do with regular expressions. What about mouse/mice and colossus/colossi? – Halcyon Commented Feb 13, 2013 at 17:03
  • Frits, I would not expect to match these types of pluralization words. – rgbflawed Commented Feb 13, 2013 at 17:04
  • 1 How about these cases: in5bar, bares_random, Ăbar – nhahtdh Commented Feb 13, 2013 at 17:14
 |  Show 1 more ment

3 Answers 3

Reset to default 5
var rgx = new RegExp('\\b' + word + '(?:es|s)?\\b');
rgx.test(string);

This will return true for all of the strings you specified in your request. \b represents a "word boundary," which I believe is any character in \W (including period and exclamation point) as well as the start or end of the string.

This has already been answered and accepted, but I thought I'd provide a slightly over-engineered approach that does a better job of matching plural forms. Other than that, it uses exactly the same logic as @ExplosionPills' solution:

(function() {
  var isWord = function(word) { return /^[a-z]+$/i.test(word); },

      exceptions = {
        man:   'men',
        woman: 'women',
        child: 'children',
        mouse: 'mice',
        tooth: 'teeth',
        goose: 'geese',
        foot:  'feet',
        ox:    'oxen'
      },

      pluralise = function(word) {
        word = word.toLowerCase();

        if (word in exceptions) {
          // Exceptions
          return '(?:' + word + '|' + exceptions[word] + ')';

        } else if (word.match(/(?:x|s|[cs]h)$/)) {
          // Sibilants
          return word + '(?:es)?';

        } else if (word.match(/[^f]f$/)) {
          // Non-Geminate Labio-Dental Fricative (-f > -ves / -fs)
          return '(?:' + word + 's?|' + word.replace(/f$/, 'ves') + ')';

        } else if (word.match(/[^aeiou]y$/)) {
          // Close-Front Unround Pure Vowel (-Cy > -Cies)
          return '(?:' + word + '|' + word.replace(/y$/, 'ies') + ')';

        } else if (word.substr(-1) == 'o') {
          // Mid-Back Round Vowel (-o > -oes / -os)
          return word + '(?:e?s)?';

        } else {
          // Otherwise
          return word + 's?';
        }
      };

  String.prototype.containsNoun = function(singularNoun) {
    if (!isWord(singularNoun)) throw new TypeError('Invalid word');
    var check = new RegExp('\\b' + pluralise(singularNoun) + '\\b', 'gi');
    return check.test(this);
  };

  String.prototype.pluralException = function(plural) {
    if (!isWord(this) || !isWord(plural)) throw new TypeError('Invalid exception');

    var singular = this.toLowerCase();
    plural = plural.toLowerCase();

    if (!(singular in exceptions)) {
      exceptions[singular] = plural;
    }
  };
})();

It extends the native String object, so you use it like so:

'Are there some foos in here?'.containsNoun('foo'); // True

See the gist for some quick-and-dirty unit testing done in Node.js.

/ (bar((e)?s)?)[ !?.]/

depending on what you need exactly this might work. it won't find two bars in the string "bars bars" because of the overlapping spaces.

/ (bar((e)?s)?)(?=[ !?.])/

that should work with "bars bars" (two matches) since js1.5 which is supported by all the browsers anyway nowadays.

本文标签: