admin管理员组

文章数量:1334121

The current REGEX I'm using is the following one:

var sentences = fulltext.match(/[^\.!\?]+[\.!\?]+/g);

That returns an array with the sentences split INCLUDING the spaces (I need all the characters). Problem is, it does not work with ellipsis "..." and I guess neither it does with other unconventional forms of punctuation.

How can I fix my REGEX to match this and other forms of punctuation?

Is there any noob friendly example driven guide to REGEX out there?

The current REGEX I'm using is the following one:

var sentences = fulltext.match(/[^\.!\?]+[\.!\?]+/g);

That returns an array with the sentences split INCLUDING the spaces (I need all the characters). Problem is, it does not work with ellipsis "..." and I guess neither it does with other unconventional forms of punctuation.

How can I fix my REGEX to match this and other forms of punctuation?

Is there any noob friendly example driven guide to REGEX out there?

Share Improve this question edited Jan 26, 2014 at 1:53 BenMorel 36.7k51 gold badges205 silver badges336 bronze badges asked Jan 25, 2014 at 22:54 BelohlavekBelohlavek 1673 silver badges13 bronze badges 2
  • 2 Ellipsis also have their own character / code point -- U+2026 or \u2026 -- that are distinct from 3 consecutive .s (U+002E). – Jonathan Lonowski Commented Jan 25, 2014 at 22:58
  • possible duplicate of Javascript regular expression for punctuation (international)? – Jonathan Lonowski Commented Jan 25, 2014 at 23:06
Add a ment  | 

2 Answers 2

Reset to default 5

Unicode of ellipsis is \u2026.

So you can use \u2026 to match an ellipsis .

Code :

var fulltext= "First sentence… Second sentence. ";
fulltext.match(/([^.?!;\u2026]+[.?!;\u2026]+)/g);

OUTPUT

["First sentence…", " Second sentence."]

DEMO and Explanation

You can just add the ellipsis (and any other punctuation characters) to your character sets.

var input = "First sentence… Second sentence. ";
input.match(/[^\.\?!;…]+[\.\?!;…]+/g);

Result:

["First sentence…", " Second sentence."]

本文标签: Javascript Regex can39t match ellipsisStack Overflow