admin管理员组

文章数量:1335138

I want to receive a lot of text (e.g. a book chapter), and create an array of the sentences.

My current code is:

text.match( /[^\.!\?]+[\.!\?]+["']?/g );

This only works when the text ends with one of [. ! ?]. If the final sentence has no punctuation at the end, it's lost.

How do I split my text into sentences, allowing for the final sentence to not have punctuation?

I want to receive a lot of text (e.g. a book chapter), and create an array of the sentences.

My current code is:

text.match( /[^\.!\?]+[\.!\?]+["']?/g );

This only works when the text ends with one of [. ! ?]. If the final sentence has no punctuation at the end, it's lost.

How do I split my text into sentences, allowing for the final sentence to not have punctuation?

Share Improve this question asked Dec 4, 2016 at 11:28 Mirror318Mirror318 12.7k14 gold badges70 silver badges115 bronze badges 7
  • Does the final sentence have a line break? – jstice4all Commented Dec 4, 2016 at 11:30
  • add \n i.e new line – SaidbakR Commented Dec 4, 2016 at 11:30
  • \n works only if there are no other line breaks in the text, which sounds unlikely. – JJJ Commented Dec 4, 2016 at 11:32
  • You may include an example, you'll get more relevant answers – Thomas Ayoub Commented Dec 4, 2016 at 11:35
  • What about if you have abbreviations in your sentences? – flec Commented Dec 4, 2016 at 11:37
 |  Show 2 more ments

4 Answers 4

Reset to default 4

Use $ to match the end of the string:

/[^\.!\?]+[\.!\?]+["']?|.+$/g

Or maybe you want to allow whitespace characters at the end:

/[^\.!\?]+[\.!\?]+["']?|\s*$/g

It depends on the characters in the text but

text.match( /[^\.!\?]+[\.!\?]+|[^\.!\?]+/g );

can do the job.

(If it doesn't work could you provide a few sentences what you can't match?)

Depending on whether you need the punctuation of your sentences in your result you can just use "split"

var txt="One potato. Two Potato. Three";
txt.split( /[\.!\?]+/ );
[ 'One potato', ' Two Potato', ' Three' ]

You can just use [^\.!\?]+, you don't need the rest:

text = "Mr. Brown Fox. hello world. hi again! hello one more time"
console.log(text.match(/[^\.!\?]+/g))

本文标签: javascriptJS split text into sentencesStack Overflow