admin管理员组

文章数量:1327849

javascript how to preg match first word and last word from html?

for example, in the below sentence, I want get words The and underway. Thanks.

<p>The Stack Overflow 2011 munity moderator election is underway</p>
//should consider the different html tags.

javascript how to preg match first word and last word from html?

for example, in the below sentence, I want get words The and underway. Thanks.

<p>The Stack Overflow 2011 munity moderator election is underway</p>
//should consider the different html tags.
Share Improve this question asked Nov 9, 2011 at 16:31 cj333cj333 2,61921 gold badges70 silver badges111 bronze badges 6
  • Don't use regexes on html. Use DOM operations. Once you've extracted a string from HTML, then it's acceptable... but not on raw html. – Marc B Commented Nov 9, 2011 at 16:35
  • wow - you're just asking for disappointment to only use JavaScript and regex... are you open to using / able to use jQuery or a similar library? – Code Jockey Commented Nov 9, 2011 at 16:36
  • What is your current approach/solution and what exactly isn't working as intended? – Andreas Commented Nov 9, 2011 at 16:36
  • 2 @CodeJockey: why is he asking for disappointment to only use JavaScript and regex? You don't need libraries for simple stuff like this. – Andy E Commented Nov 9, 2011 at 16:41
  • 1 @AndyE My statement was based on assumptions of actual practical use. There are many things (some you point out in your answer) that will cause trouble, and would be more easily resolved with various tools (jQuery, et al. could most likely be avoided) handling punctuation would be a good use for regex, handling sub-tags (like <em> would be handled somewhat by using innerText or the like, rather than innerHTML) - it's more plicated than just the example in the question - but if it works for him, good for both of you! – Code Jockey Commented Nov 9, 2011 at 16:59
 |  Show 1 more ment

2 Answers 2

Reset to default 4

If the element is already in the document, you can get its text content and split it based on " ":

var text  = "textContent" in document.body ? "textContent" : "innerText",
    el    = document.getElementById("myElement"),
    arr   = el[text].split(" "),
    first = arr.shift(),
    last  = arr.pop();

alert("1st word is '"+first+"', last word is '"+last+"'.");

If it's not already an element, make it one:

var arr, first, last,
    text  = "textContent" in document.body ? "textContent" : "innerText",
    html = "<p>The Stack Overflow 2011 munity moderator election is underway</p>",
    el   = document.createElement("div");

el.innerHTML = html;
arr   = el[text].split(" "),
first = arr.shift(),
last  = arr.pop();

alert("1st word is '"+first+"', last word is '"+last+"'.");  

Note: this doesn't take punctuation characters into account - you might want to remove some of them from the string with a simple regex before splitting. Also, if there is only a single word in the text, last will be undefined. If there are no words, 1st will be an empty string, "" and last will still be undefined.

You don't need regular expressions:

var words = "The Stack Overflow 2011 munity moderator election is underway".split(" "),
    first = words[0],
    last = words[words.length-1];

本文标签: how to use javascript regexget first word and last word from htmlStack Overflow