admin管理员组

文章数量:1327292

I have this string:

var string = '<article><img alt="Ice-cream" src="><div style="float: right; width: 50px;"><p>Lorem Ipsum </p></div></article>';

and I am trying to extract the text out of it as such:

var $str = $(string).text();
console.log($str)

but since I am concerned about performance due to a huge amount of strings with big text, I would want to go natively.

How is this possible?

I have this string:

var string = '<article><img alt="Ice-cream" src=http://placehold.it/300x300g"><div style="float: right; width: 50px;"><p>Lorem Ipsum </p></div></article>';

and I am trying to extract the text out of it as such:

var $str = $(string).text();
console.log($str)

but since I am concerned about performance due to a huge amount of strings with big text, I would want to go natively.

How is this possible?

Share Improve this question asked Jul 21, 2013 at 20:40 jQuerybeastjQuerybeast 14.5k39 gold badges119 silver badges198 bronze badges
Add a ment  | 

3 Answers 3

Reset to default 4

Let the Browser do the sanitation and use this trick:

var str= '<article><img alt="Ice-cream" src=http://placehold.it/300x300g">'+
'<divstyle="float: right; width: 50px;"><p>Lorem Ipsum </p></div></article>';

var dummyNode = document.createElement('div'),
    resultText = '';

dummyNode.innerHTML = str;
resultText = dummyNode.innerText || dummyNode.textContent;

This creates a dummy DOM element and sets its HTML content to the input string.
Now the only text can be got by simply calling the DOM property innerText or textContent.

This is also more safe and robust as Browser has already written better algorithms to get these values.

You have to make global search to find any characters any no. of time between < and >

<script type="text/javascript">

var str='<article><img alt="Ice-cream" src=http://placehold.it/300x300g"><div style="float: right; width: 50px;"><p>Lorem Ipsum </p></div></article>';
var patt=/\<.*?\>/g;

var result = str.replace(patt, "");
console.log(result);

</script>

You can use regex to get text from string that contains HTML tags.

<script type="text/javascript">

    var regex = "/<(.|\n)*?>/";
    var string = '<article><img alt="Ice-cream" src=http://placehold.it/300x300g"><div style="float: right; width: 50px;"><p>Lorem Ipsum </p></div></article>';
    var result = string .replace(regex, "");
    alert(result); // result should be "Lorem Ipsum "

</script>

This way you strip all HTML tags with empty string.

本文标签: javascriptJS Extract text from a string without jQueryStack Overflow