DOM parsing in JavaScript - Stack Overflow

IT技术

更新时间：2025-04-081

admin管理员组
文章数量:1396829

Some background:
I'm developing a web based mobile application using JavaScript. HTML rendering is Safari based. Cross domain policy is disabled, so I can make calls to other domains using XmlHttpRequests. The idea is to parse external HTML and get text content of specific element.
In the past I was parsing the text line by line, finding the line I need. Then get the content of the tag which is a substring of that line. This is very troublesome and requires a lot of maintenance each time the target html changes.
So now I want to parse the html text into DOM and run css or xpath queries on it.
It works well:

$('<div></div>').append(htmlBody).find('#theElementToFind').text()

The only problem is that when I use the browser to load html text into DOM element, it will try to load all external resources (images, js files, etc.). Although it isn't causing any serious problem, I would like to avoid that.

Now the question:
How can I parse html text to DOM without the browser loading external resources, or run js scripts ?
Some ideas I've been thinking about:

creating new document object using createDocument call (document.implementation.createDocument()), but I'm not sure it will skip the loading of external resources.
use third party DOM parser in JS - the only one I've tried was very bad with handling errors
use iframe to create new document, so that external resources with relative path will not throw an error in console

Some background:
I'm developing a web based mobile application using JavaScript. HTML rendering is Safari based. Cross domain policy is disabled, so I can make calls to other domains using XmlHttpRequests. The idea is to parse external HTML and get text content of specific element.
In the past I was parsing the text line by line, finding the line I need. Then get the content of the tag which is a substring of that line. This is very troublesome and requires a lot of maintenance each time the target html changes.
So now I want to parse the html text into DOM and run css or xpath queries on it.
It works well:

$('<div></div>').append(htmlBody).find('#theElementToFind').text()

The only problem is that when I use the browser to load html text into DOM element, it will try to load all external resources (images, js files, etc.). Although it isn't causing any serious problem, I would like to avoid that.

Now the question:
How can I parse html text to DOM without the browser loading external resources, or run js scripts ?
Some ideas I've been thinking about:

creating new document object using createDocument call (document.implementation.createDocument()), but I'm not sure it will skip the loading of external resources.
use third party DOM parser in JS - the only one I've tried was very bad with handling errors
use iframe to create new document, so that external resources with relative path will not throw an error in console

Share Improve this question asked Aug 15, 2012 at 9:30 m_vitaly 12k5 gold badges48 silver badges63 bronze badges

Add a ment |

2 Answers 2

Sorted by: Reset to default 5

It seems that the following piece of code works great:

var doc = document.implementation.createHTMLDocument("");
doc.documentElement.innerHTML = htmlBody;
var text = $(doc).find('#theElementToFind').text();

external resources aren't loaded, scripts aren't being evaluated.

Found it here: https://stackoverflow./a/9251106/95624

Origin: https://developer.mozilla/en/DOMParser#DOMParser_HTML_extension_for_other_browsers

You can construct jQuery object of any html string, without appending it to the DOM:

$(htmlBody).find('#theElementToFind').text();

本文标签： DOM parsing in JavaScriptStack Overflow

版权声明：本文标题：DOM parsing in JavaScript - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744121124a2591742.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

DOM parsing in JavaScript - Stack Overflow

2 Answers 2

更多相关文章