admin管理员组

文章数量:1323171

I'd like to parse a string and make DOM tree out of it. I decided to use documentFragment API and I did this so far:

var htmlString ="Some really really plicated html string that only can be parsed by a real browser!";
var fragment = document.createDocumentFragment('div');
var tempDiv = document.createElement('div');
fragment.appendChild(tempDiv);
tempDiv.innerHTML = htmlString;
console.log(tempDiv);

But the problem is that this script causes my browser (Chrome specifically) to send actual HTTP requests! what do I mean? take this as example:

var htmlString ='<img src="somewhere/odd/on/the/internet" alt="alt?" />';
var fragment = document.createDocumentFragment('div');
var tempDiv = document.createElement('div');
fragment.appendChild(tempDiv);
tempDiv.innerHTML = htmlString;
console.log(tempDiv);

Which leads to:

Is there any workarounds for this? or any other better idea to parse HTML-String?

I'd like to parse a string and make DOM tree out of it. I decided to use documentFragment API and I did this so far:

var htmlString ="Some really really plicated html string that only can be parsed by a real browser!";
var fragment = document.createDocumentFragment('div');
var tempDiv = document.createElement('div');
fragment.appendChild(tempDiv);
tempDiv.innerHTML = htmlString;
console.log(tempDiv);

But the problem is that this script causes my browser (Chrome specifically) to send actual HTTP requests! what do I mean? take this as example:

var htmlString ='<img src="somewhere/odd/on/the/internet" alt="alt?" />';
var fragment = document.createDocumentFragment('div');
var tempDiv = document.createElement('div');
fragment.appendChild(tempDiv);
tempDiv.innerHTML = htmlString;
console.log(tempDiv);

Which leads to:

Is there any workarounds for this? or any other better idea to parse HTML-String?

Share Improve this question edited Jun 20, 2020 at 9:12 CommunityBot 11 silver badge asked Oct 5, 2012 at 13:31 SepehrSepehr 2,11120 silver badges29 bronze badges
Add a ment  | 

4 Answers 4

Reset to default 3

Well you are appending the element to the page, of course the browser is going to fetch the content.

You can look into using DOMParser

var htmlString ='<img src="somewhere/odd/on/the/internet" alt="alt?" />';
var parser = new DOMParser();
var doc = parser.parseFromString(htmlString , "text/html");

There is code there on the MDN Doc page to support browsers that do not native support for it.

I've found answer of my question here on stackoverflow, this answer. the answer consists of a piece of code which parses HTML using native browser functionality but in a semi-sandboxed environment which doesn't send HTTP requests. hope it helps others as well.

I took a modified approach to the accepted answer's linked answer, as I don't like the idea of creating an iframe, processing the string through a BUNCH of regular expressions, and then putting that into the DOM.

I needed to preprocess some HTML ing in from an ajax request (this particular HTML has images with relative paths, and the page making the ajax request is not in the same directory as the HTML) and make the path to resources an absolute path instead.

My code looks something like this:

var dataSrcStr = data.replace(/src=/g,'data-src=');
var myContainer = document.getElementById('mycontainer');
myContainer.innerHTML = dataSrcStr;
var imgs = myContainer.querySelectorAll('img');
for(i=0,ii=imgs.length;i<ii;i++){
  imgs[i].src = 'prepended/path/to/img/'+imgs[i].data-src;
  delete imgs[i]['data-src'];
}

Obviously if there's some clear text with src= in it, you'll be replacing that, but it won't be the case for my content, as I control it as well.

This offers me a quicker solution than the linked answer or using a DOMParser, while still adding elements to the DOM to be able to access the elements programmatically.

Try this. Works for plex html too. Anything your browser can display, this can parse.

var htmlString = "...";
var newDoc = document.implementation.createHTMLDocument('newDoc');      
newDoc.documentElement.innerHTML = htmlString;

本文标签: javascriptUsing documentFragment to parse HTML without sending HTTP requestsStack Overflow