admin管理员组

文章数量:1387442

There are some very simple codes in my project, look like:

const textToHtml = (text) => {
   const div = document.createElement('div');
   div.innerText = text;
   return div.innerHTML;
}

const htmlToText = (html) => {
   const div = document.createElement('div');
   div.innerHTML = html;
   return div.innerText;
}

It has been working normally for the past few months.A few days ago, there was a problem: in some browsers htmlToText('<br>') no more return '\n' as it always, instead it return '', so:

textToHtml(htmlToText('<br>'))
// A few months ago got '<br>'
// but today got '', I lost my '<br>'

In Mac Chrome version:73.0.3683.75 and Mac Firefox version:66.0.3 (64-bit) the '<br>' got lost, but didn't in Mac Safari version: 12.1 (14607.1.40.1.4), other versions and platforms were not tested.

I am sure their version at several month ago worked well, and I know workaround to solve the problem(I can replace all '<br>' to '\n' by RegExp myself), I just wonder has anyone else encountered the same situation? Is this a bug in the browser?

There are some very simple codes in my project, look like:

const textToHtml = (text) => {
   const div = document.createElement('div');
   div.innerText = text;
   return div.innerHTML;
}

const htmlToText = (html) => {
   const div = document.createElement('div');
   div.innerHTML = html;
   return div.innerText;
}

It has been working normally for the past few months.A few days ago, there was a problem: in some browsers htmlToText('<br>') no more return '\n' as it always, instead it return '', so:

textToHtml(htmlToText('<br>'))
// A few months ago got '<br>'
// but today got '', I lost my '<br>'

In Mac Chrome version:73.0.3683.75 and Mac Firefox version:66.0.3 (64-bit) the '<br>' got lost, but didn't in Mac Safari version: 12.1 (14607.1.40.1.4), other versions and platforms were not tested.

I am sure their version at several month ago worked well, and I know workaround to solve the problem(I can replace all '<br>' to '\n' by RegExp myself), I just wonder has anyone else encountered the same situation? Is this a bug in the browser?

Share Improve this question asked Apr 23, 2019 at 4:42 fenyiwudianfenyiwudian 4703 silver badges12 bronze badges 5
  • @fenyiwudian Since your own findings are that the behaviour is different depending on the browser you're using, I'd suggest looking through recent updates of the browsers that stopped working to see if they modified the behaviour of the code you want to use. You might find a hint somewhere about an alternative aproach that would work for all browsers. – Azer Commented Apr 23, 2019 at 4:59
  • I see what's going on. The <br> renders as a line break, but it doesn't bee a newline in the text. If you ask for the text of <div>foo<br>bar</div>, you just get foobar, but you still see the newline on the screen. – Barmar Commented Apr 23, 2019 at 5:03
  • This seems right to me. If you ask for the text of an <ol> you won't get the numbers or the line breaks before the items. How something looks on the screen is not the same as the text of it. – Barmar Commented Apr 23, 2019 at 5:04
  • It sounds like the old behavior was a bug, which Chrome and Firefox fixed, but hasn't yet been fixed in Safari. – Barmar Commented Apr 23, 2019 at 5:07
  • @Barmar I'm with you. Sounds like exactly what I'd expect - getting innerText should give the rendered output and <br> is not rendered, so you don't get it back. This is how it works in Firefox on Windows. I had to just tested it to make sure I wasn't crazy - I couldn't figure out why you'd be getting get the HTML when you ask for everything that's not HTML in, presumably, every other browser than Firefox and Chrome in OS X. – VLAZ Commented Apr 23, 2019 at 5:16
Add a ment  | 

2 Answers 2

Reset to default 4

There is an example on the MDN documentation that pares innerText and textContent and where explicitly says:

This example pares innerText with Node.textContent. Note how innerText is aware of things like <br> tags, and ignores hidden elements.

So, I have tested this on Firefox 66.0.3 (64bits) and it still work if the element from/where you are setting/getting the properties is rendered or exists on the document.body while you perform the operations. You can check the next two examples:

Example 1: The div element already exists on the document.body

const textToHtml = (text) => {
   const div = document.getElementById('test');
   div.innerText = text;
   return div.innerHTML;
}

const htmlToText = (html) => {
   const div = document.getElementById("test");
   div.innerHTML = html;
   console.log("Note <br> is parsed to \\n =>", div.innerText);
   return div.innerText;
}

console.log("Output =>", textToHtml(htmlToText(`Some<br>Other`)));
.as-console {background-color:black !important; color:lime;}
<div id="test"></div>

Example 2: The div element is appended dynamically on the document.body

const textToHtml = (text) => {
   const div = document.createElement('div');
   document.body.append(div);
   div.innerText = text;
   return div.innerHTML;
}

const htmlToText = (html) => {
   const div = document.createElement('div');
   document.body.append(div);
   div.innerHTML = html;
   console.log("Note <br> is parsed to \\n =>", div.innerText);
   return div.innerText;
}

console.log("Output =>", textToHtml(htmlToText(`Some<br>Other`)));
.as-console {background-color:black !important; color:lime;}

And, like you say, it won't work (on some newer browsers) if the element don't exists on the document, however I don't know exactly what is the reason about it (maybe it is because the element you create is not rendered):

Example 3: The div element is not present on the document.body

const textToHtml = (text) => {
   const div = document.createElement('div');
   div.innerText = text;
   return div.innerHTML;
}

const htmlToText = (html) => {
   const div = document.createElement('div');
   div.innerHTML = html;
   console.log("Note <br> isn't parsed to \\n =>", div.innerText);
   return div.innerText;
}

console.log("Output =>", textToHtml(htmlToText(`Some<br>Other`)));
.as-console {background-color:black !important; color:lime;}
.as-console-wrapper {max-height:100% !important; top:0;}

Anyway, I have e to the next approach that creates a div element, appends it to the body and then removes it. This way, you won't have any visual perturbation and should work nicely for all browsers:

New Implementation:

const textToHtml = (text) => {
   const div = document.createElement('div');
   document.body.append(div);
   div.innerText = text;
   const html = div.innerHTML;
   div.remove();
   return html;
}

const htmlToText = (html) => {
   const div = document.createElement('div');
   document.body.append(div);
   div.innerHTML = html;
   const text = div.innerText;
   div.remove();
   return text;
}

console.log("Output =>", textToHtml(htmlToText(`Some<br>Other`)));
.as-console {background-color:black !important; color:lime;}
.as-console-wrapper {max-height:100% !important; top:0;}

Extra: There is a good read about innerText on the-poor-misunderstood-innerText

Works as specified in https://html.spec.whatwg/multipage/dom.html#the-innertext-idl-attribute

Can be set, to replace the element's children with the given value, but with line breaks converted to br elements.

Why it differs if it is rendered was not clear to me, thats why I by favor, do not rely on this behaviour.

I have no example of methods that pass a the inverse-test like this:

assert textToHtml(htmlToText(x)) === x;

本文标签: javascriptInteresting conversion of quotltbrgtquot between innerHTML and innerTextStack Overflow