javascript - Is a DOM Text Node guaranteed to not be interpreted as HTML? - Stack Overflow

IT技术

更新时间：2025-04-112

admin管理员组
文章数量:1402287

Does anyone know whether a DOM Node of type Text is guaranteed not be interpreted as HTML by the browser?

More details follow.

Background

I'm building a simple web ment system for a friend, and I've been thinking about XSS attacks. I don't think filtering or escaping HTML tags is a very elegant solution--it's too easy to e up with a convolution that will slip past the filter. The fundamental issue is that I want to guarantee that, for certain pieces of content (i.e. the content that random unauthenticated web users POST), the browser never tries to interpret or run the content.

A plain(text) start

The first thought that came to mind is just to use Content-Type: text/plain, but this has to apply to a whole page. You can put a plaintext IFRAME in the middle of a page, but it's ugly, and it creates focus problems if the user clicks into the frame.

innerText/textContent/JQuery

It turns out that there are some browser-specific (innerText in IE, textContent in FF, Safari, etc.) attributes that, when set, are required to create a single Text node.

JQuery tries to avoid the difference in browser-specific attributes, by implementing a single function text(val) that skips the browser-specific attributes and goes directly to document.createTextNode(text), which, as you can guess, creates a Text node.

W3 DOM Text Nodes

So I think this is close to what I want, it looks good--Text nodes can't have children, and it appears like they can't be interpreted as HTML. But I am not 100% sure from the official docs.

Interface Node: .html#ID-1950641247
Interface Text: .html#ID-1312295772
textContent: .html#Node3-textContent

The part from textContent is particularly encouraging, because it says "on setting, no parsing is performed either, the input string is taken as pure textual content." But is this fundamental to all Text nodes, or only nodes on which you set textContent? This probably seems like a dumb quibble, but it might be important because IE doesn't support textContent (see above).

Back around to the initial question

Can anyone confirm/reject that this will work? That is, that a w3 DOM pliant browser will never interpret a Text node as HTML, no matter what the content? I'd be extremely grateful to have this tormenting little uncertainty resolved.

Thank you for your time!

Does anyone know whether a DOM Node of type Text is guaranteed not be interpreted as HTML by the browser?

More details follow.

Background

I'm building a simple web ment system for a friend, and I've been thinking about XSS attacks. I don't think filtering or escaping HTML tags is a very elegant solution--it's too easy to e up with a convolution that will slip past the filter. The fundamental issue is that I want to guarantee that, for certain pieces of content (i.e. the content that random unauthenticated web users POST), the browser never tries to interpret or run the content.

A plain(text) start

The first thought that came to mind is just to use Content-Type: text/plain, but this has to apply to a whole page. You can put a plaintext IFRAME in the middle of a page, but it's ugly, and it creates focus problems if the user clicks into the frame.

innerText/textContent/JQuery

It turns out that there are some browser-specific (innerText in IE, textContent in FF, Safari, etc.) attributes that, when set, are required to create a single Text node.

JQuery tries to avoid the difference in browser-specific attributes, by implementing a single function text(val) that skips the browser-specific attributes and goes directly to document.createTextNode(text), which, as you can guess, creates a Text node.

W3 DOM Text Nodes

So I think this is close to what I want, it looks good--Text nodes can't have children, and it appears like they can't be interpreted as HTML. But I am not 100% sure from the official docs.

Interface Node: http://www.w3/TR/DOM-Level-3-Core/core.html#ID-1950641247
Interface Text: http://www.w3/TR/DOM-Level-3-Core/core.html#ID-1312295772
textContent: http://www.w3/TR/DOM-Level-3-Core/core.html#Node3-textContent

The part from textContent is particularly encouraging, because it says "on setting, no parsing is performed either, the input string is taken as pure textual content." But is this fundamental to all Text nodes, or only nodes on which you set textContent? This probably seems like a dumb quibble, but it might be important because IE doesn't support textContent (see above).

Back around to the initial question

Can anyone confirm/reject that this will work? That is, that a w3 DOM pliant browser will never interpret a Text node as HTML, no matter what the content? I'd be extremely grateful to have this tormenting little uncertainty resolved.

Thank you for your time!

Share Improve this question asked Jan 24, 2009 at 22:47 elliot42 3,7643 gold badges30 silver badges29 bronze badges

Add a ment |

2 Answers 2

Sorted by: Reset to default 6

Yes, this is confirmed, to the extent that for what ever browser it wasn't, that browser would have a serious defect. A text node that rendered anything but text would be a contradiction. By using document.createTextNode("some string"); and appending that node, the string is guaranteed to be rendered as text.

I don't think filtering or escaping HTML tags is a very elegant solution--it's too easy to e up with a convolution that will slip past the filter

That is absolutely untrue, filtering > to > and < to < will pletely stop any HTML injection.

本文标签： javascriptIs a DOM Text Node guaranteed to not be interpreted as HTMLStack Overflow

版权声明：本文标题：javascript - Is a DOM Text Node guaranteed to not be interpreted as HTML? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744352128a2602114.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

javascript - Is a DOM Text Node guaranteed to not be interpreted as HTML? - Stack Overflow

2 Answers 2

更多相关文章