admin管理员组文章数量:1194125
When I set a value of a text node with
node.nodeValue="string with &#xxxx; sort of characters"
ampersand gets escaped. Is there an easy way to do this?
When I set a value of a text node with
node.nodeValue="string with &#xxxx; sort of characters"
ampersand gets escaped. Is there an easy way to do this?
Share Improve this question edited Mar 31, 2014 at 21:12 tshepang 12.5k25 gold badges96 silver badges139 bronze badges asked Feb 4, 2009 at 20:22 SlartibartfastSlartibartfast 8,8056 gold badges43 silver badges45 bronze badges5 Answers
Reset to default 8You need to use Javascript escapes for the Unicode characters:
node.nodeValue="string with \uxxxx sort of characters"
From http://code.google.com/p/jslibs/wiki/JavascriptTips:
(converts both entity references and numeric entities)
const entityToCode = { __proto__: null, apos:0x0027,quot:0x0022,amp:0x0026,lt:0x003C,gt:0x003E,nbsp:0x00A0,iexcl:0x00A1,cent:0x00A2,pound:0x00A3, curren:0x00A4,yen:0x00A5,brvbar:0x00A6,sect:0x00A7,uml:0x00A8,copy:0x00A9,ordf:0x00AA,laquo:0x00AB, not:0x00AC,shy:0x00AD,reg:0x00AE,macr:0x00AF,deg:0x00B0,plusmn:0x00B1,sup2:0x00B2,sup3:0x00B3, acute:0x00B4,micro:0x00B5,para:0x00B6,middot:0x00B7,cedil:0x00B8,sup1:0x00B9,ordm:0x00BA,raquo:0x00BB, frac14:0x00BC,frac12:0x00BD,frac34:0x00BE,iquest:0x00BF,Agrave:0x00C0,Aacute:0x00C1,Acirc:0x00C2,Atilde:0x00C3, Auml:0x00C4,Aring:0x00C5,AElig:0x00C6,Ccedil:0x00C7,Egrave:0x00C8,Eacute:0x00C9,Ecirc:0x00CA,Euml:0x00CB, Igrave:0x00CC,Iacute:0x00CD,Icirc:0x00CE,Iuml:0x00CF,ETH:0x00D0,Ntilde:0x00D1,Ograve:0x00D2,Oacute:0x00D3, Ocirc:0x00D4,Otilde:0x00D5,Ouml:0x00D6,times:0x00D7,Oslash:0x00D8,Ugrave:0x00D9,Uacute:0x00DA,Ucirc:0x00DB, Uuml:0x00DC,Yacute:0x00DD,THORN:0x00DE,szlig:0x00DF,agrave:0x00E0,aacute:0x00E1,acirc:0x00E2,atilde:0x00E3, auml:0x00E4,aring:0x00E5,aelig:0x00E6,ccedil:0x00E7,egrave:0x00E8,eacute:0x00E9,ecirc:0x00EA,euml:0x00EB, igrave:0x00EC,iacute:0x00ED,icirc:0x00EE,iuml:0x00EF,eth:0x00F0,ntilde:0x00F1,ograve:0x00F2,oacute:0x00F3, ocirc:0x00F4,otilde:0x00F5,ouml:0x00F6,divide:0x00F7,oslash:0x00F8,ugrave:0x00F9,uacute:0x00FA,ucirc:0x00FB, uuml:0x00FC,yacute:0x00FD,thorn:0x00FE,yuml:0x00FF,OElig:0x0152,oelig:0x0153,Scaron:0x0160,scaron:0x0161, Yuml:0x0178,fnof:0x0192,circ:0x02C6,tilde:0x02DC,Alpha:0x0391,Beta:0x0392,Gamma:0x0393,Delta:0x0394, Epsilon:0x0395,Zeta:0x0396,Eta:0x0397,Theta:0x0398,Iota:0x0399,Kappa:0x039A,Lambda:0x039B,Mu:0x039C, Nu:0x039D,Xi:0x039E,Omicron:0x039F,Pi:0x03A0,Rho:0x03A1,Sigma:0x03A3,Tau:0x03A4,Upsilon:0x03A5, Phi:0x03A6,Chi:0x03A7,Psi:0x03A8,Omega:0x03A9,alpha:0x03B1,beta:0x03B2,gamma:0x03B3,delta:0x03B4, epsilon:0x03B5,zeta:0x03B6,eta:0x03B7,theta:0x03B8,iota:0x03B9,kappa:0x03BA,lambda:0x03BB,mu:0x03BC, nu:0x03BD,xi:0x03BE,omicron:0x03BF,pi:0x03C0,rho:0x03C1,sigmaf:0x03C2,sigma:0x03C3,tau:0x03C4, upsilon:0x03C5,phi:0x03C6,chi:0x03C7,psi:0x03C8,omega:0x03C9,thetasym:0x03D1,upsih:0x03D2,piv:0x03D6, ensp:0x2002,emsp:0x2003,thinsp:0x2009,zwnj:0x200C,zwj:0x200D,lrm:0x200E,rlm:0x200F,ndash:0x2013, mdash:0x2014,lsquo:0x2018,rsquo:0x2019,sbquo:0x201A,ldquo:0x201C,rdquo:0x201D,bdquo:0x201E,dagger:0x2020, Dagger:0x2021,bull:0x2022,hellip:0x2026,permil:0x2030,prime:0x2032,Prime:0x2033,lsaquo:0x2039,rsaquo:0x203A, oline:0x203E,frasl:0x2044,euro:0x20AC,image:0x2111,weierp:0x2118,real:0x211C,trade:0x2122,alefsym:0x2135, larr:0x2190,uarr:0x2191,rarr:0x2192,darr:0x2193,harr:0x2194,crarr:0x21B5,lArr:0x21D0,uArr:0x21D1, rArr:0x21D2,dArr:0x21D3,hArr:0x21D4,forall:0x2200,part:0x2202,exist:0x2203,empty:0x2205,nabla:0x2207, isin:0x2208,notin:0x2209,ni:0x220B,prod:0x220F,sum:0x2211,minus:0x2212,lowast:0x2217,radic:0x221A, prop:0x221D,infin:0x221E,ang:0x2220,and:0x2227,or:0x2228,cap:0x2229,cup:0x222A,int:0x222B, there4:0x2234,sim:0x223C,cong:0x2245,asymp:0x2248,ne:0x2260,equiv:0x2261,le:0x2264,ge:0x2265, sub:0x2282,sup:0x2283,nsub:0x2284,sube:0x2286,supe:0x2287,oplus:0x2295,otimes:0x2297,perp:0x22A5, sdot:0x22C5,lceil:0x2308,rceil:0x2309,lfloor:0x230A,rfloor:0x230B,lang:0x2329,rang:0x232A,loz:0x25CA, spades:0x2660,clubs:0x2663,hearts:0x2665,diams:0x2666 }; var charToEntity = {}; for ( var entityName in entityToCode ) charToEntity[String.fromCharCode(entityToCode[entityName])] = entityName; function EscapeEntities(str) str.replace(/[^\x20-\x7E]/g, function(str) charToEntity[str] ? '&'+charToEntity[str]+';' : str );
function unescapeEntities(str) { return str.replace( /&(.+?);/g, function(str, ent) { return String.fromCharCode( ent[0]!='#' ? entityToCode[ent] : ent[1]=='x' ? parseInt(ent.substr(2),16): parseInt(ent.substr(1)) ); }); }
The reason this is happening is because the & in your string is being expanded into the ampersand entity by the browser. To get around this, you'll need to convert the entities yourself.
<html>
<body>
<div id="test"> </div>
</body>
<script type="text/javascript">
onload = function()
{
var node = document.getElementById( 'test' );
node.firstChild.nodeValue = convertEntities( 'Some » entities « and some » more entities «' );
}
function convertEntities( text )
{
var matches = text.match( /\&\#(\d+);/g );
for ( var i = 0; i < matches.length; i++ )
{
console.log( "Replacing: " + matches[i] );
console.log( "With: " + convertEntity( matches[i] ) );
text = text.replace( matches[i], convertEntity( matches[i] ) );
}
return text;
function convertEntity( ent )
{
var num = parseInt(ent.replace(/\D/g, ''), 16);
var esc = ((num < 16) ? '0' : '') + num.toString(16);
return String.fromCharCode( esc );
}
}
</script>
</html>
As noted in other answers, I need to replace html encoded entities with javascript encoded ones. Starting from BaileyP's answer, I've made this:
function convertEntities( text )
{
var ret = text.replace( /\&\#(\d+);/g, function ( ent, captureGroup )
{
var num = parseInt( captureGroup );
return String.fromCharCode( num );
});
return ret;
}
The OP has entities / entity references, and wants them to appear in the DOM in a text node.
That's why the accepted answer and many other answers are great; those answers convert entities to their unicode equivalents using Javascript unicode escape sequences.
But I had a different need, I had unicode characters and I want to put them into the text node as entity references. I want entity references specifically so that the XML string representing my document could be encoded in ASCII (i.e. encoding="ascii"
). Otherwise, as @Bjorn said, the Unicode characters would be "decoded as junk"
This is what I want, note the ASCII encoding:
<?xml version='1.0' encoding='ASCII'?>
<html>
<body>
“Quotes”
</body>
</html>
The ASCII encoded XML/HTML above looks good in a browser:
So I can't use the other answers because they insert unicode characters (but I want ASCII).
And I can't use the DOM text node API to insert unescaped entity references. As the OP points out: if you use DOM text node API to set the node's textContent
or nodeValue
DOM will always escape any entities you try to inject...
- ...so
&
becomes&
- ... and
“
becomes&#8220;
As a deleted answer suggested, you could try to manipulate HTML directly using innerHTML
or outerHTML
, but the Text
API does not have those properties.
Even if you are working on a non-Text node (like a <span>
), the DOM API
in my browser won't leave the entities intact, the entities are "parsed"/dereferenced to their utf-8 strings like “
becomes “
temp1.innerHTML='“'
'“' // note how I set
temp1.innerHTML;
'“' // note how the unicode character comes back out; not the entity reference
But I want my document to be ASCII encoded, I can't use the UTF-8 characters as set by the DOM; “
and ”
will be "decoded as junk" as shown below:
Yes I could simply use utf-8 encoding and therefore I don't need entity references (example shown below), but I prefer to respect the original encoding (which happened to be ASCII).
So if you are only using DOM, there's no good way to put unescaped entity references into the text nodes, they are either escaped or dereferenced to utf-8 behavior. I think this is as-designed/expected behavior, and I appreciate that... If you're only manipulating the DOM to change what renders in your browser, this might be no problem.
But in my case I was using the DOM to create and download an XML document, so I had an opportunity to get the outerHTML
string and manipulate it independently of the DOM API before downloading it.
I get the outerHTML
and run the function below to convert non-ASCII characters to their entity equivalents (similar approach in C#). By replacing the non-ASCII with entity references, my document could be encoded as ASCII and read without problems.
const replaceNonAsciiWithNumCharRefEntity = (s) => {
return (s || '')
.replace(
/[^\x00-\x7F]/g,
_ => `&#${_.codePointAt(0)};`
);
}
本文标签: domSetting nodeValue of text node in Javascript when string contains html entitiesStack Overflow
版权声明:本文标题:dom - Setting nodeValue of text node in Javascript when string contains html entities - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1738500449a2090237.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论