admin管理员组文章数量:1195123
To express, for example, the character U+10400 in JavaScript, I use "\uD801\uDC00"
or String.fromCharCode(0xD801) + String.fromCharCode(0xDC00)
. How do I figure that out for a given unicode character? I want the following:
var char = getUnicodeCharacter(0x10400);
How do I find 0xD801
and 0xDC00
from 0x10400
?
To express, for example, the character U+10400 in JavaScript, I use "\uD801\uDC00"
or String.fromCharCode(0xD801) + String.fromCharCode(0xDC00)
. How do I figure that out for a given unicode character? I want the following:
var char = getUnicodeCharacter(0x10400);
How do I find 0xD801
and 0xDC00
from 0x10400
?
2 Answers
Reset to default 17Based on the wikipedia article given by Henning Makholm, the following function will return the correct character for a code point:
function getUnicodeCharacter(cp) {
if (cp >= 0 && cp <= 0xD7FF || cp >= 0xE000 && cp <= 0xFFFF) {
return String.fromCharCode(cp);
} else if (cp >= 0x10000 && cp <= 0x10FFFF) {
// we substract 0x10000 from cp to get a 20-bits number
// in the range 0..0xFFFF
cp -= 0x10000;
// we add 0xD800 to the number formed by the first 10 bits
// to give the first byte
var first = ((0xffc00 & cp) >> 10) + 0xD800
// we add 0xDC00 to the number formed by the low 10 bits
// to give the second byte
var second = (0x3ff & cp) + 0xDC00;
return String.fromCharCode(first) + String.fromCharCode(second);
}
}
How do I find
0xD801
and0xDC00
from0x10400
?
JavaScript uses UCS-2 internally. That’s why String#charCodeAt()
doesn’t work the way you’d want it to.
If you want to get the code point of every Unicode character (including non-BMP characters) in a string, you could use Punycode.js’s utility functions to convert between UCS-2 strings and UTF-16 code points:
// String#charCodeAt() replacement that only considers full Unicode characters
punycode.ucs2.decode('
本文标签:
Expressing UTF16 unicode characters in JavaScriptStack Overflow
版权声明:本文标题:Expressing UTF-16 unicode characters in JavaScript - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人,
转载请联系作者并注明出处:http://www.betaflare.com/web/1738482224a2089208.html,
本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
String.fromCharCode(0xD801) + String.fromCharCode(0xDC00)
can be written asString.fromCharCode(0xD801, 0xDC00)
. – Mathias Bynens Commented Feb 2, 2012 at 13:08