admin管理员组文章数量:1327924
I have a properties file which is encoded using ISO Latin but with special characters as UTF-8 escape sequences, for example the following string:
Einstellungen l\u00f6schen
I've tried a bunch of different binations of iconv
, punycode
and JSON.parse
yet none of them do what I need which is to convert these strings to a proper UTF8 format which works with JavaScript. No matter how I go about it the strings always have their UTF8 escape sequences when I print them.
Note that the file is a longer file with some line breaks etc if that makes any difference.
How do I read this file in a way which prints the correct characters?
I have a properties file which is encoded using ISO Latin but with special characters as UTF-8 escape sequences, for example the following string:
Einstellungen l\u00f6schen
I've tried a bunch of different binations of iconv
, punycode
and JSON.parse
yet none of them do what I need which is to convert these strings to a proper UTF8 format which works with JavaScript. No matter how I go about it the strings always have their UTF8 escape sequences when I print them.
Note that the file is a longer file with some line breaks etc if that makes any difference.
How do I read this file in a way which prints the correct characters?
Share Improve this question asked Jun 2, 2016 at 7:31 RichardRichard 3,3062 gold badges32 silver badges54 bronze badges 9- FYI, JavaScript \u escape sequences have nothing to do with UTF-8. The number is the unicode codepoint [reference]. – Álvaro González Commented Jun 2, 2016 at 7:36
-
Have you tried
console.log("Einstellungen l\u00f6schen")
=>Einstellungen löschen
. JavaScript will automatically do the conversion for you. – phuzi Commented Jun 2, 2016 at 7:41 -
JSON.parse('"' + str.split('"').join('\\"') + '"')
orstr.replace(/\\u([0-9a-fA-F]{4})/g, (m,cc)=>String.fromCharCode("0x" + cc))
– Thomas Commented Jun 2, 2016 at 7:42 - Yes, I've noticed that as well but for whatever reason it doesn't work when the string is parsed from the file which confuses me. – Richard Commented Jun 2, 2016 at 7:43
-
@Thomas
str.replace(/\\u([0-9a-fA-F]{4})/g, (m,cc)=>String.fromCharCode("0x" + cc))
did the trick! Feel free to post it as an answer and I'll accept it as soon as I can :) – Richard Commented Jun 2, 2016 at 7:44
1 Answer
Reset to default 7You either have to parse it as a string-literal, so the unicode-codes are parsed by the engine, therefore you have to wrap it in quotes before running it through JSON.parse().
JSON.parse('"' + str + '"');
//if you use " in your string, you would have to escape it
JSON.parse('"' + str.split('"').join('\\"') + '"');
or you search for the unicode-codes and replace them on your own
str.replace(/\\u([0-9a-fA-F]{4})/g, (m,cc)=>String.fromCharCode("0x"+cc));
本文标签: javascriptHow to unescape UTF8 characters in Node (u00f6)Stack Overflow
版权声明:本文标题:javascript - How to unescape UTF-8 characters in Node (u00f6)? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1742159664a2424709.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论