admin管理员组

文章数量:1327924

I have a properties file which is encoded using ISO Latin but with special characters as UTF-8 escape sequences, for example the following string:

Einstellungen l\u00f6schen

I've tried a bunch of different binations of iconv, punycode and JSON.parse yet none of them do what I need which is to convert these strings to a proper UTF8 format which works with JavaScript. No matter how I go about it the strings always have their UTF8 escape sequences when I print them.

Note that the file is a longer file with some line breaks etc if that makes any difference.

How do I read this file in a way which prints the correct characters?

I have a properties file which is encoded using ISO Latin but with special characters as UTF-8 escape sequences, for example the following string:

Einstellungen l\u00f6schen

I've tried a bunch of different binations of iconv, punycode and JSON.parse yet none of them do what I need which is to convert these strings to a proper UTF8 format which works with JavaScript. No matter how I go about it the strings always have their UTF8 escape sequences when I print them.

Note that the file is a longer file with some line breaks etc if that makes any difference.

How do I read this file in a way which prints the correct characters?

Share Improve this question asked Jun 2, 2016 at 7:31 RichardRichard 3,3062 gold badges32 silver badges54 bronze badges 9
  • FYI, JavaScript \u escape sequences have nothing to do with UTF-8. The number is the unicode codepoint [reference]. – Álvaro González Commented Jun 2, 2016 at 7:36
  • Have you tried console.log("Einstellungen l\u00f6schen") => Einstellungen löschen. JavaScript will automatically do the conversion for you. – phuzi Commented Jun 2, 2016 at 7:41
  • JSON.parse('"' + str.split('"').join('\\"') + '"') or str.replace(/\\u([0-9a-fA-F]{4})/g, (m,cc)=>String.fromCharCode("0x" + cc)) – Thomas Commented Jun 2, 2016 at 7:42
  • Yes, I've noticed that as well but for whatever reason it doesn't work when the string is parsed from the file which confuses me. – Richard Commented Jun 2, 2016 at 7:43
  • @Thomas str.replace(/\\u([0-9a-fA-F]{4})/g, (m,cc)=>String.fromCharCode("0x" + cc)) did the trick! Feel free to post it as an answer and I'll accept it as soon as I can :) – Richard Commented Jun 2, 2016 at 7:44
 |  Show 4 more ments

1 Answer 1

Reset to default 7

You either have to parse it as a string-literal, so the unicode-codes are parsed by the engine, therefore you have to wrap it in quotes before running it through JSON.parse().

JSON.parse('"' + str + '"');
//if you use " in your string, you would have to escape it
JSON.parse('"' + str.split('"').join('\\"') + '"');

or you search for the unicode-codes and replace them on your own

str.replace(/\\u([0-9a-fA-F]{4})/g, (m,cc)=>String.fromCharCode("0x"+cc));

本文标签: javascriptHow to unescape UTF8 characters in Node (u00f6)Stack Overflow