admin管理员组文章数量:1303389
I'm trying to obfuscate some javascript by altering their character codes, but I've found that I can't correctly print characters outside of a certain range, in Python 2.7.
For example, here's what I'm trying to do:
f = open('text.txt','w')
f.write(unichr(510).encode('utf-8'))
f.close()
I can't write unichr(510) because it says the ascii codec is out of range. So I encode it with utf-8. This turns a single character u'\u01fe'
into two '\xc7\xbe'
.
Now, in javascript, it's easy to get the symbol for the character code 510:
String.fromCharCode(510)
Gives the single character: Ǿ
What I'm getting with Python is two characters: Ǿ
If I pass those characters to javascript, I can't retrieve the original single character.
I know that it is possible to print the Ǿ character in python, but I haven't been able to figure it out. I've gotten as far as using unichr() instead of chr(), and encoding it to 'utf-8', but I'm still ing up short. I've also read that Python 3 has this functionality built-in to the chr() function. But that won't help me.
Does anyone know how I can acplish this task?
Thank you.
I'm trying to obfuscate some javascript by altering their character codes, but I've found that I can't correctly print characters outside of a certain range, in Python 2.7.
For example, here's what I'm trying to do:
f = open('text.txt','w')
f.write(unichr(510).encode('utf-8'))
f.close()
I can't write unichr(510) because it says the ascii codec is out of range. So I encode it with utf-8. This turns a single character u'\u01fe'
into two '\xc7\xbe'
.
Now, in javascript, it's easy to get the symbol for the character code 510:
String.fromCharCode(510)
Gives the single character: Ǿ
What I'm getting with Python is two characters: Ǿ
If I pass those characters to javascript, I can't retrieve the original single character.
I know that it is possible to print the Ǿ character in python, but I haven't been able to figure it out. I've gotten as far as using unichr() instead of chr(), and encoding it to 'utf-8', but I'm still ing up short. I've also read that Python 3 has this functionality built-in to the chr() function. But that won't help me.
Does anyone know how I can acplish this task?
Thank you.
Share Improve this question asked Apr 8, 2013 at 1:16 bozdozbozdoz 12.9k8 gold badges73 silver badges100 bronze badges 2-
How are you passing the
'\xc7\xbe'
to JavaScript? Those two consecutive bytes (not to be confused with the characters Ǿ) are the UTF-8 encoding of Ǿ, which JavaScript should recognize as such (or at least treat no differently than a Ǿ appearing in a UTF-8 encoded JS file). – jwodder Commented Apr 8, 2013 at 1:22 -
I'm saving the
'\xc7\xbe'
to a javascript file. Also, it is treating it as two separate characters. @jwodder – bozdoz Commented Apr 8, 2013 at 1:25
3 Answers
Reset to default 4You should open the file in binary mode:
f = open('text.txt','wb')
And then write the bytes (in Python 3):
f.write(chr(510).encode('utf-8'))
Or in Python 2:
f.write(unichr(510).encode('utf-8'))
Finally, close the file
f.close()
Or you could do it in a better manner like this:
>>> f = open('e:\\text.txt','wt',encoding="utf-8")
>>> f.write(chr(510))
>>> f.close()
After that, you could read the file as:
>>> f = open('e:\\text.txt','rb')
>>> content = f.read().decode('utf-8')
>>> content
'Ǿ'
Or
>>> f = open('e:\\text.txt','rt',encoding='utf-8')
>>> f.read()
'Ǿ'
Tested on my Win7 and Python3. It should works with Python 2.X
How about this?
import codecs
outfile = codecs.open(r"C:\temp\unichr.txt", mode='w', encoding="utf-8")
outfile.write(unichr(510))
outfile.close()
Python is writing the bytes '\xc7\xbe'
to the file:
In [45]: unichr(510).encode('utf-8')
Out[45]: '\xc7\xbe'
JavaScript is apparently forming the unicode u'\xc7\xbe'
instead:
In [46]: 'Ǿ'.decode('utf-8')
Out[46]: u'\xc7\xbe'
In [47]: 'Ǿ'.decode('utf-8').encode('latin-1')
Out[47]: '\xc7\xbe'
The problem is in how JavaScript is converting the bytes to unicode, not in how Python is writing the bytes.
本文标签: javascriptHow to print nonascii characters to file in Python 27Stack Overflow
版权声明:本文标题:javascript - How to print non-ascii characters to file in Python 2.7 - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741722931a2394492.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论