admin管理员组文章数量:1425742
'❌'[0] === '❌' // true
'✔️'[0] === '✔️' // false
'✔️'[0] === '✔' // true
I suspect it's unicode related but would like to understand precisely what is happening and how can I correctly pare such charaters. Why is '✔️' treated differently than '❌'?
I encountered it in this simple char counting
'✔️❌✔️❌'.split('').filter(e => e === '❌').length // 2
'✔️❌✔️❌'.split('').filter(e => e === '✔️').length // 0
'❌'[0] === '❌' // true
'✔️'[0] === '✔️' // false
'✔️'[0] === '✔' // true
I suspect it's unicode related but would like to understand precisely what is happening and how can I correctly pare such charaters. Why is '✔️' treated differently than '❌'?
I encountered it in this simple char counting
'✔️❌✔️❌'.split('').filter(e => e === '❌').length // 2
'✔️❌✔️❌'.split('').filter(e => e === '✔️').length // 0
Share
Improve this question
asked Oct 12, 2021 at 23:51
Wilhelm OlejnikWilhelm Olejnik
2,5073 gold badges15 silver badges23 bronze badges
4
- 2 I think the check-mark is a two-character sequence, while the "X" is not. – Pointy Commented Oct 12, 2021 at 23:55
- 1 You encountered a surrogate pair: stackoverflow./questions/31986614/what-is-a-surrogate-pair – Adelin Commented Oct 13, 2021 at 0:01
- 1 thorough explication here: Emojis in Javascript from this question: How to convert one emoji character to Unicode codepoint number in JavaScript? – pilchard Commented Oct 13, 2021 at 0:12
-
2
It’s not surrogate pairs; this is a grapheme cluster made out of the
U+2714 HEAVY CHECK MARK
and theU+FE0F VARIATION SELECTOR-16
. – Sebastian Simon Commented Oct 13, 2021 at 0:15
3 Answers
Reset to default 6Because ✔️
takes two characters:
"✔️".length === 2
"✔️"[0] === "✔"
an "✔️"[1]
denotes color, I think.
And "❌".length === 1
so it take only one character.
It's similar to the way emojis with different skin colors work as well.
As to how to pare, I think that "✔️".codePointAt(0)
(not to confuse with charCodeAt()
) might help. See https://thekevinscott./emojis-in-javascript/:
codePointAt and fromCodePoint are new methods introduced in ES2015 that can handle unicode characters whose UTF-16 encoding is greater than 16 bits, which includes emojis. Use these instead of charCodeAt, which doesn’t handle emoji correctly.
The second char '✔️'[1]
(code point = 65039) is a Variation Selector
A Variation Selector specifies that the preceding character should be displayed with emoji presentation. Only required if the preceding character defaults to text presentation.
Often used in Emoji ZWJ Sequences, where one or more characters in the sequence have text and emoji presentation, but otherwise default to text (black and white) display.
Examples Snowman as text: ☃. Snowman as Emoji: ☃️
Black Heart as text: ❤. Black Heart as Emoji: ❤️ (not so black)
Variation Selector-16 was approved as part of Unicode 3.2 in 2002.
https://unicode-table./en/FE0F/
I believe the '✔️' is made up of 2 ponents. When you output '✔️'[0] you get '✔', and the black checkmark does not equal the green checkmark.
However, the '❌' is made up of just a single ponent, so when you output '❌'[0], you get the same thing: '❌'.
本文标签: javascriptWhy 39❌39039❌39 but 39✔️39039✔️39Stack Overflow
版权声明:本文标题:javascript - Why '❌'[0] === '❌' but '✔️'[0] !== '✔️'? - Stack Ov 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745452759a2658961.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论