admin管理员组文章数量:1122832
The help doc for pandas.Series.nbytes shows the following example:
s = pd.Series(['Ant', 'Bear', 'Cow'])
s
0 Ant
1 Bear
2 Cow
dtype: object
s.nbytes
24
<< end example >>
How is that 24 bytes?
I tried looking at three different encodings, none of which seems to yield that total.
print(s.str.encode('utf-8').str.len().sum())
print(s.str.encode('utf-16').str.len().sum())
print(s.str.encode('ascii').str.len().sum())
10
26
10
The help doc for pandas.Series.nbytes shows the following example:
s = pd.Series(['Ant', 'Bear', 'Cow'])
s
0 Ant
1 Bear
2 Cow
dtype: object
s.nbytes
24
<< end example >>
How is that 24 bytes?
I tried looking at three different encodings, none of which seems to yield that total.
print(s.str.encode('utf-8').str.len().sum())
print(s.str.encode('utf-16').str.len().sum())
print(s.str.encode('ascii').str.len().sum())
10
26
10
1 Answer
Reset to default 3Pandas nbytes
does not refer to the bytes required to store the string data encoded in specific formats like UTF-8
, UTF-16
, or ASCII
. It refers to the total number of bytes consumed by the underlying array of the Series data in memory.
Pandas stores a NumPy array of pointers to these Python objects when using the object dtype
.
On a 64-bit system, each pointer/reference takes 8 bytes.
3 × 8 bytes =24 bytes.
Link: nbyte source code
Link: ndarray documentation
版权声明:本文标题:python - How does Pandas.Series.nbytes work for strings? Results don't seem to match expectations - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736308784a1933787.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论