admin管理员组文章数量:1289557
I found that setting pandas DataFrame column with numpy array whose dtype is object will cause a wierd error. I wonder why it happens.
The code I ran is as follows:
import numpy as np
import pandas as pd
print(f"numpy version: {np.__version__}")
print(f"pandas version: {pd.__version__}")
data = pd.DataFrame({
"c1": [1, 2, 3, 4, 5],
})
print("-" * 10)
t1 = np.array([["A"], ["B"], ["C"], ["D"], ["E"]])
data["c1"] = t1 # This works well
print("-" * 10)
t2 = np.array([["A"], ["B"], ["C"], ["D"], ["E"]], dtype=object)
data["c1"] = t2 # This throws an error
print("-" * 10)
The output is:
numpy version: 1.26.4
pandas version: 2.2.2
----------
----------
Traceback (most recent call last):
File "...\test.py", line 19, in <module>
data["c1"] = t2 # This throws an error
~~~~^^^^^^
File "...\pandas\core\frame.py", line 4311, in __setitem__
self._set_item(key, value)
File "...\pandas\core\frame.py", line 4524, in _set_item
value, refs = self._sanitize_column(value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "...\pandas\core\frame.py", line 5267, in _sanitize_column
arr = sanitize_array(value, self.index, copy=True, allow_2d=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "...\pandas\core\construction.py", line 606, in sanitize_array
subarr = maybe_infer_to_datetimelike(data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "...\pandas\core\dtypes\cast.py", line 1182, in maybe_infer_to_datetimelike
raise ValueError(value.ndim) # pragma: no cover
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: 2
I found that setting pandas DataFrame column with numpy array whose dtype is object will cause a wierd error. I wonder why it happens.
The code I ran is as follows:
import numpy as np
import pandas as pd
print(f"numpy version: {np.__version__}")
print(f"pandas version: {pd.__version__}")
data = pd.DataFrame({
"c1": [1, 2, 3, 4, 5],
})
print("-" * 10)
t1 = np.array([["A"], ["B"], ["C"], ["D"], ["E"]])
data["c1"] = t1 # This works well
print("-" * 10)
t2 = np.array([["A"], ["B"], ["C"], ["D"], ["E"]], dtype=object)
data["c1"] = t2 # This throws an error
print("-" * 10)
The output is:
numpy version: 1.26.4
pandas version: 2.2.2
----------
----------
Traceback (most recent call last):
File "...\test.py", line 19, in <module>
data["c1"] = t2 # This throws an error
~~~~^^^^^^
File "...\pandas\core\frame.py", line 4311, in __setitem__
self._set_item(key, value)
File "...\pandas\core\frame.py", line 4524, in _set_item
value, refs = self._sanitize_column(value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "...\pandas\core\frame.py", line 5267, in _sanitize_column
arr = sanitize_array(value, self.index, copy=True, allow_2d=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "...\pandas\core\construction.py", line 606, in sanitize_array
subarr = maybe_infer_to_datetimelike(data)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "...\pandas\core\dtypes\cast.py", line 1182, in maybe_infer_to_datetimelike
raise ValueError(value.ndim) # pragma: no cover
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: 2
Share
Improve this question
asked Feb 21 at 10:15
Tony DingTony Ding
1
1 Answer
Reset to default 1I'm not sure why this is causing an error with dtype=object
, but your arrays are 2D.
A Series is a 1D object.
If you convert them to 1D this works fine:
data['c1'] = t2.ravel() # works fine
data['c1'] = t2.squeeze() # also works fine
本文标签: Setting pandas DataFrame column with numpy object array causes errorStack Overflow
版权声明:本文标题:Setting pandas DataFrame column with numpy object array causes error - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741391642a2376136.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论