admin管理员组文章数量:1406911
I am trying to apply target encoding to categorical features using the category_encoders.TargetEncoder
in Python. However, I keep getting the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'groupby'
from category_encoders import TargetEncoder
from sklearn.model_selection import train_test_split
# Features for target encoding
encoding_cols = ['grade', 'sub_grade', 'home_ownership', 'verification_status',
'purpose', 'application_type', 'zipcode']
# Train-Test Split
X_train_cv, X_test, y_train_cv, y_test = train_test_split(x, y, test_size=0.25, random_state=1)
X_train, X_test_cv, y_train, y_test_cv = train_test_split(X_train_cv, y_train_cv, test_size=0.25, random_state=1)
# Initialize the Target Encoder
encoder = TargetEncoder()
# Apply Target Encoding
for i in encoding_cols:
X_train[i] = encoder.fit_transform(X_train[i], y_train) # **Error occurs here**
X_test_cv[i] = encoder.transform(X_test_cv[i])
X_test[i] = encoder.transform(X_test[i])
want to successfully apply target encoding to the categorical columns without encountering the 'numpy.ndarray' object has no attribute 'groupby'
error.
I am trying to apply target encoding to categorical features using the category_encoders.TargetEncoder
in Python. However, I keep getting the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'groupby'
from category_encoders import TargetEncoder
from sklearn.model_selection import train_test_split
# Features for target encoding
encoding_cols = ['grade', 'sub_grade', 'home_ownership', 'verification_status',
'purpose', 'application_type', 'zipcode']
# Train-Test Split
X_train_cv, X_test, y_train_cv, y_test = train_test_split(x, y, test_size=0.25, random_state=1)
X_train, X_test_cv, y_train, y_test_cv = train_test_split(X_train_cv, y_train_cv, test_size=0.25, random_state=1)
# Initialize the Target Encoder
encoder = TargetEncoder()
# Apply Target Encoding
for i in encoding_cols:
X_train[i] = encoder.fit_transform(X_train[i], y_train) # **Error occurs here**
X_test_cv[i] = encoder.transform(X_test_cv[i])
X_test[i] = encoder.transform(X_test[i])
want to successfully apply target encoding to the categorical columns without encountering the 'numpy.ndarray' object has no attribute 'groupby'
error.
2 Answers
Reset to default 2This is interesting. I can reproduce your error.
It is related to the dtype
. To solve the issue you need to force a conversion using its list values and set the name and index explicitly.
y_train = pd.Series(y_train.tolist(), name='loan_status', index=y_train.index)
This will convert your initial dtype
of CategoricalDtype(categories=[1, 0], ordered=False, categories_dtype=int64)
to dtype('int64')
So you last cell in the Colab is now:
# Initialize TargetEncoder
encoder = ce.TargetEncoder(cols=encoding_cols)
# Here is the list conversion and back to series
y_train = pd.Series(y_train.tolist(), index=y_train.index)
# Fit and transform the training data
X_train = encoder.fit_transform(X_train, y_train)
and this works fine.
I'm the maintainer of Category Encoders. There was a problem in the library, I've fixed it now in version 2.8.1
本文标签: python39numpyndarray39 object has no attribute 39groupby39Stack Overflow
版权声明:本文标题:python - 'numpy.ndarray' object has no attribute 'groupby' - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745056494a2639965.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
pandas.DataFrame
because it has functiongroupby
– furas Commented Mar 4 at 8:19TargetEncoder
with different objects-dataframe
,list
,numpy.array
- and it always works, I can't reproduce problem with simple code. Maybe later I would try to run your colab code. At this moment you could useprint()
to checktype()
of data beforefit_transform
. Maybe it can explain what can make problem – furas Commented Mar 4 at 16:01