python - 'numpy.ndarray' object has no attribute 'groupby' - Stack Overflow-软件玩家

admin管理员组
文章数量:1406911

I am trying to apply target encoding to categorical features using the category_encoders.TargetEncoder in Python. However, I keep getting the following error:

AttributeError: 'numpy.ndarray' object has no attribute 'groupby'

from category_encoders import TargetEncoder
from sklearn.model_selection import train_test_split

# Features for target encoding
encoding_cols = ['grade', 'sub_grade', 'home_ownership', 'verification_status', 
                 'purpose', 'application_type', 'zipcode']

# Train-Test Split
X_train_cv, X_test, y_train_cv, y_test = train_test_split(x, y, test_size=0.25, random_state=1)
X_train, X_test_cv, y_train, y_test_cv = train_test_split(X_train_cv, y_train_cv, test_size=0.25, random_state=1)

# Initialize the Target Encoder
encoder = TargetEncoder()

# Apply Target Encoding
for i in encoding_cols:
    X_train[i] = encoder.fit_transform(X_train[i], y_train)  # **Error occurs here**
    X_test_cv[i] = encoder.transform(X_test_cv[i])
    X_test[i] = encoder.transform(X_test[i])

want to successfully apply target encoding to the categorical columns without encountering the 'numpy.ndarray' object has no attribute 'groupby' error.

I am trying to apply target encoding to categorical features using the category_encoders.TargetEncoder in Python. However, I keep getting the following error:

AttributeError: 'numpy.ndarray' object has no attribute 'groupby'

from category_encoders import TargetEncoder
from sklearn.model_selection import train_test_split

# Features for target encoding
encoding_cols = ['grade', 'sub_grade', 'home_ownership', 'verification_status', 
                 'purpose', 'application_type', 'zipcode']

# Train-Test Split
X_train_cv, X_test, y_train_cv, y_test = train_test_split(x, y, test_size=0.25, random_state=1)
X_train, X_test_cv, y_train, y_test_cv = train_test_split(X_train_cv, y_train_cv, test_size=0.25, random_state=1)

# Initialize the Target Encoder
encoder = TargetEncoder()

# Apply Target Encoding
for i in encoding_cols:
    X_train[i] = encoder.fit_transform(X_train[i], y_train)  # **Error occurs here**
    X_test_cv[i] = encoder.transform(X_test_cv[i])
    X_test[i] = encoder.transform(X_test[i])

want to successfully apply target encoding to the categorical columns without encountering the 'numpy.ndarray' object has no attribute 'groupby' error.

Share edited Mar 15 at 18:57 desertnaut 60.5k32 gold badges155 silver badges182 bronze badges asked Mar 4 at 8:00 Ironman 132 bronze badges

3 always put full error message because there are other useful information. – furas Commented Mar 4 at 8:19
1 maybe it needs pandas.DataFrame because it has function groupby – furas Commented Mar 4 at 8:19
i tried to run TargetEncoder with different objects- dataframe, list, numpy.array - and it always works, I can't reproduce problem with simple code. Maybe later I would try to run your colab code. At this moment you could use print() to check type() of data before fit_transform. Maybe it can explain what can make problem – furas Commented Mar 4 at 16:01
2 (1) always put full error message because there are other useful information. (2) in colab you have little different code than in your question - it can make difference. Always show code which gives you error. (3) you could add link in question - it will be more visible, so more people may help you. – furas Commented Mar 4 at 16:07
Please try to provide a minimal reproducible example. When I run most of your code I get the error you report, but when I try running just the data import, split, target definition, and encoder fit (without specifying columns) it works fine. – Ben Reiniger Commented Mar 5 at 2:27

| Show 1 more comment

2 Answers 2

Sorted by: Reset to default 2

This is interesting. I can reproduce your error.

It is related to the dtype. To solve the issue you need to force a conversion using its list values and set the name and index explicitly.

y_train = pd.Series(y_train.tolist(), name='loan_status', index=y_train.index)

This will convert your initial dtype of CategoricalDtype(categories=[1, 0], ordered=False, categories_dtype=int64) to dtype('int64')

So you last cell in the Colab is now:

# Initialize TargetEncoder
encoder = ce.TargetEncoder(cols=encoding_cols)

# Here is the list conversion and back to series
y_train = pd.Series(y_train.tolist(), index=y_train.index)

# Fit and transform the training data
X_train = encoder.fit_transform(X_train, y_train)

and this works fine.

I'm the maintainer of Category Encoders. There was a problem in the library, I've fixed it now in version 2.8.1

本文标签： python39numpyndarray39 object has no attribute 39groupby39Stack Overflow

版权声明：本文标题：python - 'numpy.ndarray' object has no attribute 'groupby' - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1745056494a2639965.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - 'numpy.ndarray' object has no attribute 'groupby' - Stack Overflow

2 Answers 2

更多相关文章