admin管理员组

文章数量:1336331

I have a DataFrame with 3 column of zeroes and ones corresponding to 3 different classes. I want to get a single series of zeroes, ones, and twos depending of the class of the entry (0 for the first class, 1 for the second one and 2 for the third one):

>>> results.head()
    HOME_WINS  DRAW  AWAY_WINS
ID                            
0           0     0          1
1           0     1          0
2           0     0          1
3           1     0          0
4           0     1          0

What I want :

>>> results.head()
    SCORE
ID                            
0       2
1       1
2       2
3       0
4       1

I have a DataFrame with 3 column of zeroes and ones corresponding to 3 different classes. I want to get a single series of zeroes, ones, and twos depending of the class of the entry (0 for the first class, 1 for the second one and 2 for the third one):

>>> results.head()
    HOME_WINS  DRAW  AWAY_WINS
ID                            
0           0     0          1
1           0     1          0
2           0     0          1
3           1     0          0
4           0     1          0

What I want :

>>> results.head()
    SCORE
ID                            
0       2
1       1
2       2
3       0
4       1
Share Improve this question asked Nov 20, 2024 at 20:11 Noé MastrorilloNoé Mastrorillo 3091 silver badge12 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 5

Multiply by a dictionary, sum and convert to_frame:

d = {'HOME_WINS': 0, 'DRAW': 1, 'AWAY_WINS': 2}

out = df.mul(d).sum(axis=1).to_frame(name='SCORE')

Or using a dot product:

d = {'HOME_WINS': 0, 'DRAW': 1, 'AWAY_WINS': 2}

out = df.dot(pd.Series(d)).to_frame(name='SCORE')

Or, if there is exactly one 1 per row, with from_dummies:

d = {'HOME_WINS': 0, 'DRAW': 1, 'AWAY_WINS': 2}

out = pd.from_dummies(df)[''].map(d).to_frame(name='SCORE')

Output:

    SCORE
ID       
0       2
1       1
2       2
3       0
4       1

Another possible solution, whose steps are:

  • Using idxmax(axis=1), the function determines the column name with the highest value (which is 1 in ourcase) for each row.

  • The resulting series is then mapped to numerical scores using map with the dictionary d = {'HOME_WINS': 0, 'DRAW': 1, 'AWAY_WINS': 2}

  • Finally, to_frame('Score') converts the into a dataframe with a single column named Score

d = {'HOME_WINS': 0,'DRAW': 1, 'AWAY_WINS': 2}
df.idxmax(axis=1).map(d).to_frame('Score')

Output:

    Score
ID       
0       2
1       1
2       2
3       0
4       1

本文标签: Get a single series of classes instead of one series for each class with pandas in PythonStack Overflow