pandas - Fastest Python Approach to Get Aggregated Value - Stack Overflow

IT技术

更新时间：2025-03-100

admin管理员组
文章数量:1295268

What's the fastest python way to create a new column for the sum of the following:

In my dataframe, there are unknown number of columns that are named like carbrands_0_type, carbrands_1_type, etc. There is only one string value in each such column, e.g. "BMW", and there are corresponding float values that relates to the type value, so there are columns also named as carbrands_0_quantity, carbrands_1_quantity, etc. So if carbrands_0_type is "BMW", carbrands_0_quantity is 50, i would know for that row or event, i have 50 BMW.

The thing is the car brands will not appear in any defined column, and can be random, so BMW can be seen in carbrands_15_type, carbrands_15_quantity for the next row.

Tentatively, I would need to get the str name "Audi" and create a new column named 'Audi', with the corresponding quantity for the entire dataframe. What I have done is the following:

def convert_sum_type_quantity(row, df, start_string, end_str1, end_str2, character):

total_sum = 0
val = len([x for x in df.columns if (x.startswith(start_string) & x.endswith(end_str1))]) # in this case this is using type for len function

for i in range(val):
    qnty_col = start_string + '_' + str(i) + '_' + end_str2
    type_col = start_string + '_' + str(i) + '_' + end_str1
    
    if ((isinstance(row[type_col], str)) & (character in str(row[type_col]))):
        total_sum += row[qnty_col]
    
return total_sum

then after which i apply to the dataframe.

data['audi'] = data.apply(lambda row: convert_sum_type_quantity(row, data, 'carbrands', 'type', 'quantity', 'audi'), axis=1)

it works but it is draggy and slow, since it is using lambda. Moreover, it take more time if i want more columns, like BMW or Mercedes.

Any experts with good advice? Or even a better code to get all unique car brands with the corresponding quantity?

p.s. i need the extra defined columns for output to non-IT people

本文标签： pandasFastest Python Approach to Get Aggregated ValueStack Overflow

版权声明：本文标题：pandas - Fastest Python Approach to Get Aggregated Value - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1741617376a2388603.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

pandas - Fastest Python Approach to Get Aggregated Value - Stack Overflow

更多相关文章