dataframe - How to select rows based on combinations while preserving column names in pandas? - Stack Overflow

IT技术

更新时间：2025-02-120

admin管理员组
文章数量:1221377

I have a DataFrame df_items and want to create combinations of its rows of size i using itertoolsbinations. Each combination should maintain all columns from the original DataFrame.

Current approach: works but loses column names

from itertools import combinations
combinations = np.array(list(combinations(range(len(df_items)), i)))
selected_items = df_items.values[combinations]

I have a DataFrame df_items and want to create combinations of its rows of size i using itertools.combinations. Each combination should maintain all columns from the original DataFrame.

Current approach: works but loses column names

from itertools import combinations
combinations = np.array(list(combinations(range(len(df_items)), i)))
selected_items = df_items.values[combinations]

Share Improve this question asked Feb 7 at 18:24 A A 1 New contributor A A is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

A few line of sample code would be easier to visualize. Anyway, check using groupby as this is the shortest way I often use to find the list of all possible combinations. For example my df has cols employee and customer_id, then if I want to find all the combination of those two factors, I just df.groupby(['employee', 'customer_id'])['var'].size() Hope this helps – PTQuoc Commented Feb 7 at 18:27
Please add a minimal reproducible example together with the exact desired output based on the small sample to be provided. – ouroboros1 Commented Feb 7 at 18:30
please provide samples for context – Fred Alisson Commented Feb 7 at 18:30

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

If you want independent DataFrames for each combination of rows, the best is to use iloc in a loop:

for c in combinations(range(len(df_items)), 2):
    print(df_items.iloc[list(c)])

Example output:

Used input:

df_items = pd.DataFrame({'A': range(3),
                         'B': range(3)})

You could also groupby but this will be less efficient:

from itertools import combinations, chain

i = 2

tmp = df_items.iloc[list(chain.from_iterable(combinations(range(len(df_items)), i)))]

tmp.groupby(np.arange(len(tmp))//i)

本文标签： dataframeHow to select rows based on combinations while preserving column names in pandasStack Overflow

版权声明：本文标题：dataframe - How to select rows based on combinations while preserving column names in pandas? - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1739298034a2157003.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

dataframe - How to select rows based on combinations while preserving column names in pandas? - Stack Overflow

1 Answer 1

更多相关文章