admin管理员组

文章数量:1400218

I have a dataframe with a column multi-index, df1, with a datetime index and 2 levels: level 0, called Capitals, has columns A, B, C, and level 1, called Smalls, has columns a, b, c, d, e.

Capitals A B C
Smalls a b c d e a b c d e a b c d e
Date
01-01-25
01-02-25
01-03-25
01-04-25

I have a dataframe with a column multi-index, df1, with a datetime index and 2 levels: level 0, called Capitals, has columns A, B, C, and level 1, called Smalls, has columns a, b, c, d, e.

Capitals A B C
Smalls a b c d e a b c d e a b c d e
Date
01-01-25
01-02-25
01-03-25
01-04-25

I have a second dataframe, df2, with the same datetime index and three columns, X, Y and Z.

X Y Z
Date
01-01-25
01-02-25
01-03-25
01-04-25

Is there a way to:

i) multiply B of df1 by Z of df2 (Ba * Z, Bb * Z, Bc * Z, Bd * Z Be * Z) and

ii) add the 5 new (Smalls: a, b, c, d ,e) columns to a new Capitals column called D in df1?

Capitals A B C D
Smalls a b c d e a b c d e a b c d e a b c d e
Date
01-01-25
01-02-25
01-03-25
01-04-25

The method I'm using first creates an empty multi-index data frame with a similar structure to df1, with the new columns I want to add, which is connected to the original multi-index data frame.

Then it iterates through the level 1 values of B, multiplying by the value of Z in the second df.

# Extract level 1 tickers from df1.columns
smalls = df1.columns.get_level_values(1).unique()

# Create new MultiIndex for the empty columns
new_columns = pd.MultiIndex.from_product(['D', smalls],names=df1.columns.names)

# Create an empty DataFrame with the new columns
empty_df = pd.DataFrame(0, index=df1.index, columns=new_columns)

# Concatenate with the original DataFrame
df1 = pd.concat([df1, empty_df], axis=1)

# Multiply dfs and populate D
for small in smalls:
    df1[('D', small)] = df1[('B', small)] / df2['Z']

Is there a more streamlined way to do this, using vectors rather than iterating?

Share Improve this question edited Mar 25 at 1:56 Shaido 28.4k25 gold badges75 silver badges81 bronze badges asked Mar 25 at 1:32 AndysPythonStuffAndysPythonStuff 1671 silver badge10 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 1

You can create the new D columns by multiplying the B columns in df1 by the Z column in df2 then concatinating them by concat. The column name can be renamed using MultiIndex.from_product.

B_columns = df1.loc[:, 'B']
D_columns = B_columns.mul(df2['Z'], axis=0)

# Rename the top level from 'B' to 'D'
smalls = df1.columns.get_level_values(1).unique()
D_columns.columns = pd.MultiIndex.from_product([['D'], smalls], names=df1.columns.names)

df = pd.concat([df1, D_columns], axis=1)

Using two testing dataframes, df1:

Capitals    A                   B                   C
Smalls      a   b   c   d   e   a   b   c   d   e   a   b   c   d   e
Date                                                            
01-01-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5
01-02-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5
01-03-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5
01-04-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5

and df2:

             X   Y   Z
Date            
01-01-25    10  11  12
01-02-25    10  11  12
01-03-25    10  11  12
01-04-25    10  11  12

then above code gives:

Capitals    A                   B                   C                    D
Smalls      a   b   c   d   e   a   b   c   d   e   a   b   c   d   e    a   b   c   d   e
Date                                                                                
01-01-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   12  24  36  48  60
01-02-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   12  24  36  48  60
01-03-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   12  24  36  48  60
01-04-25    1   2   3   4   5   1   2   3   4   5   1   2   3   4   5   12  24  36  48  60

You can use concat for add D level to B and another concat or DataFrame.join for append to original:

np.random.seed(2)

mux = pd.MultiIndex.from_product([list('ABC'), list('abcde')])
df1 = pd.DataFrame(np.random.randint(10, size=(4, 15)), 
                   index=pd.date_range('01-01-25','01-04-25'), columns=mux)
print (df1)
            A              B              C            
            a  b  c  d  e  a  b  c  d  e  a  b  c  d  e
2025-01-01  8  8  6  2  8  7  2  1  5  4  4  5  7  3  6
2025-01-02  4  3  7  6  1  3  5  8  4  6  3  9  2  0  4
2025-01-03  2  4  1  7  8  2  9  8  7  1  6  8  5  9  9
2025-01-04  9  3  0  0  2  8  8  2  9  6  5  6  6  6  3

df2 = pd.DataFrame(np.random.randint(10, size=(4, 3)),
                   index=pd.date_range('01-01-25','01-04-25'),
                   columns=list('XYZ'))
print (df2)
            X  Y  Z
2025-01-01  8  2  1
2025-01-02  4  8  1
2025-01-03  6  9  5
2025-01-04  1  2  4

df3 = pd.concat({'D': df1.loc[:, 'B'].mul(df2['X'], axis=0)}, axis=1)
df = pd.concat([df1, df3], axis=1)

Or:

df3 = pd.concat({'D': df1.loc[:, 'B'].mul(df2['X'], axis=0)}, axis=1)


df = df1.join(df3)
print (df)
            A              B              C               D                
            a  b  c  d  e  a  b  c  d  e  a  b  c  d  e   a   b   c   d   e
2025-01-01  8  8  6  2  8  7  2  1  5  4  4  5  7  3  6   7   2   1   5   4
2025-01-02  4  3  7  6  1  3  5  8  4  6  3  9  2  0  4   3   5   8   4   6
2025-01-03  2  4  1  7  8  2  9  8  7  1  6  8  5  9  9  10  45  40  35   5
2025-01-04  9  3  0  0  2  8  8  2  9  6  5  6  6  6  3  32  32   8  36  24

本文标签: