admin管理员组

文章数量:1122846

Here is my sample data:

                     indicator1
company  date                  
company1 2015-01-01        97.0
         2016-01-01        55.0
         2017-01-01        47.0
         2018-01-01        68.0
         2019-01-01        65.0
company2 2015-01-01        22.0
         2016-01-01        40.0
         2017-01-01        22.0
         2018-01-01        12.0
         2019-01-01        86.0
company3 2015-01-01        47.0
         2016-01-01        28.0
         2017-01-01        91.0
         2018-01-01        63.0
         2018-05-01       123.0
         2019-01-01        57.0

I'm trying to calculate 1-year pct_chng this way:

df["pct_chng_3"] = df.groupby("company", group_keys=False)\
    .apply(lambda x: x['indicator1'].pct_change(periods = period, freq = 'Y'))

It works fine w/o the freq parameter (just does pct_change line-by_line), but as soon as I add freq = 'Y' I'm getting the error:

new_ax = index.shift(periods, freq)
NotImplementedError: This method is only implemented for DatetimeIndex, PeriodIndex and TimedeltaIndex; Got type MultiIndex

I presume that is caused by the fact that groupBy leaves the two-dimensional index in place, that confuses the "shift" method.

I can't figure out a nice workaround.

Here is my sample data:

                     indicator1
company  date                  
company1 2015-01-01        97.0
         2016-01-01        55.0
         2017-01-01        47.0
         2018-01-01        68.0
         2019-01-01        65.0
company2 2015-01-01        22.0
         2016-01-01        40.0
         2017-01-01        22.0
         2018-01-01        12.0
         2019-01-01        86.0
company3 2015-01-01        47.0
         2016-01-01        28.0
         2017-01-01        91.0
         2018-01-01        63.0
         2018-05-01       123.0
         2019-01-01        57.0

I'm trying to calculate 1-year pct_chng this way:

df["pct_chng_3"] = df.groupby("company", group_keys=False)\
    .apply(lambda x: x['indicator1'].pct_change(periods = period, freq = 'Y'))

It works fine w/o the freq parameter (just does pct_change line-by_line), but as soon as I add freq = 'Y' I'm getting the error:

new_ax = index.shift(periods, freq)
NotImplementedError: This method is only implemented for DatetimeIndex, PeriodIndex and TimedeltaIndex; Got type MultiIndex

I presume that is caused by the fact that groupBy leaves the two-dimensional index in place, that confuses the "shift" method.

I can't figure out a nice workaround.

Share Improve this question edited Nov 21, 2024 at 11:53 Mark Rotteveel 109k224 gold badges155 silver badges218 bronze badges asked Nov 21, 2024 at 11:36 ArseniArseni 336 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 2

Use DateOffset for specify frequency, for avoid your error convert first level company to column by Series.reset_index, count pct_change and again recreate MultiIndex:

df1 = df.reset_index(level=0)

out = (df.join(df1.groupby("company", group_keys=False, sort=False)['indicator1']
                  .pct_change(freq=pd.DateOffset(years=1))
                  .to_frame('pct_chng_3')
                  .set_index(df1['company'], append=True).swaplevel()))
print (out)
                     indicator1  pct_chng_3
company  date                              
company1 2015-01-01        97.0         NaN
         2016-01-01        55.0   -0.432990
         2017-01-01        47.0   -0.145455
         2018-01-01        68.0    0.446809
         2019-01-01        65.0   -0.044118
company2 2015-01-01        22.0         NaN
         2016-01-01        40.0    0.818182
         2017-01-01        22.0   -0.450000
         2018-01-01        12.0   -0.454545
         2019-01-01        86.0    6.166667
company3 2015-01-01        47.0         NaN
         2016-01-01        28.0   -0.404255
         2017-01-01        91.0    2.250000
         2018-01-01        63.0   -0.307692
         2018-05-01       123.0         NaN
         2019-01-01        57.0   -0.095238

Another idea without MultiIndex output is create numpy array, in my opinion less safe:

df['pct_chng_3'] = (df.reset_index(level=0)
                      .groupby("company", group_keys=False, sort=False)['indicator1']
                      .pct_change(freq=pd.DateOffset(years=1)).to_numpy())
print (df)
                     indicator1  pct_chng_3
company  date                              
company1 2015-01-01        97.0         NaN
         2016-01-01        55.0   -0.432990
         2017-01-01        47.0   -0.145455
         2018-01-01        68.0    0.446809
         2019-01-01        65.0   -0.044118
company2 2015-01-01        22.0         NaN
         2016-01-01        40.0    0.818182
         2017-01-01        22.0   -0.450000
         2018-01-01        12.0   -0.454545
         2019-01-01        86.0    6.166667
company3 2015-01-01        47.0         NaN
         2016-01-01        28.0   -0.404255
         2017-01-01        91.0    2.250000
         2018-01-01        63.0   -0.307692
         2018-05-01       123.0         NaN
         2019-01-01        57.0   -0.095238 

本文标签: pythonUse pctchange for DateTime (sub)Index along with groupby for multiindex data frameStack Overflow