python - Use pct_change for DateTime (sub)Index along with group_by for multi-index data frame - Stack Overflow

IT技术

更新时间：2025-01-089

admin管理员组
文章数量:1122846

Here is my sample data:

                     indicator1
company  date                  
company1 2015-01-01        97.0
         2016-01-01        55.0
         2017-01-01        47.0
         2018-01-01        68.0
         2019-01-01        65.0
company2 2015-01-01        22.0
         2016-01-01        40.0
         2017-01-01        22.0
         2018-01-01        12.0
         2019-01-01        86.0
company3 2015-01-01        47.0
         2016-01-01        28.0
         2017-01-01        91.0
         2018-01-01        63.0
         2018-05-01       123.0
         2019-01-01        57.0

I'm trying to calculate 1-year pct_chng this way:

df["pct_chng_3"] = df.groupby("company", group_keys=False)\
    .apply(lambda x: x['indicator1'].pct_change(periods = period, freq = 'Y'))

It works fine w/o the freq parameter (just does pct_change line-by_line), but as soon as I add freq = 'Y' I'm getting the error:

new_ax = index.shift(periods, freq)
NotImplementedError: This method is only implemented for DatetimeIndex, PeriodIndex and TimedeltaIndex; Got type MultiIndex

I presume that is caused by the fact that groupBy leaves the two-dimensional index in place, that confuses the "shift" method.

I can't figure out a nice workaround.

Here is my sample data:

                     indicator1
company  date                  
company1 2015-01-01        97.0
         2016-01-01        55.0
         2017-01-01        47.0
         2018-01-01        68.0
         2019-01-01        65.0
company2 2015-01-01        22.0
         2016-01-01        40.0
         2017-01-01        22.0
         2018-01-01        12.0
         2019-01-01        86.0
company3 2015-01-01        47.0
         2016-01-01        28.0
         2017-01-01        91.0
         2018-01-01        63.0
         2018-05-01       123.0
         2019-01-01        57.0

I'm trying to calculate 1-year pct_chng this way:

df["pct_chng_3"] = df.groupby("company", group_keys=False)\
    .apply(lambda x: x['indicator1'].pct_change(periods = period, freq = 'Y'))

It works fine w/o the freq parameter (just does pct_change line-by_line), but as soon as I add freq = 'Y' I'm getting the error:

new_ax = index.shift(periods, freq)
NotImplementedError: This method is only implemented for DatetimeIndex, PeriodIndex and TimedeltaIndex; Got type MultiIndex

I presume that is caused by the fact that groupBy leaves the two-dimensional index in place, that confuses the "shift" method.

I can't figure out a nice workaround.

Share Improve this question edited Nov 21, 2024 at 11:53 Mark Rotteveel 109k224 gold badges155 silver badges218 bronze badges asked Nov 21, 2024 at 11:36 Arseni 336 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 2

Use DateOffset for specify frequency, for avoid your error convert first level company to column by Series.reset_index, count pct_change and again recreate MultiIndex:

df1 = df.reset_index(level=0)

out = (df.join(df1.groupby("company", group_keys=False, sort=False)['indicator1']
                  .pct_change(freq=pd.DateOffset(years=1))
                  .to_frame('pct_chng_3')
                  .set_index(df1['company'], append=True).swaplevel()))
print (out)
                     indicator1  pct_chng_3
company  date                              
company1 2015-01-01        97.0         NaN
         2016-01-01        55.0   -0.432990
         2017-01-01        47.0   -0.145455
         2018-01-01        68.0    0.446809
         2019-01-01        65.0   -0.044118
company2 2015-01-01        22.0         NaN
         2016-01-01        40.0    0.818182
         2017-01-01        22.0   -0.450000
         2018-01-01        12.0   -0.454545
         2019-01-01        86.0    6.166667
company3 2015-01-01        47.0         NaN
         2016-01-01        28.0   -0.404255
         2017-01-01        91.0    2.250000
         2018-01-01        63.0   -0.307692
         2018-05-01       123.0         NaN
         2019-01-01        57.0   -0.095238

Another idea without MultiIndex output is create numpy array, in my opinion less safe:

df['pct_chng_3'] = (df.reset_index(level=0)
                      .groupby("company", group_keys=False, sort=False)['indicator1']
                      .pct_change(freq=pd.DateOffset(years=1)).to_numpy())
print (df)
                     indicator1  pct_chng_3
company  date                              
company1 2015-01-01        97.0         NaN
         2016-01-01        55.0   -0.432990
         2017-01-01        47.0   -0.145455
         2018-01-01        68.0    0.446809
         2019-01-01        65.0   -0.044118
company2 2015-01-01        22.0         NaN
         2016-01-01        40.0    0.818182
         2017-01-01        22.0   -0.450000
         2018-01-01        12.0   -0.454545
         2019-01-01        86.0    6.166667
company3 2015-01-01        47.0         NaN
         2016-01-01        28.0   -0.404255
         2017-01-01        91.0    2.250000
         2018-01-01        63.0   -0.307692
         2018-05-01       123.0         NaN
         2019-01-01        57.0   -0.095238

本文标签： pythonUse pctchange for DateTime (sub)Index along with groupby for multiindex data frameStack Overflow

版权声明：本文标题：python - Use pct_change for DateTime (sub)Index along with group_by for multi-index data frame - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736311326a1934697.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - Use pct_change for DateTime (sub)Index along with group_by for multi-index data frame - Stack Overflow

1 Answer 1

更多相关文章