python - Fill gaps in time series data in a Polars Lazy-Dataframe - Stack Overflow

IT技术

更新时间：2025-04-185

admin管理员组
文章数量:1406950

I am in a situation where I have some time series data, potentially looking like this:

{
        "t": [1, 2, 5, 6, 7],
        "y": [1, 1, 1, 1, 1],
}

As you can see, the time stamp jumps from 2 to 5. For my analysis, I would like to fill in zeros for the time stamps 3, and 4.

In reality, I might have multiple gaps with varying lengths. I'd like to fill this gap for all other columns.

I'd also really like to keep my data in a LazyFrame since this is only one step in my pipeline. I don't think that .interpolate is really addressing my issue, nor is fill_null helpful here.

I managed to achieve what I want, but it looks too complex:


# Dummy, lazy data.
lf = pl.LazyFrame(
    {
        "t": [1, 2, 5, 6, 7],
        "y": [1, 1, 1, 1, 1],
    }
)


lf_filled = lf.join(
    pl.Series(
        name="t",
        values=pl.int_range(
            start=lf.select("t").first().collect().item(0, 0),
            end=lf.select("t").last().collect().item(0, 0) + 1,
            eager=True,
        ),
    )
    .to_frame()
    .lazy(),
    on="t",
    how="right",
).fill_null(0)

The output is correct and I am never collecting any more data than the two values needed for start and end.

This looks like there should be a better way to do this. Happy to hear other suggestions :)

I am in a situation where I have some time series data, potentially looking like this:

{
        "t": [1, 2, 5, 6, 7],
        "y": [1, 1, 1, 1, 1],
}

As you can see, the time stamp jumps from 2 to 5. For my analysis, I would like to fill in zeros for the time stamps 3, and 4.

In reality, I might have multiple gaps with varying lengths. I'd like to fill this gap for all other columns.

I'd also really like to keep my data in a LazyFrame since this is only one step in my pipeline. I don't think that .interpolate is really addressing my issue, nor is fill_null helpful here.

I managed to achieve what I want, but it looks too complex:


# Dummy, lazy data.
lf = pl.LazyFrame(
    {
        "t": [1, 2, 5, 6, 7],
        "y": [1, 1, 1, 1, 1],
    }
)


lf_filled = lf.join(
    pl.Series(
        name="t",
        values=pl.int_range(
            start=lf.select("t").first().collect().item(0, 0),
            end=lf.select("t").last().collect().item(0, 0) + 1,
            eager=True,
        ),
    )
    .to_frame()
    .lazy(),
    on="t",
    how="right",
).fill_null(0)

The output is correct and I am never collecting any more data than the two values needed for start and end.

This looks like there should be a better way to do this. Happy to hear other suggestions :)

Share asked Mar 7 at 12:25 Thomas 1,2551 gold badge18 silver badges35 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 3

I think your approach is sensible, there's just no need for an intermediate collect:

lf.join(
    lf.select(pl.int_range(pl.col.t.first(), pl.col.t.last()+1)),
    on="t",
    how="right"
)
.fill_null(0)

An alternate approach that might be a bit more efficient is to use an asof-join with no tolerance:

lf.select(pl.int_range(pl.col.t.first(), pl.col.t.last()+1))
  .join_asof(lf, on="t", tolerance=0)
  .fill_null(0)

本文标签： pythonFill gaps in time series data in a Polars LazyDataframeStack Overflow

版权声明：本文标题：python - Fill gaps in time series data in a Polars Lazy-Dataframe - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744929954a2632835.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - Fill gaps in time series data in a Polars Lazy-Dataframe - Stack Overflow

1 Answer 1

更多相关文章

python - Fill gaps in time series data in a Polars Lazy-Dataframe - Stack Overflow

发表评论

推荐文章

javascript - How to reduce ifelse complexity? - Stack Overflow

uploads - How to wp_upload_bits() to a sub-folder?

php - add custom link to gallery images

redirect - Adding query string parameters to URL with same name as custom post type gives 404 error

templates - Fixing media query

热门文章

functions - Load JS file only in specific template

node.js - CSRF protection in node - Stack Overflow

javascript - TypeError: compiler.plugin is not a function - Stack Overflow

javascript - Limiting the jQuery search scope correctly? - Stack Overflow

javascript - angularjs how to find img element height and width - Stack Overflow

javascript - How to generate an audio waveform from an HTML5 web video? - Stack Overflow

convert images and media url into blob url

javascript - How can I change the color of Material-UI Select Field HR element? - Stack Overflow

SuperCollider inheritance - Stack Overflow

urls - Allow duplicate slugs for pages and posts

最新文章

windows设置断电重启开机后自动输入锁屏密码登录

Windows系统设置开机默认开启数字小键盘

Windows11 开机自动同步时间（开机时间不更新问题）

windows配置开机自启动软件或脚本

【Redis】Windows设置Redis为开机自启动

Cropping an image before inserting into a post

javascript - Bootstrap form validation validates all the fields inside same form group - Stack Overflow

c# - How to avoid from richTextBox to being flickering? - Stack Overflow

javascript - Using Shoutcast, Display Now Playing Album Art - Stack Overflow

javascript - Flutter How can download pdf file from api - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价