Snowflake Time-Series Weighted Values - Stack Overflow

IT技术

更新时间：2025-04-131

admin管理员组
文章数量:1388067

I have data from a PLC coming in ONLY ON CHANGE into table format (TIMESTAMP, TAG, VALUE). I have a visualisation tool (seeq) that queries this base table in snowflake and shows the data on a time series chart. If a user selects a long time-range, then this data will need to be aggregated (max 2000 points per time series plot). I want this aggregation (average) to be weighted for how long a tag has been on that value before its change. For example if I have a tag = 'cheese' and t=0 -> t=5 has a value of 100, then t=6 -> t=100 has a value of 500. If the user in seeq selects this tag, on a long period window (i.e. spans from t=0 to t=100000), the data registered from this tag has to be aggregated to (5100+95500)/100 for t=50 (mid point) on the plot. How to generate a query for this in snowflake using this base table format of TIMESTAMP, TAG, VALUE.

Tried doing a cross join to a time dimension table created from a tag dimension table then using a lead function to get every single second output from the raw data spread across every time and then weight it accordingly. It was not very performant in terms of speed.

I have data from a PLC coming in ONLY ON CHANGE into table format (TIMESTAMP, TAG, VALUE). I have a visualisation tool (seeq) that queries this base table in snowflake and shows the data on a time series chart. If a user selects a long time-range, then this data will need to be aggregated (max 2000 points per time series plot). I want this aggregation (average) to be weighted for how long a tag has been on that value before its change. For example if I have a tag = 'cheese' and t=0 -> t=5 has a value of 100, then t=6 -> t=100 has a value of 500. If the user in seeq selects this tag, on a long period window (i.e. spans from t=0 to t=100000), the data registered from this tag has to be aggregated to (5100+95500)/100 for t=50 (mid point) on the plot. How to generate a query for this in snowflake using this base table format of TIMESTAMP, TAG, VALUE.

Tried doing a cross join to a time dimension table created from a tag dimension table then using a lead function to get every single second output from the raw data spread across every time and then weight it accordingly. It was not very performant in terms of speed.

Share Improve this question asked Mar 17 at 22:15 Austin 1

something like NTILE docs.snowflake/en/sql-reference/functions/ntile might be a way to chunk the data, and then do some average/weighted operation. – Simeon Pilgrim Commented Mar 17 at 22:15

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

so I am not really sure what you are trying todo, but an explosive way of doing something like what you describe can be done like so:

with d0 as (
    select * from values
     ('cheese', 0, 5, 100),
     ('cheese', 6, 100, 500)
     t(tag, _s, _e, val)
 ), d1 as (
     select 
        tag,
        value::number rn,
        val,
        ntile(10) over (partition by tag order by rn) as tile,
     from d0, 
        table(flatten(array_generate_range(_s, _e+1)))
)
select
    tag,
    tile,
    avg(rn) as mid,
    avg(val) as val
from d1
group by 1,2
order by 1,2;

which gives:

TAG	TILE	MID	VAL
cheese	1	5.000000	281.818182
cheese	2	15.500000	500.000000
cheese	3	25.500000	500.000000
cheese	4	35.500000	500.000000
cheese	5	45.500000	500.000000
cheese	6	55.500000	500.000000
cheese	7	65.500000	500.000000
cheese	8	75.500000	500.000000
cheese	9	85.500000	500.000000
cheese	10	95.500000	500.000000

those rows do not really need expanding, and the interpolation can be driven against a d0 like table, if that is how your data is sourced..

本文标签： Snowflake TimeSeries Weighted ValuesStack Overflow

版权声明：本文标题：Snowflake Time-Series Weighted Values - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1744532791a2611135.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

Snowflake Time-Series Weighted Values - Stack Overflow

1 Answer 1

更多相关文章