admin管理员组文章数量:1123223
I have two unbounded sources (pubsub):
- main source: emits values frequently
- secondary source: sends an event which tells us to read a big query table, since there was a change in the table.
I want to enrich (left join) the main source with the table read based on the secondary source.
I already have a solution in which the big query tables are read at the beginning, thus they are bounded. For the join I used Beam SQL, since it is quite complex and I want to keep it, therefore, I think using side input is not feasible, since I don't think I can join a PCollection
with PCollectionView
using Beam SQL.
I tried to use a fixed window with 5 seconds on each source, but for the second source the last state is not propagated to the windows where nothing has changed. Therefore after joining the sources I get the right results only when the BigQuery table was updated, but when nothing has changed (most of the time) I get null values on the right side.
How can I upsample the seconds source to get the right results after the join?
本文标签: javaJoin a rapidly and slowly changing unbounded sources in Apache BeamStack Overflow
版权声明:本文标题:java - Join a rapidly and slowly changing unbounded sources in Apache Beam - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736560299a1944635.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论