admin管理员组文章数量:1122846
I have a Hudi table generated by Spark; the schema was like:
id: int64
content: string
create_date: timestamp[ns]
This table was super large. Most of the queries we perform on this table involve range queries on create_date
:
select xx from table where xxx and xxx and create_date>='2024-01-01 00:00:00' and create_date<='2024-01-02 00:00:00'
Each time the query has to spend a long time scanning all data in this table, even if I just want to do some filtering or aggregation on data of a certain date. How should I build indexes in this Hudi table to speed up my queries?
本文标签:
版权声明:本文标题:apache spark - Which type of index should I build in this situation to speed up the query on a Hudi table? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736312264a1935039.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论