admin管理员组文章数量:1353119
I am trying to create a DLT pipeline and this is my first time doing it. What is the best possible way to satisfy my following requirements. I am fully aware that the path I am choosing may not be optimal and I am open to design recommendations here as well. Here's what I am trying to do:
@dlt.table(
name="bronze_dlt_table",
comment="This table reads data from a Delta location",
table_properties={
"quality": "bronze"
}
)
def read_raw_bronze_dlt_table():
return spark.read.format("delta").load("Delta Table Path written from Upstream location")
@dlt.table(
name="silver_dlt_table",
partition_cols=["ABC"],
table_properties={
"quality": "silver"
})
def refresh_silver_dlt_table():
bronzeDF = dlt.read("bronze_dlt_table")
LookupDF = spark.read("Read data from a delta table")
//Perform some basic column manipulation and joins between BronzeDF & LookupDF
silverDF
dlt.apply_changes(
target = silver_dlt_table,
source = silverDF,
sequence_by = col("Newly Added Column in SilverDF based on LookupDF")
)
return
本文标签:
版权声明:本文标题:databricks - Apply Merge from Bronze DLT Table into Silver DLT Table with some light cleaning? (Traditional Upsert style) - Stac 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1743926273a2563018.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论