admin管理员组

文章数量:1295874

I am playing around with Iceberg table and inserted just 6 records partitioned on a long type column. Although I have executed update commands, I can see there are more than 10 data files in the data folder. I tried running rewrite_data_file procedure

CALL catalog.system.rewrite_data_files(table => 'db.iceberg',
                                                            options => map(
                                                            'max-concurrent-file-group-rewrites','50',
                                                            'max-file-size-bytes','104857600000',
                                                            'min-file-size-bytes','10480',
                                                            'target-file-size-bytes','134217728',
                                                            'min-input-files', '2'))

still the files on s3 does not change. I can see two folder for the same partitioned column

I tried setting the table property on Merge-On-Read and setting max target file size.

ALTER table catalog.db.iceberg SET TBLPROPERTIES ('write.delete.mode' = 'merge-on-read',
'delete.isolation-level' = 'snapshot', 'write.target-file-size-bytes'  = '134217728')

The contents in the table are below, table is partitioned on age column :

Ideally there must be 6 folders in the data layer, as per the partitioned column cardinality. Not sure what's not adding up here. Can someone help please?

本文标签: amazon s3Rewritedatafiles procedure is not working as intendedStack Overflow