admin管理员组

文章数量:1129022

I have a use case where I need to regularly copy data from one table to another within the same Azure Data Explorer (ADX) database. Here's the specific context:

Number of Tables: Around 50 tables in the database. Data Volume: Each table can have approximately 200GB of data. Frequency: Data copying should occur every 30 minutes.

I have compared multiple possibilities like ADF, AzureFuntions, AzCopy, Logic Apps, Power Automate

ADF seemed to be the best fit.

Cluster details: SKU: Standard_L16as_v3, Large, 16 vCPUs, Memory 128GB, Cache 3500GB Instances: auto scaling enabled upto 20 I am having 30 tables for testing purpose

The pipeline setup looks like this

  1. There is a Master Pipeline which looks for the table in the database and triggers child pipeline for each table
  2. The child pipeline gets the extent for the table and runs copy activity for each extent. (To support better concurrency)
  3. In forEach activity batch count is set to: 50 (Needed to take it down because of throttling in ADF)

The pipeline run is taking approx 1 hour

What can I do to improve the performance and is there any better solution?

Target is to run pipeline with 30 minutes. Is it even possible?

本文标签: azureSeeking performance improvement while using ADF with ADXStack Overflow