admin管理员组

文章数量:1384263

How to setup dynamic allocation for a spark job which is having data rate about 450k?

I tried with the below configurations, but the executor pods are always running with the max executors and it's not scaling down even when the data rate is just 20k-30k.

  • --conf spark.dynamicAllocation.enabled=true
  • --conf spark.dynamicAllocation.shuffleTracking.enabled=true
  • --conf spark.dynamicAllocation.shuffleTracking.timeout=30s
  • --conf spark.dynamicAllocation.minExecutors=10
  • --conf spark.dynamicAllocation.initialExecutors=2
  • --conf spark.dynamicAllocation.maxExecutors=85
  • --conf spark.dynamicAllocation.executorIdleTimeout=300s
  • --conf spark.dynamicAllocation.schedulerBacklogTimeout=210s
  • --conf spark.dynamicAllocation.sustainedSchedulerBacklogTimeout=30s

How can I fix this and why the job is always ends up in running with max executors all the time? I'm trying to optimize the job and expecting to run with minimum number of executors when the data size is minimal.

本文标签: Setup dynamic allocation for a spark job which is having data rate about 450kStack Overflow