admin管理员组

文章数量:1123429

I have written a spark job to read from kafka topic, do some processing and dump the data in avro format to GCS.

I am deploying this JAVA application dataproc serverless using the TriggerOnce mode so that at every run the new data pushed to kafka topic is consumed and dumped to GCS.

The strange behaviour is that on the first run the code works absolutely fine but when I try to rerun for the next batch I get the below error.

java.io.InvalidClassException: org.apache.spark.sql.avro.AvroDataToCatalyst; local class incompatible: stream classdesc serialVersionUID = -4108983435828400550, local class serialVersionUID = 3066013574753296163

本文标签: apache sparkDataproc Serverless Failure on RerunStack Overflow