admin管理员组

文章数量:1122832

We have a springboot application that has multiple pods deployed on k8s. It consumes events from a topic sitting in a cluster C1, transforms the messages and pushes the transformed data to a topic sitting in a different cluster using KafkaTemplate class.

Is it possible to maintain deduplication of events using Exactly once semantics config given that there are 2 clusters involved?

If not, what are the options available so even if a pod restarts or a pod gets added/deleted dynamically, there are no events getting duplicated?

We have a springboot application that has multiple pods deployed on k8s. It consumes events from a topic sitting in a cluster C1, transforms the messages and pushes the transformed data to a topic sitting in a different cluster using KafkaTemplate class.

Is it possible to maintain deduplication of events using Exactly once semantics config given that there are 2 clusters involved?

If not, what are the options available so even if a pod restarts or a pod gets added/deleted dynamically, there are no events getting duplicated?

Share Improve this question edited Nov 23, 2024 at 15:13 OneCricketeer 191k20 gold badges141 silver badges267 bronze badges asked Nov 21, 2024 at 9:03 Ravi Singh ShekhawatRavi Singh Shekhawat 312 silver badges10 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

You will need to use an intermediate KV store, such as Redis, Mongo, Postgres, Elasticsearch, etc. Kafka itself will never know there's duplicates (yes, even compacted topics can contain duplicate keys), even in one cluster

You'd insert/query every event to your database to know if it's been seen before or not

Lookup 2PC patterns for more ideas around this concept

Depending on your use case, you could also use a framework like Temporal to handle such distribution transactions

本文标签: apache kafkaAvoid duplicate messages for topics across different clustersStack Overflow