admin管理员组文章数量:1415467
I am using the Debezium MySQL Source Connector (v2.2.1.Final) in a multi-tenant application. As the number of tenant databases increased, the size of the schema history topic grown significantly. Whenever I restart the connector (while having the snapshot.mode=schema_only), it takes approximately 1 hour and 30 minutes to resume streaming new records.
I have observed that:
- The schema history topic already contains all the persisted schema information.
- The __consumer_offsets topic has the correct offsets.
- The connect.offsets file contains the correct mysqlbinlog position.
- During the schemahistory recovery process, I don’t see any activity on the database server.
Log:
[2025-02-20 04:34:34,669] INFO [mysql_source_connector|task-0] Closing connection before starting schema recovery (io.debezium.connector.mysql.MySqlConnectorTask:94)
[2025-02-20 04:34:34,743] INFO [mysql_source_connector|task-0] Started database schema history recovery (io.debezium.relational.history.SchemaHistoryMetrics:115)
[2025-02-20 04:34:37,825] INFO [mysql_source_connector|task-0] Database schema history recovery in progress, recovered ... records (io.debezium.relational.history.SchemaHistoryMetrics:130)
[2025-02-20 04:34:38,235] INFO [mysql_source_connector|task-0] Already applied .... database changes (io.debezium.relational.history.SchemaHistoryMetrics:140)
[2025-02-20 04:34:38,629] INFO [mysql_source_connector|task-0] Database schema history recovery in progress, recovered .... records (io.debezium.relational.history.SchemaHistoryMetrics:130)
[2025-02-20 04:34:38,630] INFO [mysql_source_connector|task-0] Already applied .... database changes (io.debezium.relational.history.SchemaHistoryMetrics:140)
[2025-02-20 04:34:40,629] INFO [mysql_source_connector|task-0] Already applied .... database changes (io.debezium.relational.history.SchemaHistoryMetrics:140)
My Question:
- What is the need for the schema history recovery process every time the connector restarts, given that the connector already has the binlog position?
- Wouldn't it be more efficient for the connector to resume from the existing binlog position and only append newly added tables and database schema changes to the schema history topic, instead of reprocessing everything?
Any insights on optimizing the recovery process would be appreciated!
本文标签:
版权声明:本文标题:apache kafka - Debezium MySQL Source Connector takes too long to resume streaming on restart (in schema_only mode) - Stack Overf 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1745165527a2645652.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论