admin管理员组

文章数量:1398771

Some year ago, we switched from some self built leader election mechanism to spring integration leader election, both on MariaDB clusters, relying on duplicate key violation. We switched to spring integration, as we were sometimes facing dead locks when the MariaDB clusters were synchronizing (rough generalizing here).

Back when we switched, we assumed that spring leader election is only writing the lock with the current timestamp to the INT_LOCK table from the leader node, while all others seemed to just read from the table, to know whether a new leader is needed (timestamp expired in case current leader node died).

But with the latest update, suddenly there is a duplicate key violation logged and analyzing the implementation, we found out that in fact every node is trying to insert the same line all the time. The code can be found in DefaultLockRegistry.aquire(String) implementation:

            if (this.template.update(this.updateQuery, new Object[]{this.id, epochMillis(), this.region, lock, this.id, this.ttlEpochMillis()}) > 0) {
                return true;
            } else {
                try {
                    return this.template.update(this.insertQuery, new Object[]{this.region, lock, this.id, epochMillis()}) > 0;
                } catch (DataIntegrityViolationException var4) {
                    return false;
                }
            }

while the first "updateQuery" has a where condition, that is only met for the leader node, the "insertQuery" in fact is trying to insert a row which is of course clashing all the time, as the row with the same primary keys is already there. So the catch block is executed on all non-leader nodes constantly.

Does somebody know the reasoning? Is inserting and failing faster than just reading?

I just stumbled across the duplicate key warnings in the logs. Of course I can easily silence them, but I also like to understand as much as possible of what I do. Actually the code is doing what it is supposed to, but I wonder whether it's doing so in the best way, or at least I want to understand.

I found similar discussion here in github, where the idea is to just set a custom insertStatement in case bored with the warnings.

Some year ago, we switched from some self built leader election mechanism to spring integration leader election, both on MariaDB clusters, relying on duplicate key violation. We switched to spring integration, as we were sometimes facing dead locks when the MariaDB clusters were synchronizing (rough generalizing here).

Back when we switched, we assumed that spring leader election is only writing the lock with the current timestamp to the INT_LOCK table from the leader node, while all others seemed to just read from the table, to know whether a new leader is needed (timestamp expired in case current leader node died).

But with the latest update, suddenly there is a duplicate key violation logged and analyzing the implementation, we found out that in fact every node is trying to insert the same line all the time. The code can be found in DefaultLockRegistry.aquire(String) implementation:

            if (this.template.update(this.updateQuery, new Object[]{this.id, epochMillis(), this.region, lock, this.id, this.ttlEpochMillis()}) > 0) {
                return true;
            } else {
                try {
                    return this.template.update(this.insertQuery, new Object[]{this.region, lock, this.id, epochMillis()}) > 0;
                } catch (DataIntegrityViolationException var4) {
                    return false;
                }
            }

while the first "updateQuery" has a where condition, that is only met for the leader node, the "insertQuery" in fact is trying to insert a row which is of course clashing all the time, as the row with the same primary keys is already there. So the catch block is executed on all non-leader nodes constantly.

Does somebody know the reasoning? Is inserting and failing faster than just reading?

I just stumbled across the duplicate key warnings in the logs. Of course I can easily silence them, but I also like to understand as much as possible of what I do. Actually the code is doing what it is supposed to, but I wonder whether it's doing so in the best way, or at least I want to understand.

I found similar discussion here in github, where the idea is to just set a custom insertStatement in case bored with the warnings.

Share Improve this question asked Mar 13 at 7:08 mangiarimangiari 211 silver badge2 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

So, the goal of that LockRepository.acquire() is to renew record for the lock or insert one if not present.

The INSERT and catch is faster than SELECT and then INSERT. Plus that INSERT after SELECT might fail again. With just INSERT and catch we perform only one operation on DB. This is like atomic put if absent. Only with the caveat that not all RDBMS vendors provide some hint like the mentioned in that issue ON CONFLICT DO NOTHING for PostgreSQL.

If there is something similar for MariaDB, you are free to use that DefaultLockRepository.setInsertQuery(String insertQuery). For example INSERT IGNORE INTO: How can I do 'insert if not exists' in MySQL?.

lockRepository.setInsertQuery(lockRepository.getInsertQuery().replaceFirst("INSERT", "INSERT IGNORE"));

You also can adjust option on the LockRegistryLeaderInitiator:

/**
 * Time in milliseconds to wait in between attempts to acquire the lock, if it is not
 * held. The longer this is, the longer the system can be leaderless, if the leader
 * dies. If a leader dies without releasing its lock, the system might still have to
 * wait for the old lock to expire, but after that it should not have to wait longer
 * than the busy wait time to get a new leader. If the remote lock does not expire, or
 * if you know it interrupts the current thread when it expires or is broken, then you
 * can reduce the busy wait to zero.
 */
private volatile long busyWaitMillis = DEFAULT_BUSY_WAIT_TIME;

本文标签: