2
头图

problem background

When we enable multi-threaded operations on the database, we first delete data in batches, and then add new ones. We originally thought that the performance would be improved without updating, but an error was reported during execution, and the executed sql waited for a timeout. , the process is blocked, the dbcp connection pool is full, and database table access is unavailable. In response to this problem, we have conducted in-depth excavation, and gradually unraveled the truth of the problem.

Look at the specific business implementation details

  • table definition
    image.png
  • Now import a collection of data A, the definition of A is as follows:
    image.png

Next operation to reproduce the problem

  • According to the value of t1, check whether there is a corresponding record in table a
  • If there is a value, update the value of t2
  • If no result is found, execute the insert operation

Here we use a multi-threaded method to perform batch operations.
image.png
image.png

Problem recurrence

  • step1 - Insert test data first
    image.png
  • step2 - We open two windows to simulate deadlock.
    Session1:
    image.png
    Session2:
    image.png
    At this point, both Session 1 and Session 2 lock the interval (20, infinity), because the gap lock is only used to prevent other transactions from inserting data in the interval.
  • step3 - Session1 continues the insert operation:
    image.png

At this point Session1 blocks (because Session2 holds a gap lock).

  • step4- Immediately after Session2 continues the insertion operation:
    image.png

At this point Session2 deadlocks because Session1 holds a gap lock. In our code, deadlock will occur because it involves multi-threading operations that delete first and then insert in the transaction.

Do not take the update operation, delete first, then insert, to ensure that there are only 2 database operations.

problem causes

Query related information to learn that the cause of the deadlock is the gap lock of MYSQL.
gap lock

Gap Lock (Gap Lock) is a lock mechanism introduced by Innodb to solve the phantom read problem under repeatable read submission. The phantom read problem exists because of new or update operations. Lock query), there will be inconsistency problems. At this time, there is no way to use different row locks to meet the requirements. It is necessary to lock data within a certain range. Gap lock is to solve this problem. Under the repeatable read isolation level, the database is implemented by a combination of row locks and gap locks (next-key lock).
Row locks and gap locks are defined as follows:

  • record lock: row lock, that is, only a single row is locked.
  • gap lock: Gap lock, only locks an interval (note that the intervals here are all open intervals, that is, excluding boundary values).
  • next-key lock: record lock+gap lock, so next-key lock is also a half-open and half-closed interval, and the lower bound is open and the upper bound is closed.
    Locking Rule Features

Locking rules have some characteristics, of which we need to pay attention to:

  • The basic unit of locking is (next-key lock), which is based on the principle of front opening and back closing
  • Objects accessed during the lookup process will increase the lock
  • Gap locks only prevent other transactions from inserting into the gap. When deleting data, a gap lock will be added, but multiple transactions can lock a gap at the same time, and if you need to insert the gap, you need to wait for the lock to be released.

Solution

1. Set the transaction isolation level to read commit.

Gap locks only exist under the Repeatable Read isolation level to prevent phantom reads. This method is unrealistic, and it is impossible to change the isolation level of the entire online database for this problem.
2. Avoid the operation of deleting first and then inserting.

Modify the code to avoid the operation of deleting first and then inserting. At the expense of performance, in the business, the existing records are first found according to the unique index, and then the existing records are updated in a loop according to the primary key ID, and the non-existing records are inserted in batches.


skyarthur
1.6k 声望1.3k 粉丝

技术支持业务,技术增强业务,技术驱动业务