Author: Ren Kun
Now living in Zhuhai, he has served as a full-time Oracle and MySQL DBA, and now he is mainly responsible for the maintenance of MySQL, mongoDB and Redis.
Source of this article: original contribution
*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.
1. Background
An online core MySQL, version 5.6, has 1 master and 2 slaves in the local computer room, and a remote slave library is deployed at the same time.
Since February 14, the remote database has started to report the delay of replication. At first, it was thought that it was caused by network fluctuations and it was not processed. However, after 2 days, the alarm still exists and the delay is getting higher and higher.
2. Diagnosis
Log in to the off-site slave library, and first check whether it is the delay caused by the IO replication thread.
This step is very simple. Check whether the Master_Log_File of show slave status is the current binlog of the main library. If it means that there is no delay in the IO replication thread, it is caused by the SQL replication thread.
Get the process ID of the mysqld, execute perf record -ag -p 11029 -- sleep 10; perf report
Repeatedly executed many times, each time has deflate_slow and occupies the highest proportion
Expand it, associated with the compressed page
pstack 11029 crawls the scene many times, which is also related to compressed pages.
The instance does have a large table, and only the off-site slave library has page compression turned on and its row format is converted to dynamic.
Looking at Seconds_Behind_Master, the latency indicator begins to gradually decrease, indicating that the plan has taken effect.
Grab the perf and pstack scene again.
--perf report
--pstack
It can be seen that the API related to page compression has disappeared, confirming again that this replication delay is directly related to the opening of page compression for large tables.
3. Summary
With the help of perf and pstack tools, the SQL thread replication delay caused by the compressed table can be quickly located, and the problem can be solved by decompressing the large table.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。