聊聊redis的slowlog与latency monitor

序

本文主要研究一下redis的slowlog与latency monitor

slowlog

redis在2.2.12版本引入了slowlog，用于记录超过指定执行时间的命令，这个执行时间不包括诸如与客户端通信的IO操作耗时，是实实在在的命令执行的耗时。主要有如下操作：

查看slowlog的数量

127.0.0.1:6379> slowlog len
(integer) 1024

查看slowlog的执行耗时阈值

127.0.0.1:6379> config get slowlog-log-slower-than
1) "slowlog-log-slower-than"
2) "1000"

设置slowlog的执行耗时阈值

127.0.0.1:6379> config set slowlog-log-slower-than 1000
OK

查看slowlog保存数量的阈值

127.0.0.1:6379> config get slowlog-max-len
1) "slowlog-max-len"
2) "1024"

设置slowlog保存数量的阈值

127.0.0.1:6379> config set slowlog-max-len 1024
OK

查询slowlog

127.0.0.1:6379> slowlog get 1
1) 1) (integer) 76016
   2) (integer) 1537250266
   3) (integer) 48296
   4) 1) "COMMAND"

第一行是命令id，第二行是timestamp，第三行是执行耗时，第四行是命令及参数

清除slowlog记录

127.0.0.1:6379> slowlog reset
OK

latency monitor

redis在2.8.13版本引入了latency monitoring，这里主要是监控latency spikes(延时毛刺)。它基于事件机制进行监控，command事件是监控命令执行latency，fast-command事件是监控时间复杂度为O(1)及O(logN)命令的latency，fork事件则监控redis执行系统调用fork(2)的latency。主要有如下操作：

设置/开启latency monitor

127.0.0.1:6379> config set latency-monitor-threshold 100
OK

读取latency monitor配置

127.0.0.1:6379> config get latency-monitor-threshold
1) "latency-monitor-threshold"
2) "100"

获取最近的latency

127.0.0.1:6379> debug sleep 1
OK
(1.01s)
127.0.0.1:6379> debug sleep .25
OK
127.0.0.1:6379> latency latest
1) 1) "command"
   2) (integer) 1537268070
   3) (integer) 250
   4) (integer) 1010

返回事件名、发生的时间戳、最近的延时(毫秒)、最大的延时(毫秒)

查看某一事件的延时历史

127.0.0.1:6379> latency history command
1) 1) (integer) 1537268064
   2) (integer) 1010
2) 1) (integer) 1537268070
   2) (integer) 250

查看事件延时图

127.0.0.1:6379> latency reset command
(integer) 0
127.0.0.1:6379> debug sleep .1
OK
127.0.0.1:6379> debug sleep .2
OK
127.0.0.1:6379> debug sleep .3
OK
127.0.0.1:6379> debug sleep .4
OK
127.0.0.1:6379> debug sleep .5
OK
(0.50s)
127.0.0.1:6379> latency graph
(error) ERR syntax error
127.0.0.1:6379> latency graph command
command - high 500 ms, low 100 ms (all time high 500 ms)
--------------------------------------------------------------------------------
   _#
  _||
 _|||
_||||

22117
4062s
ssss

重置/清空事件数据

127.0.0.1:6379> latency reset command
(integer) 1
127.0.0.1:6379> latency history command
(empty list or set)
127.0.0.1:6379> latency latest
(empty list or set)

诊断建议

127.0.0.1:6379> latency doctor
Dave, I have observed latency spikes in this Redis instance. You don't mind talking about it, do you Dave?

1. command: 6 latency spikes (average 257ms, mean deviation 142ms, period 3.83 sec). Worst all time event 500ms.

I have a few advices for you:

- Check your Slow Log to understand what are the commands you are running which are too slow to execute. Please check http://redis.io/commands/slowlog for more information.
- Deleting, expiring or evicting (because of maxmemory policy) large objects is a blocking operation. If you have very large objects that are often deleted, expired, or evicted, try to fragment those objects into multiple smaller objects.

小结

redis的slowlog在2.2.12版本引入，latency monitor在2.8.13版本引入
slowlog仅仅是记录纯命令的执行耗时，不包括与客户端的IO交互及redis的fork等耗时
latency monitor监控的latency spikes则范围广一点，不仅包括命令执行，也包括fork(2)系统调用，key过期等操作的耗时

聊聊redis的slowlog与latency monitor

序

slowlog

查看slowlog的数量

查看slowlog的执行耗时阈值

设置slowlog的执行耗时阈值

查看slowlog保存数量的阈值

设置slowlog保存数量的阈值

查询slowlog

清除slowlog记录

latency monitor

设置/开启latency monitor

读取latency monitor配置

获取最近的latency

查看某一事件的延时历史

查看事件延时图

重置/清空事件数据

诊断建议

小结

doc

codecraft

引用和评论

聊聊Tomato Architecture

如何实现页面广告随时上下线、过期自动下线及到时自动上线

Linux Redis 安装、配置、教程（一）

10k star！一款轻量级的Redis客户端工具，贼好用！

基于k3s部署Nginx、MySQL、PHP和Redis的详细教程

如何基于Redis实现延时任务

揭秘！Redis 分布式锁在订单创建系统中的精妙应用