redis的sentinel模式故障演练

codecraft

本文主要研究一下redis的sentinel模式的failover

启动

docker-compose up
这里使用redis-cluster的docker-compose文件进行演示
  • master日志
1:M 12 Sep 06:42:02.159 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 12 Sep 06:42:02.159 # Server started, Redis version 3.2.8
1:M 12 Sep 06:42:02.159 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 12 Sep 06:42:02.159 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 12 Sep 06:42:02.159 * The server is now ready to accept connections on port 6379
1:M 12 Sep 06:42:02.849 * Slave 172.17.0.3:6379 asks for synchronization
1:M 12 Sep 06:42:02.849 * Full resync requested by slave 172.17.0.3:6379
1:M 12 Sep 06:42:02.849 * Starting BGSAVE for SYNC with target: disk
1:M 12 Sep 06:42:02.851 * Background saving started by pid 16
16:C 12 Sep 06:42:02.861 * DB saved on disk
16:C 12 Sep 06:42:02.862 * RDB: 6 MB of memory used by copy-on-write
1:M 12 Sep 06:42:02.865 * Background saving terminated with success
1:M 12 Sep 06:42:02.866 * Synchronization with slave 172.17.0.3:6379 succeeded
1:M 12 Sep 06:42:13.649 # Connection with slave 172.17.0.3:6379 lost.
1:M 12 Sep 06:42:14.072 * Slave 172.17.0.3:6379 asks for synchronization
1:M 12 Sep 06:42:14.073 * Full resync requested by slave 172.17.0.3:6379
1:M 12 Sep 06:42:14.073 * Starting BGSAVE for SYNC with target: disk
1:M 12 Sep 06:42:14.075 * Background saving started by pid 17
17:C 12 Sep 06:42:14.085 * DB saved on disk
17:C 12 Sep 06:42:14.085 * RDB: 8 MB of memory used by copy-on-write
1:M 12 Sep 06:42:14.185 * Background saving terminated with success
1:M 12 Sep 06:42:14.186 * Synchronization with slave 172.17.0.3:6379 succeeded
  • slave日志
1:S 12 Sep 06:42:02.847 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:S 12 Sep 06:42:02.847 # Server started, Redis version 3.2.8
1:S 12 Sep 06:42:02.847 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:S 12 Sep 06:42:02.847 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:S 12 Sep 06:42:02.847 * The server is now ready to accept connections on port 6379
1:S 12 Sep 06:42:02.847 * Connecting to MASTER redis-master:6379
1:S 12 Sep 06:42:02.848 * MASTER <-> SLAVE sync started
1:S 12 Sep 06:42:02.848 * Non blocking connect for SYNC fired the event.
1:S 12 Sep 06:42:02.849 * Master replied to PING, replication can continue...
1:S 12 Sep 06:42:02.849 * Partial resynchronization not possible (no cached master)
1:S 12 Sep 06:42:02.851 * Full resync from master: 32f526697a22fef7945974d2b4dfc599401e2525:1
1:S 12 Sep 06:42:02.866 * MASTER <-> SLAVE sync: receiving 76 bytes from master
1:S 12 Sep 06:42:02.866 * MASTER <-> SLAVE sync: Flushing old data
1:S 12 Sep 06:42:02.866 * MASTER <-> SLAVE sync: Loading DB in memory
1:S 12 Sep 06:42:02.867 * MASTER <-> SLAVE sync: Finished with success
1:S 12 Sep 06:42:02.869 * Background append only file rewriting started by pid 15
1:S 12 Sep 06:42:02.903 * AOF rewrite child asks to stop sending diffs.
15:C 12 Sep 06:42:02.904 * Parent agreed to stop sending diffs. Finalizing AOF...
15:C 12 Sep 06:42:02.904 * Concatenating 0.00 MB of AOF diff received from parent.
15:C 12 Sep 06:42:02.906 * SYNC append only file rewrite performed
15:C 12 Sep 06:42:02.907 * AOF rewrite: 6 MB of memory used by copy-on-write
1:S 12 Sep 06:42:02.948 * Background AOF rewrite terminated with success
1:S 12 Sep 06:42:02.948 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 12 Sep 06:42:02.948 * Background AOF rewrite finished successfully
1:S 12 Sep 06:42:13.649 # Connection with master lost.
1:S 12 Sep 06:42:13.649 * Caching the disconnected master state.
1:S 12 Sep 06:42:13.650 * Discarding previously cached master state.
1:S 12 Sep 06:42:13.650 * SLAVE OF 172.17.0.2:6379 enabled (user request from 'id=3 addr=172.17.0.4:57270 fd=6 name=sentinel-927320a2-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1:S 12 Sep 06:42:13.650 # CONFIG REWRITE executed with success.
1:S 12 Sep 06:42:14.071 * Connecting to MASTER 172.17.0.2:6379
1:S 12 Sep 06:42:14.072 * MASTER <-> SLAVE sync started
1:S 12 Sep 06:42:14.072 * Non blocking connect for SYNC fired the event.
1:S 12 Sep 06:42:14.072 * Master replied to PING, replication can continue...
1:S 12 Sep 06:42:14.072 * Partial resynchronization not possible (no cached master)
1:S 12 Sep 06:42:14.076 * Full resync from master: 32f526697a22fef7945974d2b4dfc599401e2525:733
1:S 12 Sep 06:42:14.185 * MASTER <-> SLAVE sync: receiving 76 bytes from master
1:S 12 Sep 06:42:14.186 * MASTER <-> SLAVE sync: Flushing old data
1:S 12 Sep 06:42:14.186 * MASTER <-> SLAVE sync: Loading DB in memory
1:S 12 Sep 06:42:14.186 * MASTER <-> SLAVE sync: Finished with success
1:S 12 Sep 06:42:14.189 * Background append only file rewriting started by pid 16
1:S 12 Sep 06:42:14.221 * AOF rewrite child asks to stop sending diffs.
16:C 12 Sep 06:42:14.221 * Parent agreed to stop sending diffs. Finalizing AOF...
16:C 12 Sep 06:42:14.221 * Concatenating 0.00 MB of AOF diff received from parent.
16:C 12 Sep 06:42:14.223 * SYNC append only file rewrite performed
16:C 12 Sep 06:42:14.224 * AOF rewrite: 6 MB of memory used by copy-on-write
1:S 12 Sep 06:42:14.274 * Background AOF rewrite terminated with success
1:S 12 Sep 06:42:14.274 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 12 Sep 06:42:14.274 * Background AOF rewrite finished successfully

主从切换

  • docker-compose ps
       Name                      Command               State           Ports
-------------------------------------------------------------------------------------
sentinel_master_1     docker-entrypoint.sh redis ...   Up      0.0.0.0:6379->6379/tcp
sentinel_sentinel_1   sh /data/sentinel-entrypoi ...   Up      26379/tcp, 6379/tcp
sentinel_sentinel_2   sh /data/sentinel-entrypoi ...   Up      26379/tcp, 6379/tcp
sentinel_sentinel_3   sh /data/sentinel-entrypoi ...   Up      26379/tcp, 6379/tcp
sentinel_slave_1      docker-entrypoint.sh redis ...   Up      6379/tcp
sentinel_slave_2      docker-entrypoint.sh redis ...   Up      6379/tcp
  • 停止master节点:
docker pause sentinel_master_1
  • 查看sentinel日志:
1:X 12 Sep 06:46:42.611 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 12 Sep 06:46:42.615 # Sentinel ID is 9e1da269ca7f134ed7bae15ad8efa3f5dd22f72d
1:X 12 Sep 06:46:42.615 # +monitor master redis-master 172.17.0.2 6379 quorum 2
1:X 12 Sep 06:46:42.617 * +slave slave 172.17.0.3:6379 172.17.0.3 6379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:46:43.467 * +sentinel sentinel 927320a2afbfd70eae1716e8a024c726e71f2b51 172.17.0.4 26379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:46:44.554 * +sentinel sentinel 8fc2f95bc671dc8a3df30046a29fdc41743a774d 172.17.0.5 26379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:47:02.679 * +slave slave 172.17.0.7:6379 172.17.0.7 6379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:48:32.777 # +new-epoch 1
1:X 12 Sep 06:48:32.784 # +vote-for-leader 927320a2afbfd70eae1716e8a024c726e71f2b51 1
1:X 12 Sep 06:48:32.843 # +sdown master redis-master 172.17.0.2 6379
1:X 12 Sep 06:48:32.944 # +odown master redis-master 172.17.0.2 6379 #quorum 3/2
1:X 12 Sep 06:48:32.944 # Next failover delay: I will not start a failover before Wed Sep 12 06:48:43 2018
1:X 12 Sep 06:48:33.857 # +config-update-from sentinel 927320a2afbfd70eae1716e8a024c726e71f2b51 172.17.0.4 26379 @ redis-master 172.17.0.2 6379
1:X 12 Sep 06:48:33.861 # +switch-master redis-master 172.17.0.2 6379 172.17.0.3 6379
1:X 12 Sep 06:48:33.863 * +slave slave 172.17.0.7:6379 172.17.0.7 6379 @ redis-master 172.17.0.3 6379
1:X 12 Sep 06:48:33.864 * +slave slave 172.17.0.2:6379 172.17.0.2 6379 @ redis-master 172.17.0.3 6379
1:X 12 Sep 06:48:38.902 # +sdown slave 172.17.0.2:6379 172.17.0.2 6379 @ redis-master 172.17.0.3 6379
  • 查看新的master
1:M 12 Sep 06:48:32.996 # Connection with master lost.
1:M 12 Sep 06:48:32.997 * Caching the disconnected master state.
1:M 12 Sep 06:48:32.997 * Discarding previously cached master state.
1:M 12 Sep 06:48:32.997 * MASTER MODE enabled (user request from 'id=3 addr=172.17.0.4:57270 fd=6 name=sentinel-927320a2-cmd age=389 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1:M 12 Sep 06:48:32.998 # CONFIG REWRITE executed with success.
1:M 12 Sep 06:48:33.983 * Slave 172.17.0.7:6379 asks for synchronization
1:M 12 Sep 06:48:33.983 * Full resync requested by slave 172.17.0.7:6379
1:M 12 Sep 06:48:33.983 * Starting BGSAVE for SYNC with target: disk
1:M 12 Sep 06:48:33.984 * Background saving started by pid 28
28:C 12 Sep 06:48:33.992 * DB saved on disk
28:C 12 Sep 06:48:33.992 * RDB: 6 MB of memory used by copy-on-write
1:M 12 Sep 06:48:34.076 * Background saving terminated with success
1:M 12 Sep 06:48:34.076 * Synchronization with slave 172.17.0.7:6379 succeeded
  • 可以看到MASTER MODE enabled

恢复节点

docker unpause sentinel_master_1

查看该节点日志

1:M 12 Sep 06:56:05.592 # Connection with slave client id #12 lost.
1:M 12 Sep 06:56:05.592 # Connection with slave client id #5 lost.
1:S 12 Sep 06:56:17.140 * SLAVE OF 172.17.0.3:6379 enabled (user request from 'id=144 addr=172.17.0.5:41876 fd=7 name=sentinel-8fc2f95b-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec')
1:S 12 Sep 06:56:17.141 # CONFIG REWRITE executed with success.
1:S 12 Sep 06:56:17.206 * Connecting to MASTER 172.17.0.3:6379
1:S 12 Sep 06:56:17.206 * MASTER <-> SLAVE sync started
1:S 12 Sep 06:56:17.206 * Non blocking connect for SYNC fired the event.
1:S 12 Sep 06:56:17.207 * Master replied to PING, replication can continue...
1:S 12 Sep 06:56:17.208 * Partial resynchronization not possible (no cached master)
1:S 12 Sep 06:56:17.211 * Full resync from master: b2e78c2c21c3a4caa7a37fe86da9b3a2cda0dce4:134615
1:S 12 Sep 06:56:17.288 * MASTER <-> SLAVE sync: receiving 94 bytes from master
1:S 12 Sep 06:56:17.289 * MASTER <-> SLAVE sync: Flushing old data
1:S 12 Sep 06:56:17.289 * MASTER <-> SLAVE sync: Loading DB in memory
1:S 12 Sep 06:56:17.289 * MASTER <-> SLAVE sync: Finished with success
1:S 12 Sep 06:56:17.292 * Background append only file rewriting started by pid 32
1:S 12 Sep 06:56:17.339 * AOF rewrite child asks to stop sending diffs.
32:C 12 Sep 06:56:17.339 * Parent agreed to stop sending diffs. Finalizing AOF...
32:C 12 Sep 06:56:17.339 * Concatenating 0.00 MB of AOF diff received from parent.
32:C 12 Sep 06:56:17.342 * SYNC append only file rewrite performed
32:C 12 Sep 06:56:17.342 * AOF rewrite: 4 MB of memory used by copy-on-write
1:S 12 Sep 06:56:17.407 * Background AOF rewrite terminated with success
1:S 12 Sep 06:56:17.407 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
1:S 12 Sep 06:56:17.407 * Background AOF rewrite finished successfully
  • 可以看到自己切换为slave跟新的master同步

小结

redis的sentinel模式相对cluster来说比较简单,缺点是需要浪费一些资源来做sentinel节点,对于中小数据量的业务来说,相对比较划算。

doc

阅读 1.3k

code-craft
spring boot , docker and so on 欢迎关注微信公众号: geek_luandun

当一个代码的工匠回首往事时,不因虚度年华而悔恨,也不因碌碌无为而羞愧,这样,当他老的时候,可以很...

11.6k 声望
1.9k 粉丝
0 条评论

当一个代码的工匠回首往事时,不因虚度年华而悔恨,也不因碌碌无为而羞愧,这样,当他老的时候,可以很...

11.6k 声望
1.9k 粉丝
文章目录
宣传栏