零. 准备工作
最近公司zabbix服务器升级,鉴于以前单机负载压力较大,宕机之后系统瘫痪,趁机改成双机高可用架构。
以下是用keepalived做HA的搭建过程,以备参考!
两台主机,三个ip,两台主机配置zabbix和keepalived用于高可用架设。
IP地址 | 主机名 | 备注 |
---|---|---|
192.168.3.141 | zabbix_master | zabbix/keepalived |
192.168.3.141 | zabbix_slaver | zabbix/keepalived |
192.168.3.144 | VIP | --- |
一. keepalived安装
1. Installation
centOS 7下安装很简单,直接yum
即可。
$ su root
$ yum install keepalived
$ /usr/sbin/keepalived -D
$ pstree |grep keepalived
|-keepalived---2*[keepalived]
可以看到实际启动了三个进程。
2. Base Configuration
制作开机启动
pgrep keepalived |xargs kill -15
#重新加载
systemctl daemon-reload
#设置开机自动启动
systemctl enable keepalived.service
#取消开机自动启动
systemctl disable keepalived.service
#启动
systemctl start keepalived.service
#停止
systemctl stop keepalived.service
#状态检查
systemctl status keepalived.service
3. Advance Configuration
su root
chown zabbix:zabbix /usr/sbin/keepalived
chown zabbix:zabbix /lib/systemd/system/keepalived.service
chown zabbix:zabbix /etc/keepalived/keepalived.conf
su zabbix
vim /lib/systemd/system/keepalived.service
vim /etc/keepalived/keepalived.conf
chmod +x /home/zabbix/keepalived/src/zabbix_status_check2.sh
主要是keepalived.conf和zabbix_status_check2.sh两个脚本,网上资料很多,这里不细讲。
二. 双机启动
1. 主机启动keepalived
Mar 13 17:27:57 vm1184 Keepalived_vrrp[5026]: VRRP_Script(zabbix_status_check) succeeded
Mar 13 17:28:00 vm1184 Keepalived_vrrp[5026]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 13 17:28:01 vm1184 Keepalived_vrrp[5026]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar 13 17:28:01 vm1184 Keepalived_vrrp[5026]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 13 17:28:01 vm1184 Keepalived_vrrp[5026]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens160 for 192.168.3.144
Mar 13 17:28:01 vm1184 Keepalived_healthcheckers[5025]: Netlink reflector reports IP 192.168.3.144 added
Mar 13 17:28:01 vm1184 avahi-daemon[637]: Registering new address record for 192.168.3.144 on ens160.IPv4.
Mar 13 17:28:06 vm1184 Keepalived_vrrp[5026]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens160 for 192.168.3.144
2. 主机地址,VIP已经绑定到网卡
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:50:56:94:eb:ad brd ff:ff:ff:ff:ff:ff
inet 192.168.3.141/32 brd 192.168.3.141 scope global ens160
valid_lft forever preferred_lft forever
inet 192.168.3.144/24 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::ff15:1cc9:5bd0:b06e/64 scope link
valid_lft forever preferred_lft forever
3. 备机日志,keepalived已启动。
$ tail -f /var/log/messages
Mar 14 09:24:02 vm1185 Keepalived_healthcheckers[7588]: Registering Kernel netlink command channel
Mar 14 09:24:02 vm1185 Keepalived_healthcheckers[7588]: Opening file '/etc/keepalived/keepalived.conf'.
Mar 14 09:24:02 vm1185 Keepalived_healthcheckers[7588]: Configuration is using : 8087 Bytes
Mar 14 09:24:02 vm1185 Keepalived_healthcheckers[7588]: Using LinkWatch kernel netlink reflector...
4. 备机ip地址,VIP实际并没有绑到网卡上。
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:50:56:94:88:c9 brd ff:ff:ff:ff:ff:ff
inet 192.168.3.142/32 brd 192.168.3.142 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::e00c:f722:cd30:ad8b/64 scope link
valid_lft forever preferred_lft forever
三. 主备切换
1. 杀掉主机上的keepalived之后,备机现在自动切换成主机了。
Mar 14 09:29:04 vm1185 Keepalived_vrrp[7589]: VRRP_Instance(VI_1) Transition to MASTER STATE
Mar 14 09:29:05 vm1185 Keepalived_vrrp[7589]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar 14 09:29:05 vm1185 Keepalived_vrrp[7589]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 14 09:29:05 vm1185 Keepalived_vrrp[7589]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens160 for 192.168.3.144
Mar 14 09:29:05 vm1185 Keepalived_healthcheckers[7588]: Netlink reflector reports IP 192.168.3.144 added
Mar 14 09:29:05 vm1185 avahi-daemon[674]: Registering new address record for 192.168.3.144 on ens160.IPv4.
Mar 14 09:29:10 vm1185 Keepalived_vrrp[7589]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens160 for 192.168.3.144
2. 现主机(原备机)现在的网卡情况,VIP已经绑定上去了。
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:50:56:94:88:c9 brd ff:ff:ff:ff:ff:ff
inet 192.168.3.142/32 brd 192.168.3.142 scope global ens160
valid_lft forever preferred_lft forever
inet 192.168.3.144/24 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::e00c:f722:cd30:ad8b/64 scope link
valid_lft forever preferred_lft forever
3. 再启动原主机上的keepalived,原主机变备机,VIP绑定被清除了。
$ tail -f /var/log/messages
Mar 14 09:30:12 vm1184 Keepalived_vrrp[11857]: VRRP_Instance(VI_1) Entering MASTER STATE
Mar 14 09:30:12 vm1184 Keepalived_vrrp[11857]: VRRP_Instance(VI_1) setting protocol VIPs.
Mar 14 09:30:12 vm1184 Keepalived_vrrp[11857]: VRRP_Instance(VI_1) Sending gratuitous ARPs on ens160 for 10.18 .8.144
Mar 14 09:30:12 vm1184 avahi-daemon[637]: Registering new address record for 192.168.3.144 on ens160.IPv4.
Mar 14 09:30:12 vm1184 Keepalived_healthcheckers[11856]: Netlink reflector reports IP 192.168.3.144 added
Mar 14 09:30:12 vm1184 Keepalived_vrrp[11857]: VRRP_Instance(VI_1) Received higher prio advert
Mar 14 09:30:12 vm1184 Keepalived_vrrp[11857]: VRRP_Instance(VI_1) Entering BACKUP STATE
Mar 14 09:30:12 vm1184 Keepalived_vrrp[11857]: VRRP_Instance(VI_1) removing protocol VIPs.
Mar 14 09:30:12 vm1184 Keepalived_healthcheckers[11856]: Netlink reflector reports IP 192.168.3.144 removed
Mar 14 09:30:12 vm1184 avahi-daemon[637]: Withdrawing address record for 192.168.3.144 on ens160.
4. 原主机变备机,VIP绑定确实被清除了。
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:50:56:94:eb:ad brd ff:ff:ff:ff:ff:ff
inet 192.168.3.141/32 brd 192.168.3.141 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::ff15:1cc9:5bd0:b06e/64 scope link
valid_lft forever preferred_lft forever
四. zabbix_server 宕机实验
# 原server启动方式
$ ps -ef |grep zabbix_server
zabbix 31668 31643 0 Mar13 ? 00:00:11 ./sbin/zabbix_server: escalator #1 [processed 0 escalations in 0.001148 sec, idle 3 sec]
zabbix 31669 31643 0 Mar13 ? 00:00:03 ./sbin/zabbix_server: proxy poller #1 [exchanged data with 0 proxies in 0.000066 sec, idle 5 sec]
zabbix 31670 31643 0 Mar13 ? 00:00:09 ./sbin/zabbix_server: self-monitoring [processed data in 0.000046 sec, idle 1 sec]
zabbix 31671 31643 0 Mar13 ? 00:00:03 ./sbin/zabbix_server: task manager [processed 0 task(s) in 0.000289 sec, idle 5 sec]
# 杀掉进程,操作宕机
$ pgrep zabbix_server |xargs kill -15
# 检查
$ ps -ef |grep zabbix_server
zabbix 14053 14028 0 09:59 ? 00:00:00 /home/zabbix/zabbix/sbin/zabbix_server: history syncer #4 [synced 0 items in 0.000043 sec, idle 1 sec]
zabbix 14054 14028 0 09:59 ? 00:00:00 /home/zabbix/zabbix/sbin/zabbix_server: escalator #1 [processed 0 escalations in 0.001202 sec, idle 3 sec]
zabbix 14055 14028 0 09:59 ? 00:00:00 /home/zabbix/zabbix/sbin/zabbix_server: proxy poller #1 [exchanged data with 0 proxies in 0.000057 sec, idle 5 sec]
zabbix 14056 14028 0 09:59 ? 00:00:00 /home/zabbix/zabbix/sbin/zabbix_server: self-monitoring [processed data in 0.000040 sec, idle 1 sec]
zabbix 14057 14028 0 09:59 ? 00:00:00 /home/zabbix/zabbix/sbin/zabbix_server: task manager [processed 0 task(s) in 0.000433 sec, idle 5 sec]
说明keepalived的健康检查脚本确实执行了,检查不到zabbix_server进程的情况下会自动重启。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。