故障如下:
root@drbd1:~# drbd-overview
0:data/0 StandAlone Primary/Unknown UpToDate/DUnknown /data/mysql ext3 3.9G 8.1M 3.7G 1%
root@drbd2:~# drbd-overview
0:data/0 StandAlone Primary/Unknown UpToDate/DUnknown /data/mysql ext3 3.9G 8.1M 3.7G 1%
状态 StandAlone: 没有可用的网络配置(没有可用的复制或同步网路), 资源没有被连接, 或者是管理员使用drbdadm disconnect <resource> 进行了连接中断, 也有可能是认证失败或是产生脑裂而中断了连接
查看日志:
root@drbd1:~# tail -n 20 /var/log/syslog
May 23 20:34:41 drbd1 kernel: [ 4629.177175] drbd data: Peer authenticated using 20 bytes HMAC
May 23 20:34:41 drbd1 kernel: [ 4629.177389] drbd data: conn( WFConnection -> WFReportParams )
May 23 20:34:41 drbd1 kernel: [ 4629.177391] drbd data: Starting asender thread (from drbd_r_data [10450])
May 23 20:34:41 drbd1 kernel: [ 4629.186967] block drbd0: drbd_sync_handshake:
May 23 20:34:41 drbd1 kernel: [ 4629.186970] block drbd0: self B4EF9EF8D6B328BD:1E9AC6C2E7980795:4B519345CD4008DE:4B509345CD4008DE bits:1024 flags:0
May 23 20:34:41 drbd1 kernel: [ 4629.186972] block drbd0: peer 7B0DFE0CF2812103:1E9AC6C2E7980794:4B519345CD4008DE:4B509345CD4008DE bits:1 flags:2
May 23 20:34:41 drbd1 kernel: [ 4629.186973] block drbd0: uuid_compare()=100 by rule 90
May 23 20:34:41 drbd1 kernel: [ 4629.186976] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
May 23 20:34:41 drbd1 kernel: [ 4629.188312] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
May 23 20:34:41 drbd1 kernel: [ 4629.188324] block drbd0: Split-Brain detected but unresolved, dropping connection!
May 23 20:34:41 drbd1 kernel: [ 4629.189831] block drbd0: helper command: /sbin/drbdadm split-brain minor-0
May 23 20:34:41 drbd1 kernel: [ 4629.191008] block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
May 23 20:34:41 drbd1 kernel: [ 4629.191028] drbd data: conn( WFReportParams -> Disconnecting )
May 23 20:34:41 drbd1 kernel: [ 4629.191030] drbd data: error receiving ReportState, e: -5 l: 0!
May 23 20:34:41 drbd1 kernel: [ 4629.191496] drbd data: asender terminated
May 23 20:34:41 drbd1 kernel: [ 4629.191497] drbd data: Terminating drbd_a_data
May 23 20:34:41 drbd1 kernel: [ 4629.218488] drbd data: Connection closed
May 23 20:34:41 drbd1 kernel: [ 4629.218551] drbd data: conn( Disconnecting -> StandAlone )
May 23 20:34:41 drbd1 kernel: [ 4629.218553] drbd data: receiver terminated
May 23 20:34:41 drbd1 kernel: [ 4629.218554] drbd data: Terminating drbd_r_data
查看服务状态:
root@drbd1:~# service drbd status
drbd driver loaded OK; device status:
version: 8.4.5 (api:1/proto:86-101)
srcversion: 5A4F43804B37BB28FCB1F47
m:res cs ro ds p mounted fstype
0:data StandAlone Primary/Unknown UpToDate/DUnknown r----- ext3
其中: drbd1 为主节点, drbd2 为备节点
解决方法:
1.确保卸载所有drbd设备
root@drbd1:~# umount /dev/drbd0
root@drbd2:~# umount /dev/drbd0
2.将所有节点设为Secondary
root@drbd1:~# drbdadm secondary data
root@drbd2:~# drbdadm secondary data
3.中断节点的连接
root@drbd2:~# drbdadm disconnect data
??: Failure: (162) Invalid configuration request
additional info from kernel:
unknown connection
Command 'drbdsetup-84 disconnect ipv4:10.11.8.158:7789 ipv4:10.11.8.145:7789' terminated with exit code 10
4.drbd2 上执行
root@drbd2:~# drbdadm connect data --discard-my-data
root@drbd2:~# drbd-overview
0:data/0 WFConnection Secondary/Unknown UpToDate/DUnknown
状态 WFConnection: 表示本节点将会等待, 直到对点网络实现连接
5.drbd1 上执行
root@drbd1:~# drbdadm connect data
root@drbd1:~# drbd-overview
0:data/0 Connected Secondary/Secondary UpToDate/UpToDate
状态恢复正常
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。