转载自:Docker部署自动容灾切换的RocketMQ集群(DLedger)
根据该文章搭建的最终成果为:RocketMQ集群内,有一个name server,6个broker节点,其中,每三个broker以master-slave的形式组成一个broker组,当master挂掉时,从broker组选举出一个broker节点成为master节点。每个broker组至少需要3个broker节点,否则master挂掉后无法完成slave自动切换为master节点,因为剩下的1个broker节点无法获得集群内大多数节点的投票。(详见Raft算法)
环境
- 虚拟机: 4核CPU,7GB内存
- Docker版本:19.03.2
- RocketMQ版本:4.6.0
- 虚拟机IP地址:192.168.7.241
新建一个目录(下文称为根目录
),并将从官网上下载的RocketMQ发布包解放放到根目录中,此时根目录内容如下:
| rocketmq-all-4.6.0-bin-release
1. 修改namesrv和broker配置
由于虚拟机内存有限,因此需要在启动脚本里限制namesrv和broker的内存
修改runserver.sh
修改rocketmq-all-4.6.0-bin-release/bin/runserver.sh
, 将:
JAVA_OPT="${JAVA_OPT} -server -Xms4g -Xmx4g -Xmn2g -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=320m"
修改为:
JAVA_OPT="${JAVA_OPT} -server -Xms1g -Xmx1g -Xmn512m -XX:MetaspaceSize=64m -XX:MaxMetaspaceSize=128m"
修改runbroker.sh
修改rocketmq-all-4.6.0-bin-release/bin/runbroker.sh
, 将:
JAVA_OPT="${JAVA_OPT} -server -Xms8g -Xmx8g -Xmn4g"
JAVA_OPT="${JAVA_OPT} -XX:+UseG1GC -XX:G1HeapRegionSize=16m -XX:G1ReservePercent=25 -XX:InitiatingHeapOccupancyPercent=30 -XX:SoftRefLRUPolicyMSPerMB=0"
修改为
JAVA_OPT="${JAVA_OPT} -server -Xms512m -Xmx512m -Xmn256m"
JAVA_OPT="${JAVA_OPT} -XX:+UseG1GC -XX:G1HeapRegionSize=16m -XX:G1ReservePercent=10 -XX:InitiatingHeapOccupancyPercent=30 -XX:SoftRefLRUPolicyMSPerMB=0"
以上修改的各个参数含义如下:
- -Xms:JVM初始堆内存大小
- -Xmx:JVM最大堆内存大小限制
- -Xmn:JVM的年轻代内存大小
- -XX:MetaspaceSize:JVM元空间初始大小
- -XX:MaxMetaspaceSize:JVM最大元空间大小
- -XX:+UseG1GC:使用G1垃圾收集器
- -XX:G1HeapRegionSize:设置G1的每个Region的大小
- -XX:G1ReservePercent:G1预留的内存百分比值
2. 编辑broker的配置
在根目录中新建一个目录broker-conf
,并在broker-conf
目录中新建6个文件,文件名分别是:
- broker0-n0.conf
- broker0-n1.conf
- broker0-n2.conf
- broker1-n0.conf
- broker1-n1.conf
- broker1-n2.conf
此时根目录内容如下:
| broker-conf
| -- broker0-n0.conf
| -- broker0-n1.conf
| -- broker0-n2.conf
| -- broker1-n0.conf
| -- broker1-n1.conf
| -- broker1-n2.conf
| rocketmq-all-4.6.0-bin-release
各个broker分配的端口为:
broker0-n0: 10911
broker0-n1: 11911
broker0-n2: 12911
broker1-n0: 20911
broker1-n1: 21911
broker1-n2: 22911
配置broker0
编辑broker0-n0
文件,内容如下:
brokerClusterName = DefaultCluster
brokerName = broker0
brokerId = 0
deleteWhen = 04
fileReservedTime = 48
brokerRole = ASYNC_MASTER
flushDiskType = ASYNC_FLUSH
# dleger
enableDLegerCommitLog = true
dLegerGroup = broker0
dLegerPeers = n0-broker0n0:40911;n1-broker0n1:40911;n2-broker0n2:40911
dLegerSelfId = n0
sendMessageThreadPoolNums = 4
# namesrv的地址和端口,这里设置为虚拟机的IP,以便于让测试机访问
namesrvAddr=192.168.7.241:9876
# 该broker的IP地址,由于测试需要让其他机器访问,因此设置为虚拟机的IP
brokerIP1 = 192.168.7.241
listenPort = 10911
以上配置需要注意以下几个点:
- brokerName:broker的名字,一个broker组内的broker节点该配置的值相同。
- brokerId:所有broker的brokerId都配置为0,因为他们都可能被选举为为master节点。这一点和主从的配置是不一样的。
- enableDLegerCommitLog:是否启用DLedger
- dLegerGroup:DLedger Raft Group的名字
- dLegerPeers:DLedger Group内各节点的地址和端口,每个节点的配置以
;
分隔,配置项的组成格式为dLegerSelfId-地址:端口
,需注意,在RocketMQ 4.6.0的版本中,地址部分不能包含-
,否则项目启动会抛出异常。 - dLegerSelfId:节点id, 必须属于dLegerPeers中的一个;同Group内各个节点要唯一
将broker0-n0.conf
的内容复制到broker0-n1
中,并修改dLegerSelfId
、listenPort
这两个字段的值为:
dLegerSelfId = n1
listenPort = 11911
将broker0-n0.conf
的内容复制到broker0-n2
中,并修改dLegerSelfId
、listenPort
这两个字段的值为:
dLegerSelfId = n2
listenPort = 12911
#存储路径
storePathRootDir=/app/data/store
#commitLog存储路径
storePathCommitLog=/app/data/store/commitlog
#消费队列存储路径
storePathConsumeQueue=/app/data/store/consumequeue
#索引存储路径
storePathIndex=/app/data/store/index
#checkpoint文件存储路径
storeCheckpoint=/app/data/store/checkpoint
#abort文件存储路径
abortFile=/app/data/store/abort
配置broker1
编辑broker1-n0
文件,内容如下:
brokerClusterName = DefaultCluster
brokerName = broker1
brokerId = 0
deleteWhen = 04
fileReservedTime = 48
brokerRole = ASYNC_MASTER
flushDiskType = ASYNC_FLUSH
# dleger
enableDLegerCommitLog = true
dLegerGroup = broker1
dLegerPeers = n0-broker1n0:40911;n1-broker1n1:40911;n2-broker1n2:40911
dLegerSelfId = n0
sendMessageThreadPoolNums = 4
# namesrv的地址和端口,这里设置为虚拟机的IP,以便于让测试机访问
namesrvAddr=192.168.7.241:9876
# 该broker的IP地址,由于测试需要让其他机器访问,因此设置为虚拟机的IP
brokerIP1 = 192.168.7.241
listenPort = 20911
将broker1-n0.conf
的内容复制到broker1-n1
中,并修改dLegerSelfId
、listenPort
这两个字段的值为:
dLegerSelfId = n1
listenPort = 21911
将broker1-n0.conf
的内容复制到broker1-n2
中,并修改dLegerSelfId
、listenPort
这两个字段的值为:
dLegerSelfId = n2
listenPort = 22911
#存储路径
storePathRootDir=/app/data/store
#commitLog存储路径
storePathCommitLog=/app/data/store/commitlog
#消费队列存储路径
storePathConsumeQueue=/app/data/store/consumequeue
#索引存储路径
storePathIndex=/app/data/store/index
#checkpoint文件存储路径
storeCheckpoint=/app/data/store/checkpoint
#abort文件存储路径
abortFile=/app/data/store/abort
构建namrsrv镜像的Dockerfile
根目录下新建文件rocketmq-namesrv.dockerfile
, 内容如下:
FROM openjdk:8u212-jre-alpine3.9
LABEL MAINTAINER='xxxx'
LABEL MAIL 'xx@xxx.xxx'
ADD rocketmq-all-4.6.0-bin-release /app/rocketmq
ENTRYPOINT exec sh /app/rocketmq/bin/mqnamesrv -n 127.0.0.1:9876
RUN echo "Asia/Shanghai" > /etc/timezone
EXPOSE 9876
构建broker镜像的Dockerfile
根目录下新建文件rocketmq-broker.dockerfile
, 内容如下:
FROM openjdk:8u212-jre-alpine3.9
LABEL MAINTAINER='Huang Junkai'
LABEL MAIL 'h@xnot.me'
ADD rocketmq-all-4.6.0-bin-release /app/rocketmq
RUN echo "Asia/Shanghai" > /etc/timezone
ENTRYPOINT exec sh /app/rocketmq/bin/mqbroker -c /app/data/conf/broker.conf
VOLUME /app/data
docker-compose文件
根目录下新建文件docker-compose.yml
, 内容如下:
version: "3.5"
services:
# 运行一个name server
namesrv1:
build:
context: .
dockerfile: rocketmq-namesrv.dockerfile
image: rocketmq-namesrv/4.6.0
container_name: namesrv1
restart: always
networks:
rocketmq-dledger:
ports:
- 9876:9876
# 运行一个rocketmq控制台服务
console:
image: styletang/rocketmq-console-ng
container_name: console
depends_on:
- namesrv1
environment:
- JAVA_OPTS= -Dlogging.level.root=info -Drocketmq.namesrv.addr=namesrv1:9876
- Dcom.rocketmq.sendMessageWithVIPChannel=false
networks:
rocketmq-dledger:
ports:
- 8087:8080
# broker0
broker0-n0:
build:
context: .
dockerfile: rocketmq-broker.dockerfile
image: rocketmq-broker/4.6.0
depends_on:
- namesrv1
container_name: broker0n0
restart: always
networks:
rocketmq-dledger:
volumes:
- ./broker-conf/broker0-n0.conf:/app/data/conf/broker.conf
- ./store/broker0n0:/app/data/store
ports:
- 10909:10909
- 10911:10911
- 10912:10912
broker0-n1:
build:
context: .
dockerfile: rocketmq-broker.dockerfile
image: rocketmq-broker/4.6.0
depends_on:
- namesrv1
container_name: broker0n1
restart: always
networks:
rocketmq-dledger:
volumes:
- ./broker-conf/broker0-n1.conf:/app/data/conf/broker.conf
- ./store/broker0n1:/app/data/store
ports:
- 11909:11909
- 11911:11911
- 11912:11912
broker0-n2:
build:
context: .
dockerfile: rocketmq-broker.dockerfile
image: rocketmq-broker/4.6.0
depends_on:
- namesrv1
container_name: broker0n2
restart: always
networks:
rocketmq-dledger:
volumes:
- ./broker-conf/broker0-n2.conf:/app/data/conf/broker.conf
- ./store/broker0n2:/app/data/store
ports:
- 12909:12909
- 12911:12911
- 12912:12912
# broker1
broker1-n0:
build:
context: .
dockerfile: rocketmq-broker.dockerfile
image: rocketmq-broker/4.6.0
depends_on:
- namesrv1
container_name: broker1n0
restart: always
networks:
rocketmq-dledger:
volumes:
- ./broker-conf/broker1-n0.conf:/app/data/conf/broker.conf
- ./store/broker1n0:/app/data/store
ports:
- 20909:20909
- 20911:20911
- 20912:20912
broker1-n1:
build:
context: .
dockerfile: rocketmq-broker.dockerfile
image: rocketmq-broker/4.6.0
depends_on:
- namesrv1
container_name: broker1n1
restart: always
networks:
rocketmq-dledger:
volumes:
- ./broker-conf/broker1-n1.conf:/app/data/conf/broker.conf
- ./store/broker1n1:/app/data/store
ports:
- 21909:21909
- 21911:21911
- 21912:21912
broker1-n2:
build:
context: .
dockerfile: rocketmq-broker.dockerfile
image: rocketmq-broker/4.6.0
depends_on:
- namesrv1
container_name: broker1n2
restart: always
networks:
rocketmq-dledger:
volumes:
- ./broker-conf/broker1-n2.conf:/app/data/conf/broker.conf
- ./store/broker1n2:/app/data/store
ports:
- 22909:22909
- 22911:22911
- 22912:22912
networks:
rocketmq-dledger:
注:
name server的端口为:9876
console的访问端口为:8087
此时目录结构如下:
| broker-conf
| -- broker0-n0.conf
| -- broker0-n1.conf
| -- broker0-n2.conf
| -- broker1-n0.conf
| -- broker1-n1.conf
| -- broker1-n2.conf
| rocketmq-all-4.6.0-bin-release
| docker-compose.yml
| rocketmq-broker.dockerfile
| rocketmq-namesrv.dockerfile
完成后,在根目录执行docker-compose up
即可拉起整个测试环境。
运行起来后,访问http://192.168.7.241:8087/#/cluster
可看到如下图,两个master节点,4个slave节点:
如图中,broker0集群中,192.168.7.241:11911
这个节点为master节点,可以执行docker kill broker0n1
将这个节点杀掉。
几秒钟后,再刷新页面,将会发现其他节点变成了master节点。
此时再执行docker start broker0n1
,过几秒刷新页面后,可以看到broker0n1这个节点已经变成slave节点了。
经测试,当两个master节点同时挂掉
后,需要大概30秒的时间,集群才能继续提供写服务。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。