2

1. Introduction to RocketMQ architecture

1.1 Logical deployment diagram

(The picture comes from the Internet)

1.2 Description of core components

As you can see from the above figure, the core components of RocketMQ mainly include four, namely NameServer, Broker, Producer, and Consumer. Let's briefly explain these four core components in turn:

NameServer : NameServer acts as a provider of routing information. The producer or consumer can look up the Broker IP list corresponding to each topic through the NameServer. Multiple Namesrver instances form a cluster, but they are independent of each other and there is no information exchange.

Broker : Message relay role, responsible for storing and forwarding messages. The Broker server is responsible for receiving and storing messages sent from producers in the RocketMQ system, and at the same time preparing for consumers' pull requests. The Broker server also stores message-related metadata, including consumer group, consumption progress offset, topic and queue messages, etc.

Producer : Responsible for producing messages, generally the business system is responsible for producing messages. A message producer will send the messages generated in the business application system to the Broker server. RocketMQ provides multiple sending methods, synchronous sending, asynchronous sending, sequential sending, and one-way sending. Both synchronous and asynchronous methods require Broker to return confirmation information, and one-way transmission is not required.

Consumer : Responsible for consuming messages, generally the background system is responsible for asynchronous consumption. A message consumer will pull the message from the Broker server and provide it to the application. From the perspective of user applications, two forms of consumption are provided: pull consumption and push consumption.

In addition to the three core components mentioned above, the concept of Topic will also be mentioned many times below:

Topic : Represents a collection of messages. Each topic contains several messages. Each message can only belong to one topic. It is the basic unit of RocketMQ for message subscription. A topic can be sharded on multiple Broker clusters, each topic shard contains multiple queues, the specific structure can refer to the following figure:

1.3 Design concept

RocketMQ is a topic-based publish and subscribe model. The core functions include message sending, message storage, and message consumption. The overall design pursues simplicity and performance first. In summary, there are mainly the following three:

  • NameServer replaces ZK as the registration center. NameServer clusters do not communicate with each other, and tolerate the inconsistency of routing information in the cluster within minutes, making it more lightweight;
  • Use memory mapping mechanism to achieve efficient IO storage and achieve high throughput;
  • Design flaws are tolerated, and the ACK is used to ensure that the message is consumed at least once, but if the ACK is lost, the message may be consumed repeatedly. This situation is allowed by design, and it is left to the user to guarantee.

This article focuses on NameServer. Let's take a look at how NameServer is started and how to manage routing.

Two, NameServer architecture design

In the first chapter, it has been briefly introduced that NameServer replaces zk as a more lightweight registry as a provider of routing information. So how do you implement routing information management specifically? Let's look at the picture below:

The above figure describes the core principles of NameServer for route registration, route elimination, and route discovery.

Route registration : When the Broker server starts, it will send a heartbeat signal to all the NameServers in the NameServer cluster for registration, and will send a heartbeat to the nameserver every 30 seconds to tell the NameServer that it is alive. After NameServer receives the heartbeat packet sent by the Broker, it will record the broker information and save the time of the most recent heartbeat packet received.

route elimination : NameServer maintains a long connection with each Broker, receives heartbeat packets sent by Broker every 30 seconds, and scans BrokerLiveTable every 10 seconds, and compares the last received heartbeat time with the current time to see if it is greater than 120 seconds. If it exceeds, the Broker is considered unavailable, and the Broker-related information in the routing table is removed.

Route discovery : Route discovery is not real-time. After the route changes, the NameServer does not actively push it to the client and waits for the producer to regularly pull the latest route information. This design method reduces the complexity of the NameServer implementation. When the routing changes, the fault-tolerant mechanism at the message sender is used to ensure the high availability of message sending (this content will be introduced in the subsequent introduction of producer message sending, and this article will not explain it) .

High-availability : NameServer guarantees its own high availability by deploying multiple NameServer servers, and no communication between multiple NameServer servers. In this way, when routing information changes, the data between each NameServer server may not be exactly the same, but The fault-tolerant mechanism of the sender ensures the high availability of message transmission. This is exactly the purpose of NameServer's pursuit of simplicity and efficiency.

Three, start-up process

After sorting out and understanding the architecture design of NameServer, let's first take a look at how NameServer is started?

Since it is a source code interpretation, let's first look at the code entry: org.apache.rocketmq.namesrv.NamesrvStartup#main(String[] args), which actually calls the main0() method,

code show as below:

public static NamesrvController main0(String[] args) {
​
    try {
        //创建namesrvController
        NamesrvController controller = createNamesrvController(args);
        //初始化并启动NamesrvController
        start(controller);
        String tip = "The Name Server boot success. serializeType=" + RemotingCommand.getSerializeTypeConfigInThisServer();
        log.info(tip);
        System.out.printf("%s%n", tip);
        return controller;
    } catch (Throwable e) {
        e.printStackTrace();
        System.exit(-1);
    }
​
    return null;
}

Starting the NameServer through the main method is mainly divided into two major steps. First, create the NamesrvController, and then initialize and start the NamesrvController. Let's analyze them separately.

3.1 Timing diagram

Before reading the code in detail, we first have an understanding of the overall process through a sequence diagram, as shown in the following figure:

3.2 Create NamesrvController

First look at the core code, as follows:

public static NamesrvController createNamesrvController(String[] args) throws IOException, JoranException {
    // 设置版本号为当前版本号
    System.setProperty(RemotingCommand.REMOTING_VERSION_KEY, Integer.toString(MQVersion.CURRENT_VERSION));
    //PackageConflictDetect.detectFastjson();
  //构造org.apache.commons.cli.Options,并添加-h -n参数,-h参数是打印帮助信息,-n参数是指定namesrvAddr
    Options options = ServerUtil.buildCommandlineOptions(new Options());
    //初始化commandLine,并在options中添加-c -p参数,-c指定nameserver的配置文件路径,-p标识打印配置信息
    commandLine = ServerUtil.parseCmdLine("mqnamesrv", args, buildCommandlineOptions(options), new PosixParser());
    if (null == commandLine) {
        System.exit(-1);
        return null;
    }
  //nameserver配置类,业务参数
    final NamesrvConfig namesrvConfig = new NamesrvConfig();
    //netty服务器配置类,网络参数
    final NettyServerConfig nettyServerConfig = new NettyServerConfig();
    //设置nameserver的端口号
    nettyServerConfig.setListenPort(9876);
    //命令带有-c参数,说明指定配置文件,需要根据配置文件路径读取配置文件内容,并将文件中配置信息赋值给NamesrvConfig和NettyServerConfig
    if (commandLine.hasOption('c')) {
        String file = commandLine.getOptionValue('c');
        if (file != null) {
            InputStream in = new BufferedInputStream(new FileInputStream(file));
            properties = new Properties();
            properties.load(in);
            //反射的方式
            MixAll.properties2Object(properties, namesrvConfig);
            MixAll.properties2Object(properties, nettyServerConfig);
      //设置配置文件路径
            namesrvConfig.setConfigStorePath(file);
​
            System.out.printf("load config properties file OK, %s%n", file);
            in.close();
        }
    }
  //命令行带有-p,说明是打印参数的命令,那么就打印出NamesrvConfig和NettyServerConfig的属性。在启动NameServer时可以先使用./mqnameserver -c configFile -p打印当前加载的配置属性 
    if (commandLine.hasOption('p')) {
        InternalLogger console = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_CONSOLE_NAME);
        MixAll.printObjectProperties(console, namesrvConfig);
        MixAll.printObjectProperties(console, nettyServerConfig);
        //打印参数命令不需要启动nameserver服务,只需要打印参数即可
        System.exit(0);
    }
  //解析命令行参数,并加载到namesrvConfig中
    MixAll.properties2Object(ServerUtil.commandLine2Properties(commandLine), namesrvConfig);
  //检查ROCKETMQ_HOME,不能为空
    if (null == namesrvConfig.getRocketmqHome()) {
        System.out.printf("Please set the %s variable in your environment to match the location of the RocketMQ installation%n", MixAll.ROCKETMQ_HOME_ENV);
        System.exit(-2);
    }
  //初始化logback日志工厂,rocketmq默认使用logback作为日志输出
    LoggerContext lc = (LoggerContext) LoggerFactory.getILoggerFactory();
    JoranConfigurator configurator = new JoranConfigurator();
    configurator.setContext(lc);
    lc.reset();
    configurator.doConfigure(namesrvConfig.getRocketmqHome() + "/conf/logback_namesrv.xml");
​
    log = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_LOGGER_NAME);
​
    MixAll.printObjectProperties(log, namesrvConfig);
    MixAll.printObjectProperties(log, nettyServerConfig);
  //创建NamesrvController
    final NamesrvController controller = new NamesrvController(namesrvConfig, nettyServerConfig);
​
    //将全局Properties的内容复制到NamesrvController.Configuration.allConfigs中
    // remember all configs to prevent discard
    controller.getConfiguration().registerConfig(properties);
​
    return controller;
}

Through the comments on each line of code above, it can be seen that the process of creating NamesrvController is mainly divided into two steps:

Step1: Obtain the configuration through the command line. Assign values to the NamesrvConfig and NettyServerConfig classes.

Step2: Construct an instance of NamesrvController according to the configuration classes NamesrvConfig and NettyServerConfig.

It can be seen that NamesrvConfig and NettyServerConfig are important. These two classes are the business parameters and network parameters of NameServer. Let's take a look at the properties of these two classes:

NamesrvConfig

NettyServerConfig

Note: Apache Commons CLI is an open source command line parsing tool, which can help developers quickly build startup commands, and help you organize command parameters and output lists.

3.3 Initialize and start

After the NamesrvController instance is created, initialize and start the NameServer.

Initialize first, the code entry is NamesrvController#initialize.

public boolean initialize() {
  //加载kvConfigPath下kvConfig.json配置文件里的KV配置,然后将这些配置放到KVConfigManager#configTable属性中
    this.kvConfigManager.load();
  //根据nettyServerConfig初始化一个netty服务器。
    //brokerHousekeepingService是在NamesrvController实例化时构造函数里实例化的,该类负责Broker连接事件的处理,实现了ChannelEventListener,主要用来管理RouteInfoManager的brokerLiveTable
    this.remotingServer = new NettyRemotingServer(this.nettyServerConfig, this.brokerHousekeepingService);
  //初始化负责处理Netty网络交互数据的线程池,默认线程数是8个
    this.remotingExecutor =
        Executors.newFixedThreadPool(nettyServerConfig.getServerWorkerThreads(), new ThreadFactoryImpl("RemotingExecutorThread_"));
  //注册Netty服务端业务处理逻辑,如果开启了clusterTest,那么注册的请求处理类是ClusterTestRequestProcessor,否则请求处理类是DefaultRequestProcessor
    this.registerProcessor();
  //注册心跳机制线程池,延迟5秒启动,每隔10秒遍历RouteInfoManager#brokerLiveTable这个属性,用来扫描不存活的broker
    this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
​
        @Override
        public void run() {
            NamesrvController.this.routeInfoManager.scanNotActiveBroker();
        }
    }, 5, 10, TimeUnit.SECONDS);
  //注册打印KV配置线程池,延迟1分钟启动、每10分钟打印出kvConfig配置
    this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
​
        @Override
        public void run() {
            NamesrvController.this.kvConfigManager.printAllPeriodically();
        }
    }, 1, 10, TimeUnit.MINUTES);
  //rocketmq可以通过开启TLS来提高数据传输的安全性,如果开启了,那么需要注册一个监听器来重新加载SslContext
    if (TlsSystemConfig.tlsMode != TlsMode.DISABLED) {
        // Register a listener to reload SslContext
        try {
            fileWatchService = new FileWatchService(
                new String[] {
                    TlsSystemConfig.tlsServerCertPath,
                    TlsSystemConfig.tlsServerKeyPath,
                    TlsSystemConfig.tlsServerTrustCertPath
                },
                new FileWatchService.Listener() {
                    boolean certChanged, keyChanged = false;
                    @Override
                    public void onChanged(String path) {
                        if (path.equals(TlsSystemConfig.tlsServerTrustCertPath)) {
                            log.info("The trust certificate changed, reload the ssl context");
                            reloadServerSslContext();
                        }
                        if (path.equals(TlsSystemConfig.tlsServerCertPath)) {
                            certChanged = true;
                        }
                        if (path.equals(TlsSystemConfig.tlsServerKeyPath)) {
                            keyChanged = true;
                        }
                        if (certChanged && keyChanged) {
                            log.info("The certificate and private key changed, reload the ssl context");
                            certChanged = keyChanged = false;
                            reloadServerSslContext();
                        }
                    }
                    private void reloadServerSslContext() {
                        ((NettyRemotingServer) remotingServer).loadSslContext();
                    }
                });
        } catch (Exception e) {
            log.warn("FileWatchService created error, can't load the certificate dynamically");
        }
    }
​
    return true;
}

The above code is the initialization process of NameServer. Through the comments of each line of code, it can be seen that there are mainly 5 steps:

  • Step1: Load the KV configuration and write it into the configTable attribute of KVConfigManager;
  • Step2: Initialize the netty server;
  • Step3: Initialize the thread pool for processing netty network interaction data;
  • Step4: Register the heartbeat mechanism thread pool, and check the Broker's survival every 10 seconds after starting 5 seconds;
  • Step5: Register the thread pool for printing KV configuration. After starting for 1 minute, KV configuration will be printed every 10 minutes.

The RocketMQ development team also used a common programming technique, which is to use JVM hook functions to gracefully shut down the NameServer. So before the JVM process is shut down, the shutdown operation will be executed first.

Runtime.getRuntime().addShutdownHook(new ShutdownHookThread(log, new Callable<Void>() {
    @Override
    public Void call() throws Exception {
        controller.shutdown();
        return null;
    }
}));

Execute the start function to start the NameServer. The code is relatively simple, that is, start the netty server created in the first step. Among them, the remotingServer.start() method is not explained in detail. You need to be familiar with netty, which is not the focus of this article. Students who are interested can download the source code to read.

public void start() throws Exception {
    //启动netty服务
    this.remotingServer.start();
  //如果开启了TLS
    if (this.fileWatchService != null) {
        this.fileWatchService.start();
    }
}

Four, routing management

At the beginning of Chapter 2, we learned that NameServer, as a lightweight registry, mainly provides Topic routing information for message producers and consumers, and manages these routing information and Broker nodes, including routing registration, Route elimination and route discovery.

This chapter will analyze specifically how NameServer manages routing information from the perspective of source code. The core code is mainly implemented in org.apache.rocketmq.namesrv.routeinfo.RouteInfoManager.

4.1 routing meta information

Before understanding routing information management, we first need to understand what routing meta-information is stored in NameServer and what the data structure is.

Looking at the code, we can see that routing meta-information is maintained mainly through five attributes, as follows:

private final HashMap<String/* topic */, List<QueueData>> topicQueueTable;
private final HashMap<String/* brokerName */, BrokerData> brokerAddrTable;
private final HashMap<String/* clusterName */, Set<String/* brokerName */>> clusterAddrTable;
private final HashMap<String/* brokerAddr */, BrokerLiveInfo> brokerLiveTable;
private final HashMap<String/* brokerAddr */, List<String>/* Filter Server */> filterServerTable;

We expand on these 5 attributes in turn.

4.1.1 TopicQueueTable

Description: Topic message queue routing information, load balancing is performed according to the routing table when the message is sent.

Data structure: HashMap structure, the key is the topic name, and the value is a queue collection whose type is QueueData. As mentioned in the first chapter, there are multiple queues in a topic. The data structure of QueueData is as follows:

data structure example:

topicQueueTable:{
    "topic1": [
        {
            "brokerName": "broker-a",
            "readQueueNums":4,
            "writeQueueNums":4,
            "perm":6,
            "topicSynFlag":0,
        },
        {
            "brokerName": "broker-b",
            "readQueueNums":4,
            "writeQueueNums":4,
            "perm":6,
            "topicSynFlag":0,
        }
    ]
}

4.1.2 BrokerAddrTable

Description: Broker basic information, including BrokerName, cluster name, active and standby Broker addresses.

data structure : HashMap structure, key is BrokerName, value is an object of type BrokerData. The data structure of BrokerData is as follows (can be understood with the following Broker master-slave structure logic diagram):

Broker master-slave structure logic diagram:

Example of data structure:

brokerAddrTable:{
    "broker-a": {
        "cluster": "c1",
        "brokerName": "broker-a",
        "brokerAddrs": {
            0: "192.168.1.1:10000",
            1: "192.168.1.2:10000"
        }
    },
    "broker-b": {
        "cluster": "c1",
        "brokerName": "broker-b",
        "brokerAddrs": {
            0: "192.168.1.3:10000",
            1: "192.168.1.4:10000"
        }
    }
}

4.1.3 ClusterAddrTable

Description: Broker cluster information, stores the names of all Brokers in the cluster.

Data structure: HashMap structure, key is ClusterName, value is Set structure storing BrokerName.

Example of data structure:

clusterAddrTable:{
    "c1": ["broker-a","broker-b"]
}

4.1.4 BrokerLiveTable

Description: Broker status information. NameServer will replace this information every time it receives a heartbeat packet

data structure : HashMap structure, the key is the address of the Broker, and the value is the Broker information object of the BrokerLiveInfo structure. The data structure of BrokerLiveInfo is as follows:

Example of data structure:

brokerLiveTable:{
    "192.168.1.1:10000": {
            "lastUpdateTimestamp": 1518270318980,
            "dataVersion":versionObj1,
            "channel":channelObj,
            "haServerAddr":""
    },
    "192.168.1.2:10000": {
            "lastUpdateTimestamp": 1518270318980,
            "dataVersion":versionObj1,
            "channel":channelObj,
            "haServerAddr":"192.168.1.1:10000"
     },
    "192.168.1.3:10000": {
            "lastUpdateTimestamp": 1518270318980,
            "dataVersion":versionObj1,
            "channel":channelObj,
            "haServerAddr":""
     },
    "192.168.1.4:10000": {
            "lastUpdateTimestamp": 1518270318980,
            "dataVersion":versionObj1,
            "channel":channelObj,
            "haServerAddr":"192.168.1.3:10000"
     }
}

4.1.5 filterServerTable

Note: The list of FilterServers on the Broker and the list of message filtering servers will be introduced when the Consumer is introduced later. The consumer pulls data through the filterServer, and the consumer registers with the Broker.

data structure : HashMap structure, the key is the Broker address, and the value is the List collection that records the address of the filterServer.

4.2 Route registration

Route registration is achieved through the heartbeat function between Broker and NameServer. It is mainly divided into two steps:

Step1:

When the Broker starts, it sends a heartbeat statement to all NameServers in the cluster, and sends it again every 30 seconds (the default is 30s, and the time interval is between 10 and 60 seconds).

Step2:

NameServer receives the heartbeat packet to update topicQueueTable, brokerAddrTable, brokerLiveTable, clusterAddrTable, filterServerTable.

We analyze these two steps separately.

4.2.1 Broker sends heartbeat packets

The core logic of sending a heartbeat packet is in the Broker startup logic. The code entry is org.apache.rocketmq.broker.BrokerController#start. This article focuses on the logic implementation of sending a heartbeat packet. Only the core of sending a heartbeat packet is listed. code show as below:

1) A thread pool is created to register Broker, and the program is executed 10 seconds after startup, and executed once every 30 seconds (the default is 30s, the time interval is between 10 and 60 seconds, the default value of BrokerConfig.getRegisterNameServerPeriod() is 30 seconds) .

this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
​
    @Override
    public void run() {
        try {
            BrokerController.this.registerBrokerAll(true, false, brokerConfig.isForceRegister());
        } catch (Throwable e) {
            log.error("registerBrokerAll Exception", e);
        }
    }
}, 1000 * 10, Math.max(10000, Math.min(brokerConfig.getRegisterNameServerPeriod(), 60000)), TimeUnit.MILLISECONDS);

2) After encapsulating the Topic configuration and version number, perform the actual routing registration (Note: Packaging Topic configuration is not the focus of this article, it will be explained when introducing the Broker source code). The actual routing registration is implemented in org.apache.rocketmq.broker.out.BrokerOuterAPI#registerBrokerAll, the core code is as follows:

public List<RegisterBrokerResult> registerBrokerAll(
    final String clusterName,
    final String brokerAddr,
    final String brokerName,
    final long brokerId,
    final String haServerAddr,
    final TopicConfigSerializeWrapper topicConfigWrapper,
    final List<String> filterServerList,
    final boolean oneway,
    final int timeoutMills,
    final boolean compressed) {
​
    final List<RegisterBrokerResult> registerBrokerResultList = new CopyOnWriteArrayList<>();
    //获取nameserver地址列表
    List<String> nameServerAddressList = this.remotingClient.getNameServerAddressList();
    if (nameServerAddressList != null && nameServerAddressList.size() > 0) {
    /**
      *封装请求包头start
      *封装请求包头,主要封装broker相关信息
    **/
        final RegisterBrokerRequestHeader requestHeader = new RegisterBrokerRequestHeader();
        requestHeader.setBrokerAddr(brokerAddr);
        requestHeader.setBrokerId(brokerId);
        requestHeader.setBrokerName(brokerName);
        requestHeader.setClusterName(clusterName);
        requestHeader.setHaServerAddr(haServerAddr);
        requestHeader.setCompressed(compressed);
    //封装requestBody,包括topic和filterServerList相关信息
        RegisterBrokerBody requestBody = new RegisterBrokerBody();
        requestBody.setTopicConfigSerializeWrapper(topicConfigWrapper);
        requestBody.setFilterServerList(filterServerList);
        final byte[] body = requestBody.encode(compressed);
        final int bodyCrc32 = UtilAll.crc32(body);
        requestHeader.setBodyCrc32(bodyCrc32);
        /**
      *封装请求包头end
    **/
        //开启多线程到每个nameserver进行注册
        final CountDownLatch countDownLatch = new CountDownLatch(nameServerAddressList.size());
        for (final String namesrvAddr : nameServerAddressList) {
            brokerOuterExecutor.execute(new Runnable() {
                @Override
                public void run() {
                    try {
                        //实际进行注册方法
                        RegisterBrokerResult result = registerBroker(namesrvAddr,oneway, timeoutMills,requestHeader,body);
                        if (result != null) {
                            //封装nameserver返回的信息
                            registerBrokerResultList.add(result);
                        }
​
                        log.info("register broker[{}]to name server {} OK", brokerId, namesrvAddr);
                    } catch (Exception e) {
                        log.warn("registerBroker Exception, {}", namesrvAddr, e);
                    } finally {
                        countDownLatch.countDown();
                    }
                }
            });
        }
​
        try {
            countDownLatch.await(timeoutMills, TimeUnit.MILLISECONDS);
        } catch (InterruptedException e) {
        }
    }
​
    return registerBrokerResultList;
}

From the above code, it is also relatively simple. First, you need to encapsulate the request header and requestBody, and then start multi-threading to each NameServer server to register.

The request header type is RegisterBrokerRequestHeader, which mainly includes the following fields:

The requestBody type is RegisterBrokerBody, which mainly includes the following fields:

1) The actual routing registration is achieved through the registerBroker method, the core code is as follows:

private RegisterBrokerResult registerBroker(
    final String namesrvAddr,
    final boolean oneway,
    final int timeoutMills,
    final RegisterBrokerRequestHeader requestHeader,
    final byte[] body
) throws RemotingCommandException, MQBrokerException, RemotingConnectException, RemotingSendRequestException, RemotingTimeoutException,
InterruptedException {
    //创建请求指令,需要注意RequestCode.REGISTER_BROKER,nameserver端的网络处理器会根据requestCode进行相应的业务处理
    RemotingCommand request = RemotingCommand.createRequestCommand(RequestCode.REGISTER_BROKER, requestHeader);
    request.setBody(body);
  //基于netty进行网络传输
    if (oneway) {
        //如果是单向调用,没有返回值,不返回nameserver返回结果
        try {
            this.remotingClient.invokeOneway(namesrvAddr, request, timeoutMills);
        } catch (RemotingTooMuchRequestException e) {
            // Ignore
        }
        return null;
    }
  //异步调用向nameserver发起注册,获取nameserver的返回信息
    RemotingCommand response = this.remotingClient.invokeSync(namesrvAddr, request, timeoutMills);
    assert response != null;
    switch (response.getCode()) {
        case ResponseCode.SUCCESS: {
            //获取返回的reponseHeader
            RegisterBrokerResponseHeader responseHeader =
                (RegisterBrokerResponseHeader) response.decodeCommandCustomHeader(RegisterBrokerResponseHeader.class);
            //重新封装返回结果,更新masterAddr和haServerAddr
            RegisterBrokerResult result = new RegisterBrokerResult();
            result.setMasterAddr(responseHeader.getMasterAddr());
            result.setHaServerAddr(responseHeader.getHaServerAddr());
            if (response.getBody() != null) {
                result.setKvTable(KVTable.decode(response.getBody(), KVTable.class));
            }
            return result;
        }
        default:
            break;
    }
​
    throw new MQBrokerException(response.getCode(), response.getRemark(), requestHeader == null ? null : requestHeader.getBrokerAddr());
}

Network transmission between borker and NameServer is carried out through netty. When Broker initiates registration with NameServer, it will add the registration code RequestCode.REGISTER_BROKER to the request. This is a network tracking method. Each request of RocketMQ will define a requestCode, and the network processor of the server will process the affected business according to different requestCodes.

4.2.2 NameServer handles heartbeat packets

After the Broker sends out the heartbeat packet of the route registration, the NameServer will process it according to the requestCode in the heartbeat packet. The default network processor of NameServer is DefaultRequestProcessor, the specific code is as follows:

public RemotingCommand processRequest(ChannelHandlerContext ctx,
        RemotingCommand request) throws RemotingCommandException {
    if (ctx != null) {
        log.debug("receive request, {} {} {}",
                  request.getCode(),
                  RemotingHelper.parseChannelRemoteAddr(ctx.channel()),
                  request);
    }
    switch (request.getCode()) {
        ......
        //,如果是RequestCode.REGISTER_BROKER,进行broker注册
        case RequestCode.REGISTER_BROKER:
            Version brokerVersion = MQVersion.value2Version(request.getVersion());
            if (brokerVersion.ordinal() >= MQVersion.Version.V3_0_11.ordinal()) {
                return this.registerBrokerWithFilterServer(ctx, request);
            } else {
                return this.registerBroker(ctx, request);
            }
        ......
        default:
            break;
    }
    return null;
}

Determine the requestCode, if it is RequestCode.REGISTER\_BROKER, then determine that the business processing logic is to register Broker. Choose different methods according to the Broker version number. We have taken V3\_0_11 and above as an example. The main steps of calling the registerBrokerWithFilterServer method to register are divided into three steps:

Step1

Parse the requestHeader and verify the signature (based on crc32) to determine whether the data is correct;

Step2

Analyze Topic information;

Step3

Call RouteInfoManager#registerBroker to perform Broker registration;

The core registration logic is implemented by RouteInfoManager#registerBroker. The core code is as follows:

public RegisterBrokerResult registerBroker(
    final String clusterName,
    final String brokerAddr,
    final String brokerName,
    final long brokerId,
    final String haServerAddr,
    final TopicConfigSerializeWrapper topicConfigWrapper,
    final List<String> filterServerList,
    final Channel channel) {
    RegisterBrokerResult result = new RegisterBrokerResult();
    try {
        try {
            //加写锁,防止并发写RoutInfoManager中的路由表信息。
            this.lock.writeLock().lockInterruptibly();
      //根据clusterName从clusterAddrTable中获取所有broker名字集合
            Set<String> brokerNames = this.clusterAddrTable.get(clusterName);
            //如果没有获取到,说明broker所属集群还没记录,那么需要创建,并将brokerName加入到集群的broker集合中
            if (null == brokerNames) {
                brokerNames = new HashSet<String>();
                this.clusterAddrTable.put(clusterName, brokerNames);
            }
            brokerNames.add(brokerName);
      
            boolean registerFirst = false;
      //根据brokerName尝试从brokerAddrTable中获取brokerData
            BrokerData brokerData = this.brokerAddrTable.get(brokerName);
            if (null == brokerData) {
                //如果没获取到brokerData,新建BrokerData并放入brokerAddrTable,registerFirst设为true;
                registerFirst = true;
                brokerData = new BrokerData(clusterName, brokerName, new HashMap<Long, String>());
                this.brokerAddrTable.put(brokerName, brokerData);
            }
            //更新brokerData中的brokerAddrs
            Map<Long, String> brokerAddrsMap = brokerData.getBrokerAddrs();
            //考虑到可能出现master挂了,slave变成master的情况,这时候brokerId会变成0,这时候需要把老的brokerAddr给删除
            //Switch slave to master: first remove <1, IP:PORT> in namesrv, then add <0, IP:PORT>
            //The same IP:PORT must only have one record in brokerAddrTable
            Iterator<Entry<Long, String>> it = brokerAddrsMap.entrySet().iterator();
            while (it.hasNext()) {
                Entry<Long, String> item = it.next();
                if (null != brokerAddr && brokerAddr.equals(item.getValue()) && brokerId != item.getKey()) {
                    it.remove();
                }
            }
      //更新brokerAddrs,根据返回的oldAddr判断是否是第一次注册的broker
            String oldAddr = brokerData.getBrokerAddrs().put(brokerId, brokerAddr);
            registerFirst = registerFirst || (null == oldAddr);
​
            //如过Broker是Master,并且Broker的Topic配置信息发生变化或者是首次注册,需要创建或更新Topic路由元数据,填充topicQueueTable
            if (null != topicConfigWrapper
                && MixAll.MASTER_ID == brokerId) {
                if (this.isBrokerTopicConfigChanged(brokerAddr, topicConfigWrapper.getDataVersion())
                    || registerFirst) {
                    ConcurrentMap<String, TopicConfig> tcTable =
                        topicConfigWrapper.getTopicConfigTable();
                    if (tcTable != null) {
                        for (Map.Entry<String, TopicConfig> entry : tcTable.entrySet()) {
                            //创建或更新Topic路由元数据
                            this.createAndUpdateQueueData(brokerName, entry.getValue());
                        }
                    }
                }
            }
      //更新BrokerLivelnfo,BrokeLivelnfo是执行路由删除的重要依据
            BrokerLiveInfo prevBrokerLiveInfo = this.brokerLiveTable.put(brokerAddr,
                                                                         new BrokerLiveInfo(
                                                                             System.currentTimeMillis(),
                                                                             topicConfigWrapper.getDataVersion(),
                                                                             channel,
                                                                             haServerAddr));
            if (null == prevBrokerLiveInfo) {
                log.info("new broker registered, {} HAServer: {}", brokerAddr, haServerAddr);
            }
      //注册Broker的filterServer地址列表
            if (filterServerList != null) {
                if (filterServerList.isEmpty()) {
                    this.filterServerTable.remove(brokerAddr);
                } else {
                    this.filterServerTable.put(brokerAddr, filterServerList);
                }
            }
      //如果此Broker为从节点,则需要查找Broker Master的节点信息,并更新对应masterAddr属性
            if (MixAll.MASTER_ID != brokerId) {
                String masterAddr = brokerData.getBrokerAddrs().get(MixAll.MASTER_ID);
                if (masterAddr != null) {
                    BrokerLiveInfo brokerLiveInfo = this.brokerLiveTable.get(masterAddr);
                    if (brokerLiveInfo != null) {
                        result.setHaServerAddr(brokerLiveInfo.getHaServerAddr());
                        result.setMasterAddr(masterAddr);
                    }
                }
            }
        } finally {
            this.lock.writeLock().unlock();
        }
    } catch (Exception e) {
        log.error("registerBroker Exception", e);
    }
​
    return result;
}

Through the above source code analysis, you can decompose a Broker registration is mainly divided into 7 steps:

  • Step1: Add a write lock to prevent concurrent writing of routing table information in RoutInfoManager;
  • Step2: Determine whether the cluster to which the Broker belongs exists, and does not need to be created, and add the Broker name to the cluster Broker set;
  • Step3: Maintain BrokerData;
  • Step4: If the Broker is the Master, and the Topic configuration information of the Broker changes or is the first registration, you need to create or update the Topic routing metadata and fill the TopicQueueTable;
  • Step5: Update BrokerLivelnfo;
  • Step6: Register Broker's filterServer address list;
  • Step7: If this Broker is a slave node, you need to find the node information of the Broker Master, update the corresponding masterAddr attribute, and return it to the Broker end.

4.3 route elimination

4.3.1 Trigger conditions

There are two main trigger conditions for route rejection:

NameServer scans the BrokerLiveTable every 10s, if it does not receive a heartbeat packet for 120s, it will remove the Broker and close the socket connection;

The route deletion is triggered when the Broker closes normally.

4.3.2 Source code analysis

The trigger point described above finally deletes the route logic is the same, unified in RouteInfoManager#onChannelDestroy

The core code is as follows:

public void onChannelDestroy(String remoteAddr, Channel channel) {
    String brokerAddrFound = null;
    if (channel != null) {
        try {
            try {
                //加读锁
                this.lock.readLock().lockInterruptibly();
                //通过channel从brokerLiveTable中找出对应的Broker地址
                Iterator<Entry<String, BrokerLiveInfo>> itBrokerLiveTable =
                    this.brokerLiveTable.entrySet().iterator();
                while (itBrokerLiveTable.hasNext()) {
                    Entry<String, BrokerLiveInfo> entry = itBrokerLiveTable.next();
                    if (entry.getValue().getChannel() == channel) {
                        brokerAddrFound = entry.getKey();
                        break;
                    }
                }
            } finally {
                //释放读锁
                this.lock.readLock().unlock();
            }
        } catch (Exception e) {
            log.error("onChannelDestroy Exception", e);
        }
    }
  //若该Broker已经从存活的Broker地址列表中被清除,则直接使用remoteAddr
    if (null == brokerAddrFound) {
        brokerAddrFound = remoteAddr;
    } else {
        log.info("the broker's channel destroyed, {}, clean it's data structure at once", brokerAddrFound);
    }
​
    if (brokerAddrFound != null && brokerAddrFound.length() > 0) {
​
        try {
            try {
                //申请写锁
                this.lock.writeLock().lockInterruptibly();
                //根据brokerAddress,将这个brokerAddress从brokerLiveTable和filterServerTable中移除
                this.brokerLiveTable.remove(brokerAddrFound);
                this.filterServerTable.remove(brokerAddrFound);
                String brokerNameFound = null;
                boolean removeBrokerName = false;
                Iterator<Entry<String, BrokerData>> itBrokerAddrTable =
                    this.brokerAddrTable.entrySet().iterator();
                //遍历brokerAddrTable
                while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) {
                    BrokerData brokerData = itBrokerAddrTable.next().getValue();
​
                    Iterator<Entry<Long, String>> it = brokerData.getBrokerAddrs().entrySet().iterator();
                    while (it.hasNext()) {
                        Entry<Long, String> entry = it.next();
                        Long brokerId = entry.getKey();
                        String brokerAddr = entry.getValue();
                        //根据brokerAddress找到对应的brokerData,并将brokerData中对应的brokerAddress移除
                        if (brokerAddr.equals(brokerAddrFound)) {
                            brokerNameFound = brokerData.getBrokerName();
                            it.remove();
                            log.info("remove brokerAddr[{}, {}] from brokerAddrTable, because channel destroyed",
                                     brokerId, brokerAddr);
                            break;
                        }
                    }
          //如果移除后,整个brokerData的brokerAddress空了,那么将整个brokerData移除
                    if (brokerData.getBrokerAddrs().isEmpty()) {
                        removeBrokerName = true;
                        itBrokerAddrTable.remove();
                        log.info("remove brokerName[{}] from brokerAddrTable, because channel destroyed",
                                 brokerData.getBrokerName());
                    }
                }
​
                if (brokerNameFound != null && removeBrokerName) {
                    //遍历clusterAddrTable
                    Iterator<Entry<String, Set<String>>> it = this.clusterAddrTable.entrySet().iterator();
                    while (it.hasNext()) {
                        Entry<String, Set<String>> entry = it.next();
                        String clusterName = entry.getKey();
                        Set<String> brokerNames = entry.getValue();
                        //根据第三步中获取的需要移除的brokerName,将对应的brokerName移除了
                        boolean removed = brokerNames.remove(brokerNameFound);
                        if (removed) {
                            log.info("remove brokerName[{}], clusterName[{}] from clusterAddrTable, because channel destroyed",
                                     brokerNameFound, clusterName);
              //如果移除后,该集合为空,那么将整个集群从clusterAddrTable中移除
                            if (brokerNames.isEmpty()) {
                                log.info("remove the clusterName[{}] from clusterAddrTable, because channel destroyed and no broker in this cluster",
                                         clusterName);
                                it.remove();
                            }
​
                            break;
                        }
                    }
                }
​
                if (removeBrokerName) {
                    Iterator<Entry<String, List<QueueData>>> itTopicQueueTable =
                        this.topicQueueTable.entrySet().iterator();
                    //遍历topicQueueTable
                    while (itTopicQueueTable.hasNext()) {
                        Entry<String, List<QueueData>> entry = itTopicQueueTable.next();
                        String topic = entry.getKey();
                        List<QueueData> queueDataList = entry.getValue();
​
                        Iterator<QueueData> itQueueData = queueDataList.iterator();
                        while (itQueueData.hasNext()) {
                            QueueData queueData = itQueueData.next();
                            //根据brokerName,将topic下对应的broker移除掉
                            if (queueData.getBrokerName().equals(brokerNameFound)) {
                                itQueueData.remove();
                                log.info("remove topic[{} {}], from topicQueueTable, because channel destroyed",
                                         topic, queueData);
                            }
                        }
            //如果该topic下只有一个待移除的broker,那么该topic也从table中移除
                        if (queueDataList.isEmpty()) {
                            itTopicQueueTable.remove();
                            log.info("remove topic[{}] all queue, from topicQueueTable, because channel destroyed",
                                     topic);
                        }
                    }
                }
            } finally {
                //释放写锁
                this.lock.writeLock().unlock();
            }
        } catch (Exception e) {
            log.error("onChannelDestroy Exception", e);
        }
    }
}

The overall logic of route deletion is mainly divided into 6 steps:

  • Step1: Add readlock, find out the corresponding Broker address from BrokerLiveTable through channel, release readlock, if the Broker has been removed from the list of surviving Broker addresses, use remoteAddr directly.
  • Step2: Apply for a write lock and remove it from BrokerLiveTable and filterServerTable according to BrokerAddress.
  • Step3: Traverse the BrokerAddrTable, find the corresponding brokerData according to the BrokerAddress, and remove the corresponding brokerAddress in the brokerData. If the brokerAddress of the entire brokerData is empty after removal, then the entire brokerData is removed.
  • Step4: Traverse the clusterAddrTable, and remove the corresponding brokerName according to the BrokerName that needs to be removed in the third step. If the collection is empty after removal, the entire cluster is removed from clusterAddrTable.
  • Step5: Traverse the TopicQueueTable and remove the corresponding Broker under the Topic according to the BrokerName. If there is only one Broker to be removed under the Topic, then the Topic is also removed from the table.
  • Step6: Release the write lock.

It can be seen from the above that the overall logic of routing elimination is relatively simple, that is, simply operating on the data structure of routing meta-information. In order for everyone to better understand this piece of code, it is recommended that you review the data structure of the routing meta-information introduced in 4.1 to read the code.

4.4 Route discovery

When the routing information changes, the NameServer will not actively push it to the client, but wait for the client to periodically go to the nameserver to actively pull the latest routing information. This design approach reduces the complexity of the NameServer implementation.

4.4.1 Producer takes the initiative to pull

After the producer starts, it will start a series of timed tasks, one of which is to periodically obtain Topic routing information from the NameServer. The code entry is MQClientInstance#start-ScheduledTask(), and the core code is as follows:

private void startScheduledTask() {
    ......
    this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
​
        @Override
        public void run() {
            try {
                //从nameserver更新最新的topic路由信息
                MQClientInstance.this.updateTopicRouteInfoFromNameServer();
            } catch (Exception e) {
                log.error("ScheduledTask updateTopicRouteInfoFromNameServer exception", e);
            }
        }
    }, 10, this.clientConfig.getPollNameServerInterval(), TimeUnit.MILLISECONDS);
​
    ......
}
​
/**
    * 从nameserver获取topic路由信息
    */
public TopicRouteData getTopicRouteInfoFromNameServer(final String topic, final long timeoutMillis,
                                                      boolean allowTopicNotExist) throws MQClientException, InterruptedException, RemotingTimeoutException, RemotingSendRequestException, RemotingConnectException {
    ......
    //向nameserver发送请求包,requestCode为RequestCode.GET_ROUTEINFO_BY_TOPIC
    RemotingCommand request = RemotingCommand.createRequestCommand(RequestCode.GET_ROUTEINFO_BY_TOPIC, requestHeader);
  ......
}

Network transmission between producer and NameServer through netty, producer adds registration code to the request initiated by NameServer

RequestCode.GET\_ROUTEINFO\_BY_TOPIC。

4.4.2 NameServer returns routing information

After NameServer receives the request sent by the producer, it will process it according to the requestCode in the request. Processing requestCode is also processed in the default network processor DefaultRequestProcessor, and finally implemented through RouteInfoManager#pickupTopicRouteData.

TopicRouteData structure

Before officially parsing the source code, let's take a look at the data structure that NameServer returns to the producer. As you can see from the code, what is returned is a TopicRouteData object, the specific structure is as follows:

Among them, QueueData, BrokerData, and filterServerTable are introduced when the routing meta-information is introduced in chapter 4.1.

source code analysis

After understanding the TopicRouteData structure returned to the producer, we enter the RouteInfoManager#pickupTopicRouteData method to see how to implement it.

public TopicRouteData pickupTopicRouteData(final String topic) {
    TopicRouteData topicRouteData = new TopicRouteData();
    boolean foundQueueData = false;
    boolean foundBrokerData = false;
    Set<String> brokerNameSet = new HashSet<String>();
    List<BrokerData> brokerDataList = new LinkedList<BrokerData>();
    topicRouteData.setBrokerDatas(brokerDataList);
​
    HashMap<String, List<String>> filterServerMap = new HashMap<String, List<String>>();
    topicRouteData.setFilterServerTable(filterServerMap);
​
    try {
        try {
            //加读锁
            this.lock.readLock().lockInterruptibly();
            //从元数据topicQueueTable中根据topic名字获取队列集合
            List<QueueData> queueDataList = this.topicQueueTable.get(topic);
            if (queueDataList != null) {
                //将获取到的队列集合写入topicRouteData的queueDatas中
                topicRouteData.setQueueDatas(queueDataList);
                foundQueueData = true;
​
                Iterator<QueueData> it = queueDataList.iterator();
                while (it.hasNext()) {
                    QueueData qd = it.next();
                    brokerNameSet.add(qd.getBrokerName());
                }
        //遍历从QueueData集合中提取的brokerName
                for (String brokerName : brokerNameSet) {
                    //根据brokerName从brokerAddrTable获取brokerData
                    BrokerData brokerData = this.brokerAddrTable.get(brokerName);
                    if (null != brokerData) {
                        //克隆brokerData对象,并写入到topicRouteData的brokerDatas中
                        BrokerData brokerDataClone = new BrokerData(brokerData.getCluster(), brokerData.getBrokerName(), (HashMap<Long, String>) brokerData.getBrokerAddrs().clone());
                        brokerDataList.add(brokerDataClone);
                        foundBrokerData = true;
                        //遍历brokerAddrs
                        for (final String brokerAddr : brokerDataClone.getBrokerAddrs().values()) {
                            //根据brokerAddr获取filterServerList,封装后写入到topicRouteData的filterServerTable中
                            List<String> filterServerList = this.filterServerTable.get(brokerAddr);
                            filterServerMap.put(brokerAddr, filterServerList);
                        }
                    }
                }
            }
        } finally {
            //释放读锁
            this.lock.readLock().unlock();
        }
    } catch (Exception e) {
        log.error("pickupTopicRouteData Exception", e);
    }
​
    log.debug("pickupTopicRouteData {} {}", topic, topicRouteData);
​
    if (foundBrokerData && foundQueueData) {
        return topicRouteData;
    }
​
    return null;
}

The above code encapsulates the queueDatas, BrokerDatas and filterServerTable of TopicRouteData, and the orderTopicConf field is not encapsulated. Let's see when this field is encapsulated. Let's look up at the method of calling RouteDefaultRequestProcessor#getRouteInfoByTopic of RouteInfoManager#pickupTopicRouteData as follows:

public RemotingCommand getRouteInfoByTopic(ChannelHandlerContext ctx,
                                           RemotingCommand request) throws RemotingCommandException {
    ......
  //这块代码就是上面解析的代码,获取到topicRouteData对象
    TopicRouteData topicRouteData = this.namesrvController.getRouteInfoManager().pickupTopicRouteData(requestHeader.getTopic());
​
    if (topicRouteData != null) {
        //判断nameserver的orderMessageEnable配置是否打开
        if (this.namesrvController.getNamesrvConfig().isOrderMessageEnable()) {
            //如果配置打开了,根据namespace和topic名字获取kvConfig配置文件中顺序消息配置内容
            String orderTopicConf =
                this.namesrvController.getKvConfigManager().getKVConfig(NamesrvUtil.NAMESPACE_ORDER_TOPIC_CONFIG,
                                                                        requestHeader.getTopic());
            //封装orderTopicConf
            topicRouteData.setOrderTopicConf(orderTopicConf);
        }
​
        byte[] content = topicRouteData.encode();
        response.setBody(content);
        response.setCode(ResponseCode.SUCCESS);
        response.setRemark(null);
        return response;
    }
  //如果没有获取到topic路由,那么reponseCode为TOPIC_NOT_EXIST
    response.setCode(ResponseCode.TOPIC_NOT_EXIST);
    response.setRemark("No topic route info in name server for the topic: " + requestHeader.getTopic()
                       + FAQUrl.suggestTodo(FAQUrl.APPLY_TOPIC_URL));
    return response;
}

Combining these two methods, we can conclude that finding Topic routing is mainly divided into 3 steps:

Call RouteInfoManager#pickupTopicRouteData, get information from topicQueueTable, brokerAddrTabl, filterServerTable, and fill queue-Datas, BrokerDatas, filterServerTable respectively.

If the topic is a sequential message, then get the configuration of the sequence message from KVconfig and fill it in orderTopicConf.

If the routing information is not found, the return code is ResponseCode.TOPIC\_NOT\_EXIST.

V. Summary

This article mainly introduces RocketMQ's NameServer from the perspective of source code, including the startup process of NameServer, route registration, route removal and route discovery. After we understand the design principles of NameServer, we can also go back and think about some tips that are worth learning in the design process. Here I offer two points:

  • The startup process registers JVM hooks for graceful shutdown. This is a programming technique. In the actual development process, if we use thread pools or some resident thread tasks, we can consider registering JVM hooks to release resources or complete some things before the JVM shuts down to ensure graceful shutdown.
  • When updating the routing table, you need to lock to prevent concurrent operations. The read-write lock with less lock granularity is used here, which allows multiple message senders to read concurrently, ensuring high concurrency when sending messages, but the NameServer only processes one Broker at the same time Heartbeat packets, multiple heartbeat packets are requested to be executed serially, which is also a classic use scenario for read-write locks.

Six, reference materials

1. "RocketMQ Technology Insider"

2. "RocketMQ Core Principles and Practice"

3. Apache RocketMQ Developer Guide

Author: vivo internet server team-Ye Wenhao

vivo互联网技术
3.3k 声望10.2k 粉丝