MQTT protocol has been widely used in the Internet of Things, small device scenarios, mobile applications, etc., and has gradually become the standard for Internet of Things communication. This article focuses on the challenges of setting up an MQTT Broker cluster and the role of load balancing in the MQTT cluster.

MQTT protocol

Similar to the familiar HTTP protocol, the MQTT protocol is also based on TCP/TLS and belongs to the application layer protocol (it can also work on top of the HTTP protocol, this article does not involve this part of the content for the time being).

The MQTT Standards Committee's interpretation of the MQTT protocol is as follows:

MQTT is the OASIS standard messaging protocol for the Internet of Things (IoT). It is a very lightweight message transmission protocol that uses a publish/subscribe mechanism and is very suitable for connecting to remote devices. Both code and network bandwidth are very small. Today, MQTT has been widely used in various industries such as automobiles, industrial manufacturing, telecommunications, oil and gas.

MQTT client is also very similar to the HTTP client. It establishes a TCP connection with the server, and transmits data through the connection. The difference is that HTTP uses a request/response model, while MQTT uses a publish/subscribe model.

For example: the temperature sensor installed in the living room will intermittently upload the indoor temperature value to MQTT server . And another smart home device subscribes to the channel where the temperature sensor releases the news, it can get the indoor temperature data, and take some smart countermeasures based on the actual room temperature, such as turning on the air conditioner when the indoor temperature exceeds 32°C.

Scalability challenge

The MQTT protocol sounds far away from us, but it has already penetrated into our daily lives. In general, a single MQTT node can meet the smart home device connection requirements of a single family, and users can even run an EMQ X Edge (MQTT server running on the edge) on the Raspberry Pi. An EMQ X node running in the cloud can support up to 2 million connections, easily meeting the needs of ordinary smart home scenarios.

But if millions of cars across the country need to be connected to the Internet, or millions of street lights need to transmit data and other scenarios, then the huge number of devices (MQTT clients) and data throughput far exceed a single MQTT The pressure that the nodes can withstand requires the formation of an MQTT server cluster.

While building a cluster, it also faces a series of technical challenges:

  1. Provide service address: How to let the client know which address to connect to?
  2. How do different nodes take over the sessions of MQTT subscribers, for example, when a client disconnects from one server, how to restore the connection on another server?
  3. How to maintain consistency in the routing table on each node in the cluster?

By introducing a load balancing in front of the MQTT cluster, it can help us easily solve problems 1 and 2.

MQTT load balancing

MQTT 负载均衡

MQTT load balancing

In order to deal with the above problems, load balancing needs to be able to help the client decide which node to connect to according to the configured balancing strategy. The main functions of MQTT cluster load balancing are:

  • Provide cluster service address externally

    The client only needs to care about the address of the load balancing, and does not need to know the address of each node in the cluster. This greatly improves the flexibility of server migration and scaling.

  • TLS end

    Many MQTT users choose to terminate TLS at the load balancing layer, so that the resources of the MQTT server can be fully used for message processing.

  • Balance the load of each node in the cluster

    Load balancing services can usually be configured with different balancing strategies, such as random allocation, polling (some polling strategies can adjust node weights), and interesting sticky allocation.

Since MQTT is a protocol based on TCP/IP, load balancing can be performed at the transport layer. In addition to load balancing at the transport layer, the load balancing products HAProxy 2.4 and Nginx Plus that MQTT can use also provide a load balancing solution at the application layer (MQTT layer).

Nginx Plus is an application delivery platform built on the basis of Nginx (an open source web server and a reverse proxy for high-traffic websites). Nginx Plus's article made a more detailed description.

Equally good, there is HAProxy. It provides high-availability load balancing, and application proxy based on TCP, HTTP, and MQTT. So far (August 2021), HAProxy 2.4 is the only free product that can provide MQTT layer load balancing. In their release note , they briefly introduced the MQTT load balancing function.

In the next article in the "MQTT Broker cluster detailed explanation" series, we will expand the integration scheme of HAProxy 2.4 + EMQ X 4.3 in detail, so stay tuned.

Copyright statement: This article is EMQ original, please indicate the source for reprinting.

Original link: https://www.emqx.com/zh/blog/mqtt-broker-clustering-part-1-load-balancing

Technical support: If you have any questions about this article or EMQ related products, you can visit the EMQ Q&A community https://askemq.com ask questions, and we will reply and support in time.


EMQX
336 声望437 粉丝

EMQ(杭州映云科技有限公司)是一家开源物联网数据基础设施软件供应商,交付全球领先的开源 MQTT 消息服务器和流处理数据库,提供基于云原生+边缘计算技术的一站式解决方案,实现企业云边端实时数据连接、移动、...