In the previous article " MQTT Broker Cluster Detailed Explanation (1): Load Balancing ", we briefly introduced MQTT load balancing: load balancing can be applied to both the transport layer and the application layer. In this article, we will introduce in detail application layer load balancing, the most interesting part: sticky-session.

This article consists of two parts. The first part will introduce the MQTT session and MQTT Broker cluster; the second part is by HAProxy 2.4 load balancer in front of the EMQ X 4.3 , And take readers to personally experience how to make full use of sticky sessions to achieve load balancing.

MQTT session

In order to continuously receive messages, the MQTT client usually connects to the MQTT Broker to subscribe and maintain a long-term connection. Due to network problems or client software maintenance, the connection may be interrupted for a period of time. This is not uncommon, but the client usually hopes to still receive messages that were missed during the interruption after the reconnection is successful.

Therefore, the MQTT Broker that provides services to the client should maintain a session for the client (according to the client's request, set the "Clean-Session" flag to false). At this point, even if the client is disconnected, the topics currently subscribed by the subscriber and the messages (QoS1 and 2) delivered to these topics will be retained by the message server (broker).

When a client with a persistent session reconnects, it does not need to re-subscribe to the topic, and the message server should send all unsent messages to the client.

We have written an article about MQTT session before. If you are interested in the technical details of MQTT session, you can read this article to learn more.

Session takeover

When MQTT Brokers form a cluster, things become more complicated. From the client's point of view, there is more than one server to connect to, and it is difficult to know which server is most suitable for connection. We need another key component in the network: the load balancer. The load balancer becomes the access point of the entire cluster and routes the client connection to a certain server in the cluster.

If a client connects to a server (for example, node1) through a load balancer, then disconnects and reconnects later, the new connection may be routed to a different server in the cluster (for example, node3). In this case, node3 should start sending unsent messages to the client when the client disconnects.

There are many different strategies for implementing cluster-wide persistent sessions. For example, the entire cluster can share a global storage to save client sessions.

However, more scalable solutions usually solve this problem in a distributed manner, where data is migrated from one node to another. This migration is called session takeover. Session takeover should be completely transparent to the client, but it comes at a cost, especially when there are many messages to process.

会话接管

Sticky Session Solution

The term "sticky" here refers to the ability of the load balancer to route the client to the previous server when reconnecting, which can avoid session takeover. This is a particularly useful feature when there are many clients reconnecting at the same time, or when a problematic client repeatedly disconnects and connects again.

In order for the load balancer to dispatch connections in a "sticky" manner, the server needs to know the client identifier (sometimes the user name) in the connection request-this requires the load balancer to check the MQTT packet to find this kind of information.

Once the client identifier (or username) is obtained, for a static cluster, the server can hash the client identifier (or username) to the server ID. Or for better flexibility, the load balancer can choose to maintain a mapping table from the client identifier (or user name) to the target node ID.

In the next section, we will demonstrate the sticky table strategy in HAProxy 2.4.

Use HAProxy 2.4 to achieve sticky sessions

In order to minimize the prerequisites, in this demo cluster, we will start two EMQ X nodes and one HAProxy 2.4 in a docker container.

Create docker network

In order to connect the containers to each other, we created a docker network for them.

docker network create test.net

Start two EMQ X 4.3 nodes

In order to connect the nodes to each other, the container name and EMQX node name test.net

Start node1

docker run -d \
  --name n1.test.net \
  --net test.net \
  -e EMQX_NODE_NAME=emqx@n1.test.net \
  -e EMQX_LISTENER__TCP__EXTERNAL__PROXY_PROTOCOL=on \
  emqx/emqx:4.3.7

Start node2

docker run -d \
  --name n2.test.net \
  --net test.net \
  -e EMQX_NODE_NAME=emqx@n2.test.net \
  -e EMQX_LISTENER__TCP__EXTERNAL__PROXY_PROTOCOL=on \
  emqx/emqx:4.3.7

Pay attention to environment variables

EMQX_LISTENER__TCP__EXTERNAL__PROXY_PROTOCOL . This variable is to enable the binary proxy protocol for the TCP listener, so that the server can obtain the real IP address information of the client instead of the IP address of the load balancer.

Add EMQ X nodes to the cluster

docker exec -it n2.test.net emqx_ctl cluster join emqx@n1.test.net

If everything goes as expected, a log like this should be printed out:

[EMQ X] emqx shutdown for join
Join the cluster successfully.
Cluster status: #{running_nodes => ['emqx@n1.test.net','emqx@n2.test.net'], stopped_nodes => []} 

Start HAProxy 2.4

Create file /tmp/haproxy.config with the following content:

global
  log stdout format raw daemon debug
  nbproc 1
  nbthread 2
  cpu-map auto:1/1-2 0-1
  # Enable the HAProxy Runtime API
  # e.g. echo "show table emqx_tcp_back" | sudo socat stdio tcp4-connect:172.100.239.4:9999
  stats socket :9999 level admin expose-fd listeners

defaults
  log global
  mode tcp
  option tcplog
  maxconn 1024000
  timeout connect 30000
  timeout client 600s
  timeout server 600s

frontend emqx_tcp
  mode tcp
  option tcplog
  bind *:1883
  default_backend emqx_tcp_back

backend emqx_tcp_back
  mode tcp

  # Create a stick table for session persistence
  stick-table type string len 32 size 100k expire 30m

  # Use ClientID / client_identifier as persistence key
  stick on req.payload(0,0),mqtt_field_value(connect,client_identifier)

  # send proxy-protocol v2 headers
  server emqx1 n1.test.net:1883 check-send-proxy send-proxy-v2
  server emqx2 n2.test.net:1883 check-send-proxy send-proxy-v2

Start haproxy in the test docker network:

docker run -d \
  --net test.net \
  --name proxy.test.net \
  -p 9999:9999 \
  -v /tmp/haproxy.cfg:/haproxy.cfg \
  haproxy:2.4 haproxy -f /haproxy.cfg

test

Now we use the popular mosquitto MQTT client (also in docker) to test it.

We start a subscriber (named subscriber1 ) to subscribe to the topic t/#

docker run --rm -it --net test.net eclipse-mosquitto \
    mosquitto_sub -h proxy.test.net -t 't/#' -I subscriber1

Then publish a hello message t/xyz from another client

docker run --rm -it --net test.net eclipse-mosquitto \
    mosquitto_pub -h proxy.test.net -t 't/xyz' -m 'hello'

If everything goes as expected, the subscriber should print out the hello message.

Check the sticky table in HAProxy

We can also use the following command to check the sticky table created in HAProxy. This requires the socat command, so we run it from the docker host.

show table emqx_tcp_back" | sudo socat stdio tcp4-connect:127.0.0.1:9999

The command should print the current connection as shown below:

# table: emqx_external_tcp_listners, type: string, size:102400, used:1
0x7f930c033d90: key=subscriber1 use=0 exp=1793903 server_id=2 server_key=emqx2

In this example, the client subscriber1 is permanently connected to the server emqx2 .

Concluding remarks

So far, we can understand how the EMQ X cluster provides services to the outside through the load balancer from the perspective of the client.

In the follow-up content of this series of articles, we will track the entire process of an MQTT message from publisher to subscriber, so that everyone can understand how EMQ X replicates and forwards it in the cluster. Stay tuned.

Other articles in this series

Copyright statement: This article is EMQ original, please indicate the source for reprinting.

Original link: https://www.emqx.com/zh/blog/mqtt-broker-clustering-part-2-sticky-session-load-balancing

Technical support: If you have any questions about this article or EMQ-related products, you can visit the EMQ Q&A community https://askemq.com ask questions, we will reply to support in time.

For more technical dry goods, please pay attention to our public account [EMQ Chinese Community].


EMQX
336 声望436 粉丝

EMQ(杭州映云科技有限公司)是一家开源物联网数据基础设施软件供应商,交付全球领先的开源 MQTT 消息服务器和流处理数据库,提供基于云原生+边缘计算技术的一站式解决方案,实现企业云边端实时数据连接、移动、...