What is the MQTT protocol? This article tells you!

The article was first published on my public account "programmer cxuan", welcome everyone to pay attention~

A reader left a message to me before saying that he wanted to know what the MQTT protocol is, and he praised me by the way, which is a bit embarrassing.

Then the requirements of the readers must be met, so now @ this young lady, come and listen to the class!

What is the MQTT protocol

The full name of the MQTT protocol is Message Queuing Telemetry Transport , which translates to message queue transmission detection. It is publish-subscribe mode under the ISO standard. It is based on the TCP/IP protocol cluster. It is It is designed to improve the performance of network equipment hardware and network performance. MQTT is generally used in IoT, the Internet of Things, and is widely used in industrial-level application scenarios, such as automobiles, manufacturing, oil, and natural gas.

After understanding the concepts and application scenarios of MQTT, let's come down and walk into the learning of MQTT, let's take a look at the concepts of MQTT.

MQTT basics

Above we explained the basic concepts of the MQTT protocol. The summary of the MQTT protocol is a lightweight binary protocol . Compared with HTTP, the MQTT protocol has an obvious advantage: has a smaller data packet overhead and , which has a small packet overhead It means easier network transmission. Another advantage is that MQTT is easy to implement on the client side, and it is easy to use, which is very suitable for today's resource-limited devices.

You may be a little secretive about these concepts. Why does it have the characteristic of xxx? This needs to start with the design of MQTT.

The MQTT protocol was invented in 1999 by Andy Stanford-Clark (IBM) and Arlen Nipper (Arcom, now Cirrus Link). They need a protocol to connect oil pipelines via satellites to minimize battery consumption and bandwidth. So they set several requirements for this agreement:

This agreement must be easy to implement;
The data in this protocol must be easy to transmit and consume less cost;
This agreement must provide service quality management;
This protocol must support continuous session control
Assuming that the data is unknowable, the type and format of the transmitted data are not forced to maintain flexibility.

These designs are also the essence of MQTT. After continuous development, MQTT has become a necessary message detection protocol for the Internet of Things. The official version strongly recommended is MQTT 5.

Publish-subscribe model

Publish-subscribe model I believe students who have come into contact with the message middleware architecture have heard it. This is an alternative to the traditional client-server architecture, because the traditional client-server is that the client can directly communicate with the server. Communication.

However publish - subscribe model pub/sub not the case, the publisher will publish subscription model to send a message publisher with subscribers receiving messages subscribers separation, publisher and subscribers will not communicate directly, they are not even clear whether the other exists, The communication between them is represented by the third-party component broker .

The most important aspect of pub/sub is the decoupling between publisher and subscriber. This coupling has the following three dimensions:

Spatial decoupling: Publisher and subscriber do not know the existence of each other, for example, there will be no interaction between IP addresses and ports, and there will be no interaction between messages.
Time decoupling: Publisher and subscriber do not necessarily need to run at the same time.
Synchronous Synchronization decoupling: The operations of the two components such as publish and subscribe will not cause interruption during the publishing or receiving process.

In short, the publish/subscribe model eliminates the direct communication between the traditional client and the server, and hands the communication operation to the broker for proxy, and understands the lotus in the three dimensions of space, time, and synchronization.

Scalability

Pub/sub has a better expansion than the traditional client-server model. This is due to the parallelization of the broker, and it is based on the event-driven model. Scalability is also reflected in message caching and intelligent routing of messages. It can also realize millions of connections through cluster agents, and use load balancers to distribute load to more single servers. This is the in-depth application of MQTT. .

You may not understand what is event-driven, I will explain the concept of event-driven here.

Event-driven is a programming paradigm. Programming paradigm is a concept in software engineering. refers to a programming method or programming method . For example, object-oriented programming and process-oriented programming are a programming paradigm, event The program flow in the driver is determined by events such as user operations (clicking the mouse, keyboard), sensor output, or messages from other programs or delivery. Event-driven programming is the main paradigm used in graphical user interfaces and other applications such as the Web. These applications can perform certain operations in response to user input. This is also applicable to driver programming.

`Message filtering`

In the pub/sub architecture pattern, the broker plays a vital role. One of the very important points is that the broker can filter messages so that each subscriber can only receive messages that are of interest to them.

The broker has several options that can be filtered

Subject-based filtering

MQTT is based on subject message filtering. Each message will have a topic. The receiving client will subscribe to the topic of interest to the borker. After subscribing, the broker will ensure that the client receives the message published to the topic.

Content-based filtering

In content-based filtering, the broker will filter messages based on specific content, and receiving clients will filter the content they are interested in. A significant disadvantage of this method is that the content of the message must be known in advance and cannot be encrypted or easily modified.

Type-based filtering

When using object-oriented languages, type filtering based on messages (events) is a relatively common filtering method.

In order to challenge the publish/subscribe system, MQTT has three service quality levels. You can specify the message to be transmitted from the client to the broker or from the broker to the client. In the topic subscription, there will be cases where the topic is not subscribed by the subscriber. The broker must know how to handle this situation.

`The difference between MQTT and message queue`

We now know that MQTT is a message queue transmission detection protocol, which seems to be based on message queues, but is different from message queues.

In the traditional message queue model, a message will be stored in the message queue waiting to be consumed, and each incoming message will be stored in the message queue until it is received by the client (usually called the consumer), if If there is no client to consume the message, the message will be stored in the message queue waiting to be consumed. But in the message queue, there will be no case where the message is not consumed by the client, but in MQTT, there is indeed a case where the topic does not have a subscriber subscription.

In the traditional message queue model, a message can only be consumed by one client, and the load of will be distributed among each consumer of the queue; while in MQTT, each subscriber will receive the message, and each subscriber will receive the message. Have the same load.

In the traditional message queue mode, a separate command must be used to explicitly create a queue. Only after the queue is created, can messages be produced or consumed; in MQTT, topics are more flexible and can be created instantly.

HiveMQ is now open source. HiveMQ Community Edition implements the MQTT broker specification and is compatible with MQTT 3.1, 3.1.1 and MQTT 5. HiveMQ MQTT Client is a Java-based MQTT client implementation, compatible with MQTT 3.1.1 and MQTT 5. Both projects can be found on HiveMQ's github https://github.com/hivemq .

We know that the broker separates the publisher and the subscriber, so the client connection is proxied by the broker, so before we deeply understand MQTT, we need to know the meaning of the client and the proxy.

`MQTT important concepts`

`MQTT client`

When we discuss the concept of the client, we generally refer to MQTT Client , publisher and subscriber belong to MQTT Client. The concept of publisher and subscriber is actually a relative concept, which refers to whether the current client is publishing or receiving messages. publish and subscribe functions can also be implemented by the same MQTT Client .

The MQTT client refers to any device that runs the MQTT library and connects to the MQTT broker through the network. These devices can range from a microcontroller to a mature server. Basically, any MQTT device using TCP/IP protocol can be called MQTT Client. The client implementation of the MQTT protocol is very simple and straightforward. Ease of implementation is one of the reasons why MQTT is very suitable for small devices. The MQTT client library can be used in multiple programming languages. For example, Android, Arduino, C, C++, C#, Go, iOS, Java, JavaScript, and .NET.

`MQTT broker`

The MQTT broker corresponds to the MQTT client. The broker is the core of any publish/subscribe organization. Depending on the implementation, the broker can handle up to millions of connected MQTT clients.

The broker is responsible for receiving all messages, filtering the messages, determining which client has subscribed to each message, and sending the message to the corresponding client. The broker is also responsible for saving session data, including subscribed and missed messages. The broker is also responsible for client authentication and authorization.

`MQTT Connection`

MQTT is based on the TCP/IP protocol, so both the client and broker of MQTT need the support of the TCP/IP protocol.

The MQTT connection is always carried out between the client and the broker, and the client and the client are not connected to each other. If you want to initiate a connection, the client will initiate a CONNECT message to the broker, and the agent will respond with a CONNACK message and status code. Once the connection between the client and the broker is established, the broker will keep the client's connection open until the client issues a disconnect command or the connection is interrupted.

`Message message`

MQTT messages are mainly divided into CONNECT and CONNACK messages.

`CONNECT`

We mentioned above that in order to initialize the connection, the client needs to send a CONNECT message to the broker. If the CONNECT message format is wrong or the socket is opened (because the TCP/IP protocol stack needs to initialize the Socket connection), the time is too long, or the connection message is sent. If the time is too long, the broker will close the connection.

An MQTT client sends a CONNECT connection. This CONNECT connection may contain the following information:

Let me explain what these messages are

ClientId : Obviously, this is the ID of each client, that is, each client connected to the MQTT broker. This ID should be unique to each client and broker. If you don't need the broker to hold state, you can send an empty ClientId, and an empty ClientId will have no state. In this case, ClientSession needs to be set to true, otherwise the connection will be refused.

What clientSession is we will talk about below.

CleanSession : The CleanSession session flag will tell the broker client whether it needs to establish a persistent session. In a persistent session (CleanSession = false), the broker stores all client subscriptions and quality of service (Qos) is 1 or 2 subscribed to the client's all lost messages. If the session is not persistent (CleanSession = true), then the broker will not store anything for the client and will clear all information in the previous persistent session.
Username/Password : MQTT will send username and password for client authentication and authorization. If this information is not encrypted or hashed, the password will be sent in plain text. Therefore, it is generally strongly recommended that username and password be encrypted and transmitted securely. Brokers like HiveMQ can authenticate with SSL certificates, so no username and password are required.
LastWillxxx : LastWillxxx represents the last wish, will set up a client wishes when the connection broker, the broker will be saved in this last wish, when the client because abnormal causes when disconnected from the broker, the broker will be sent to the wishes Clients who have subscribed to this topic (subscribed to the last wish topic).
keepAlive : keepAlive is the time interval between the client and the broker when the connection is established, usually in seconds. This time refers to the maximum time that the client and broker can withstand without sending messages.

After talking about sending a CONNECT message to establish a connection between the client and the broker, let's talk about the CONNACK message that the broker needs to confirm the CONNECT.

`CONNACK`

When the broker receives the CONNECT message, it is obligated to respond with the CONNACK message. The CONNACK message includes two parts

SessionPresent : The current session identifier. This flag will tell the client whether the current broker has a persistent session to interact with the client. The SessionPresent flag is related to the CleanSession flag. When the client connects with CleanSession set to true, SessionPresent is always false because there is no persistent session to use. If CleanSession is set to false, there are two possibilities. If the session information of the ClientId is available and the broker has stored the session information, then the SessionPresent is true, otherwise if there is no session information of the ClientId, then the SessionPresent is false.
ReturnCode : The second flag in the CONNACK message is the connection confirmation flag. This flag contains a return code that tells the client whether the connection attempt was successful. The connection confirmation flag has the following options.

For a detailed description of each connection, please refer to https://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718035

`Message type`

`release`

When the MQTT client can send messages after connecting to the broker, MQTT uses topic-based filtering. Each message should contain a topic, and the broker can use the topic to send the message to interested clients. In addition, each message will also contain a payload (Payload), which contains the data to be sent in bytes.

MQTT is data-independent, which means that the publisher determines whether the data to be sent is XML, JSON, binary data, or text data.

The PUBLISH message structure in MQTT is as follows.

Packet Identifier : This PacketId identifies the unique message identifier between the client and the broker. packetId is only related to Qos levels greater than zero.
TopicName : The subject name is a simple string, and / represents the hierarchical structure.
Qos : This number represents the service quality level. The service quality level has three levels: 0, 1, and 2. The service level determines the type of guarantee for the message to reach the client or broker to determine whether the message is lost.
RetainFlag : This flag indicates that the broker saves the recently received message with the RETAIN flag bit true on the server side (memory or file).

The MQTT server will only save the most recently received message with the RETAIN flag bit true for each topic. In other words, if a retained message has been saved for a topic on the MQTT server, when the client publishes a new retained message again, the original message on the server will be overwritten.

Payload : This is the actual content of each message. MQTT is data-independent. Any text, image, encrypted data, and binary data can be sent.
Dupflag : This flag indicates that the message is repeated and resent because the expected client or broker did not confirm it. This flag is only related to Qos greater than 0.

When the client sends a message to the broker, the broker will read the message, confirm the message according to the Qos level, and then process the message. Processing messages is actually determining which subscribers have subscribed to the topic and sending the messages to them.

The client who originally published the message only cares about sending the PUBLISH message to the broker. Once the broker receives the PUBLISH message, the broker is responsible for delivering it to all subscribers. The client who publishes the message does not know whether anyone is interested in the published message, nor does it know how many clients have received messages from the broker.

`subscription`

The client will send a SUBSCRIBE message to the broker to receive the topic of interest. This SUBSCRIBE message is very simple. It contains a unique packet identifier and a subscription list.

Packet Identifier : This PacketId is the same as the above PacketId, and both represent the unique identifier of the message.
ListOfSubscriptions : The SUBSCRIBE message can contain multiple subscriptions of a client, and each subscription will consist of a topic and a Qos. The topic in the subscription message can contain wildcards.

`Confirmation message`

After the client sends a SUBSCRIBE message to the broker, in order to confirm each subscription, the broker sends a SUBACK confirmation message to the client. This SUBACK contains the packetId and return code list of the original SUBSCRIBE message.

among them

Packet Identifier : This packet identifier is the same as that in SUBSCRIBE.
ReturnCode : The broker sends a return code for each topic/Qos pair of the received SUBSCRIBE message. For example, if the SUBSCRIBE message has five subscription messages, the SUBACK message contains five return codes in response.

So far we have discussed three types of messages, publish-subscribe-confirm messages. The schematic diagrams of these three messages are as follows.

`Unsubscribe`

The SUBSCRIBE message corresponds to the UNSUBSCRIBE message. After this message is sent, the broker will delete the client's subscription. Therefore, the UNSUBSCRIBE message is similar to the SUBSCRIBE message, and both have a packetId and topic list.

`Confirm unsubscribe`

Canceling the subscription also requires confirmation from the broker. At this time, the broker will send a UNSUBACK message to the client. This UNSUBACK message is very simple, with only a packetId data identifier.

The process of unsubscription and confirmation of unsubscription is as follows.

When the client receives the UNSUBACK message from the broker, it can be considered that the subscription in the UNSUBSCRIBE message has been deleted.

`Talk about Topic`

I have talked so much about MQTT, but we haven't talked about Topic. UTF-8 string that the broker filters messages for each connected client. Topic is a hierarchical structure, which can be composed of one or more topics. Each / is divided by 060d9737cb94aa.

Compared with traditional message queues, MQTT Topic is very lightweight. The client does not need to create the required Topic before publishing or subscribing, and the broker does not need to perform initialization operations before receiving each topic.

wildcard

When a client subscribes to a topic, it can subscribe to the exact topic of the published message, or use wildcards to subscribe to multiple topics at the same time. There are two types of wildcards: single-level and multi-level .

`Single-level wildcard`

Single-level wildcards can replace one level of Topic. + represents the single-level wildcards in Topic.

If Topic contains any string instead of wildcard, then any topic can be matched with single-level wildcard. E.g

/groundfloor/+/temperature has the following matching methods.

`Multi-level wildcard`

Multi-level wildcards cover multiple topics. # represents the multi-level wildcards in Topic. In order for the broker to determine which topics to match, the multi-level wildcard must be placed as the last character in the / start with 060d9737cb9605.

Here are a few examples /groundfloor/#

When a client subscribes to a topic with multiple levels of wildcards, no matter how long or deep the topic is, it will receive all the messages of the topic before the wildcard. If you only define Topic as #, then you will receive all messages.

I own six PDFs, which have been spread over 10w+ throughout the Internet. After searching for "programmer cxuan" on WeChat and following the

six PDF links