With the full maturity of Kubernetes (K8s), more and more organizations have begun to build infrastructure layers based on K8s on a large scale.
According to Sysdig's research report in the field of container orchestration, K8s has a market share of up to 75%. Among the K8s deployment and operation tools, Operator and Helm are the more mainstream two. However, due to the lack of life cycle management in Helm, Operator has become the only choice in the management of the whole life cycle.
K8s Operator is an application-specific controller that can continuously monitor the change events of K8s resource objects, perform monitoring and response throughout the life cycle, and complete deployment and delivery with high reliability. Operator provides a framework. Generally speaking, it deposits the experience of operation and maintenance into code, and realizes the coding, automation and intelligence of operation and maintenance.
In order to manage the whole life cycle of the cloud-native distributed MQTT message server EMQX , EMQX Operator came into being. Using EMQX Operator, even in the K8s environment with complex network and storage environment, you can easily build an MQTT cluster with millions of connections. This article will use EMQX Operator to build a million-level MQTT connection service based on K8s, and verify the results through testing.
What is EMQX Operator
EMQX is a cloud-native distributed MQTT message server developed based on Erlang/OTP platform. With the deepening of cloud native concepts and the popularization of K8s and Operator concepts, we have developed EMQX Operator ( https://github.com/emqx/emqx-operator ), which can quickly create and manage EMQX clusters in the Kubernetes environment , realize the management of EMQX life cycle, and greatly simplify the process of deploying and managing EMQX clusters. It mainly has the following functional advantages:
- Reduce EMQX deployment cost in K8s environment
- Provides basic capabilities for persistent data backup and recovery
- Provides the ability to independently deploy, manage, and configure persistence for EMQX Plugin (coming soon)
- Dynamically update Licence and SSL, etc.
- Automated operation and maintenance (high availability, capacity expansion, exception handling)
Use EMQX Operator to build a million-level MQTT cluster based on K8s
Kernel parameter tuning
In order to maximize the performance of EMQX, we adjusted the kernel parameters on the worker nodes.
sudo vi /etc/sysctl.conf
#!/bin/bash
echo "DefaultLimitNOFILE=100000000" >> /etc/systemd/system.conf
echo "session required pam_limits.so" >> /etc/pam.d/common-session
echo "* soft nofile 10000000" >> /etc/security/limits.conf
echo "* hard nofile 100000000" >> /etc/security/limits.conf
# lsmod |grep -q conntrack || modprobe ip_conntrack
cat >> /etc/sysctl.d/99-sysctl.conf <<EOF
net.ipv4.tcp_tw_reuse=1
fs.nr_open=1000000000
fs.file-max=1000000000
net.ipv4.ip_local_port_range=1025 65534
net.ipv4.udp_mem=74583000 499445000 749166000
net.core.somaxconn=32768
net.ipv4.tcp_max_sync_backlog=163840
net.core.netdev_max_backlog=163840
net.core.optmem_max=16777216
net.ipv4.tcp_rmem=1024 4096 16777216
net.ipv4.tcp_wmem=1024 4096 16777216
net.ipv4.tcp_max_tw_buckets=1048576
net.ipv4.tcp_fin_timeout=15
net.core.rmem_default=262144000
net.core.wmem_default=262144000
net.core.rmem_max=262144000
net.core.wmem_max=262144000
net.ipv4.tcp_mem=378150000 504200000 756300000
# net.netfilter.nf_conntrack_max=1000000
# net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
EOF
sysctl -p
Deployment installation
Deploy 5 EMQX Pods
Install Cert Manager dependencies
$ kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.yaml
Install EMQX Operator
$ helm repo add emqx https://repos.emqx.io/charts $ helm repo update $ helm install emqx-operator emqx/emqx-operator \ --set installCRDs=true \ --namespace emqx-operator-system \ --create-namespace
Check EMQX Operator Controller Status
$ kubectl get pods -l "control-plane=controller-manager" -n emqx-operator-system NAME READY STATUS RESTARTS AGE emqx-operator-controller-manager-68b866c8bf-kd4g6 1/1 Running 0 15s
Deploy EMQX
cat << "EOF" | kubectl apply -f - apiVersion: apps.emqx.io/v1beta2 kind: EmqxEnterprise metadata: name: emqx-ee labels: cluster: emqx spec: image: emqx/emqx-ee:4.4.1 env: - name: "EMQX_NODE__DIST_BUFFER_SIZE" value: "16MB" - name: "EMQX_NODE__PROCESS_LIMIT" value: "2097152" - name: "EMQX_NODE__MAX_PORTS" value: "1048576" - name: "EMQX_LISTENER__TCP__EXTERNAL__ACCEPTORS" value: "64" - name: "EMQX_LISTENER__TCP__EXTERNAL__BACKLOG" value: "1024000" - name: "EMQX_LISTENER__TCP__EXTERNAL__MAX_CONNECTIONS" value: "1024000" - name: "EMQX_LISTENER__TCP__EXTERNAL__MAX_CONN_RATE" value: "100000" emqxTemplate: license: "your license string" listener: type: LoadBalancer annotations: service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: "intranet" service.beta.kubernetes.io/alibaba-cloud-loadbalancer-spec: "slb.s3.large" EOF
View EMQX Deployment Status
$ kubectl get pods
The EMQX cluster consists of 5 Pods, and the resource limit of each Pod is as follows:
$ kubectl get emqx-ee emqx-ee -o json | jq ".spec.replicas"
5
$ kubectl get emqx-ee emqx-ee -o json | jq ".spec.resources"
{
"limits": {
"cpu": "20",
"memory": "20Gi"
},
"requests": {
"cpu": "4",
"memory": "4Gi"
}
}
Build result verification
test environment
This test uses Alibaba Cloud's ACK proprietary service, the network plug-in is Flannel, 3 instances with specifications of ecs.c7.2xlarge and operating system of centos7.9 are used as master nodes, and 5 instances with specifications of ecs.c7.16xlarge, operating The system is the instance of centos7.9 as the worker node node, the test tool uses XMeter, and the load balance of the press and ACK is in the same VPC network.
testing scenarios
- 1 million clients connect to EMQX using MQTT 5.0 protocol
- 500k each for publish client and subscribe client
- Each publish client publishes a message with QoS 1 and payload size 1k per second
- Each corresponding subscribe client consumes one message per second
Test Results
As shown in the following EMQX Enterprise Dashboard monitoring, the cluster under test has a total of 5 worker nodes, the cluster access volume actually reaches 1M connections and 500k subscriptions, and the message inflow (publishing) message outflow (consumption) has reached 500,000 per second:
The resource consumption of EMQX Enterprise Pods during message throughput is as follows:
The XMeter test tool results are detailed as follows:
Epilogue
Through the above verification, we can see that the EMQX cluster deployed on K8s based on EMQX Operator can easily handle millions of MQTT connections. With the popularity of K8s, more users will choose Operator to deploy and operate cloud-native applications on K8s. EMQ will continue to optimize the EMQX Operator to help users simplify the process of deploying and managing EMQX clusters, fully enjoy the convenience brought by the cloud in the cloud-native era, and explore the powerful capabilities that EMQX brings to IoT real-time data movement, processing and integration.
Copyright statement: This article is original by EMQ, please indicate the source when reprinting.
Original link: https://www.emqx.com/zh/blog/building-a-million-connection-mqtt-service-on-k8s
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。