The next frontier of Kubernetes: multi-cluster management


Text|Jin Min (Technical Expert of Ant Group)\Qiu Jian (Red Hat)

Proofreading| Feng Yong (Senior Technical Expert of Ant Group)

This article is 3,311 words read in 6 minutes

Since the advent of Kubernetes technology, the industry has repeatedly questioned whether it can withstand the test of production level. Today, many large enterprises have adopted Kubernetes technology to "cloudize" the large-scale infrastructure before it, and dozens of them have been created inside the enterprise. Or even hundreds of clusters.

The native management capabilities of Kubernetes are still at the single-cluster level. Each cluster can operate stably and autonomously, but it lacks the ability to coordinate and manage across multiple clusters. The builder of the infrastructure needs to coordinate the scattered management components to form a unified management platform.

Through it, operation and maintenance managers can learn about the changes in the water level of multiple clusters, the vibration of the node health status, and other information; the business application leader can decide how to deploy the deployment and distribution of application services in each cluster; the application operation and maintenance personnel can know the service status , Issued the strategy of TengNuo.

Innovations related to multiple clusters are constantly emerging. For example, ClusterAPI and Submariner are successful projects that deal with specific multi-cluster management issues.

And what this article is going to discuss is those technical explorations that try to solve all the problems encountered by enterprises in the face of multi-cluster management.

In the past five years, the Chinese technology company Ant Group has learned a lot of valuable experience in the thinking, use and implementation of multi-cluster management.

Ant Group manages dozens of Kubernetes clusters all over the world, each with an average of thousands of nodes (servers). They organize the application and its required components (including middleware, database, load balancing, etc.) into flexible logical units of the logical data center (LDC) at the architectural level, and plan to deploy them on the physical infrastructure. This architecture design helps them achieve two key goals of infrastructure operation and maintenance: high availability and transactional.

  • First of all, the availability of business applications deployed on a certain LDC can be guaranteed within its own LDC.
  • Second, the application components deployed in the LDC can be verified, and can be rolled back when a failure occurs.

Feng Yong, senior technical expert of the Ant Group PaaS team, said:

"Ant Group has dozens of Kubernetes clusters, hundreds of thousands of nodes and thousands of key application infrastructure. In such a cloud-native infrastructure, tens of thousands of Pods are created and deleted every day. Build a height A usable, scalable, and secure platform to manage these clusters and applications is a challenge."

PART. 1 started with KubeFed

In the Kubernetes project ecosystem, the multi-cluster function is mainly handled by the SIG-Multicluster team of the same name. This team developed a cluster federation technology called KubeFed in 2017.

Federation was initially considered to be a built-in feature of Kubernetes, but soon encountered the problem of splitting implementation and user demands. Federation v1 can distribute services to multiple Kubernetes clusters, but cannot handle other types of objects, nor can it truly "Manage" the cluster in any way. Some users with considerable professional needs—especially a few academic laboratories—still use it, but the project has been archived by Kubernetes and has never become a core feature.

Then, Federation v1 was quickly replaced by a refactored design called "KubeFed v2", which was used by operations staff all over the world. It allows a single Kubernetes cluster to deploy multiple objects to multiple other Kubernetes clusters. KubeFed v2 also allows the "control plane" master cluster to manage other clusters, including their massive resources and policies. This is the first generation solution of Ant Group's multi-cluster management platform.

One of the primary tasks of Ant Group using multi-cluster federation is resource elasticity, which includes not only node-level elasticity but also site-level elasticity. The ability to increase efficiency and expand the system by adding nodes and the entire cluster when needed. For example, annual resource elasticity. November 11 is China's annual Singles Day. Ant Group usually needs to quickly deploy a large amount of additional capacity to support peak online shopping workloads. However, it is a pity that, as they discovered, KubeFed was slow to add new clusters and was inefficient in managing a large number of clusters.

In the KubeFed v2 cluster, a central Kubernetes cluster acts as a single "control plane" for all other clusters. The Ant Group found that the resource utilization rate of the central cluster is very high when managing the hosting cluster and the applications in the hosting cluster.

In a test to manage application workloads, which accounted for only 3% of Ant Group's total, they found that the central cluster composed of medium-sized cloud instances was saturated and had poor response time. Therefore, they never ran the entire workload on KubeFed.

The second limitation is related to the extended capabilities of Kubernetes, called custom resource definitions or CRDs. "Advanced users" like Ant Group often develop numerous custom resources to expand management capabilities. In order to distribute CRDs among multiple clusters, KubeFed requires the creation of a "joint CRD" for each CRD. This not only doubles the number of objects in the cluster, but also brings serious problems in maintaining the consistency of CRD version and API version between clusters, and will cause applications to be unable to be compatible with different DRD or API versions. Upgrade smoothly.

This rapid increase in the number of CRDs has also led to serious troubleshooting problems. At the same time, bad habits such as improper use of CRD definitions/arbitrary field changes will make the robustness of the KubeFed control surface worse. Where the local cluster has custom resources, there is also a graphical aggregate view representing the local cluster resources on the federated control plane. But if there is a problem with the local cluster, it is difficult to know where the problem is from the federated control plane. Operation logs and resource events on the local cluster are also invisible at the federated level.

PART. 2 Move to Open Cluster Management

The Open Cluster Management project (OCM) was originally developed by IBM and open sourced by Red Hat last year. Based on the experience of Ant Group and other partners, OCM has improved the method of multi-cluster management. It decentralizes the management overhead from the central cluster to the agent (Agent) on each managed cluster, allowing it to be distributed, autonomous and stable throughout the infrastructure. This makes the number of clusters that OCM can theoretically manage at least one order of magnitude more than KubeFed. So far, users have tested managing up to 1000 clusters at the same time.

OCM can also use the development of Kubernetes itself to improve its capabilities. For example, those capability extensions encapsulated in CRD can use OCM's WorkAPI (a sub-project that is being proposed to SIG-Multicluster) to distribute Kubernetes objects between clusters. WorkAPI embeds a subset of local Kubernetes resources as the definition of the object to be deployed, and leaves it to the agent for deployment. This model is more flexible and minimizes the need for any central control plane deployment. WorkAPI can define multiple versions of a resource together to support the upgrade path of the application. At the same time, WorkAPI takes into account the state maintenance problem when the network link of the central cluster and the managed cluster fails, and can guarantee the ultimate consistency of the resource state in the case of reconnection.

Most importantly, OCM achieves more automation in cluster deployment. In KubeFed, cluster management is a "two-way handshake" process, based on the "zero trust" between the central cluster and the managed cluster. Many manual steps are involved in this process to ensure security. The new platform can simplify this process. For example, because it runs on the basis of "PULL", there is no longer need for multi-stage manual certificate registration, nor the circulation of any clear-text KubeConfig certificate, so that the managed cluster can obtain the management commands of the central cluster.

Although the registration process focuses on two-way "trust", adding a new cluster to OCM requires very little operation; workers can simply deploy a "Klusterlet" agent on the target Kubernetes cluster to achieve automatic management. This is not only easier for administrators, but also means that Ant Group prepares more new clusters for Double Eleven and deploys more quickly.

PART. 3 What is the next step for Kubernetes multi-cluster?

In just four years, the multi-cluster management capabilities of the Kubernetes community have developed rapidly, from Federation v1 to KubeFed v2 to Open Cluster Management.

Through the technical capabilities of the talented engineers working in the internal interest group SIG-Multicluster and external projects (OCM, Submariner, etc.), the management scale and management functions supported by the multi-cluster are much improved than before.

Will there be a new platform to further develop multi-cluster functions in the future, or is OCM the ultimate implementation?

Feng Yong thinks so:

"Looking forward, with the joint efforts of Red Hat, Ant Group, Alibaba Cloud and other participants, the Open Cluster Management project will become the standard and backplane for building multi-cluster solutions based on Kubernetes."

In any case, one thing is clear:

You can now run the entire planet on Kubernetes

To learn more about cloud native topics, please join the Cloud Native Computing Foundation and Cloud Native Community at KubeCon+CloudNativeCon North America, 2021 – October 11-15, 2021.

🔗「Original link」:

Recommended reading this week

阅读 414

蚂蚁金服自主研发的分布式中间件(Scalable Open Financial Architecture)

SOFAStack™(Scalable Open Financial Architecture Stack)是一套用于快速构建金融级分布式架构的中间...

318 声望
1.6k 粉丝
0 条评论

SOFAStack™(Scalable Open Financial Architecture Stack)是一套用于快速构建金融级分布式架构的中间...

318 声望
1.6k 粉丝