The grievances between Docker and k8s (5)-Kubernetes innovation

Please indicate the source for reprinting: Grape City official website , Grape City provides developers with professional development tools, solutions and services, and empowers developers.

In the previous section, we mentioned that the development of the community ecology has enabled Kubernetes to develop and spread benign. Compared with the relatively closed Docker community, the open CNCF community has achieved greater success, but only the vitality of the community is not enough to make Docker defeated so quickly. The fundamental reason is Kubernetes' understanding of container orchestration technology. Docker is even better. This advantage is almost an overwhelming dimensionality reduction blow, and Docker has nothing to fight back.

Next, I will introduce how Kubernetes has an advantage in this container war.

Container Orchestration

The so-called container orchestration is actually dealing with the relationship between the container and the container. In a large distributed system, it is impossible for multiple single individuals to exist. They may be one or more, a group and a group such that they are intertwined with each other. .


Docker's container orchestration function

Docker builds a PaaS ecosystem with Docker containers as the core, including simple container relationship orchestration based on Docker Compose, and an online operation and maintenance platform based on Docker Swarm. Users can handle the relationship between the containers in their clusters through Docker Compose, and manage and maintain their clusters through Docker Swarm. You can see that all this is actually the PaaS function of Cloud Foundry at the time, and the main focus is the seamlessness with Docker containers. integrated.

What Docker Compose does is to establish a kind of "connection" for multiple interactive relationships, write them all in a docker-compose.yaml file, and then publish them in a unified manner (the ELK function in the group I will talk about later does this. ), this also has advantages, that is, it is very convenient for simple interaction between several containers. But for large clusters, it is a drop in the bucket, and this development model that requires a new function for each new requirement will make it very difficult to maintain the code in the later stage.



If Kubernetes is going to fight Docker, it must not only do Docker container management, which is already supported by Docker itself. In this case, let alone fight against courtesy, it may not even attract the basic users of Docker. Therefore, at the beginning of the design of Kubernetes, the design concept of not relying on Docker as the core was determined. In Kubernetes, Docker is only an option for the implementation of the container runtime. Users can change the content of the containers they need according to their preferences and Kubernetes provides interfaces for these containers. In addition, Kubernetes accurately grasped a fatal weakness of the Docker container and carried out its own innovation.


Next, let us understand together, what exactly is this dimensionality reduction blow to Docker?

Kubernetes' container orchestration function


Unlike Docker, which can only deal with the relationship between containers from the perspective of containers, what Kubernetes does is to start from the design concept of software engineering, divide the relationship into different types, and define the close relationship (between Pods). ) And the concept of interaction (between Services), and then implement specific choreography for different relationships.


At first glance you may be confused. Here is an unrealistic but easy-to-understand example: If you compare the relationship between containers to the relationship between people, what Docker can handle is only from the perspective of a single individual, dealing with people between people The interpersonal relationship; and Kubernetes is indeed God. From the perspective of God, it can not only handle the interpersonal relationship between people, but also the inter-dog relationship between dogs. The most important thing is that it can handle the relationship between people and dogs. Relationship.


The principle that realizes the above-mentioned close relationship is the innovative Pod of Kubernetes.


Pod is a concept innovated by Kubernetes. Its prototype is Alloc in Borg. It is the smallest execution unit for Kubernetes to run applications. It is composed of one or more tightly coordinated containers. The reason for its emergence is a fatality for containers. Weakness-the expansion of the problem of single process, so that the container has the concept of process group. Through the first section, we know that the essence of a container is a process, which itself is a super process, and other processes must be its child processes. Therefore, in a container, there is no concept of a process group, and in daily program operation, Process groups are often used in conjunction.

Use Pod to handle close relationships

In order to introduce you to the principle of Pod's handling of close relationships, here is an example of a process group:


There is a program in Linux, rsyslogd, which is responsible for operating system log processing. It is composed of three modules: imklog module, muxsock module, and rsyslogd's own main function main process. These three process groups must run on the same machine, otherwise there will be problems in the Socket-based communication and file exchange between them.


And the above problem, if it appears in Docker, you have to use three different containers to describe separately, and users have to simulate the communication relationship between the three of them by themselves. This kind of complexity may be more complicated than using the container operation. Weidu is much taller. And for the operation and maintenance of this problem, Docker Swarm also has its own problems. Based on the above example, if the three modules each require 1GB of memory to run, if there are two nodes in the cluster running by Docker Swarm, node-1 has 2.5GB remaining, and node-2 has 3GB remaining. In this case, use the docker run module to run the above three containers. Based on Swarm's affinity=main constraint, all three of them must be scheduled on the same machine, but Swarm is likely to allocate two to node-1 first. , And the remaining one failed this time because the remaining 0.5GB did not meet the schedule. This typical example of gang scheduling not being handled properly is often found in Docker Swarm.


Based on the above requirements, Kubernetes has the concept of Pod to handle this close relationship. Containers in a Pod share the same Cgroups and Namespace, so there is no boundary and isolation environment between them. They can share the same network IP, use the same Volume to process data, and so on. The principle is to create links to shared resources between multiple containers. But in order to solve the topological problem of whether A shares B or B shares A, and whether A and B start this topological problem, a Pod is actually composed of an Infra container and two containers AB, where the Infra container is the first. A startup:


The Infra container is written in assembly language, and the main process is a container that is always in a "suspended" state. It only takes up very few resources and only about 100KB after decompression.

Example demonstration of Pod in Kubernetes

After introducing a pass, we will show you what a Pod looks like in an example.


We use the following yaml file and shell command to run a Pod in any cluster with Kubernetes installed. I don’t care about the specific meaning of this YAML file for now. I will explain this YAML later. We only need to understand at present: All resources in Kubernetes can be described by the following YAML files or json files. Now we only need to know that this is a Pod running busybox and nginx:


After creating this hello-pod.yaml file, run the following command:


Through the above command, we have successfully created a pod. From the execution result, we can see that the main process of the infra container has become the super process of this pod with PID==1, which shows that the pod is composed:


At this point, we should understand the concept that Pod is the smallest scheduling unit of Kubernetes, and we should also treat Pod as a whole rather than a collection of multiple containers.
Let's take another look at the file type YAML that describes this Pod.


YAML syntax definition:

YAML is a language that specializes in writing configuration files. It is concise and powerful. It is far better than JSON in describing configuration files. Therefore, many emerging projects such as Kubernetes and Docker Compose use YAML as the description language of configuration files. Like HTML, YAML is also an English abbreviation: YAML Ain't Markup Language. Smart students have already seen it. This is a recursive writing method, which highlights the fullness of programmers. Its grammar has the following characteristics:


- Case Sensitive

-Use indentation to indicate hierarchical relationships, similar to Python

-Tabs are not allowed for indentation, only spaces are allowed

-The number of spaces for indentation is not important, as long as the left side of the element of the same level is aligned

-The array is indicated by a dash-

-NULL is indicated by a wavy line~


With the above concepts clarified, we rewrite YAML into a JSON to see the difference between these:


These two writing methods are equivalent in Kubernetes. The above JSON can run normally, but Kubernetes still recommends using YAML. From the above comparison, we can also find that JSON, which has been very useful in previous uses, is now slightly clumsy and requires a lot of string tokens.


After reading the grammar, let's talk about the meaning of each node in the above YAML in Kubernetes. In Kubernetes, there is a concept similar to Java syntax that everything is an object. All internal resources, including server node, service service, and operation group Pod are stored in the form of objects in kubernetes, and all objects are fixed by the following Part of the composition:


-apiVersion: No corresponding explanation is given in the official document, but it can be seen from the name that this is a field that specifies the API version, but this field cannot be customized, and must comply with the official constraints of Kubernetes. The basic ones we currently use Both are v1 stable version

-kind: Specify what type of current configuration is, such as Pod, Service, Ingress, Node, etc. Note that this first letter is capitalized

-metadata: used to describe the meta information of the current configuration, such as name, label, etc.

-spec: specify the specific implementation of the current configuration

All Kubernetes objects basically meet the above format, so the beginning of the Pod YAML file means "use the v1 stable version of the API information, the type is Pod, the name is hello-pod, the specific implementation is to open ProcessNamespace, there are two containers .


Knowing the concept of YAML, let us return to the subject. In order to solve the single-process container problem, one of the reasons for only creating Pod is that Google has implemented its own container design pattern through Pod, and Google has written the most suitable container design pattern for Kubernetes.


Here is the most commonly used example:


The Java project cannot be run directly from the host after the compilation is completed like the .Net Core project. You must copy the compiled war package to the running directory of the service host program such as Tomcat before it can be used normally. However, in actual situations, the larger the company, the clearer the division of labor. It is very likely that the teams responsible for the development of the Java project and the development of the service host program are not the same team.


In order for the two teams in the above situation to develop independently and work closely together, we can use Pod to solve this problem.

The following yaml file defines a Pod that meets the above requirements:


In this yaml file, we define a container for a java program and a tomcat program, and mount the container between the two containers: the /app path of the java program and the /root/apache of the tomcat program -tomcat/webapps are mounted on the mount volume of sample-volume at the same time, and finally it is determined that this mount volume is a memory data volume. And it is defined that the container in which the java program is located is an initContainer, indicating that this container was started before the tomcat container, and a cp command was executed after the start.


The above Pod describes such a scenario: when the program starts to run, the Java container starts and copies its own war package sample.war to its own /app directory; then the tomcat container starts, executes the startup script, and executes the war package from Obtained under your own /root/apache-tomcat/webapps path.


It can be seen that through the above configuration description, we have neither changed the Java program nor the tomcat program, but let them work perfectly together to complete the decoupling operation. This example is the Sidecar mode in the container design mode. There are many other design modes. Interested students can go to learn more on their own.




The above is the basic content of the concept Pod abstracted by Kubernetes in order to solve the close relationship. It should be noted that Pod provides only an orchestration idea, not a specific technical solution. In the Kubernetes framework we use, Pod is just implemented with Docker as the carrier. If the underlying container you use is a virtual machine, such as a virtlet, then the Pod does not need Infra Container at all when it is created, because the virtual machine inherently supports multi-process collaboration.


After talking about the basic content of Pod, in the next section we will introduce how Kubernetes will stand out in the next container orchestration war.

阅读 514



1.7k 声望
14.1k 粉丝
0 条评论


1.7k 声望
14.1k 粉丝