Super detailed! Summary of k8s interview questions

Briefly describe ETCD and its characteristics?

etcd is an open source project initiated by the CoreOS team. It is a project for managing configuration information and service discovery. Its goal is to build a highly available distributed key-value database based on the Go language. .

Features:
Simple: Support REST style HTTP+JSON API
Security: Support HTTPS access
Fast: support concurrent 1k/s write operation
Reliable: Support for distributed structure, based on Raft's consensus algorithm, Raft is a set of algorithms that achieves the consistency of distributed systems by electing master nodes.

Briefly describe the scene that ETCD adapts to?

etcd can be widely used in the following scenarios:

Service Discovery: Service Discovery mainly solves the process or service in the same distributed cluster, how to find each other and establish a connection. Essentially, service discovery is to know whether there are processes in the cluster listening on udp or tcp ports, and you can find and connect by name.

Message publishing and subscribing: In a distributed system, the most suitable communication method between components is message publishing and subscribing. That is to build a configuration sharing center, data providers publish messages in this configuration center, and message consumers subscribe to topics they care about. Once a topic has a news release, subscribers will be notified in real time. In this way, centralized management and dynamic update of distributed system configuration can be achieved. Some configuration information used in the application is placed on etcd for centralized management.

Load balancing: In a distributed system, in order to ensure the high availability of services and the consistency of data, multiple copies of data and services are usually deployed to achieve peer-to-peer services. Even if one of the services fails, it will not affect use. The information access stored in etcd's distributed architecture supports load balancing. After etcd is clustered, each core node of etcd can process user requests. Therefore, directly storing small but frequently accessed message data in etcd can also achieve the effect of load balancing.

Distributed notification and coordination: Similar to message publishing and subscription, the Watcher mechanism in etcd is used. Through registration and asynchronous notification mechanisms, notification and coordination between different systems in a distributed environment are realized, so as to achieve real-time data changes. deal with.

Distributed lock: Because etcd uses the Raft algorithm to maintain strong data consistency, the value stored in the cluster in a certain operation must be globally consistent, so it is easy to implement distributed locks. There are two ways to use the lock service, one is to maintain exclusive use, and the other is to control the timing.

Cluster monitoring and Leader election: Monitoring through etcd is very simple and real-time.

Briefly describe what is Kubernetes?

Kubernetes is a new distributed system support platform based on container technology. It is Google's open source container cluster management system (Google internal: Borg). On the basis of Docker technology, it provides a series of complete functions such as deployment and operation, resource scheduling, service discovery and dynamic scaling for containerized applications, which improves the convenience of large-scale container cluster management. And it has complete cluster management capabilities, multi-level security protection and access mechanisms, multi-tenant application support capabilities, transparent service registration and discovery mechanisms, built-in intelligent load balancers, powerful fault detection and self-repair capabilities, and service rolling Upgrade and online expansion capabilities, scalable resource automatic scheduling mechanism, and multi-granularity resource quota management capabilities.

Briefly describe the relationship between Kubernetes and Docker?

Docker provides container life cycle management and Docker image construction runtime container. Its main advantage is to package the settings and dependencies required for the software/application to run into a container, thereby achieving advantages such as portability.

Kubernetes used to associate and orchestrate containers running on multiple hosts.

Briefly describe what are Minikube, Kubectl, and Kubelet in Kubernetes?

Minikube is a tool that can easily run a single-node Kubernetes cluster locally.

Kubectl is a command line tool that can be used to control the Kubernetes cluster manager, such as checking cluster resources, creating, deleting and updating components, and viewing applications.

Kubelet is a proxy service that runs on each node and enables the slave server to communicate with the master server.

Briefly describe the common deployment methods of Kubernetes?

Common deployment methods Kubernetes

kubeadm: It is also a recommended deployment method;
Binary: CentOS build K8S, one-time success, collection!
minikube: A tool to easily run a single-node Kubernetes cluster locally.
Simple steps, no pit deployment Minimize K8S cluster
Kubernetes deployment is so simple, I fully understand

Briefly describe how Kubernetes implements cluster management?

In terms of cluster management, Kubernetes divides into a Master node and a group of working nodes Node. Among them, a group of processes related to cluster management kube-apiserver, kube-controller-manager, and kube-scheduler are running on the Master node. These processes implement resource management, Pod scheduling, elastic scaling, security control, system monitoring, and Management capabilities such as error correction are fully automated. We recommend you take a look at: 7 tool in easily manage Kubernetes cluster .

Briefly describe the advantages, adaptation scenarios and characteristics of Kubernetes?

Kubernetes as a complete distributed system support platform, its main advantages:

Container Orchestration
Lightweight
Open source
Elastic scaling
Load balancing

Kubernetes common scenarios:

Quickly deploy applications
Quickly expand the application
Seamlessly connect with new application functions
Save resources and optimize the use of hardware resources

Kubernetes related features:

Portable: Support public cloud, private cloud, hybrid cloud, multi-cloud (multi-cloud).
Extensible: modular, plug-in, mountable, and combinable.
Automation: automatic deployment, automatic restart, automatic replication, automatic scaling/scaling.

Briefly describe the shortcomings or current shortcomings of Kubernetes?

The current shortcomings (deficiencies) of Kubernetes are as follows:

The installation process and configuration are relatively difficult and complicated.
Management services are relatively cumbersome.
It takes a lot of time to run and compile.
It is more expensive than other alternatives.
For simple applications, it may not be necessary to involve Kubernetes.

Briefly describe the basic concepts related to Kubernetes?

master: k8s cluster management node, responsible for managing the cluster, providing access to the resource data of the cluster. Have Etcd storage service (optional), run Api Server process, Controller Manager service process and Scheduler service process.

Node (worker): Node (worker) is the service node that runs Pod in the Kubernetes cluster architecture. It is the unit of Kubernetes cluster operation. It is used to carry the operation of the assigned Pod and is the host machine where the Pod runs. Run docker eninge service, daemon kunelet and load balancer kube-proxy.

Pod: running on Node node, a combination of several related containers ( Kubernetes Pod realization principle ). The containers contained in the Pod run on the same host, use the same network namespace, IP address, and port, and can communicate through localhost. Pod is Kurbernetes for creation, scheduling and management. It provides a higher level of abstraction than containers, making deployment and management more flexible. A Pod can contain one container or multiple related containers.

label: The Label in Kubernetes is essentially a series of Key/Value key-value pairs, in which the key and value can be customized. Label can be attached to various resource objects, such as Node, Pod, Service, RC, etc. A resource object can define any number of Labels, and the same Label can also be added to any number of resource objects. Kubernetes queries and filters resource objects through Label Selector.

Replication Controller: Replication Controller is used to manage Pod copies and ensure that there are a specified number of Pod copies in the cluster. If the number of replicas in the cluster is greater than the specified number, the number of redundant containers outside the specified number will be stopped. Otherwise, less than the specified number of containers will be started, and the number will remain unchanged. Replication Controller is the core of achieving elastic scaling, dynamic expansion, and rolling upgrades.

Deployment: Deployment uses RS internally to achieve its purpose. Deployment is equivalent to an upgrade of RC. Its biggest feature is that it can know the current Pod deployment progress at any time.

HPA (Horizontal Pod Autoscaler): The horizontal automatic expansion of Pod is also a resource of Kubernetes. By tracking and analyzing the load changes of all Pod targets controlled by RC, it is determined whether it is necessary to adjust the number of Pod replicas in a targeted manner.

Service: Service ( Kubernetes service discovery ) defines the logical collection of Pod and the strategy for accessing the collection, which is an abstraction of real services. Service provides a unified service access entry, service proxy and discovery mechanism, associates multiple Pods with the same Label, and users do not need to understand how the background Pod operates.

Volume: Volume is a shared directory that can be accessed by multiple containers Kubernetes is defined on the Pod and can be mounted to a directory by one or more containers in the Pod.

Namespace: Namespace is used to implement multi-tenant resource isolation. Resource objects within the cluster can be allocated to different Namespaces to form logically different projects, groups or user groups, so that different Namespaces can share the resources of the entire cluster. At the same time it can be managed separately.

Briefly describe the relevant components of the Kubernetes cluster?

Kubernetes Master control component, scheduling and managing the entire system (cluster), including the following components:

Kubernetes API Server: As the entrance to the Kubernetes system, it encapsulates the addition, deletion, modification, and query operations of core objects, and provides external customers and internal component calls in the form of RESTful API interfaces. It is the central hub for data interaction and communication between various functional modules in the cluster.

Kubernetes Scheduler: Node selection (that is, allocating machines) for the newly created Pod, responsible for the resource scheduling of the cluster.

Kubernetes Controller: Responsible for executing various controllers. Many controllers have been provided to ensure the normal operation of Kubernetes.

Replication Controller: Manage and maintain the Replication Controller, associate the Replication Controller with the Pod, and ensure that the number of replicas defined by the Replication Controller is consistent with the actual number of running Pods.

Node Controller: Manage and maintain the Node, regularly check the health status of the Node, and identify the (failed|unfailed) Node node.

Namespace Controller: Manage and maintain Namespace, regularly clean up invalid Namespace, including API objects under Namespace, such as Pod, Service, etc.

Service Controller: Manage and maintain Service, provide load and service agent.

EndPoints Controller: Manage and maintain Endpoints, associate Service and Pod, create Endpoints as the back end of Service, and update Endpoints in real time when the Pod changes.

Service Account Controller: Manage and maintain the Service Account, create a default Service Account for each Namespace, and create a Service Account Secret for the Service Account.

Persistent Volume Controller: Manage and maintain Persistent Volume and Persistent Volume Claim, assign Persistent Volume to new Persistent Volume Claim for binding, and perform cleanup and recovery for released Persistent Volume.

Daemon Set Controller: Manage and maintain Daemon Set, responsible for creating Daemon Pod, and ensuring the normal operation of Daemon Pod on the designated Node.

Deployment Controller: Manage and maintain Deployment, associate Deployment and Replication Controller, and ensure that a specified number of Pods are running. When the Deployment is updated, control the implementation of the Replication Controller and Pod updates.

Job Controller: Manage and maintain the Job, create a one-time task Pod for the Jod, and ensure that the number of tasks specified by the job is completed

Pod Autoscaler Controller: Realizes the automatic scaling of Pod, obtains monitoring data regularly, performs policy matching, and executes the scaling action of Pod when conditions are met.

Briefly describe the mechanism of Kubernetes RC?

The Replication Controller is used to manage Pod copies and ensure that there are a specified number of Pod copies in the cluster. After the RC is defined and submitted to the Kubernetes cluster, the Controller Manager component on the Master node learns it, and at the same time inspects the target Pod currently surviving in the system, and ensures that the number of target Pod instances is exactly equal to the expected value of this RC. If more Pod copies are running, the system will stop some Pods, otherwise it will automatically create some Pods.

Briefly describe the difference between Kubernetes Replica Set and Replication Controller? Replica Set is similar to Replication Controller in that it ensures that a specified number of Pod replicas are run at any given time. The difference is that RS uses set-based selectors, while Replication Controller uses permission-based selectors.

Briefly describe the role of kube-proxy?

Kube-proxy runs on all nodes, it monitors the changes of service and endpoint in apiserver, and creates routing rules to provide service IP and load balancing functions. Simply understand that this process is a transparent proxy and load balancer for Service. Its core function is to forward access requests to a Service to multiple Pod instances on the backend.

Briefly describe the principle of kube-proxy iptables?

Starting from version 1.2, Kubernetes uses iptables as the default mode of kube-proxy. The kube-proxy in iptables mode no longer functions as a proxy. Its core function is to track changes in Service and Endpoint in real time through the Watch interface of the API Server, and update the corresponding iptables rules. Client request traffic passes through the NAT mechanism of iptables. "Direct routing" to the target Pod.

Briefly describe the principle of kube-proxy ipvs?

IPVS was upgraded to GA stable version in Kubernetes 1.11. IPVS is dedicated to high-performance load balancing, and uses a more efficient data structure (Hash table), allowing almost unlimited scale expansion, so it is adopted by kube-proxy as the latest model.

In IPVS mode, use iptables' extended ipset instead of directly calling iptables to generate a rule chain. The iptables rule chain is a linear data structure, and ipset introduces an indexed data structure, so when there are many rules, it can also be searched and matched very efficiently.

You can simply understand ipset as a collection of IP (segments). The content of this collection can be IP addresses, IP network segments, ports, etc. iptables can directly add rules to operate on this "variable collection". The benefits of doing so It can greatly reduce the number of iptables rules, thereby reducing performance loss.

Briefly describe the similarities and differences between kube-proxy ipvs and iptables?

Both iptables and IPVS are implemented based on Netfilter, but because of the different positioning, there are essential differences between the two: iptables is designed for firewalls; IPVS is specifically used for high-performance load balancing, and uses more efficient data structures (Hash table ), allowing almost unlimited scale expansion.

Compared with iptables, IPVS has the following obvious advantages:

1. Provides better scalability and performance for large clusters;
2. Support more complex replication balancing algorithms (minimum load, minimum connections, weighting, etc.) than iptables;
3. Support server health check and connection retry functions;
4. The set of ipset can be modified dynamically, even if the rules of iptables are using this set.

Briefly describe what is a static Pod in Kubernetes?

Static pods are managed by kubelet and only exist on specific Node Pods. They cannot be managed through API Server, cannot be associated with ReplicationController, Deployment, or DaemonSet, and kubelet cannot perform health checks on them. Static Pods are always created by kubelet and always run on the Node where the kubelet is located.

Briefly describe the possible state of Pod in Kubernetes?

Pending: API Server has created the Pod, and there are still one or more container images in the Pod that have not been created, including the process of downloading the image.

Running: All containers in the Pod have been created, and at least one container is running, starting, or restarting.

Succeeded: All containers in the Pod have successfully exited without restarting.

Failed: All containers in the Pod have exited, but at least one container exited in a failed state.

Unknown: The Pod status cannot be obtained for some reason, which may be caused by poor network communication.

Briefly describe the main process of creating a Pod in Kubernetes?

Creating a Pod in Kubernetes involves the linkage between multiple components. The main process is as follows:

1. The client submits Pod configuration information (it can be the information defined by the yaml file) to kube-apiserver.
2. After receiving the instruction, the Apiserver notifies the controller-manager to create a resource object.
3. The Controller-manager stores the configuration information of the pod in the ETCD data center through the api-server.
4. When Kube-scheduler detects the pod information, it will start scheduling pre-selection. It will first filter out nodes that do not meet the Pod resource configuration requirements, and then start scheduling tuning, mainly to select nodes that are more suitable for running pods, and then configure pod resources The order is sent to the kubelet component on the node.
5. Kubelet runs the pod according to the resource configuration sheet sent by the scheduler. After the operation is successful, it returns the running information of the pod to the scheduler, and the scheduler stores the returned pod operating status information in the etcd data center.

Briefly describe the restart strategy of Pod in Kubernetes?

The Pod restart policy (RestartPolicy) is applied to all containers in the Pod, and is judged and restarted by the kubelet only on the Node where the Pod is located. When a container exits abnormally or the health check fails, kubelet will perform corresponding operations according to the settings of RestartPolicy.

Pod restart strategies include Always, OnFailure, and Never, and the default value is Always.

Always: When the container fails, the kubelet automatically restarts the container;
OnFailure: When the container terminates and the exit code is not 0, the kubelet automatically restarts the container;
Never: Regardless of the running status of the container, kubelet will not restart the container.

At the same time, the restart strategy of the Pod is related to the control method. The controllers that can currently be used to manage the Pod include ReplicationController, Job, DaemonSet, and direct management of kubelet management (static Pod).

The restart strategy restrictions for different controllers are as follows:

RC and DaemonSet: Must be set to Always, need to ensure that the container continues to run;
Job: OnFailure or Never, to ensure that the container will not restart after the execution is complete;
Kubelet: Restart when the Pod fails, no matter what value is set to RestartPolicy, no health check will be performed on the Pod.

Briefly describe the health check method of Pod in Kubernetes?

The Pod's health check can be checked by two types of probes: LivenessProbe and ReadinessProbe.

LivenessProbe probe: used to determine whether the container is alive (running state). If the LivenessProbe probe detects that the container is unhealthy, kubelet will kill the container and deal with it according to the container's restart strategy. If a container does not contain the LivenessProbe probe, kubelet considers that the return value of the LivenessProbe probe of the container is "Success".

ReadineeProbe probe: used to determine whether the container is started (ready state). If the ReadinessProbe probe detects a failure, the status of the Pod will be modified. The Endpoint Controller will delete the Eenpoint that contains the Pod where the container is located from the Endpoint of the Service.

startupProbe probe: Startup check mechanism, apply some slow start-up business, to avoid the business startup for a long time and be killed by the above two types of probes.

Briefly describe the common way of LivenessProbe probe of Kubernetes Pod?

Kubelet periodically executes the LivenessProbe probe to diagnose the health status of the container, usually in the following three ways:

ExecAction: Execute a command in the container. If the return code is 0, it indicates that the container is healthy.

TCPSocketAction: Perform TCP check through the container's IP address and port number. If a TCP connection can be established, it indicates that the container is healthy.

HTTPGetAction: Invoke the HTTP Get method through the container's IP address, port number, and path. If the response status code is greater than or equal to 200 and less than 400, it indicates that the container is healthy.

Briefly describe the common scheduling methods of Kubernetes Pod?

In Kubernetes, Pod is usually the carrier of the container, and there are mainly the following common scheduling methods:

Deployment or RC: The main function of this scheduling strategy is to automatically deploy multiple copies of a container application, and continuously monitor the number of copies, and always maintain the number of copies specified by the user in the cluster.
NodeSelector: Directional scheduling. When you need to manually specify to schedule a Pod to a specific Node, you can match the label (Label) of the Node with the nodeSelector property of the Pod.
NodeAffinity affinity scheduling: The affinity scheduling mechanism greatly expands the scheduling capabilities of Pod. There are currently two expressions of node affinity:
requiredDuringSchedulingIgnoredDuringExecution: Hard rules, the specified rules must be met before the scheduler can schedule Pod to Node (similar to nodeSelector, with different syntax).
preferredDuringSchedulingIgnoredDuringExecution: soft rule, priority scheduling to meet Node nodes, but not mandatory, multiple priority rules can also set the weight value.
Taints and Tolerations:
Taint: Make Node refuse to run a specific Pod;
Toleration: It is the attribute of Pod, which means that Pod can tolerate (run) Node marked with Taint.

Briefly describe the Kubernetes initialization container (init container)?

The operating mode of init container is different from that of application container. They must be executed before application container. When multiple init containers are set up, they will run one by one in order, and the next init container can only be run after the previous init container runs successfully. When all init containers are successfully run, Kubernetes will initialize various information about the Pod and start to create and run application containers.

Briefly describe the Kubernetes deployment upgrade process?

When the deployment was initially created, the system created a ReplicaSet and created a corresponding number of Pod replicas according to the user's needs.
When the Deployment is updated, the system creates a new ReplicaSet, expands its number of replicas to 1, and then reduces the old ReplicaSet to 2.
After that, the system continues to adjust the old and new ReplicaSets one by one according to the same update strategy.
Finally, the new ReplicaSet runs a corresponding Pod copy of the new version, and the number of copies of the old ReplicaSet is reduced to 0.

Briefly describe the Kubernetes deployment upgrade strategy?

In the definition of Deployment, you can specify the Pod update strategy through spec.strategy. Currently, two strategies are supported: Recreate (rebuild) and RollingUpdate (rolling update). The default value is RollingUpdate.

Recreate: Set spec.strategy.type=Recreate, which means that when the Deployment updates the Pod, it will first kill all running Pods and then create a new Pod.

RollingUpdate: Set spec.strategy.type=RollingUpdate, which means that Deployment will update Pods one by one in a rolling update. At the same time, you can control the rolling update process by setting two parameters (maxUnavailable and maxSurge) under spec.strategy.rollingUpdate.

Briefly describe the resource characteristics of the Kubernetes DaemonSet type?

The DaemonSet resource object will run on each node in the Kubernetes cluster, and each node can only run one pod. This is the biggest and only difference between it and the deployment resource object. Therefore, in the definition of the yaml file, the definition of replicas is not supported.

Its general usage scenarios are as follows:

To do log collection work for each node.
Monitor the running status of each node.

Briefly describe the automatic expansion mechanism of Kubernetes?

Kubernetes uses the Horizontal Pod Autoscaler (HPA) controller to implement automatic Pod scaling based on CPU usage. The HPA controller periodically monitors the resource performance indicators of the target Pod and compares it with the expansion and contraction conditions in the HPA resource object, and adjusts the number of Pod copies when the conditions are met.

HPA principle

A Metrics Server (Heapster or custom Metrics Server) in Kubernetes continuously collects the metric data of all Pod copies. The HPA controller obtains these data through the Metrics Server API (Heapster API or aggregation API), and calculates based on user-defined scaling rules to obtain the number of target Pod copies.

When the number of target Pod replicas is different from the current number of replicas, the HPA controller initiates a scale operation to the Pod's replica controller (Deployment, RC, or ReplicaSet), adjusts the number of Pod replicas, and completes the scaling operation.

Briefly describe the type of Kubernetes Service?

By creating a Service, a unified entry address can be provided for a group of container applications with the same function, and the request load can be distributed to the back-end container applications. The main types are:

ClusterIP: The virtual service IP address, which is used for Pod access within the Kubernetes cluster, and kube-proxy forwards it through the set iptables rules on the Node;
NodePort: Use the host's port to enable external clients that can access each Node to access the service through the Node's IP address and port number;
LoadBalancer: Use an external load balancer to complete the load distribution to the service. You need to specify the IP address of the external load balancer in the spec.status.loadBalancer field, which is usually used in public clouds.

Briefly describe the Kubernetes Service distribution back-end strategy?

Service load distribution strategies are: RoundRobin and SessionAffinity

RoundRobin: The default is the polling mode, that is, polling forwards the request to each Pod on the backend.
SessionAffinity: The mode of session retention based on the client IP address, that is, the first time a request initiated by a client is forwarded to a certain Pod on the backend, and then all requests initiated from the same client will be forwarded to the backend On the same Pod.

Briefly describe Kubernetes Headless Service?

In some application scenarios, if you need to manually specify a load balancer, do not use the default load balancing function provided by the Service, or the application wants to know other instances that belong to the same group of services. Kubernetes provides Headless Service to achieve this function, that is, ClusterIP (entry IP address) is not set for the Service, and only the back-end Pod list is returned to the calling client through the Label Selector.

Briefly describe how to access services in the cluster from outside Kubernetes?

For Kubernetes, by default, clients outside the cluster cannot access via Pod's IP address or Service's virtual IP address: virtual port number. The services in the Kubernetes cluster can usually be accessed in the following ways:

Map Pod to physical machine: Map the Pod port number to the host machine, that is, use the hostPort method in the Pod, so that the client application can access the container application through the physical machine.

Map Service to physical machine: Map the Service port number to the host machine, that is, use the nodePort method in the Service, so that the client application can access the container application through the physical machine.

Mapping Sercie to LoadBalancer: by setting LoadBalancer to map to the LoadBalancer address provided by the cloud service provider. This usage is only used in scenarios where Service is set up on the cloud platform of a public cloud service provider.

Briefly describe Kubernetes ingress?

The Ingress resource object of Kubernetes is used to forward access requests of different URLs to different services on the back end to implement the business routing mechanism of the HTTP layer.

Kubernetes uses Ingress strategy and Ingress Controller, which combine to implement a complete Ingress load balancer. When using Ingress for load distribution, the Ingress Controller forwards the client request directly to the backend Endpoint (Pod) corresponding to the Service based on the Ingress rule, thereby skipping the forwarding function of kube-proxy, kube-proxy no longer works, and the whole process is : Ingress controller + ingress rules ----> services.

At the same time, when the Ingress Controller provides external services, it actually implements the function of an edge router.

Briefly describe the download strategy of the Kubernetes image?

There are three mirror download strategies for K8s: Always, Never, and IFNotPresent.

Always: When the mirror label is latest, the mirror is always obtained from the specified warehouse.
Never: It is forbidden to download images from the warehouse, which means that only local images can be used.
IfNotPresent: Only download from the target warehouse when there is no corresponding mirror locally. The default image download policy is: when the image label is latest, the default policy is Always; when the image label is custom (that is, the label is not latest), then the default policy is IfNotPresent.

Briefly describe the load balancer of Kubernetes?

Load balancers are one of the most common and standard ways to expose services.

Two types of load balancers are used according to the working environment, namely internal load balancers or external load balancers. The internal load balancer automatically balances the load and distributes the containers with the required configuration, while the external load balancer directs traffic from the external load to the back-end containers.

Briefly describe how each module of Kubernetes communicates with API Server?

As the core of the cluster, Kubernetes API Server is responsible for the communication between the functional modules of the cluster. Each functional module in the cluster stores information in etcd through the API Server. When the data needs to be obtained and manipulated, it is implemented through the REST interface (using GET, LIST or WATCH methods) provided by the API Server, so as to realize the inter-module Information exchange.

For example, the interaction between the kubelet process and the API Server: every kubelet on the Node will call the API Server's REST interface to report its own status every other time period. After receiving the information, the API Server will update the node status information to etcd.

For example, the interaction between the kube-controller-manager process and the API Server: The Node Controller module in the kube-controller-manager monitors the information of the Node in real time through the Watch interface provided by the API Server, and performs corresponding processing.

For example, the interaction between the kube-scheduler process and the API Server: After the Scheduler listens to the information of the newly created Pod copy through the Watch interface of the API Server, it will retrieve the list of all Nodes that meet the requirements of the Pod, and start to execute the Pod scheduling logic. After the scheduling is successful, the Pod Bind to the target node.

Briefly describe the function and implementation principle of Kubernetes Scheduler?

Kubernetes Scheduler is an important functional module responsible for Pod scheduling. Kubernetes Scheduler undertakes the important function of "linking up and down" in the entire system. "Linking up" means that it is responsible for receiving the new Pod created by the Controller Manager and scheduling it to the target Node; "Start down" means that after the scheduling is completed, the kubelet service process on the target Node takes over the subsequent work and is responsible for the next life cycle of the Pod.

The function of the Kubernetes Scheduler is to bind the Pod to be scheduled (the newly created Pod by the API, the Pod created by the Controller Manager to complement the copy, etc.) according to a specific scheduling algorithm and scheduling strategy to bind (Binding) to a suitable Node in the cluster , And write the binding information into etcd.

Three objects are involved in the entire scheduling process, namely the list of Pod to be scheduled, the list of available Nodes, and the scheduling algorithm and strategy.

Kubernetes Scheduler uses a scheduling algorithm to schedule each Pod in the Pod list to be scheduled to select a most suitable Node from the Node list to achieve Pod scheduling. Subsequently, the kubelet on the target node monitors the Pod binding event generated by the Kubernetes Scheduler through the API Server, and then obtains the corresponding Pod list, downloads the Image image, and starts the container.

Briefly describe which two algorithms are used by Kubernetes Scheduler to bind Pods to worker nodes?

The Kubernetes Scheduler binds the Pod to the most suitable working node according to the following two scheduling algorithms:

Predicates: The input is all nodes, and the output is the nodes that meet the pre-selection conditions. kube-scheduler filters out the Nodes that do not meet the strategy according to the pre-selected strategy. If a node has insufficient resources or does not meet the conditions of the preselection strategy, it cannot pass the preselection. For example, "Node's label must be consistent with Pod's Selector".

Priorities: The input is the nodes selected in the pre-selection stage, and the nodes that are pre-selected will be ranked according to the priority strategy, and the Node with the highest score will be selected. For example, the richer the resources and the smaller the load, a Node may have a higher ranking.

Briefly describe the role of Kubernetes kubelet?

In a Kubernetes cluster, a kubelet service process is started on each Node (also known as Worker). This process is used to process the tasks sent by the Master to this node, and manage the Pod and the containers in the Pod. Each kubelet process registers the node's own information on the API Server, regularly reports the usage of node resources to the Master, and monitors the container and node resources through cAdvisor.

Briefly describe what components are used by Kubernetes kubelet to monitor Worker node resources?

Kubelet uses cAdvisor to monitor worker node resources. In the Kubernetes system, cAdvisor has been integrated into the kubelet component by default. When the kubelet service starts, it will automatically start the cAdvisor service, and then cAdvisor will collect the performance indicators of the node where it is located and the performance indicators of the containers running on the node in real time.

Briefly describe how Kubernetes guarantees the security of the cluster?

Kubernetes implements cluster security control through a series of mechanisms, mainly in the following different dimensions:

Infrastructure: to ensure the isolation of the container from its host;
Permissions:
The principle of least privilege: reasonably limit the permissions of all components, ensure that the component only performs its authorized behavior, and limit the scope of its permissions by restricting the capabilities of a single component.
User permissions: divide the roles of ordinary users and administrators.
In terms of clusters:
API Server authentication and authorization: All resource access and changes in the Kubernetes cluster are implemented through the Kubernetes API Server. Therefore, it is recommended to use more secure HTTPS or Token to identify and authenticate the client's identity (Authentication) and subsequent access permissions The authorization (Authorization) link.
API Server authorization management: through authorization strategy to determine whether an API call is legal. To authorize legitimate users and then authenticate users during user access, it is recommended to use a more secure RBAC method to improve cluster security authorization.
Introducing the Secret mechanism for sensitive data: It is recommended to use the Secret method for protection of cluster sensitive data.
AdmissionControl (admission mechanism): In the process of requesting kubernetes api, the sequence is: first pass authentication & authorization, then perform admission operation, and finally operate on the target object.

Briefly describe the Kubernetes access mechanism?

When making a request to the cluster, each admission control code is executed in a certain order. If an admission control rejects the request, the result of the entire request will be returned immediately, and the user will be prompted with corresponding error information.

Admission Control (AdmissionControl) is essentially a piece of admission code. In the process of requesting kubernetes api, the sequence is: first go through authentication & authorization, then perform the admission operation, and finally operate on the target object. Common components (control codes) are as follows:

AlwaysAdmit: Allow all requests
AlwaysDeny: All requests are forbidden, and are mostly used in test environments.
ServiceAccount: It automates serviceAccounts. It assists serviceAccount to do some things. For example, if the pod has no serviceAccount attribute, it will automatically add a default and ensure that the serviceAccount of the pod always exists.
LimitRanger: Observe all requests to ensure that they do not violate the constraints that have been defined. These conditions are defined in the LimitRange object in the namespace.
NamespaceExists: Observe all requests. If the request attempts to create a namespace that does not exist, the request is rejected.

Briefly describe Kubernetes RBAC and its characteristics (advantages)?

RBAC is a role-based access control, which is a method of managing access to computer or network resources based on the role of individual users.

Compared with other authorization modes, RBAC has the following advantages:

Complete coverage of resources and non-resource permissions in the cluster.
The entire RBAC is completely completed by several API objects. Like other API objects, it can be operated with kubectl or API.
It can be adjusted at runtime without restarting the API Server.

Briefly describe the role of Kubernetes Secret?

Secret object, the main function is to keep private data, such as password, OAuth Tokens, SSH Keys and other information. Putting these private information in the Secret object is more secure than putting it directly in the Pod or Docker Image, and it is also easier to use and distribute.

Briefly describe how to use Kubernetes Secret?

After the secret is created, it can be used in the following three ways:

When creating a Pod, the Secret is automatically used by specifying a Service Account for the Pod.
Use it by mounting the Secret to the Pod.
Used when downloading the Docker image, refer to it by specifying the spc.ImagePullSecrets of the Pod.

Briefly describe the Kubernetes PodSecurityPolicy mechanism?

Kubernetes PodSecurityPolicy is to more finely control the way Pod uses resources and improve security policies. After the PodSecurityPolicy admission controller is turned on, Kubernetes does not allow any Pod to be created by default. You need to create a PodSecurityPolicy policy and the corresponding RBAC authorization policy (Authorizing Policies) before the Pod can be created successfully.

Briefly describe what security policies can be implemented by the Kubernetes PodSecurityPolicy mechanism?

Different fields can be set in the PodSecurityPolicy object to control various security policies during Pod runtime. Common ones are:

Privileged mode: whether privileged allows Pod to run in privileged mode.
Host resources: control Pod's control of host resources, such as hostPID: whether to allow Pod to share the host's process space.
User and group: Set the user ID (scope) or group (scope) to run the container.
Elevation of privileges: AllowPrivilegeEscalation: Set whether the child processes in the container can elevate privileges, usually when setting up a non-root user (MustRunAsNonRoot).
SELinux: Perform SELinux related configuration.

Briefly describe the Kubernetes network model?

In the Kubernetes network model, each Pod has an independent IP address, and it is assumed that all Pods are in a flat network space that can be directly connected. So no matter whether they are running in the same Node (host), they are required to be directly accessible through each other's IP. The reason for designing this principle is that users do not need to consider how to establish a connection between Pods, nor do they need to consider how to map container ports to host ports.

At the same time, the model of setting an IP address for each Pod makes different containers in the same Pod share the same network namespace, that is, the same Linux network protocol stack. This means that containers in the same Pod can connect to each other's port through localhost.

In a Kubernetes cluster, IP is allocated in units of Pod. All containers in a Pod share a network stack (equivalent to a network namespace, and their IP addresses, network devices, configurations, etc. are all shared).

Briefly describe the Kubernetes CNI model?

CNI provides a plug-in network solution for application containers, defines the specifications for operating and configuring the container network, and implements the CNI interface in the form of plug-ins. CNI is only concerned with allocating network resources when the container is created, and deleting network resources when the container is destroyed. Only two concepts are involved in the CNI model: container and network.

Container: An environment with an independent Linux network namespace, such as a container created using Docker or rkt. The container needs to have its own Linux network namespace, which is a necessary condition for joining the network.

Network: Represents a group of entities that can be interconnected. These entities have their own independent and unique IP addresses, which can be containers, physical machines, or other network devices (such as routers).

The settings and operations of the container network are implemented through plug-ins. CNI plug-ins include two types: CNI Plugin and IPAM (IP Address Management) Plugin. CNI Plugin is responsible for configuring network resources for the container, and IPAM Plugin is responsible for allocating and managing the IP address of the container. IPAM Plugin is a part of CNI Plugin and works with CNI Plugin.

Briefly describe the Kubernetes network strategy?

In order to achieve a fine-grained network access isolation strategy between containers, Kubernetes introduces Network Policy.

The main function of Network Policy is to restrict network communication between Pods and access control, and to set a list of client Pods that are allowed or forbidden to access. Network Policy defines the network strategy and implements the strategy in cooperation with the Policy Controller.

Briefly describe the principle of Kubernetes network strategy?

The working principle of Network Policy is mainly as follows: The policy controller needs to implement an API Listener to monitor the Network Policy definition set by the user, and actually set the network access rules through the Agent of each Node (the Agent needs to be implemented through the CNI network plug-in).

Briefly describe the role of flannel in Kubernetes?

Flannel can be used to implement the underlying network of Kubernetes, and its main functions are:

It can assist Kubernetes by assigning non-conflicting IP addresses to each Docker container on Node.
It can establish an overlay network (Overlay Network) between these IP addresses, through this overlay network, the data packet is delivered to the target container intact.

Briefly describe the implementation principle of Kubernetes Calico network components?

Calico is a pure three-layer network solution based on BGP, which can be well integrated with cloud platforms such as OpenStack, Kubernetes, AWS, and GCE.

Calico uses Linux Kernel to implement an efficient vRouter at each computing node to be responsible for data forwarding. Each vRouter broadcasts the routing information of the container running on the node to the entire Calico network through the BGP protocol, and automatically sets routing and forwarding rules to other nodes.

Calico guarantees that all data traffic between containers is interconnected through IP routing. Calico nodes can directly use the network structure (L2 or L3) of the data center when networking, without additional NAT, tunnel or Overlay Network, and no additional packet unpacking, which can save CPU operations and improve network efficiency.

Briefly describe the role of Kubernetes shared storage?

For stateful container applications or applications that require data persistence, Kubernetes requires more reliable storage to store important data generated by applications so that container applications can still use the previous data after reconstruction. Therefore, it is necessary to use shared storage.

Briefly describe the ways of data persistence in Kubernetes?

Kubernetes persists important data through data persistence. Common ways are:

EmptyDir (empty directory): It is not specified to mount a directory on the host machine, and it is directly mapped to the host machine by the Pod internal storage department. Similar to the manager volume in docker.

Scenes:
Only need to temporarily save the data on the disk, such as in the merge/sort algorithm;
As shared storage for two containers.
characteristic:
Different containers in the same pod share the same persistent directory. When the pod node is deleted, the volume data will also be deleted.
The life cycle of emptyDir's data persistence is consistent with the pod used, and is generally used as a temporary storage.

Hostpath: Mount the existing directory or file on the host machine into the container. Similar to the bind mount method in docker.

Features: Increased the coupling between pod and node.

PersistentVolume (PV for short): Such as PV based on NFS service, or PV based on GFS. Its role is to unify the data persistence catalog and facilitate management.

Briefly describe Kubernetes PV and PVC?

PV is an abstraction of the underlying network shared storage, which defines shared storage as a "resource".

PVC is a user's "application" for storage resources.

Briefly describe the stages in the Kubernetes PV life cycle?

A PV may be in one of the following 4 phases (Phaes) in the life cycle.

Available: Available status, not yet bound to a PVC.
Bound: Has been bound to a PVC.
Released: The bound PVC has been deleted and the resource has been released, but it has not been recycled by the cluster.
Failed: Automatic resource recovery failed.

Briefly describe the storage supply model supported by Kubernetes?

Kubernetes supports two resource storage provisioning modes: static mode and dynamic mode (Dynamic).

Static mode: The cluster administrator manually creates many PVs, and needs to set the characteristics of the back-end storage when defining the PVs.

Dynamic mode: The cluster administrator does not need to create PV manually, but describes the back-end storage through the setting of StorageClass and marks it as a certain type. At this time, the PVC is required to declare the storage type, and the system will automatically complete the creation of PV and the binding with PVC.

Briefly describe the Kubernetes CSI model?

Kubernetes CSI is a storage interface standard launched by Kubernetes for docking with containers. Storage providers only need to implement storage plug-ins based on standard interfaces, and can use Kubernetes' native storage mechanism to provide storage services for containers. CSI enables the storage provider's code to be completely decoupled from the Kubernetes code, and the deployment is also separated from the Kubernetes core components. Obviously, the development of storage plug-ins is maintained by the provider itself, which can provide Kubernetes users with more storage functions and more security reliable.

CSI includes CSI Controller and CSI Node:

The main function of the CSI Controller is to provide a storage service perspective to manage and operate storage resources and storage volumes.
The main function of CSI Node is to manage and operate Volume on the host (Node).

Briefly describe the process of Kubernetes Worker nodes joining the cluster?

It is usually necessary to expand the capacity of the Worker node to expand the application system horizontally. The main process is as follows:

1. Install Docker, kubelet and kube-proxy services on the Node;
2. Then configure the startup parameters of kubelet and kubeproxy, specify the Master URL as the address of the current Kubernetes cluster Master, and finally start these services;
3. Through kubelet's default automatic registration mechanism, new workers will automatically join the existing Kubernetes cluster;
4. After the Kubernetes Master accepts the registration of the new Worker, it will automatically include it in the scheduling scope of the current cluster.

Briefly describe how Kubernetes Pod realizes resource control of nodes?

The resources provided by the nodes in the Kubernetes cluster are mainly computing resources, which are basic resources that can be applied for, allocated, and used in a measurable manner. The computing resources in the current Kubernetes cluster mainly include CPU, GPU and Memory. CPU and Memory are used by Pod, so when configuring Pod, you can use the parameters CPU Request and Memory Request to specify the amount of CPU and Memory needed for each container in it. Kubernetes will look for sufficient resources according to the value of Request. Node to schedule this Pod.

Generally, the CPU and Memory used by a program is a dynamic amount, to be precise, a range, which is closely related to its load: when the load increases, the usage of CPU and Memory also increases.

Briefly describe how Kubernetes Requests and Limits affect Pod scheduling?

When a Pod is successfully created, the Kubernetes scheduler (Scheduler) will select a node for the Pod to execute. For each computing resource (CPU and Memory), each node has a maximum capacity value that can be used to run Pod. When scheduling, the scheduler must first ensure that the sum of the Requests of the CPU and memory of all Pods on the node after scheduling does not exceed the maximum capacity of the CPU and memory that the node can provide to the Pod.

Briefly describe Kubernetes Metric Service?

After Kubernetes version 1.10, Metrics Server is adopted as the default performance data collection and monitoring, which is mainly used to provide core metrics (Core Metrics), including the CPU and memory usage metrics of Node and Pod.

The monitoring of other custom metrics (Custom Metrics) is done by components such as Prometheus.

Briefly describe how to use EFK to achieve unified log management in Kubernetes?

In a Kubernetes cluster environment, usually a complete application or service involves too many components. It is recommended to conduct centralized management of the log system, which is usually implemented by EFK.

EFK is a combination of Elasticsearch, Fluentd and Kibana, and its component functions are as follows:

Elasticsearch: is a search engine, responsible for storing logs and providing query interfaces;
Fluentd: Responsible for collecting logs from Kubernetes. Fluentd on each node monitors and collects the system logs on the node, and sends the processed log information to Elasticsearch;
Kibana: Provides a Web GUI, users can browse and search logs stored in Elasticsearch.

Collect logs on each node by deploying a fluentd running in DaemonSet mode on each node. Fluentd mounts the docker log directory /var/lib/docker/containers and /var/log directory to the Pod, and then the Pod will create a new directory in the /var/log/pods directory of the node node, which can distinguish different containers Log output, a log file in this directory is linked to the container log output in the /var/lib/docker/contianers directory.

Briefly describe how Kubernetes performs graceful node shutdown maintenance?

Since Kubernetes nodes run a large number of Pods, it is recommended to use kubectl drain to expel the node's Pods before performing shutdown maintenance, and then perform shutdown maintenance.

Briefly describe Kubernetes cluster federation?

Kubernetes cluster federation can manage multiple Kubernetes clusters as one cluster. Therefore, it is possible to create multiple Kubernetes clusters in a data center/cloud, and use cluster federation to control/manage all clusters in one place.

Briefly describe Helm and its advantages?

Helm is a package management tool for Kubernetes. Similar to apt used in Ubuntu, yum used in Centos or pip in Python.

Helm can package a set of K8S resources for unified management, which is the best way to find, share and use software built for Kubernetes.

In Helm, each package is usually called a Chart, and a Chart is a directory (generally, the directory is packaged and compressed to form a single file in name-version.tgz format, which is convenient for transmission and storage).

Helm advantage

Deploying a usable application in Kubernetes requires the collaboration of many Kubernetes resources. Using helm has the following advantages:

Unified management, configuration and update of these scattered k8s application resource files;
Distribute and reuse a set of application templates;
Treat a series of resources of the application as a package management.
For application publishers, Helm can package applications, manage application dependencies, manage application versions, and publish applications to software warehouses.
For users, there is no need to write complex application deployment files after using Helm, and applications can be found, installed, upgraded, rolled back, and uninstalled on Kubernetes in a simple way.

Source: https://www.yuque.com/docs/share/d3dd1e8e-6828-4da7-9e30-6a4f45c6fa8e