1
头图

Hello everyone, my name is Zhang Jintao.

Kubernetes is officially released with 40 project enhancements.

The main expression of this LOGO is to respect the spirit of collaboration and openness that brings us together into a force that can change the world.

In fact, every issue of my "k8s Ecological Weekly" has a section called upstream progress, so a lot of content worthy of attention has been posted in previous articles.

In this article, I will introduce some additional things that have not been covered before, and which are worth paying attention to.

core CSI Migration reaches Stable

Kubernetes support for CSI has been introduced since v1.9, and has been officially GA by v1.13 in 2019.
In recent versions, the community is gradually deprecating or removing the original in-tree plugin and migrating to the CSI driver.

The benefits of migrating to using CSI are improved maintainability and fewer bugs or bugs caused by in-tree code.

However, migration is not a direct replacement. Considering the user's migration cost, the community proposed a solution: CSI Migration, which is to
The in-tree API translates to the equivalent CSI API and delegates operations to the CSI driver's functions. At present, this function has reached GA, and there is also a lot of progress in the migration of in-tree.

In this PR, the in-tree GlusterFS plugin is abandoned, which is actually the earliest dynamic provisioner, introduced since Kubernetes v1.4.

When the CSI driver appeared later, the corresponding driver implementation https://github.com/gluster/gluster-csi-driver/ appeared in the community immediately, but the project was not actively maintained.
Now there is another recommended alternative, you can consider using https://github.com/kadalu/kadalu/ The latest version of this project is v0.8.15.

After previous discussions in the community, it was decided to mark the in-tree GlusterFS plugin as obsolete in v1.25 and remove it in subsequent versions.

If there is a small partner who is using this plugin, I suggest that you can evaluate the feasibility & migration cost of kadalu as soon as possible.

The above-mentioned PRs do basically similar things, which are some cleaning operations.
Deleted in-tree volume plugins such as StorageOS , Flocker and Quobyte in the Kubernetes project.

Among them StorageOS and Quobyte if they are in use, it is recommended to migrate to the CSI plugin, and Flocker is no longer maintained, so there is no migration plan .

cgroup v2 support up to GA

In my 2019 GitChat interview, let me talk about tech trends for 2020, and my main takeaways at the time are as follows:

As the cornerstone of cloud-native technology, Kubernetes will continue to grow in popularity in 2020. The scale of clusters of various companies and the promotion of container technology will continue to increase. After the initial containerization, more companies will face stability and performance optimization issues. At the same time, technologies such as service mesh and serverless will gradually be widely used. From the perspective of the underlying technology, cgroups v2 will gradually become popular, and then replace cgroups v1, but this process may take about two or three years. Overall, stability and performance optimization will be the main theme in the future.

Now, 3 years have passed, and the support for cgroup v2 in Kubernetes v1.25 has reached GA, which is the same as my previous judgment.

I also wrote an article "A cornerstone of understanding container technology: cgroup" , maybe it's time to consider writing a new cgroup v2 (laughs)

PodSecurity feature reaches GA

The PodSecurity feature was officially upgraded to GA in #110459 kubernetes/kubernetes . If I remember correctly, this feature was probably one of the quickest from being introduced to GA.

PodSecurity is an alpha feature introduced since Kubernetes v1.22 as a replacement for PodSecurityPolicy. It reached the beta level in v1.23. Through the above PR, it was officially GA in v1.25, and it was enabled by default. You can see that the whole development process is still very fast.

PodSecurity defines 3 levels:

  • Enforce: If the Pod violates the policy, it will not be created;
  • Audit: If the Pod violates the policy, it will be recorded in the audit log, but the Pod will still be created normally;
  • Warn: If the Pod violates the policy, warning information will be printed in the console, and the Pod will still be created normally;

It is also very simple to use, just add pod-security.kubernetes.io/<mode>=<standard> tag to the namespace.

As long as your Kubernetes cluster version is upgraded to v1.22 or later, and the PodSecurity feature is enabled, you can migrate according to Migrate from PodSecurityPolicy to the Built-In PodSecurity Admission Controller | Kubernetes .

But if you have a lower version of the cluster and want a relatively general approach, I suggest you take a look at the article I wrote earlier. "Understanding the Admission Controller in Kubernetes"
and "Cloud Native Strategy Engine Kyverno (Part 1)"
Unified configuration is done by using Admission controller, OPA/GateKeeper, Kyverno, etc.

Initial support for user namespaces

This PR implements the first phase of KEP127, which aims to add support for user namespaces to Pods.
For those who are not familiar with user namespaces, you can check out my previous series of articles: "Understanding the Cornerstone of Container Technology: Namespace (Part 1)" and "Understanding the Cornerstone of Container Technology: Namespace (Part 2)" .

The advantage of supporting the use of user namespaces in Kubernetes is that it is possible to run processes in Pods with different UID/GID from the host, so that privileged processes in the Pod actually run as normal processes in the host. In this way, assuming that the privileged process in the Pod is breached due to security loopholes, the impact on the host can also be reduced to a relatively low level.

Directly related vulnerabilities, such as CVE-2019-5736, I also introduced it in my 2019 article "On the occasion of the release of runc 1.0-rc7" ,
Interested partners can go to this article for details.

This implementation adds a boolean HostUsers field to the Pod's Spec to determine whether to enable the host's user namespaces. The default is true.

In addition, it is currently foreseeable that if the Linux kernel version used by the Kubernetes cluster is below v5.19, using this feature may lead to increased Pod startup time.

CronJobTimeZone reaches beta level

The CronJob TimeZone feature has reached the Beta stage. However, according to the latest feature policy of Kubernetes, this feature still needs to be manually enabled CronJobTimeZone feature gate .

Note that if CronJob does not set TimeZone, it will use the TimeZone of the kube-controller-manager process by default.
I have encountered a small partner who wasted some time on this problem before.

Introduced alpha feature ContainerCheckpoint

#104907 · kubernetes/kubernetes This is a PR that has been going on for almost a year, in which a new feature is introduced: ContainerCheckpoint .

Friends who are familiar with Docker may know that there is a subcommand of docker checkpoint in Docker.
This subcommand actually helps us create a snapshot of a state point of a running container and save it to disk.

Later, we can use this checkpoint to start the container, restore its original state, or migrate the container to another machine.

Also, this new feature provided in Kubernetes ContainerCheckpoint was added as an alpha feature since v1.25 and is disabled by default.
Using this feature, you can create a stateful snapshot of the container through the API provided by kubelet, and then move it to another node for debugging, or other similar needs.

It should be noted here that creating a checkpoint may cause some security risks. For example, a checkpoint is actually a memory snapshot of the currently running container.
The container's memory contains some private data, so it may be accessible on other machines.

On the other hand, creating a checkpoint will generate some files, which need to occupy the disk. If checkpoints are created frequently, it may cause excessive disk pressure.
The archive of the checkpoint will be placed in the /var/lib/kubelet/checkpoints directory by default and named with checkpoint-<podFullName>-<containerName>-<timestamp>.tar .

This feature is also very simple to use, just send a request directly to the Kubelet:

 POST /checkpoint/{namespace}/{pod}/{container}

Then restore the obtained archive file with the following command:

 crictl restore --import=<archive>

Introduce kuberc configuration file for kubectl

KEP-3104 This KEP aims to introduce a new configuration file kuberc for kubectl, which is used to configure some user-defined configuration. This has similar usage in many projects or tools. For example, in Vim, you can specify the user's own configuration file by -u , or use the default ~/.vimrc to complete the custom configuration.

The advantage of this is that it can make kubeconfig more focused, only need to keep the information related to the cluster and user credentials, and separate the user's custom configuration. Specifically, the configuration file would look like this:

 apiVersion: v1alpha1
kind: Preferences

command:
  aliases:
    - alias: getdbprod
      command: get pods -l what=database --namespace us-2-production
      
  overrides:
    - command: apply
      flags:
        - name: server-side
          default: "true"
    - command: delete
      flags:
        - name: confirm
          default: "true"
    - command: "*"
      flags:
        - name: exec-auth-allowlist
          default: /var/kubectl/exec/...

It looks more intuitive and can be used to add some alias and override some default configurations, so that you no longer need to define a lot of aliases, and you can also type a lot less commands when using kubectl in the future.
Before this feature was implemented,
By the way, I recommend another project kubectl-aliases . This project contains a lot of aliases, which can make the process of using kubectl easier.

But there are also some drawbacks, like every Vim user has to have their own vimrc configuration file, which will develop a certain habit. When used on some machines without their own custom configuration, it will be a little uncomfortable.
At the same time, when troubleshooting problems, it may also increase the links for troubleshooting (such as adding a wrong configuration in kuberc).

For example, when I troubleshoot Vim, I usually go straight to vim -u /dev/null in case any custom configuration is used. Then if this function is fully realized in the future, when you troubleshoot problems, you need to pay attention to using kubectl --kuberc /dev/null similar to this method to avoid the impact of local custom configuration.

Completion for adding --subresource to kubectl

In Kubernetes v1.24, subresource support (referring to status and scale ) is added to kubectl, so that it is very convenient to directly operate on subresources without every You can view it directly, or call APIs such as -o yaml to complete other operations.
After using this feature, you can have the following effects:

 # v1.24+
tao@moelove:~$ kubectl get  -n apisix  --subresource status deploy apisix-ingress-controller 
NAME                        AGE
apisix-ingress-controller   2d23h

tao@moelove:~$ kubectl get  -n apisix  --subresource scale deploy apisix-ingress-controller  -ojson
{
    "apiVersion": "autoscaling/v1",
    "kind": "Scale",
    "metadata": {
        "creationTimestamp": "2022-08-04T18:57:45Z",
        "name": "apisix-ingress-controller",
        "namespace": "apisix",
        "resourceVersion": "1656",
        "uid": "7c191a14-ee55-4254-80ba-7c91b4c833bd"
    },
    "spec": {
        "replicas": 1
    },
    "status": {
        "replicas": 1,
        "selector": "app.kubernetes.io/instance=apisix,app.kubernetes.io/name=ingress-controller"
    }
}

However, if this parameter is used in previous versions, an error will be prompted directly:

 # v1.23
tao@moelove:~$ kubectl-v1.23 get  -n apisix  --subresource status deploy apisix-ingress-controller 
Error: unknown flag: --subresource
See 'kubectl get --help' for usage.

The PR in v1.25 I mentioned here is actually to provide --subresource with a command completion capability (although as mentioned above, there are currently two resources), it is still relatively convenient.

other

In kubeadm, a --experimental-initial-corrupt-check option is added to etcd's static Pod, which can be used to confirm the consistency of data in etcd members. This feature is expected to be officially available in v3.6 of etcd. In addition, the Release page of etcd is also written. Currently, etcd 3.5.x is not recommended for production environments. If you have not upgraded, you can continue to use 3.4.x. If it has been upgraded, you can increase this parameter by yourself;

This PR lasted for nearly three months, mainly to upgrade the Ginkgo used in the Kubernetes project from the deprecated v1 version to the v2 version.

In fact, many projects are actively promoting this matter, but different projects rely on and use Ginkgo differently. More than 600 files have been modified in this PR, which is very huge.
In the Apache APISIX Ingress controller project, it only took a week to upgrade from Ginkgo v1 to v2, and the files were not modified too much.
In addition, the current Kubernetes Ingress-NGINX project is also undergoing this upgrade, and the workload may not be small.

This PR is a relatively small change, but its impact is huge.

A new KUBECACHEDIR environment variable was introduced in this PR to replace the default ~/.kube/cache cache directory. Passing this PR may cause users to skip the cache through this environment variable when using kubectl. This in turn may cause some performance issues.

/logs in kube-apiserver is disabled by default due to security issues, and then enabled with the --enable-logs-handler tag. If you want to collect logs, you need to pay extra attention.

The container image of kube-proxy will be changed to distroless.
This can avoid many security risks and improve the security of the cluster.

These are the more noteworthy content in Kubernetes v1.25. Be sure to check before proceeding with a cluster upgrade.
Alright, see you next time!


Welcome to subscribe my article public account [MoeLove]

TheMoeLove


张晋涛
1.7k 声望19.7k 粉丝