Author: Bao Jiangnan

introduction

Around April 2021, I was fortunate to learn about sealer's related work at the beginning of its launch, and soon after, as one of the initial development students, I joined the development of sealer.

In this article, I will review the chance and coincidence of personal participation in the sealer open source project, the challenges in the process of participating, and the insights gained from it, and write a paragraph to share, hoping to help new open source people and inspire people who want to participate in open source work but Students who have not yet taken the first step.

Personal profile

 title=

Hello readers, this article is the first article I published to Alibaba Cloud, so I will briefly introduce myself first.

I am Bao Jiangnan, the maintainer of sealer [ 1] . Graduated from the Turing class of Central South University. He is currently studying at Zhejiang University and is a master student in the SEL laboratory. At present, the main research direction is co-located cluster scheduling.

GitHub: https://github.com/justadogitaken

the beginning of the journey

When I first started my postgraduate life, others might seem smug and ready to go, but at that time I was inexplicably anxious and confused. When I was a graduate student, my laboratory was the SEL experiment of Zhejiang University, and the main research field was cloud computing. Therefore, in the cloud computing industry, we have many brothers and sisters. At that time, I came to a high-tech cloud computing company "Harmony Cloud Technology" as a cloud native R&D intern. Harmony Cloud Technology and Alibaba Cloud have many cutting-edge cloud native cooperation projects, and the core team members are from the SEL laboratory of Zhejiang University. By chance, I joined the Harmony Cloud & Alibaba Cloud Cloud Native Project Team, and met the lab brother Sun Hongliang, who was in the Alibaba Cloud Cloud Native Team, and through him, met the founder of sealer, Fang Haitao. They all have a lot of open source experience. While chatting, when I learned that Senior Brother Hongliang used to be the maintainer of Docker, I regarded Senior Brother Hongliang as a benchmark, and as the direction of my efforts, this should be the way most people set their goals. One of the conversations left a deep impression on me. My brother asked me, "What do you want to do most this year (2021)?" My answer was, "Leave some traces on open source (actually, at the time, I didn't know what I wanted to do?" Figure out why you are doing open source)”. Now that I think about it, it may be the simple idea of "going to the doctor in a hurry" caused by "anxiety" + "unclear personal planning". But no matter what, I also set a preliminary goal for myself at that time, and I devoted myself wholeheartedly with the attitude of "Make it happen".

Around April 2021, I joined the sealer open source team and focused on developing the core capabilities of sealer (originally called cluster mirroring) with several classmates.

Responsibilities and Challenges

sealer is a cloud-native tool open sourced by Alibaba Cloud, which aims to help distributed software to better encapsulate, distribute and run. Today, due to its novel design concept and the growth of the industry user base, sealer has been donated to the CNCF Foundation and has become a CNCF sandbox project, moving towards a broader industry standard.

At the beginning of software, there is often chaos accompanied by hope. Behind the big goal, the sealer team needs to solve too many technical problems, such as user interface, image format, distribution mode, operating efficiency, software architecture and so on. In the initial division of sealer development, I was mainly responsible for the image module, including cluster image cache, cluster-dependent container image cache, cluster image sharing and other capabilities.

Cluster image cache : How to greatly improve the efficiency of image construction through the reuse of cluster image layers. For example, in docker build, each build will first look for the previously cached image build content, reuse the build cache, reduce hard disk usage and improve image build efficiency.

Cluster-dependent container image cache : How to cache all the container images that the cluster depends on without the user's perception. In the early days, when sealer build builds a cluster image, it needs to actually pull up the cluster, and then package the cluster image after all loads are started normally. Among them, a very important part is to cache all the container images that the cluster depends on so that it can be packaged.

Cluster image sharing : Anyone can share cluster images using sealer, push/pull/save/load, etc. using the docker tool.

first technical challenge

Among the many challenges of sealer, I am more impressed that the cluster relies on the container image cache and the container image proxy of the private warehouse ; I remember that Brother Haitao found me and said that he wanted me to be responsible for a core function of sealer, namely "How does sealer support During the build process, all container images pulled during the build process are cached without the user providing additional information and without the user’s perception.” The following is a brief introduction to the relevant background.

  • Why does sealer need to cache all container images that the cluster depends on?

The docker container image will package the file system/configuration information required by an application. With the help of virtualization technology, docker run can run directly in any environment, even if it is isolated from the external network (the application itself has no access to the external network. logic). sealer is committed to defining the standards for cluster delivery, and also needs to solve the problem of external network isolation and image pulling, especially the private cloud delivery scenario, which is just needed. These application container images are part of the file system required by the sealer cluster image. There are many ways to solve this problem. For example, the easiest way is to let the user fill in the form, and then pull it together and package it; but both Haitao and I think that this is not user-friendly and elegant, and users who use sealer should all be "Lazy people" are tired of doing these trivial things. So let's do the tedious things for the user.

  • Why does sealer solve the container image proxy of private repositories?

There is an issue [ 2] opened in 2015 in the docker community. This issue is a request to support the mirror proxy of the private mirror repository in the docker daemon configuration, but the community has not solved the problem so far. The intuitive description of the problem is shown in Figure 1. :

 title=

Figure 1 Mirror proxy logic of native docker

During the image construction phase, sealer will cache all the container images required by the cluster in the local registry, and then ask cri to pull the container images from the registry after the cluster is started. However, because docker does not support the mirror proxy configuration of private warehouses, when pulling "example.hub/library/centos (example.hub is the address of any mirror warehouse except dockerhub)", the mirror address cannot be configured through docker daemon to pull. , but will pull directly to "example.hub".

But this is not what we expected, because the container images we cached in the build phase are prepared for the current launch phase, and in the private cloud delivery scenario, the cluster network is isolated from the outside world. For this problem, our initial optional solution is to use the webhook function of K8s to replace/add the address of the local registry to all image prefixes before pod creation. Filling in the application YAML is a bit intrusive, and we insisted on making it more elegant.

In order to solve the two problems that the cluster relies on container image cache/private repository container image proxy, I started to study and understand the source code of docker [ 3] / registry [ 4] ; I quickly located the mirror configuration part in the docker source code, In addition, I also learned from the official documentation that the registry itself supports the ability to pull through cache, as shown in code block 1. However, the registry only supports the configuration of a single remoteurl, and the user's image will come from multiple remote image repositories, so the native configuration of the registry cannot be used directly.

 proxy:
  remoteurl: https://registry-1.docker.io
  username: [username]
  password: [password]

Code block 1 registry pull through cache configuration item

Since the community does not support it, and docker is slow to advance the mirroring agent of the private warehouse, I will directly enhance it based on the existing capabilities of the community, and the changes will not affect the other logic of docker, but only make some incremental configuration items to achieve what we need. function.

Finally, the capabilities supported by docker and registry that we provide are shown in Figure 2.

The main features are:

  • Docker supports mirroring agents for any registry.
  • The registry supports the caching of multiple remote image repositories, and can be configured without any user configuration. Of course, users can also choose to configure it themselves.

 title=

Figure 2 Schematic diagram of sealer's enhanced docker/registry capabilities

sense of achievement and responsibility

In a vague impression, two to three months after we open source sealer, sealer ushered in the first user that landed in the production environment - Zhengcaiyun.

There are several events that made me feel very happy in the process of my growth experience, such as the senior year to do an internship in Beijing Douyin - the freshness of entering a large company, the successful rejection of a master's degree from Zhejiang University, etc. The most recent one is our open source tool Having the first customer who is willing to land in the production environment shows that the work we do has been recognized by others, and a large part of the work is done by me thinking hard, which gives me a unique sense of accomplishment.

However, with all the attention paid to the sealer, my responsibilities have grown. In the past, when writing software, we often only cared about functions and only needed to be able to run them; now, to open source sealer, we need to care about too many technologies, such as: how to elegantly judge and realize the needs of open source users; how to design a good architecture to support sealer Continuous development; how to allocate energy to complete sealer's software quality and so on. The most direct way to improve this aspect is to learn from excellent projects. That period of time should be the most efficient period for me to learn. I read a lot of docker/registry source code; one of the parts I learned from docker is the modularization of code functions. ;In the initial sealer development stage, there were only a few developers in total. In order to iterate faster, everyone wrote a tool class for the underlying files and metadata information, but the modules that each person was responsible for were different. Metadata/files are very likely to change in the continuous iteration process, which has a great hidden danger.

Because I stole the docker code for a period of time, I decided to start from a certain version, converging all mirror-related operations to the mirror module, providing interfaces to other modules, relying on files and other operations at the lower level, and continuing to abstract a layer. File system module; through the refactoring of some modules, our code is cleaner than before, and the risk of other students misusing the underlying files is reduced.

In the process of using sealer to put into practice, Zheng Caiyun helped us discover many problems. I still remember that during that time, I frequently communicated with another maintainer of sealer, Capricorn, to solve problems. I was very impressed that when I went to Wuhan China University of Science and Technology to find a girlfriend, out of responsibility for the first customer of sealer, I solved the issues raised by Capricorn in the community one by one in the coffee shop/library/school history room of Huazhong University of Science and Technology. .

Harvest and Think

Self-confidence : I believe that when most people encounter setbacks and difficulties, they will tell themselves "what I have accomplished", let yourself persevere.

Openness and communication : In the process of being a sealer, many designs are actually based on docker, but there are still doubts; during that time, I will frequently communicate with the community students, not only the sealer community, but the whole Open source community; I encountered some problems when compressing images, and I couldn't come to a conclusion after thinking for a long time. Then I wrote an email to ask vbatts, the author of tar-split, and solved the doubts smoothly. I think communication is one of the most important skills at work. Adequate communication can often solve many problems and avoid a lot of useless work.

Going to open source worship : In the previous article "The Beginning of the Journey", I mentioned that my expectation for 2021 is to leave traces in open source, but in fact, I was not clear at the time why it was leaving traces in open source. Probably for so-called reputation. After I participated in an open source project relatively completely, I found that superstitious open source contributions and harvesting the so-called reputation are actually meaningless. Be sure to do things that you recognize and have value, and the rest is not important.

Continuous learning : The main challenge mentioned in the article is the container image agent of the private warehouse. Solving this problem at the time gave me a sense of accomplishment, and I also learned a lot in the process. However, looking back now, there are still some flaws that cannot be ignored. For example, the docker provided by the sealer must be used. This is actually a relatively heavy behavior. Although it seems reasonable for us to use this component as the rootFS of the sealer, in this case we need to Provide most versions of docker, and keep track of upstream community version changes. So looking at the solution at the time now, it's not really elegant. When I was visiting the community recently, I saw a tool called intel/cri-resource-manager [ 5] . I guess the background of this tool is that the docker/K8s community is too slow to access technologies such as rdt [ 6 ] , so intel A plug-in is developed, which is placed between kubelet and cri to provide the upper-layer Kubernetes with some cutting-edge features of the container runtime dimension in a non-invasive way. After learning the overall architecture of the project, I thought that the container image proxy of sealer's private warehouse might also be solved in a non-invasive way.

future plan

Performance improvement : Continuous optimization to improve delivery efficiency and stability, to ensure that sealer can achieve the ultimate in cluster delivery first.

Architecture optimization : At present, each sub-module of sealer is relatively coupled, and the entry threshold for newcomers is high, which makes it difficult to develop the developer ecosystem. In the future, we will focus on abstracting each functional module so that community participants can focus more on sub-fields. Such as runtime module, support k0s/k3s and so on.

Expand the ecology : guide the community to participate in the construction of cluster images and enrich the application ecology of sealer.

Attract more developers : The community needs to absorb more developers to grow the community; at the same time, a simpler quick start is needed to lower the development threshold.

Multi-community cooperation : The sealer community is establishing cooperation with more open source communities, such as openyurt [ 7] , sealos [ 8] . In order to promote a win-win situation for all parties.

Reference link:

[1] sealer:

https://github.com/sealerio/sealer

[2] Enable engine to mirror private registry:

https://github.com/moby/moby/issues/18818

[3] moby:

https://github.com/moby/moby

[4] distribution:

https://github.com/distribution/distribution

[5] cri-resource-manager:

https://github.com/intel/cri-resource-manager.git

[6] Intel Resource Director Technology:

https://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html

[7] openyurt:

https://github.com/openyurtio/openyurt

[8] sealos:

https://github.com/labring/sealos

Click " here " to learn about the sealer project now!


阿里云云原生
1k 声望302 粉丝