Summary

The term DevOps comes from the combination of Development and Operations, which means that the development and test operation and maintenance links in the software delivery process are opened through the tool chain, and through automated testing and monitoring, the team's time loss is reduced, and products are delivered more efficiently and stably.

This article will focus on the capabilities that DevOps needs to provide in the continuous integration phase, and will briefly explain the workflow design and pipeline optimization ideas.

With the increasing size of the project, more and more functional features and maintenance personnel, the contradiction between the frequency of feature delivery and software quality has become increasingly acute. How to balance the two has become a focus of the current team’s urgent attention, so the landing A complete DevOps tool chain was put on the agenda.

We believe that, from code integration, functional testing, to deployment and release, and infrastructure management, there should be comprehensive and complete automated monitoring methods for every link, and manual intervention should be avoided as much as possible. Only in this way can the software give consideration to both quality and efficiency, and ensure reliability while increasing the release frequency. This is the ultimate goal of every successful large-scale project.

This article will focus on DevOps in ability to continue to provide the required integration phase, will give a brief explanation optimized design ideas and pipeline workflow.

When we are talking about CI, what are we talking about

CI (Continuous Integration), that is, continuous integration, refers to the behavior of frequently (multiple times a day) integrating code into the backbone.

Note that this includes not only the meaning of continuously integrating code into the backbone, but also the process of continuously generating source code to produce products that can be used in practice. Therefore, we need to use CI to automatically ensure the quality of the code, and transform its build products to generate usable products for the next stage of calling.

Therefore, in the CI phase, we have at least the following phases to be achieved:

  1. Static code inspection

This includes ESLINT/TSLINT static syntax checking, verifying whether the git commit message conforms to the specification, whether the submitted file has a corresponding owner that can be reviewed, and so on. These static checks do not require a compilation process and can be completed by directly scanning the source code.

  1. Unit Test/Integration Test/E2E Test

This link of automated testing is the key to ensuring product quality. The coverage of test cases and the quality of use cases directly determine the quality of the build product. Therefore, comprehensive and complete test cases are also essential elements for continuous delivery.

  1. Compile and organize the product

In small and medium-sized projects, this step is usually omitted directly, and the build product is directly delivered to the deployment link. But for large-scale projects, frequent submissions of construction will produce a large number of construction products, which need to be properly managed. The establishment of product to product will be explained in detail next.

Facilitate integrated workflow design

Before officially accessing CI, we need to plan a new workflow to adapt to the problems and difficulties that may arise after the project is switched to high-frequency integration. There are many transformation levels involved here. In addition to urging developers to change their habits and conduct training on new processes, our main concern is the way in which the update of the source code repository triggers the continuous integration step.

The organization of the assembly line

We need a proper organizational form to manage what tasks should be performed at what stage of a CI pipeline.

There are a lot of CI tools on the market to choose from. If you look carefully, you will find that whether it is an emerging lightweight tool like Drone or an old Jenkins, it supports such a feature ConfigurationasCode , That is, use the configuration file to manage the pipeline.

The benefits of this are considerable. First, it no longer needs a web page dedicated to pipeline management, which undoubtedly reduces maintenance costs for the platform. Secondly, for the user, integrate the pipeline configuration in the source code repository and enjoy the way of synchronous upgrade with the source code, so that the CI process can also use git version management for specification and audit traceability.

After establishing the organization of the pipeline, we also need to consider the release mode of the version and the branching strategy of the source code repository, which directly determines the way we should plan the pipeline for code integration.

The choice of version release mode

It is mentioned in the book "Continuous Delivery 2.0" that the version release model has three elements: delivery time, number of features, and delivery quality.

These three checks and balances with each other. With the development of manpower and resources relatively fixed, we can only guarantee two of them.

The traditional project-based release model sacrifices delivery time and waits for all features to be developed and undergoes complete manual testing before releasing a new version. However, this will make the delivery cycle longer, and due to the large number of features, the uncontrollable risk in the development process will become higher, which may result in the version not being delivered on time. Does not meet the continuous delivery requirements of a mature large-scale project.

For the idea of continuous integration, when our integration frequency is high enough and automated testing is mature and stable enough, we can completely stack all the features in one release without using our brains. Every time a feature is developed, it will be tested automatically, and it will be merged and released after completion. Next, you only need to automatically release the stable and waiting features in a specific time period. This is undoubtedly an optimal solution for modern large-scale projects where the release frequency is getting higher and the release cycle is getting shorter and shorter.

Branch strategy

Like most teams, our original development model is also the branch development, the main release idea, and the branch strategy adopts the industry's most mature and perfect Git-Flow model.

It can be seen that this mode has been considered in terms of feature development, bug fixes, version releases, and even hotfixes. It is a workflow that can be applied in a production environment. However, the overall structure has therefore become extremely complicated and inconvenient to manage. For example, the operation process of a hotfix is: pull the hotfix branch from the main branch used before the latest release, merge it into the develop branch after repair, wait for the next version to be released, pull it out to the release branch, and reconnect after the release is complete trunk.

In addition, for Git-Flow , there is no strict integration time, so for larger requirements, the integration time interval may be very long, so there may be a large number of conflicts when integrating into the main trunk. The solution resulted in the unreasonable extension of the project duration. In this regard, students who do large-scale transformation and reconstruction should have a deep understanding.

In response to this, we decided to boldly adopt the branch strategy of trunk development and trunk release.

We require that members of the development team try their best to submit their own branch code to the trunk every day. When the release conditions are reached, the release branch is pulled directly from the main trunk for release. If a defect is found, fix it directly on the main trunk and cherry pick to the release branch of the corresponding version as needed.

In this way, the only branches that developers need to pay attention to are the trunk and their working branches. All branch operations can be completed by only two git commands, push and merge. At the same time, due to the increase in the frequency of integration, the average amount of conflicts that each person needs to resolve is greatly reduced, which undoubtedly solves the pain points of many developers.

It should be noted that there is no silver bullet in the branch strategy and version release model. The strategy we adopted may not be suitable for all team projects. Increasing the frequency of integration as soon as possible allows products to iterate quickly, but it will undoubtedly make it difficult for newly developed features to be fully manually tested and verified.

In order to resolve this contradiction, a strong infrastructure and long-term habit training are needed behind it. The difficulties are divided into the following types. You can consider these difficulties to determine whether it is necessary to adopt the main development method.

  1. Complete and fast automated testing. Only when the unit test, integration test, and E2E test coverage are extremely high, and the quality of the test cases obtained through the mutation test is high, can there be an overall guarantee for the quality of the project. But this requires all developers in the team to get used to the TDD (Test Driven Development) development method, which is a very long process of cultivating engineering culture.
  2. Code Review mechanism of Owner responsibility system. Let developers have Owner awareness and review the modules they are responsible for line by line, which can avoid many destructive modifications and pits in the design architecture when the code is modified. Essentially, the difficulty is actually the habit training of developers.
  3. A large amount of infrastructure investment. High-frequency automated testing is actually a resource-consuming operation, especially E2E testing. Each test case needs to be supported by a headless browser. In addition, in order to improve the efficiency of testing, multi-core machines are required to execute in parallel. Each item here is a large investment of resources.
  4. Fast and stable rollback capability and accurate online and grayscale monitoring, etc. Only under highly automated full-link monitoring can the stable operation of the new version released under this mechanism be guaranteed. I will introduce the construction here in detail in a later article.

Products in large-scale projects -> product establishment

For most projects, after the code is compiled and the product is generated, the way to deploy the project is to log in to the publishing server and paste each generated product into the publishing server. The generated static files can be stored at the same time due to different hashes, and the html is updated by direct overwriting.

Directly use copy and paste to update and overwrite files, which is not convenient for auditing and tracing the update history, and it is also difficult to ensure the correctness of such changes.

In addition, when we need to roll back the version, because the historical version of html is not stored on the server, the way to roll back is actually to recompile and package the product to generate the historical version to cover it. This rollback speed is obviously not satisfactory.

One solution is to avoid any overwriting and updating of files. All products should be uploaded for persistent storage. We can add a traffic distribution service upstream of the request to determine which version of the html file should be returned for each request.

For large projects, the returned html files are not necessarily static. It may be injected into the channel, user-defined and other identification, as well as the above-the-fold data required by the SSR, thereby changing its code format. Therefore, we believe that the product provider of the html file should be a separate dynamic service, through some logic to complete the replacement of the template html and finally output.

To sum up, after each compilation is completed, the product will be organized as follows to generate the final front-end product:

  1. For static files, such as CSS, JS and other resources will be published to the cloud object storage, and use this as the source site to synchronize access speed optimization for CDN.
  2. For HTML products, a straight-out service is needed for support, and packaged into a docker image, which is at the same level as the back-end microservice image, for the upstream traffic distribution service (gateway) to choose which service load to transfer for consumption according to user requests.

Speed is efficiency, pipeline optimization ideas

For a good tool, the internal design can be complicated, but it must be simple and easy to use for the user.

Under the high-frequency continuous integration of backbone development, integration speed is efficiency, and the execution time of the pipeline is undoubtedly the most concerned by developers, and it is also the decisive indicator of whether the pipeline is easy to use. We can start from several aspects to improve the efficiency of pipeline execution and reduce the waiting time for developers.

Pipeline task orchestration

For the tasks that need to be executed at each stage of the pipeline, we need to follow certain choreography principles: no predecessor tasks first, short execution time priority, and unrelated tasks are parallel.

According to this principle, we can do a shortest path dependency analysis for each task by analyzing each task executed in the pipeline, and finally get the earliest execution time of the task.

Clever use of Docker Cache

Docker provides such a feature: in the process of building a Docker image, each executable statement in the Dockerfile will build a new image layer and cache it. In the second build, Docker will check its own caches one by one in the unit of the mirror layer. If the same mirror layer is hit, the cache will be reused directly, which greatly reduces the time for multiple repeated builds.

We can use this feature of Docker to reduce the steps that are usually repeated in the pipeline, thereby improving the efficiency of CI execution.

For example, the most time-consuming dependency installation in front-end projects is npm install . Changing dependencies is actually a relatively small event for high-frequency integration. Therefore, we can node_modules as a mirror for the first build. Called at the next compilation. The Dockerfile example is written as follows:

FROM node:12 AS dependencies
WORKDIR /ci
COPY . .
RUN npm install
ENV NODE_PATH=/ci/node_modules

We add a strategy for checking cache hits to the pipeline: before the next compilation, first check whether the mirror cache exists. In addition, in order to ensure that the dependencies of this build are not updated, we must also compare whether the md5 codes package-lock.json If they are inconsistent, reinstall the dependencies and package a new image for caching. If the comparison results are consistent, the node_modules folder will be taken directly from the mirror, thus saving a lot of time for installation.

An example of how the pipeline pulls the mirrored folder is as follows, where --from followed by the alias of the mirror built by the previous cache:

COPY --from=dependencies node_modules/ .# 其他步骤执行

In the same way, we can also extend this feature to all tasks in the CI process where the update frequency is not high and the generation time is long. For example, the installation of environment dependencies in Linux, the cache before each use case of unit test runs, and even the copying of folders with a large number of static files, etc., can use the characteristics of Docker cache to almost skip steps and reduce integration time. Effect. Since the principles are roughly the same, I won't repeat them here.

Hierarchical construction

As we all know, the execution time of the pipeline is bound to slow down as the number of tasks increases. In large-scale projects, with the integration of various index calculations, the number of various test cases gradually increases, and the running time will reach an unbearable point sooner or later.

However, the number of test cases determines the quality of our project to a certain extent, and quality inspection must not be less. So is there a way to continuously guarantee the quality of the project while reducing the time for developers to wait for integration? The answer is hierarchical construction.

The so-called hierarchical construction is to split the CI pipeline into main construction and secondary construction. The main construction needs to be executed every time the code is submitted, and if the check fails, the next step cannot be performed. The secondary build does not block the workflow, and continues to execute after the code is merged in the bypass mode. However, once the secondary build verification fails, the pipeline will immediately issue a notification alert and block the integration of all other codes until the problem is fixed.

There are several principles for whether a task should be included in the secondary construction process:

  1. Secondary builds will include tasks that take a long time to execute (for example, more than 15 minutes) and consume more resources, such as E2E testing in automated testing.
  2. The secondary build should include tasks with low use case priority or low probability of error, and try not to include important links. If some test cases in automated testing have found a high number of failures through practice, you should consider adding related functional unit tests and moving them into the main build process.
  3. If the secondary build is still too long, you can consider splitting the test cases in an appropriate way and testing in parallel.

Concluding remarks

If a worker wants to do his job well, he must first sharpen his tools. Behind the high frequency and stable release of the Tencent document project, it must be supported by a strong infrastructure.

This article only mainly introduces the transformation of the project in the continuous integration stage. The specific transformation ideas in the stages of continuous deployment and continuous operation will be explained in detail in the author's next article. You are also welcome to discuss more, and make suggestions and corrections for the parts that need improvement or errors.

Reference

  1. "Continuous Delivery 2.0"-by Qiao Liang
  2. https://www.redhat.com/zh/topics/devops/what-is-ci-cd
  3. https://www.36kr.com/p/1218375440667012

about us

For more cases and knowledge about cloud native, please follow the public account of the same name [Tencent Cloud Native]~

Welfare:

①Respond to the backstage of the official account [Manual] to get "Tencent Cloud Native Roadmap Manual" & "Tencent Cloud Native Best Practices"~

②The public account backstage reply [series], you can get the "15 series of 100+ super practical cloud native original dry goods collection", including Kubernetes cost reduction and efficiency, K8s performance optimization practices, best practices and other series.

③The official account backstage reply [white paper], you can get "Tencent Cloud Container Security White Paper" & "Source of Cost Reduction-Cloud Native Cost Management White Paper v1.0"

[Tencent Cloud Native] Yunshuo new products, Yunyan new technology, Yunyou Xinhuo, Yunxiang information, scan the QR code to follow the public account of the same name, and get more dry goods in time! !

账号已注销
350 声望974 粉丝