In-depth analysis of the inside story of full-link grayscale technology

Author: Yang Shao

When there is a new version of the service to be released and online, by diverting a small amount of traffic to the new version, program problems can be discovered in time, effectively preventing the occurrence of large-scale failures. There are already mature service release strategies in the industry, such as blue-green release, A/B testing, and canary release. These release strategies mainly focus on how to release a single service. In the microservice architecture, the dependencies between services are intricate, and sometimes the release of a certain function relies on multiple services to be upgraded and launched at the same time. We hope that the new versions of these services can be verified at the same time with low-traffic grayscale. This is the unique full-link grayscale scenario in the microservice architecture. By constructing environmental isolation from the gateway to the entire back-end service, multiple different versions can be isolated. The services of the company undergo grayscale verification.

This article will uncover the mystery of full-link grayscale, deeply analyze the inside story of full-link grayscale technology, lead to two different implementation schemes, and conduct in-depth discussions on the technical details of the implementation schemes, and finally demonstrate the whole-link grayscale through practical links. The use scenario of link gray scale in actual business.

Challenges posed by microservice architecture

In order to meet the iterative speed of the business, developers began to fine-grained the original monolithic architecture, splitting the service modules in the monolithic application into independent deployment and operation microservices, and the life cycle of these microservices The corresponding business team is solely responsible for effectively solving the problems of insufficient agility and inflexibility in the monolithic architecture. A common practice is to split services according to business domains or functional domains, as shown in the following figure:

Among them, the traffic gateway is a four-layer proxy, whose main functions include load balancing, TLS offloading, and some security protection functions; the microservice gateway is a seven-layer proxy, mainly used to expose back-end services, traffic management, access control, and traffic monitoring. The microservice architecture with "high cohesion and low coupling" as the design concept brings developers an unprecedented development experience. Each business team focuses on the code logic of its own business and publishes it through API. The service relying party only needs to introduce the API definition of the service provider to complete the communication between services, and does not need to care about the deployment form and internal implementation of the service provider.

But any structure is not a silver bullet, and it is bound to introduce some new problems while solving old problems. The most troublesome problem in the microservice system is how to efficiently and conveniently manage many microservices, which is mainly manifested in the three aspects of visibility, connectivity, and security. Further refinement, the microservice architecture brings the following challenges:

The focus of this article focuses on the sub-field of service release, how to ensure the smooth and lossless process of the new version of the service in the microservice system, and how to build a traffic isolation environment for multiple microservices at low cost, so that developers can access multiple services at the same time The new version carries out sufficient gray-scale verification to avoid the occurrence of failures.

What is full link grayscale

Service release under monolithic architecture

First, let's take a look at how to release a new version of a service module in the application in the monolithic architecture. As shown in the figure below, the Cart service module in the application has a new version iteration:

Since the Cart service is part of the application, the entire application needs to be compiled, packaged, and deployed when the new version is launched. The service-level publishing problem has become an application-level publishing problem. We need to implement an effective publishing strategy for new versions of applications rather than services.

Currently, the industry has very mature service release solutions, such as blue-green release and grayscale release. The blue-green release requires redundant deployment of the new version of the service. Generally, the machine specifications and quantity of the new version are consistent with the old version, which is equivalent to two sets of identical deployment environments for the service, but only the old version is externally available at this time Provide services, and the new version serves as a hot backup. When the service version is upgraded, we only need to switch all the traffic to the new version, and the old version is used as a hot backup. The schematic diagram of our example using blue-green publishing is as follows, and the traffic switching can be completed based on the traffic gateway of the four-layer proxy.

In the blue-green release, due to the overall switching of traffic, it is necessary to clone an environment for the new version according to the machine scale occupied by the original service, which is equivalent to requiring twice the original machine resources. The core idea of gray release is to forward a small part of the online traffic to the new version according to the request content or the proportion of the request traffic. After the gray verification is passed, the request flow of the new version is gradually increased, which is a gradual release. Way. The schematic diagram of our example using gray-scale publishing is as follows. Flow control based on content or ratio needs to be completed with the help of a microservice gateway with a seven-layer proxy.

Among them, Traffic Routing is a gray-scale method based on content. For example, traffic with a header tag=gray in the request is routed to the application v2 version; Traffic Shifting is a gray-scale method based on proportions, which proportions online traffic in a non-discriminatory manner. Perform a diversion. Compared with blue-green releases, gray releases are better in terms of machine resource costs and flow control capabilities, but the disadvantage is that the release cycle is too long and requires higher operation and maintenance infrastructure.

Service publishing under the microservice architecture

In the distributed microservice architecture, the sub-services that are split into the application are deployed, run, and iterated independently. When a new version of a single service is launched, we no longer need to release the entire application. We only need to pay attention to the release process of each microservice itself, as follows:

In order to verify the new version of the service Cart, the traffic can be selectively routed to the gray version of Cart in some way on the entire call link. This is a traffic governance problem in the microservice governance field. Common governance strategies include Provider-based and Consumer-based approaches.

Provider-based governance strategy. Configure Cart's traffic inflow rules, and use Cart's traffic inflow rules when User routes to Cart.
Consumer-based governance strategy. Configure User's traffic outflow rules, and User's traffic outflow rules will be used when User is routed to Cart.

In addition, when using these governance strategies, you can combine the blue-green release and gray release schemes described above to implement true service-level version release.

Full link grayscale

Continue to consider the scenario of publishing the service Cart in the above microservice system. If the service Order also needs to release a new version at this time, because this new function involves the common changes of the service Cart and Order, it is required to be able to make the gray verification The gray-scale traffic passes through the gray-scale version of the service Cart and Order at the same time. As shown below:

According to the two governance strategies proposed in the previous section, we need to additionally configure the governance rules of the service Order to ensure that the traffic from the service Cart in the gray environment is forwarded to the gray version of the service Order. This approach seems to be in line with the normal operation logic, but in a real business scenario, the scale and number of microservices of the business far exceed our example. One of the request links may pass through dozens of microservices, and it may also be possible when a new function is released. Multiple microservices will be changed at the same time, and the dependencies between business services are intricate, frequent service releases, and parallel development of multiple versions of services have led to the increasing expansion of traffic governance rules, which has brought disadvantages to the maintainability and stability of the entire system factor.

For the above problem, developers combined with the actual business scenarios and production experience, is proposed end to end gray-release program, the full link gray . The full-link gray-scale governance strategy mainly focuses on the entire call chain. It does not care which specific microservices pass on the link. The flow control perspective is shifted from the service to the request link. Only a few governance rules are needed to construct the slave gateway. The multiple traffic isolation environments to the entire back-end service effectively ensure the smooth and safe release of multiple intimate services and the parallel development of multiple versions of the service, which further promotes the rapid development of the business.

Full-link gray-scale solution

How to quickly implement full-link grayscale in actual business scenarios? At present, there are two main solutions: physical environment isolation and logical environment isolation.

Physical environment isolation

Physical environment isolation, as the name implies, builds true flow isolation by adding machines.

This kind of scheme needs to build a set of network isolation and resource-independent environment for the services to be gray-scale, in which the gray-scale version of the service is deployed. Because it is isolated from the formal environment, other services in the formal environment cannot access the services that require grayscale. Therefore, these online services need to be redundantly deployed in the grayscale environment so that the entire call link can forward traffic normally. In addition, some other dependent middleware components such as the registry also need to be redundantly deployed in a grayscale environment to ensure visibility issues between microservices and ensure that the obtained node IP addresses only belong to the current network environment.

This scheme is generally used for the establishment of enterprise testing and pre-release development environment, and it is not flexible enough for the scene of online grayscale release and drainage. Moreover, the existence of multiple versions of microservices is commonplace in the microservice architecture, and it is necessary to adopt a heap machine method for these business scenarios to maintain multiple sets of grayscale environments. If you have too many applications, it will cause excessive operation and maintenance and machine costs, and the cost and price will far exceed the benefits; if the number of applications is small, there are only two or three applications. This method is still very convenient and acceptable. .

Logical environment isolation

Another solution is to build a logical isolation of the environment. We only need to deploy the grayscale version of the service. When the traffic flows on the calling link, the gateway, middleware and microservices that flow through will identify the grayscale traffic. And dynamically forwarded to the gray version of the corresponding service. As shown below:

The above figure can well show the effect of this scheme. We use different colors to represent different versions of gray-scale traffic. It can be seen that both the microservice gateway and the microservice itself need to identify the traffic and make dynamic decisions based on governance rules. . When the service version changes, the forwarding of this call link will also change in real time. Compared with the gray-scale environment built by machines, this solution can not only save a lot of machine costs and operation and maintenance manpower, but also help developers to perform refined full-link control of online traffic in real time and quickly.

So how does the full-link gray scale be realized? Through the above discussion, we need to solve the following problems:

Various components and services on the link can be dynamically routed according to the characteristics of the requested traffic
Need to group all nodes under the service to be able to distinguish versions
Need to carry out grayscale identification and version identification for traffic
Need to identify different versions of gray flow

Next, we will introduce the technologies needed to solve the above problems.

Label routing

Label routing groups all nodes under the service according to different label names and label values, so that the service consumer subscribing to the service node information can access a certain group of the service on demand, that is, a subset of all nodes. The service consumer can use any label information on the service provider node. According to the actual meaning of the selected label, the consumer can apply label routing to more business scenarios.

Node marking

So how to add different labels to the service node? Driven by today's hot cloud-native technology, most businesses are actively embarking on a journey of containerization. Here, I will take a containerized application as an example to introduce how to mark the service workload in the two scenarios of using Kubernetes Service as service discovery and using the more popular Nacos registry.

In a business system that uses Kubernetes Service as a service discovery, the service provider completes service exposure by submitting Service resources to ApiServer, and the service consumer monitors the Endpoint resources associated with the Service resources, and obtains the associated business Pod resources from the Endpoint resources. Read the above Labels data and use it as the node's metadata information. Therefore, we only need to add tags to the nodes in the Pod template in the deployment of the business application description resource.

In a business system that uses Nacos as a service discovery, it is generally necessary for the business to determine the marking method according to the microservice framework it uses. If the Java application uses the Spring Cloud microservice development framework, we can add corresponding environment variables to the business container to complete the label addition operation. For example, if we want to add a version gray label to the node, then add spring.cloud.nacos.discovery.metadata.version=gray to the business container, so that when the framework registers the node with Nacos, it will add a label verison=gray .

Traffic coloring

How do various components on the request link recognize different gray-scale flows? The answer is traffic coloring, adding different grayscale flags to request traffic to facilitate distinction. We can color the traffic at the source of the request, and the front end will mark the traffic according to the user information or platform information when the request is initiated. If the front-end is unable to do so, we can also dynamically add traffic identifiers to requests matching specific routing rules on the microservice gateway. In addition, when the traffic flows through the gray node in the link, if the request information does not contain the gray mark, it needs to be automatically colored, and then the traffic can preferentially access the gray version of the service in the subsequent circulation process.

Distributed link tracking

Another very important question is how to ensure that the gray-scale identification can be transmitted in the link? If the request source is colored, then when the request passes through the gateway, the gateway acts as a proxy to forward the request to the entry service intact, unless the developer implements the request content modification strategy in the gateway's routing strategy. Then, the request traffic will call the next microservice from the entry service, and a new call request will be formed according to the business code logic, so how do we add the gray mark to this new call request, so that it can be passed down the link? ?

Evolving from a monolithic architecture to a distributed microservice architecture, the calls between services have changed from method calls in the same thread to services in local processes calling services in remote processes, and remote services may be deployed in multiple copies, so that one The nodes through which the request flows are unpredictable and uncertain, and each hop call may cause errors due to network failures or service failures. Distributed link tracing technology records the request call links in large-scale distributed systems in detail. The core idea is to record the nodes passed by the request link and the time consumed by the request through a globally unique traceid and each spanid. Traceid It needs to be transmitted through the entire link.

With the help of distributed link tracking ideas, we can also pass some custom information, such as gray-scale identification. The common distributed link tracking products in the industry all support the transmission of user-defined data on the link. The data processing flow is shown in the following figure:

Logical environment isolation-based on SDK

Above we have introduced in detail several technologies needed to realize full-link grayscale. If you want to access the full-link grayscale capability for existing services, it is inevitable that you need to transform the development framework SDK used by the business. First, it needs to support the dynamic routing function. For Spring Cloud and Dubbo development frameworks, a custom filter can be implemented for the egress traffic, and the traffic identification and label routing can be completed in the Filter. At the same time, it is necessary to use distributed link tracking technology to complete flow identification link transmission and flow automatic coloring. In addition, a centralized traffic management platform needs to be introduced to facilitate developers of various business lines to define their own full-link grayscale rules. The diagram based on the SDK implementation is as follows:

Logical environment isolation-based on Java Agent

The disadvantage of the SDK-based approach is that it requires the business to upgrade the SDK version, which may even involve changes in the business code. Although all microservices within the enterprise use the same development framework, it is difficult to ensure that the framework versions are consistent, so it is necessary to maintain a full-link gray code for each version. The business code is tightly coupled with the SDK code. SDK version iteration will trigger unnecessary business version release changes, which is more intrusive to the business.

Another popular way is to expand the function of the development framework at compile time based on bytecode enhancement technology. This solution is business unaware and introduces full-link grayscale capabilities for the business in a non-intrusive manner. The diagram of the implementation based on Java Agent is as follows:

However, it is still unavoidable that developers need to maintain the corresponding Java Agent version for development frameworks with inconsistent business versions. If you prefer this non-intrusive solution but don’t want to maintain it yourself, you can choose Alibaba Cloud’s MSE service governance product, which is a non-intrusive enterprise production-level service governance product based on Java Agent. You need to modify any line of business code to have governance capabilities that are not limited to full-link grayscale, and support all Spring Boot, Spring Cloud and Dubbo in the past 5 years.

Logical environment isolation-based on Service Mesh

In the microservice architecture of the business system, if there are a large number of microservices that use different technology stacks and language stacks, the Java Agent approach is useless. We may need to write and maintain full-link grayscale code for the SDK of each language. Not only do we need developers of different language stacks, but also when it comes to language-independent bug fixes, we need to upgrade the full-language version of the SDK together. This cost is not seen. It is smaller than a physical environment-based isolation scheme.

Is there a solution that has nothing to do with language? Yes, the next-generation microservice architecture service mesh, Service Mesh. It abstracts the communication layer of distributed services into a single layer, in which functions required by distributed systems such as load balancing, service discovery, authentication and authorization, monitoring and tracking, and flow control are implemented. Obviously, the full-link gray-scale capability we need can also be implemented in this traffic management infrastructure layer. Fortunately, the service grid star product Istio has unified and abstracted traffic governance with declarative API resources. With the help of VirtualService and DestinationRule governance rules, the effect of full-link grayscale can be easily achieved, and Istio integrates various mainstream Distributed link tracking framework. The diagram of the implementation based on Service Mesh is as follows:

In an actual production environment, parallel development of multiple versions of services is very common, and the version iteration speed is very fast. Every time the version is changed, the routing matching rules in the VirtualSerivice resource need to be modified. In addition, the VirtualSerivice resource does not provide disaster tolerance. For example, there is a routing rule to access a certain gray version of the service provider. If the gray version of the target service does not exist or is unavailable, according to the current implementation of Istio, the traffic is still forwarded to that version, which lacks a disaster tolerance mechanism. There is another business scenario. If we want to forward the user traffic within a certain UID range to a designated gray-scale environment, it cannot be achieved through Istio's existing traffic management rules. At this point, you can choose ASM, the service grid product of Alibaba Cloud, which is a managed platform that uniformly manages microservice application traffic and is compatible with Istio. ASM has solutions for the above two scenarios, easily addressing your full-link grayscale demands in multi-language scenarios.

Three ways to compare

The following table is a comparison of the three methods, which are compared from many aspects.

If you tend to use the non-intrusive Java Agent method, but are worried about the stability problems caused by self-built, you can choose the MSE microservice governance product, which is the accumulation of Alibaba's internal microservice governance field for many years The output has undergone various big promotion tests.
If you tend to use the language-independent and non-intrusive service mesh method, but you are worried about the stability problems caused by self-built, you can choose Alibaba Cloud ASM products. Sex has been greatly improved.

Flow entrance: gateway

In distributed applications, a gateway as a traffic entry is indispensable. In the full-link grayscale scenario, the microservice gateway is required to have rich traffic management capabilities, support multi-version routing of services, and support dynamic marking of requests on specific routing rules. For the problem of ingress service visibility, the gateway needs to support multiple service discovery methods. In terms of security, the gateway, as the external entrance of the cluster, can authenticate all request traffic to ensure that the business system is not invaded by illegal traffic.

Under the microservice architecture in the virtualization era, the business usually adopts a two-tier architecture of traffic gateway + microservice gateway. The traffic gateway is responsible for north-south traffic scheduling and security protection, and the microservice gateway is responsible for east-west traffic scheduling and service management. In the cloud-native era dominated by containers and K8s, Ingress has become the gateway standard for the K8s ecosystem, giving the gateway a new mission, making it possible to combine traffic gateway + microservice gateway into one. The cloud native gateway released by Alibaba Cloud MSE changes the two-layer gateway into one layer without any discount on the capacity, which not only saves 50% of resource costs, but also reduces operation and maintenance and usage costs. Most importantly, the cloud native gateway supports linkage with back-end microservice governance to achieve end-to-end full-link grayscale.

Practice full-link grayscale from 0 to 1

Seeing this, I believe most readers have a general understanding of the full-link grayscale, and also understand several solutions and implementation details. Next, based on the MSE cloud native gateway and MSE service management products mentioned in the article, we will practice the full link gray level from 0 to 1. On the one hand, we will deepen our understanding of the full link gray level, and on the other hand, we can understand How does Alibaba Cloud export Alibaba's internal best practices to cloud products.

We assume that the architecture of the application is composed of MSE cloud native gateway and back-end microservice architecture (Spring Cloud), the back-end call link has 3 hops, shopping cart (a), transaction center (b), inventory center (c), Access back-end services through the client or H5 page, and they do service discovery through the Nacos registry. Now I hope to use the full-link gray-scale capability to build a gray-scale environment to facilitate the gray-scale verification of service A and service C at the same time.

Prerequisites

List of must-have resources

Already have an MSE cloud native gateway
Already have a MSE Nacos registration center
Already have an ACK operation and maintenance cluster
MSE Microservice Governance Professional Edition has been opened

Deploy the demo application

Save the following to the ingress-gray.yaml file and executes kubectl apply -f ingress-gray.yaml to deploy applications, where we want to deploy A, B, C three applications, A and C are deployed applications and a gray scale version of a baseline version, B application deploys a baseline version.

There are the following points to note:

The full-link grayscale capability has nothing to do with the registration center. The use case of this article uses MSE Nacos as the registration center, so you need to replace spring.cloud.nacos.discovery.server-addr with the business's own Nacos registration center address
For services connected to the cloud native gateway, if you need to use grayscale publishing, you need to add a version label to the metadata information when publishing the service. In our example, service A needs to be exposed to the gateway, so when publishing, add spring.cloud.nacos.discovery.metadata.version=base for the baseline version and spring.cloud.nacos.discovery.metadata.version for the gray version. =gray.

# A 应用 base 版本
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: spring-cloud-a
  name: spring-cloud-a
spec:
  replicas: 2
  selector:
    matchLabels:
      app: spring-cloud-a
  template:
    metadata:
      annotations:
        msePilotCreateAppName: spring-cloud-a
      labels:
        app: spring-cloud-a
    spec:
      containers:
      - env:
        - name: LANG
          value: C.UTF-8
        - name: JAVA_HOME
          value: /usr/lib/jvm/java-1.8-openjdk/jre
        - name: spring.cloud.nacos.discovery.server-addr
          value: mse-455e0c20-nacos-ans.mse.aliyuncs.com:8848
        - name: spring.cloud.nacos.discovery.metadata.version
          value: base
        image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-a:0.1-SNAPSHOT
        imagePullPolicy: Always
        name: spring-cloud-a
        ports:
        - containerPort: 20001
          protocol: TCP
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
      
# A 应用 gray 版本
---            
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: spring-cloud-a-new
  name: spring-cloud-a-new
spec:
  replicas: 2
  selector:
    matchLabels:
      app: spring-cloud-a-new
  strategy:
  template:
    metadata:
      annotations:
        alicloud.service.tag: gray
        msePilotCreateAppName: spring-cloud-a
      labels:
        app: spring-cloud-a-new
    spec:
      containers:
      - env:
        - name: LANG
          value: C.UTF-8
        - name: JAVA_HOME
          value: /usr/lib/jvm/java-1.8-openjdk/jre
        - name: profiler.micro.service.tag.trace.enable
          value: "true"
        - name: spring.cloud.nacos.discovery.server-addr
          value: mse-455e0c20-nacos-ans.mse.aliyuncs.com:8848
        - name: spring.cloud.nacos.discovery.metadata.version
          value: gray
        image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-a:0.1-SNAPSHOT
        imagePullPolicy: Always
        name: spring-cloud-a-new
        ports:
        - containerPort: 20001
          protocol: TCP
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
            
# B 应用 base 版本
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: spring-cloud-b
  name: spring-cloud-b
spec:
  replicas: 2
  selector:
    matchLabels:
      app: spring-cloud-b
  strategy:
  template:
    metadata:
      annotations:
        msePilotCreateAppName: spring-cloud-b
      labels:
        app: spring-cloud-b
    spec:
      containers:
      - env:
        - name: LANG
          value: C.UTF-8
        - name: JAVA_HOME
          value: /usr/lib/jvm/java-1.8-openjdk/jre
        - name: spring.cloud.nacos.discovery.server-addr
          value: mse-455e0c20-nacos-ans.mse.aliyuncs.com:8848
        image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-b:0.2-demo-SNAPSHOT 
        imagePullPolicy: Always
        name: spring-cloud-b
        ports:
        - containerPort: 8080
          protocol: TCP
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
            
# C 应用 base 版本
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: spring-cloud-c
  name: spring-cloud-c
spec:
  replicas: 2
  selector:
    matchLabels:
      app: spring-cloud-c
  template:
    metadata:
      annotations:
        msePilotCreateAppName: spring-cloud-c
      labels:
        app: spring-cloud-c
    spec:
      containers:
      - env:
        - name: LANG
          value: C.UTF-8
        - name: JAVA_HOME
          value: /usr/lib/jvm/java-1.8-openjdk/jre
        - name: spring.cloud.nacos.discovery.server-addr
          value: mse-455e0c20-nacos-ans.mse.aliyuncs.com:8848
        image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-c:0.2-demo-SNAPSHOT
        imagePullPolicy: Always
        name: spring-cloud-c
        ports:
        - containerPort: 8080
          protocol: TCP
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
            
# C 应用 gray 版本
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: spring-cloud-c-new
  name: spring-cloud-c-new
spec:
  replicas: 2
  selector:
    matchLabels:
      app: spring-cloud-c-new
  template:
    metadata:
      annotations:
        alicloud.service.tag: gray
        msePilotCreateAppName: spring-cloud-c
      labels:
        app: spring-cloud-c-new
    spec:
      containers:
      - env:
        - name: LANG
          value: C.UTF-8
        - name: JAVA_HOME
          value: /usr/lib/jvm/java-1.8-openjdk/jre
        - name: spring.cloud.nacos.discovery.server-addr
          value: mse-455e0c20-nacos-ans.mse.aliyuncs.com:8848
        image: registry.cn-shanghai.aliyuncs.com/yizhan/spring-cloud-c:0.2-demo-SNAPSHOT
        imagePullPolicy: Always
        name: spring-cloud-c-new
        ports:
        - containerPort: 8080
          protocol: TCP
        resources:
          requests:
            cpu: 250m
            memory: 512Mi

Complete the preliminary configuration of the cloud native gateway

The first step is to add Nacos service source, service management, source management to the cloud native gateway, click Create Source,

Select the MSE Nacos service source, select the Nacos registration center to be associated, and click OK.

The second step is to import services that will be exposed to the outside through the cloud native gateway. Select service management, service list, and click Create Service.

Select the service source as MSE Nacos and select the service sc-A.

Click on the policy configuration of service A to create multiple versions for the portal service A. The version is divided according to the metadata information version carried during service registration (note that this can be any tag value that can distinguish the service version, depending on the user registration service. Metadata information used), create the following two versions base and gray.

Routing configuration

Create routing matching rules for the baseline environment, associate the domain name base.example.com, and route to the base version of the service sc-A.

Create routing matching rules for the gray environment. The associated domain name is consistent with the baseline environment. Note that the configuration related to the request header is added here, and the target service of the routing is the gray version of the service sc-A.

At this time, we have the following two routing rules,

In this case, access base.example.com routed to the baseline environment

curl -H "Host: base.example.com" http://118.31.118.69/a
A[172.21.240.105] -> B[172.21.240.106] -> C[172.21.240.46]

How to access the grayscale environment? In only need to add a request header x-mse-tag: gray can.

curl -H "Host: base.example.com" -H "x-mse-tag: gray" http://118.31.118.69/a
Agray[172.21.240.44] -> B[172.21.240.146] -> Cgray[172.21.240.147]

It can be seen that the cloud native gateway routes the grayscale traffic to the grayscale versions of A and C. Since B does not have a specified grayscale version, the traffic automatically falls back to the baseline version.

analyze

It can be seen from the above that we only need to open the MSE microservice management professional version, configure the routing rules of the ingress service on the cloud native gateway, and perform gray-scale coloring on the ingress traffic, which can satisfy our full-link grayscale of A and C. Requirements. Another very important point is that the business needs to mark the node itself and open the link delivery for the entry service. Add the alicloud.service.tag key-value pair to the Annotations of the Pod template to complete the node marking. The Java Agent will automatically add this metadata information for the node when the business is registered with the registry. At the same time, it is necessary to add an environment for the business container of the entry service variable profiler.micro.service.tag.trace.enable=true open gradation identification transmission link. The MSE service management component uses x-mse-tag by default to identify the traffic and pass it through the entire call link.

In addition, you can set other custom or existing business fields to identify the flow, operating more information, reference documentation showing scenes: https://help.aliyun.com/document_detail/359851.html

Summarize

This article expands on the challenges brought by the evolution from a monolithic architecture to a microservice architecture, focusing on the analysis of its sub-domain service publishing under the monolithic architecture and microservice architecture system, and derives the unique full range of distributed application scenarios. Link gray scale problem. In view of the business requirements for full link capabilities, two solutions based on physical environment isolation and logical environment isolation are introduced. Among them, the logical environment isolation solution is analyzed in detail, and the various technical points involved are also explained well. , And then proposed three landing plans based on logical environment isolation, and conducted a simple comparative analysis, and finally led to how Aliyun MSE cloud native, MSE service governance, and service grid ASM provide not limited to full-link grayscale Traffic management capabilities.

Finally, students who are interested in the field of gateway and governance can 161c2162b68000 nail search group number 34754806 or scan the QR code below to join the user group to communicate and answer questions.

Click here , see the full link gray micro-management of the service live playback!