Text | Zhu Dejiang (flower name: Rende) is a technical expert of Ant Group, responsible for the core development of MOSN, and pays attention to the evolution of cloud-native traffic gateways.

The following content is organized from the sharing of the fourth anniversary of SOFAStack

MOSN 1.0 released

MOSN is a cloud-native network proxy platform mainly developed in Go language. It is open sourced by Ant Group and has been verified at the production level of hundreds of thousands of containers during the Double 11 promotion.

After 4 years of vigorous development, with the joint efforts of 11 committers, more than 100 contributors and the entire community, after 27 iterations of minor versions, MOSN 1.0 was officially released.

A mature and stable MOSN 1.0 that is built by open source users, commercialized, and has a community that embraces the cloud-native ecosystem is here.

In addition to the full implementation of Ant Group, MOSN also has a wide range of applications in the industry, such as the commercialization of Industrial and Commercial Bank of China, and the production practice of Alibaba Cloud, Qunar, Shisuyun and other enterprises.

At the same time, with the release of 1.0, MOSN, which has entered adolescence, will also start the evolution of a new generation of MOE architecture and rush to the sea of stars.

Development History


MOSN originated from Service Mesh. The original call between microservices was done through a relatively heavy SDK. When the SDK is upgraded, the application needs to be upgraded together, which is relatively disturbing.

In order to solve this pain point, MOSN is evolving in the direction of stripping out the SDK. In the Pod where the application is located, there is a sidecar running MOSN alone, so the application itself only needs to communicate with MOSN to complete the entire service invocation process. Stripping out the SDK is equivalent to evolving MOSN as an independent component, and its evolution process does not disturb the application itself. The internal benefits of this are actually very obvious.

In the whole evolution process, there are two relatively deep experiences: one is more obvious, there is an independent sidecar, which can be decoupled from business logic; the other is standardization, in the era of cloud native, the control plane and the The data plane is split into two independent components. MOSN, as a component of the data plane, needs to be connected with many control plane services during the evolution. Standardization is very important during this period. In the entire standardization process, it is not as intuitive as business decoupling, but the longer it takes, the more deeply it is experienced.

Now MOSN has been fully deployed inside Ant, with hundreds of thousands of Pods deployed, with a peak QPS of tens of millions.

Evolution of MOSN


2018: Start to create;

2019: The core link coverage was completed on Double 11, and by the end of 2019, MOSN began to operate independently;

2020: Officially recommended by Istio in July. At the same time, MOSN started the exploration of commercialization, and completed the landing of Jiangxi Rural Credit by the end of the year;

2021: Docking with Envoy, Dapr, WASM ecosystem, and cooperating with mainstream communities. In December of the same year, the commercialization of ICBC was completed, setting a new benchmark in the industry.

In addition to the full rollout of MOSN within Ant, and the commercialization of the practice, there is also a gradually improved community. There are currently 11 Committers in the MOSN community, of which more than 70% are non-ant Committers, and there are more than 100 Contributors who have gone through 28 iterations. MOSN also has many open source users, who have implemented MOSN in their own companies and made a lot of contributions to MOSN.

In addition to contributions to the MOSN project, the community also contributes to other projects/communities, including Holmes, BabaSSL, Proxy-Wasm and other projects, as well as docking with other ecological projects.

In general, MOSN is now mature and stable enough, with commercialization, community, surroundings, and ecology, so we chose this time to release MOSN 1.0.

1.0 Core Capabilities and Extended Ecosystem

The signature achievement of MOSN version 1.0 is version 1.10 that has been integrated with Istio.

MOSN, as a network proxy software, has already supported TCP, UDP, and transparent hijacking modes in terms of core forwarding. MOSN is located in the east-west gateway scenario, and there are many internal and private non-standard protocols. In addition to supporting HTTP standard protocols, MOSN also has a very important XProtocol framework, which can support private non-standard protocols very simply and conveniently. Built-in The Bolt and Dubbo protocols are also implemented through the XProtocol framework. We also support automatic identification of multiple protocols, which is also a core and special capability support in the east-west traffic gateway.

Back-end management and load balancing are relatively basic conventional capabilities in the case of network proxy. MOSN also supports connection pooling, active health checks, and various load balancing strategies.

In terms of core routing, MOSN supports Domain-based VirtualHost, introduces a very powerful variable support, makes complex routing rules through variables, and also supports Metadata packet routing. There are also routing-level timeouts, retry configuration, and processing of request headers and response headers.

To put it simply, as a network proxy platform, the general core capability MOSN is fully equipped.

At the same time, in the scenario of network proxy, a lot of expansion is usually required. How far has the expansion ecology of MOSN been achieved?

RPC protocol: Supported by Dubbo and SOFABolt, and supported by national secrets based on BabaSSL;

Control plane: MOSN has done Istio support;

Registration Center: SOFARegistry;

Observability: Skywalking and Holmes for automatic analysis and diagnosis of resource usage anomalies during the Go runtime.

In the gateway scenario, there is a lot of logic that needs to be customized. In addition to the regular use of Go to write some filter extensions, it also supports the lightweight mode of Go Plugin, and also supports Proxy-Wasm standard Wasm extensions running in MOSN, and service governance is also connected to Sentinel.

Istio 1.10


MOSN will communicate with Istio through the standard xDS protocol, which is a very standard way of use, and we are also actively participating in the construction of the standard.

During the standard formulation process, we actively proposed proposals and participated in discussions, such as current limiting and routing protos. It is also because we have a lot of cooperation with Istio that we can get the official recommendation of Istio.

MOSN originated from the east-west traffic scene of ServiceMesh. After four years of hard work, we chose to release MOSN version 1.0 at this point in time today as a mature and stable version with commercialization, community and ecology. We welcome more people to use MOSN, and welcome everyone to build and grow together.

New MoE Architecture

What is the original intention of doing this?

What is the advantage of doing this?

What is the new progress in the exploration of the new architecture of MoE?

First of all, Enovy and MOSN are two data planes currently on the market. They have their own characteristics. Enovy is written in C++, and its processing performance will be relatively high. MOSN has high R&D and efficiency, and has a good ecology.

MoE is MOSN plus Enovy. We hope to be able to combine the advantages of the two, integrate each other, take advantage of each other's strengths, combine high performance and high R&D efficiency, and support us to become bigger and stronger, and go further.

MoE Architecture From Enovy's point of view, MOSN, as a plug-in extension of Enovy, makes a horizontal comparison among all Enovy extension methods.

1. First the first one is Lua

Embedded scripting languages have the advantage of being simple to operate. But as a relatively niche language, the disadvantage is also obvious - the ecology is not good. Our goal is to improve R&D efficiency, and Lua can't get us there.

2. WASM is a more attractive solution

In the early days of WASM's development, there are many things that are still just in the vision. Many language support is not friendly, and the runtime performance is not good enough, which are very real pain points.

3. External cross-process communication

The performance of external cross-process communication is relatively poor, and the difference is nearly an order of magnitude compared with CGo. Secondly, it is more complicated to manage many other external processes. If there are different languages, different processes are required, and the management cost is relatively large.

In comparison, MoE has two advantages:

  • MOSN has many existing service governance capabilities, which can be reused to the greatest extent;
  • The ecology of the Go language needs to write more extensions in the future evolution, and the efficient R&D performance of Go can be used.

Looking back at the entire architecture, from the perspective of MOSN, Envoy acts as the network runtime of MOSN. The request will first pass through the network runtime (Envoy), and then pass the CGo bridge to hand over the request information to MOSN. After MOSN completes the request logic, it will return the response to the network layer.

The current MoE architecture has been implemented within Ant, and we have also obtained the expected benefits.

As an important bridge between MOSN and Envoy, CGo's performance largely determines the overall performance of MoE. In the specific implementation of CGo, there are two calling directions from C to Go and Go to C. There are some implementation differences between the two calling directions. Specific to the MoE architecture, it is mainly from Envoy to MOSN, that is, from C to Go.

An order-of-magnitude optimization improvement has been achieved so far - from 1600 nanoseconds to 140 nanoseconds. (Through the simplest local test, it basically only covers the overhead of CGo itself, ignoring the complex logic of falling into Go.)

What is the concept of 140 nanoseconds?

It is almost an order of magnitude with Go tuning C, which is the current official implementation. (Our current optimizations have also been submitted to the Go official. After several rounds of reviews, we are still waiting for reviews from other official members.)

Because Go is cross-platform, the current implementation only supports the x86/64 system, and corresponding implementations need to be added for different architectures.

In terms of CGo, a lot of optimization space is also analyzed. For example, develop an extra P mechanism, corresponding to the extra M mechanism, to solve the contention of P resources in high-load scenarios.

The other one is parameter passing by registers. Now, parameters are passed between C and Go, and the parameters are put into a structure. If you can use registers to pass parameters, you should get better performance.

Currently, some filters of MOSN have been supported to run in Envoy. This part can be found in the open source repository. Welcome to try it out.

https://github.com/mosn/mosn/blob/master/pkg/networkextention/README-cn.md

MoE Open Source Initiative

Provide the abstract API to the Enovy official, and then implement the Go extension based on the standard API (probably to be completed in August). In the second half of the year, the overall open source of MoE will be completed, and you are also welcome to continue to pay attention.

Roadmap 2022

This year, in addition to the continuous evolution and iteration in the east-west direction, it will also be a north-south gateway.

We will provide an open source product in combination with Istio. This is a long-term plan, and it is also the direction we believe that cloud-native gateways may evolve in the future.

In the Roadmap in 2022, in addition to this core capability, for example, we will do modular structure and exit gracefully (these have been implemented in version 1.0). There are also various microservice ecosystems, and will also connect to more configuration centers such as registration centers, cloud native, and integrated Istio 1.10. Stability construction has also been specially added. With more and more users of MOSN, everyone's voice for stability capability is also getting higher and higher.

We integrated Holmes, which was implemented inside Ant, into the open source MOSN. For resource exceptions at runtime, we can capture the exception scene for analysis. About Holmes, we have a share before, you can read it if you are interested.

https://mosn.io/blog/posts/mosn-holmes-design/

North-South gateway access

In addition to the various capability development plans in the Roadmap, a very important evolution direction is the north-south gateway access. MOSN will be upgraded to the MoE architecture after version 1.0. The north-south access gateway is a scenario with larger traffic, which will also be supported by the new architecture of MOSN.

Due to historical reasons, MOSN has many gateway forms, such as internal Spanner and MobileGW. And each gateway form and gateway data plane are implemented differently.

Our evolution direction is that the data plane will use MOSN as the underlying support, and the control plane will use the standard xDS protocol to interface with Istio. In this way, both east-west and north-south directions can be connected in a standard way in a cloud-native way.

Based on the traditional north-south gateway architecture, it may be a difficult path for us to evolve to cloud native. We prefer to use MOSN, a new MoE architecture, and a more cloud-native architecture for evolution.

Why do you want to explore the MoE architecture?

MOSN is not limited to east-west traffic, but looks at unified network forwarding. And in the cloud-native era, multi-cloud is a very real requirement, which urges all networks to connect in a standard way. This is an important reason for developing north-south access gateways. We hope to unify all network forwarding to support more application scenarios.

Of course, the north-south gateway will also face many challenges. As a centralized gateway, it has many configuration rules and higher stability and performance requirements. Including our choice of Istio, it will also face a scale challenge, and how to face the cost of migrating from the old data plane.

In the face of new challenges, CGo has been optimized to ensure performance, and TLS protocol capabilities have been enhanced. At present, Envoy's capabilities in TLS protocol are more suitable for east-west gateways. In order to adapt to the north-south gateway, we will do some enhancements, such as dynamic certificate issuance, and support for multiple certificates for a single domain name. In terms of stability, based on Enovy's multi-threaded model, process crash will have a greater impact than multi-process solutions. We will first improve the recovery speed after crash.

Our long-term goal is to build with the community and provide a complete open source product. We will collaborate on an open source basis to make the product bigger and stronger.

MOSN is currently presented as a data plane, and the product can contain a complete set of solutions. Our first goal is to achieve the effect out of the box.

The other is dual-mode support. First, MOSN will support standard xDS, which is a potential evolution direction. Secondly, in the process of landing, MOSN will not only keep the xDS path. MOSN will still support all registration centers and configuration centers. In this way, in the process of business landing, both sides can run at the same time. Based on the original high-performance R&D efficiency, maintain convenient customized development capabilities.

Ultimately, it is hoped that MOSN will become a unified network forwarding platform that supports east-west and north-south traffic, as well as support in multi-cloud scenarios.

When the data plane network can be unified, MOSN will continue to explore in this direction in the direction of open source and commercialization, and focus on doing more long-term things. I also hope that more friends can participate and build together.

Welcome to use MOSN and grow together

MOSN official website: https://mosn.io/

MOSN GitHub address: https://github.com/mosn


蚂蚁技术
1.2k 声望2.5k 粉丝

蚂蚁集团技术官方账号,分享蚂蚁前沿技术创新探索。