头图

图片

Text|Zheng Zechao (GitHub ID: CodingSinger )

ByteDance Senior Engineer

Passionate about microservices and ServiceMesh open source community

This article is 6802 words, read 15 minutes

Part.1--Contributor Preface

Speaking of which, it is very catchy. At that time, I met MOSN when I contributed code to Dubbo-go, an open source project in charge of Yu Yu. After I successfully became the Committer of the Dubbo open source community, I thought I could learn Golang more deeply. By chance, I met Lie Yuan, the eldest brother of MOSN (who also led me into the door of the MOSN community) .

As a high-performance scalable security network proxy targeting Envoy, MOSN supports ecological capabilities closer to the technology stacks of domestic Internet companies, and responds quickly to new features. Secondly, MOSN has a lot of ingenious designs and advanced usage skills that are worth learning from, which can fully satisfy their demands for in-depth learning of the Golang language outside of work.

At present, I have successively participated in the construction of some relatively large feature capabilities such as EDF Scheduler, LAR, WRR load balancing, DSL routing capability, UDS Listener, Plugin mode Filter extension and reverse channel in the community. I would like to thank Yu Ge, Yuan Zong, Peng Zong, Yisong and other bigwigs in the community for helping me study the plan and review the code.

This article mainly introduces the usage scenarios and design principles of the " reverse channel " newly merged into the master branch before. You are welcome to leave a message for discussion.

MOSN Project Overview

MOSN (Modular Open Smart Network) is a cloud-native network proxy platform mainly developed in Go language. It is open-sourced by Ant Group and has passed the production-level verification of hundreds of thousands of containers during the Double 11 promotion. It has the characteristics of high performance and easy expansion . MOSN can be integrated with Istio to build Service Mesh, and can also be used as an independent Layer 4 and Layer 7 load balancing, API Gateway, cloud native Ingress, etc.

Part.2--Reverse channel implementation of MOSN

In the network scenario of cloud-side collaboration, it is usually a one-way network, and cloud-side nodes cannot actively initiate connections to communicate with edge nodes. Although this restriction ensures the security of edge nodes to a great extent, the disadvantage is also obvious, that is, only edge nodes are allowed to actively initiate access to cloud nodes.

The cloud edge tunnel is designed to solve the problem that the cloud cannot actively access edge nodes, and its essence is a reverse channel (hereinafter collectively referred to as the reverse channel) . A dedicated full-duplex connection is established between the edge side and the cloud node by actively initiating connection establishment, which is used to transmit the request data of the cloud node and return the final response result.

At present, well-known cloud-edge collaboration open source frameworks such as SuperEdge and Yurttunnel are based on the reverse channel for the implementation of cloud-edge communication.

This article will focus on the operation process and principle of the back channel on MOSN. The overall architecture is as follows (the arrow in the figure indicates the reverse of TCP connection establishment) :

图片

The entire operation process can be simply summarized as:

1. The MOSN instance on the edge side (hereinafter collectively referred to as the Tunnel Agent) starts with the Tunnel Agent-related service coroutines.

2. Obtain the list of MOSN Server addresses on the public cloud side that needs to be reversely established through the specified static configuration or dynamic service discovery method (hereinafter collectively referred to as Tunnel Server) , and establish a reverse connection.

3. The Frontend on the cloud side exchanges data with the forwarding port on the Tunnel Server side, and this part of the data will be hosted on the reverse connection established before for sending.

4. After the edge node receives the request, it forwards the request to the actual back-end target node, and the packet return process returns the same way.

Part.3--Reverse channel startup process

MOSN Agent loads and completes the work of initializing Tunnel Agent when MOSN starts up through ExtendConfig feature.

The AgentBootstrapConfig structure defined in ExtendConfig is as follows:

 type AgentBootstrapConfig struct {
    Enable bool `json:"enable"`
    // The number of connections established between the agent and each server
    ConnectionNum int `json:"connection_num"`
    // The cluster of remote server
    Cluster string `json:"cluster"`
    // After the connection is established, the data transmission is processed by this listener
    HostingListener string `json:"hosting_listener"`
    // Static remote server list
    StaticServerList []string `json:"server_list"`

    // DynamicServerListConfig is used to specify dynamic server configuration
    DynamicServerListConfig struct {
        DynamicServerLister string `json:"dynamic_server_lister"`
    }

    // ConnectRetryTimes
    ConnectRetryTimes int `json:"connect_retry_times"`
    // ReconnectBaseDuration
    ReconnectBaseDurationMs int `json:"reconnect_base_duration_ms"`

    // ConnectTimeoutDurationMs specifies the timeout for establishing a connection and initializing the agent
    ConnectTimeoutDurationMs int    `json:"connect_timeout_duration_ms"`
    CredentialPolicy         string `json:"credential_policy"`
    // GracefulCloseMaxWaitDurationMs specifies the maximum waiting time to close conn gracefully
    GracefulCloseMaxWaitDurationMs int `json:"graceful_close_max_wait_duration_ms"`

    TLSContext *v2.TLSConfig `json:"tls_context"`
}

- ConnectionNum : The number of physical connections established between the Tunnel Agent and each Tunnel Server.

- HostingListener : Specifies the MOSN Listener hosted after the Agent establishes the connection, that is, the request sent by the Tunnel Server will be hosted and processed by the Listener.

- DynamicServerListConfig : Service discovery related configuration of dynamic Tunnel Server, which can provide dynamic address service through custom service discovery component.

- CredentialPolicy : Customized connection-level authentication policy configuration.

- TLSContext : MOSN TLS configuration, providing confidentiality and reliability of communication over TCP.

For each remote Tunnel Server instance, Agent corresponds to an AgentPeer object. In addition to actively establishing ConnectionNum reverse communication connections at startup, an additional bypass connection will be established. This bypass connection is mainly used to send some control parameters. , such as smoothly closing the connection, adjusting the connection weight.

 func (a *AgentPeer) Start() {
    connList := make([]*AgentClientConnection, 0, a.conf.ConnectionNumPerAddress)
    for i := 0; i < a.conf.ConnectionNumPerAddress; i++ {
      // 初始化和建立反向连接
        conn := NewAgentCoreConnection(*a.conf, a.listener)
        err := conn.initConnection()
        if err == nil {
            connList = append(connList, conn)
        }
    }
    a.connections = connList
    // 建立一个旁路控制连接
    a.initAside()
}

The initConnection method specifically initializes a complete reverse connection, and adopts an exponential backoff method to ensure that the connection is successfully established within the maximum number of retries.

 func (a *connection) initConnection() error {
    var err error
    backoffConnectDuration := a.reconnectBaseDuration

    for i := 0; i < a.connectRetryTimes || a.connectRetryTimes == -1; i++ {
        if a.close.Load() {
            return fmt.Errorf("connection closed, don't attempt to connect, address: %v", a.address)
        }
        // 1. 初始化物理连接和传输反向连接元数据
        err = a.init()
        if err == nil {
            break
        }
        log.DefaultLogger.Errorf("[agent] failed to connect remote server, try again after %v seconds, address: %v, err: %+v", backoffConnectDuration, a.address, err)
        time.Sleep(backoffConnectDuration)
        backoffConnectDuration *= 2
    }
    if err != nil {
        return err
    }
    // 2. 托管listener
    utils.GoWithRecover(func() {
        ch := make(chan api.Connection, 1)
        a.listener.GetListenerCallbacks().OnAccept(a.rawc, a.listener.UseOriginalDst(), nil, ch, a.readBuffer.Bytes(), []api.ConnectionEventListener{a})
    }, nil)
    return nil
}

The main steps of this method:

1. The a.init( ) method will call the initAgentCoreConnection method to initialize the physical connection and complete the connection establishment interaction process. The Tunnel Server manages the reverse connection through the metadata information transmitted by the Agent. The specific interaction process and protocol will be discussed in detail later.

2. After the connection is established successfully, the Tunnel Agent hosts the raw conn to the specified Listener. After that, the life cycle of the raw conn is fully managed by the Listener, and the capabilities of the Listener are fully reused.

It defines the interactive process of initializing the reverse connection, the specific code details can be seen:

pkg/filter/network/tunnel/connection.go:250 , this article does not expand the technical details.

Part.4--Interaction process

At present, the reverse channel of MOSN only supports the realization of raw conn, so a set of simple and clear network communication protocol is defined.

图片

mainly include:

- Protocol magic number: 2 bytes;

- Protocol version: 1 byte;

-Main structure type: 1 byte, including initialization, smooth shutdown, etc.;

- Length of main body data: 2 bytes;

- JSON serialized body data.

The complete life cycle interaction process of MOSN reverse channel:

图片

During the connection establishment process, the Tunnel Agent initiates the initiative, and after the TCP connection is established successfully (TLS handshake is successful) , the key information of the reverse connection establishment, ConnectionInitInfo, is serialized and transmitted to the peer Tunnel Server. This structure defines the reverse channel. metadata information.

 // ConnectionInitInfo is the basic information of agent host,
// it is sent immediately after the physical connection is established
type ConnectionInitInfo struct {
    ClusterName      string                 `json:"cluster_name"`
    Weight           int64                  `json:"weight"`
    HostName         string                 `json:"host_name"`
    CredentialPolicy string                 `json:"credential_policy"`
    Credential       string                 `json:"credential"`
    Extra            map[string]interface{} `json:"extra"`
}

After Tunnel Server accepts the metadata information, the main tasks include:

1. If a custom authentication method is set, perform connection authentication;

2. The clusterManager adds the connection to the specified ClusterSnapshot and writes back the connection result.

At this point, the connection establishment process is complete.

 func (t *tunnelFilter) handleConnectionInit(info *ConnectionInitInfo) api.FilterStatus {
    // Auth the connection
    conn := t.readCallbacks.Connection()
    if info.CredentialPolicy != "" {
        // 1. 自定义鉴权操作,篇幅原因省略
    }
    if !t.clusterManager.ClusterExist(info.ClusterName) {
        writeConnectResponse(ConnectClusterNotExist, conn)
        return api.Stop
    }
    // Set the flag that has been initialized, subsequent data processing skips this filter
    err := writeConnectResponse(ConnectSuccess, conn)
    if err != nil {
        return api.Stop
    }
    conn.AddConnectionEventListener(NewHostRemover(conn.RemoteAddr().String(), info.ClusterName))
    tunnelHostMutex.Lock()
    defer tunnelHostMutex.Unlock()
    snapshot := t.clusterManager.GetClusterSnapshot(context.Background(), info.ClusterName)
    // 2. host加入到指定的cluster
    _ = t.clusterManager.AppendClusterTypesHosts(info.ClusterName, []types.Host{NewHost(v2.Host{
        HostConfig: v2.HostConfig{
            Address:    conn.RemoteAddr().String(),
            Hostname:   info.HostName,
            Weight:     uint32(info.Weight),
            TLSDisable: false,
        }}, snapshot.ClusterInfo(), CreateAgentBackendConnection(conn))})
    t.connInitialized = true
    return api.Stop
}

Then there is the communication process. For ease of understanding, the following diagram is an example of a request for a one-way flow:

图片

In the traditional MOSN sidecar application scenario, the request sent by Frontend first passes through Client-MOSN, and then through the routing module, actively creates a connection (dashed line part) and flows to the opposite end, and then processes it and transfers it to Backend through Server-MOSN biz-listener.

In the implementation of the reverse channel in the cloud-side scenario, the Client MOSN (Tunnel Server) adds the physical connection to the cluster snapshot of the peer MOSN after receiving the request from the peer Tunnel Agent to create a reverse channel. Therefore, the request traffic of the Frontend can be transferred from the reverse channel to the peer MOSN, and because the Tunnel Agent side hosts the connection to the biz-listener, the read and write processing is all processed by the biz-listener, and the biz-listener will finish the process. The request is forwarded to the real Backend service.

Part.5--Summary and Planning

This paper mainly introduces the realization principle and design idea of MOSN reverse channel. As a high-performance cloud-native network proxy, MOSN hopes that the capability of the reverse channel can more effectively support its responsibility of undertaking east-west traffic in cloud-side collaboration scenarios.

Of course, we will continue to do a series of expansion support in the future, including but not limited to:

1. The reverse channel supports gRPC implementation. As the most common service communication framework in the cloud-native era, gRPC has built-in various powerful governance capabilities;

2. Combined with more cloud-native scenarios, built-in more general Tunnel Server dynamic service discovery capability components;

3. More supporting automated operation and maintenance and deployment tools.

understand more…

MOSN Star ✨:
https://github.com/mosn/mosn

Come and build with us 🧸

Recommended reading of the week

Full analysis of Go native plugin usage problems

图片

MOSN builds Subset optimization ideas sharing

图片

MOSN document usage guide

图片

MOSN 1.0 is released, starting the evolution of the new architecture

图片

Welcome to scan the code and follow our official account:

图片


SOFAStack
426 声望1.6k 粉丝

SOFAStack™(Scalable Open Financial Architecture Stack)是一套用于快速构建金融级分布式架构的中间件,也是在金融场景里锤炼出来的最佳实践。