Design and Implementation of RTC Scaffolding

Image source: https://699pic.com/tupian-401703470.html
Author: AirLand

What is RTC?

RTC , short for Real-Time Communication, is a terminal service that provides the industry with high-concurrency, low-latency, high-definition, smooth, safe and reliable full-scene, full-interaction, and full-real-time audio and video services. The above is a more official explanation. Generally speaking, it is a service that can realize many functions such as one-to-one, many-to-many audio and video calls. At present, there are many service providers providing this service, such as Shengwang, Yunxin, Volcano Engine, Tencent Cloud, etc.

background

At present, there are many APPs under Cloud Music, and many of them involve RTC business, such as: common audio and video connection, PK, party room, 1v1 chat, etc. Due to different business lines, different functions, and different developers, everyone writes one set and repeats the wheel repeatedly. Therefore, in order to avoid repeated development work and improve development efficiency, a general RTC framework is required.

Design ideas

Before talking about the specific program design, let me talk about my design ideas:

Functional cohesion : All functions need to be encapsulated in a container, and method calls are provided externally through interfaces
Business isolation : Different businesses require different functional containers
Unified call : All function containers need to have a unified call entry
Status Maintenance : Precise maintenance of status is required
Switching without perception : When switching function containers, there is no perception
Core controllable : the core link can be monitored and fault warning

Based on the above 6 points, the general architecture design is shown in the figure. Here, we don't need to delve into what the modules in the figure represent. We will talk about it later. Here is just to understand the general architecture:

Next, I will talk about the specific implementation process.

Design

Foreword:

Although there are many business scenarios of RTC, they are essentially the same. All users join a common room, and then conduct real-time audio and video communication in the room. Specific to the actual project, it can be roughly divided into two types: full-scene RTC and partial-scene RTC.

Full-scene RTC : The entire business is realized through RTC technology, such as: 1v1 audio and video calls, party rooms, etc.
RTC in some scenarios : that is, only a part of the entire service chain uses the RTC technology, which often involves engine switching.

No matter what kind of scenario it is, the engine that carries the core functions is essential, so we will start with the engine first. In addition, for the convenience of description, the engine will be referred to as Player in the future.

1. Encapsulation of Player

Different types of Players will be involved in the business associated with RTC, such as: the host starts broadcasting (pushing Player), viewers watch live broadcast (pulling Player), and RTC Player, etc. Although their functions are different, there are similarities in usage, such as starting start, terminating stop and so on. Therefore, we can abstract different Players to a common interface IPlayer The relevant code is as follows:

 interface IPlayer<DS : IDataSource, CB : ICallback> {
    fun start(ds: DS)

    fun stop()

    fun <T : Any> setParam(key: String, value: T?)

    ......
}

Among them, IDataSource and ICallback are the data sources and callbacks required to start the Player, which will be mentioned many times in the following articles, especially IDataSource , which is the source of the Player startup, just like the phone number when making a call.

One of the problems encountered here is that in addition to some general methods, Player also has its own unique methods, such as: mute, volume adjustment, etc. These methods are numerous and different and cannot be listed in the IPlayer interface. Even if they can be listed, the methods in the Player will definitely change with the iteration of the business. It is impossible to change the interface every time a method is changed. This is obviously Does not conform to programming principles. So how to abstract different methods and let the upper layer perform different operations by calling the same method? Pass here:

 fun <T : Any> setParam(key: String, value: T?)

To implement, where key represents the unique tag of the method, and value represents the input parameters of the method. In this way, the upper layer only needs to call the corresponding method tag and method input parameters by calling setParam to call the corresponding method. So how to do it? The answer is also very simple to establish a one-to-one mapping relationship through an intermediate layer. However, there are many types of Players, and it would be too troublesome to write a mapping logic for each Player written. So here is a combination of APT compile-time annotations and [javapoet]( https://github.com/square/javapoet
) Automatically generate this middle layer and name it xxxPlayerWrapper It generates a convert method inside, and completes the one-to-one mapping logic inside this method. Next, let's look at the specific implementation process:

First, two annotations are defined to act on specific Player and corresponding methods. For example:

 @Retention(RetentionPolicy.CLASS)
@Target({ElementType.TYPE})
public @interface PlayerClass {
}

@Retention(RetentionPolicy.CLASS)
@Target({ElementType.METHOD})
public @interface PlayerMethod {
 String name();
}

@PlayerClass
open class xxxPlayer : IPlayer<xxxDataSource, xxxCallback>() {

 @PlayerMethod(name = "key1")
 fun method1(v: String) {
     ....具体实现
 }
}

One-to-one mapping relationship is established:

There is a mutual dependency between xxxPlayer and xxxPlayerWrapper, and they are each other's member variables. When calling the interface method setParam(key: String, value: T?) of xxxPlayer, it will directly call the convert method of xxxPlayerWrapper, the convert method will find its corresponding method name according to the key, and finally directly call the specific method of Player .

Since all Players have this logic, this part can be abstracted into an AbsPlayer:

 abstract class AbsPlayer<DS : IDataSource, CB : ICallback>
    : IPlayer<DS, CB>{
    var dataSource: DS? = null
    private val wrapper by lazy {
        val ret = kotlin.runCatching {
            val clazz = Class.forName(this::class.java.canonicalName + "Wrapper")
            val signature = arrayOf(this::class.java)
            clazz.constructors.find {
                signature.contentEquals(it.parameterTypes)
            }?.newInstance(this) as? PlayerWrapper
        }
        ret.exceptionOrNull()?.printStackTrace()
        ret.getOrNull()
    }


    override fun <T : Any> setParam(key: String, value: T?) {
        wrapper?.convert(key, value)
    }
    //...... 省略其他无关代码
}

Finally, the class diagram of the entire Player is as follows:

Here we don't pay attention to how the functions of the Player are implemented, such as how to push the stream, how to pull the stream, how to perform RTC, etc. After all, the service provider sdk used at the bottom of each project is different, and the technical implementation is also different, so here we only discuss it from the architectural level.

2. Player switching

Player switching is aimed at some scene RTC. Here we introduce the concept of SwitchablePlayer for this kind of scene, and it also inherits from AbsPlayer and has all the functions of Player. It's just that these functions are implemented by the real Player inside the decorator mode, and at the same time increase the ability of the Switch. Before talking about Switch capabilities, let's think about a few questions.

When to trigger the Switch?
How to do Switch?
Where does the Switch's target object Player come from?

The first question is when to trigger the Switch : we know that triggering the Switch means that another Player needs to be started, and starting the Player requires the IDataSource mentioned above, so we only need to determine the type of IDataSource passed in to start the Player and the current Player Whether the type of the IDataSource is the same, if it is different, it can be triggered. The specific logic of the judgment is realized by comparing the IDataSource type of the current Player generic parameter (the first generic parameter of AbsPlayer<DS : IDataSource, CB : ICallback> ) and the incoming IDataSource type.

 private fun isSourceMatch(
        player: AbsPlayer<IDataSource, ICallback>?,
        ds: IDataSource
    ): Boolean {
        if (player == null) {
            return false
        } else {
            val clazz = player::class.java
            var type = getGenericSuperclass(clazz) ?: return false
            while (Types.getRawType(type) != AbsPlayer::class.java) {
                type = getGenericSuperclass(type) ?: return false
            }
            return if (type is ParameterizedType) {
                val args = type.actualTypeArguments
                if (args.isNullOrEmpty()) {
                    false
                } else {
                    Types.getRawType(args[0]).isInstance(ds) && isSameSource(player, ds)
                }
            } else {
                false
            }
        }
    }

The second question is how to switch : This is relatively simple, just stop the current Player and start the target Player.

The third question is where the target object Player of Switch comes from : SwitchablePlayer does not know which Players the business needs. It is just a layer of packaging for the Player function and maintenance of the Switch function. Therefore, the specific Player creation needs to be implemented by the business layer. SwitchablePlayer only Provide an abstract method to get Player such as:

 abstract fun getPlayer(ds: IDataSource): AbsPlayer<out IDataSource, out ICallback>?

In addition, because the current Player will be stopped when switching, and whether the stopped Player can be reused, if it can be reused, it can be cached, and the next time it is used, it will be obtained from the cache first. The process corresponding to the entire SwitchablePlayer is shown in the figure:

When using it, the caller can define the relevant Player according to its own business. For example, in the live broadcast -> PK business, it involves the switching of two Players: LivePlayer and PKPlayer

 class LivePKSwitchablePlayer : SwitchablePlayer(false) {
        override fun getPlayer(ds: IDataSource): AbsPlayer<out IDataSource, out ICallback> {
            return when (ds) {
                is LiveDataSource -> {
                    LivePlayer()
                }
                is PKDataSource -> {
                    PKPlayer()
                }
                else -> LivePlayer()
            }
        }

    }

3. Process packaging

For the encapsulation of the entire RTC process, two things need to be figured out:

What is the main process of RTC
What business callers need and what they are concerned about

Since the main process of RTC is similar to daily phone calls, the author uses this analogy to make it easier for everyone to understand. The following figure shows the entire call process.

After figuring out the whole process, the next step is to figure out the second thing, what business callers need and what they are concerned about. There are roughly three points to focus on in combination with the above figure:

The first is to have the entry of dialing and hanging up; ( Player's Start and Stop )
The second is to be able to know the current call status, such as whether it is connected, whether it has been connected, and whether the call is over; (player status maintenance )
The third is some feedback, such as the other party is not connected, the other party is not in the service area, the mobile phone number is empty, etc. ( Player's core event callback is the aforementioned ICallback )

As for how it is connected, and what operations are done at the bottom, the caller doesn't care about it. Based on the above, our overall functional design should focus on the points.

1. Manage the Player by designing a manager and expose the Start and Stop methods to the outside world.
2. Maintain the state of the Player and allow its state to be monitored by the upper layer.
3. Some core event callbacks of Player can also be monitored by the upper layer.

The first and third points are relatively simple, so I won't go into too much detail here. The second point of state maintenance, the author uses the StateMachine state machine to implement, perform different operations in different states, and each state corresponds to a state code, the upper layer can sense the state change by monitoring the state code.

The settings of status codes and core events are handled here using LiveData

 class RtcHolder : IRtcHolder {
    private val _rtcState = MutableLiveData(RtcStatus.IDLE)
    private val _rtcEvent = MutableLiveData(RtcEvent.IDLE)
    val rtcState = _rtcState.distinctUntilChanged()
    val rtcEvent = _rtcEvent.distinctUntilChanged()
    private val callBack = object : IRtcCallBack {
        override fun onCurrentStateChange(stateCode: Int) {
            _rtcState.value = stateCode
        }

        override fun onEvent(eventCode: Int) {
            _rtcEvent.value = eventCode
        }
       
       //......省略其他代码
        
    }

     init {
        //上层状态监听 
        rtcState.observeForever {
            when (it) {
                RtcStatus.CONNECT_END -> {
                    ToastHelper.showToast("通话结束")
                }
            }
        }
    }
    //......省略其他代码
}

At this point, the design of the entire scaffolding is over. Among them, the SDK packaging part and the monitoring part of the service provider will be explained in the next issue.

Summarize

This article introduces the background of RTC scaffolding, and explains the design process and final implementation step by step in an easy-to-understand way. During this period, problems are identified, solved, and thinking is induced. Due to limited space, we cannot give a detailed introduction to each point. Interested students can leave a message if they have any questions and discuss and study together.

This article is published from the NetEase Cloud Music technical team, and any form of reprinting of the article is prohibited without authorization. We recruit various technical positions all year round. If you are ready to change jobs and happen to like cloud music, then join us at grp.music-fe(at)corp.netease.com!

Design and Implementation of RTC Scaffolding

What is RTC?

background

Design ideas

Design

Foreword:

1. Encapsulation of Player

2. Player switching

3. Process packaging

Summarize

云音乐技术团队

引用和评论

AI Code 在团队开发工作流的融合思考

iOS 集成如何集成 FSPlayer

buildozer 不能使用 3.x 的 openssl 吗？必须要用落后的 1.1 吗？

执行 buildozer -v android debug 报错 ValueError: read of closed file

手机真能秒变顶级PC？无影云、ToDesk、顺网云等五大云电脑实测对比

扣子空间初体验，MCP 扩展集成，无限拓展 Agent 能力边界，让更专业 Agent 来为你提供服务

[2025] 通过蓝牙共享应用程序？简单的技巧