头图

Guide: With the changes of modern social lifestyles, the way of social entertainment is gradually changing. Traditional face-to-face social entertainment activities are gradually changing, and more and more interactive behaviors are gradually shifting to the Internet. The advancement of RTC technology has also promoted changes in the form of online entertainment. The proportion of one-way information transmission methods such as movies, listening to music, and watching videos is declining, and the more interactive methods such as interactive live broadcast, voice call, and online KTV songs etc. are gradually rising.

The necessity of audio processing

As one of the most important means of communication for human beings, the processing of sound is of utmost importance. On the one hand, because humans are extremely sensitive to sound, the propagation of sound is affected by the characteristics of the human body’s physiological structure. Because vision is limited by light and orientation, it is not a source of information that can be relied on at all times. In many cases, hearing becomes human’s perception of environmental information. The most important channel. On the other hand, there are also scenarios where the sound is separated from the screen and exists alone.

RTC interactive communication function, as an extremely important function, , puts forward the following requirements for the processing of audio calls:

  • ultra-low latency, real-time interactive zero distance
  • super high call quality . Echo, noise and other factors that affect hearing need to be handled properly, so that there is no interference during the call

The characteristics of social entertainment put forward new requirements for audio processing. For example, users want to get high-quality music, good presence, interesting audio effects, high-quality audio content sharing etc. Therefore, this requires us to optimize the audio from different aspects to achieve the best results. Today we are sharing audio sharing.

The concept of audio sharing

Audio sharing generally refers to sharing the audio sound in the device with other participants, so that both parties can hear the same sound, such as listening to a song together.

The user in a call hears the same voice, which is very important for the user's sense of presence in some cases. There is a direct way that can let the opposite user hear the local voice from the microphone channel, but in many cases this effect will not be very good. Distortion in the acquisition and playback link, and the specific processing of the microphone channel for human voice may destroy the effect of high-quality audio.

has become a reality to provide a audio sharing function 1616fdaa34dc6f that bypasses the front-end processing links and flexibly and conveniently responds to various scenarios.

Netease cloud letter audio sharing implementation plan

In order to meet the needs of users for audio sharing in multiple scenarios, NetEase Yunxin has implemented a flexible audio sharing solution.

A variety of shared sound sources are provided here. You can use the source file , of course, it also includes the network audio source .

It can be compatible with common Mp3, AAC and other formats of data files after being decoded by the built-in decoder. This is the simplest and most common way.

What should I do when the user likes the sound played by the third-party software? We -based system interface crawl and play data processing , so that users do not suffer can not obtain the data source so that the audio source is shared more diverse.

The architecture here seems to be slightly different from the common RTC architecture. Not only is the echo cancellation module , but the source of the reference signal seems to have changed. This is the special place of this architecture. The following echo cancellation module is used for basic calls, Since the shared sound has to be heard by yourself and the other party at the same time, the sound collected by the microphone may also contain this part of the signal, which needs to be eliminated. Not only the sound of the opposite end, but also the sound played by the local end must be included.

Here, the actual broadcast signal is used as the reference input, which can ensure that the local human voice input is cleaner. An additional echo cancellation is used to eliminate the voice of the opposite party. When uses a third-party playback sound as a shared source, the signal we get contains all the playback content. such as 1616fdaa34dda3 can eliminate the voice of the peer in the shared source, so that high-quality audio calls can still be guaranteed during the sharing process.

audio sharing application scenarios

The above audio sharing solution is a unified architecture, which can be used in game blackout, audio sharing, online KTV etc. Covers multiple scenes of entertainment office.

With this basic processing framework, you can flexibly set up internal processes, appropriate external logic to achieve various functions. following figure is an example:

Replace the above third-party audio content with games, music players, or browsers, and you can use simple operations to realize audio sharing scenes such as black games, listening to songs together, and meetings.

If you think this example is a bit simple, then the following is an example of online KTV chorus implementation.

left side of 1616fdaa34de7b is the vocal end, which provides accompaniment music. After the local vocals are added, it is transmitted to the vice vocalist through RTC audio.

The voice of the singer on the right will be streamed to the vocalist via RTC for the two choruses to be synchronized. At the same time, the vocal of the vice vocalist will be mixed with the song containing the vocal of the vocalist from the side of the vocalist to form a complete chorus, which will be sent to the live broadcast. audience.

The above is the realization of an online KTV scene. Of course, the realization of online KTV scenes involves many aspects, and the problems encountered are far more than the audio sharing part. The transmission of lyrics, synchronization of each end, end-to-end audio delay and other issues are obstacles that need to be overcome. Solving these problems can provide a better experience

summary

NetEase Yunxin’s SDK products provide a complete audio sharing solution, supporting two-channel full channels. can cover a series of scenes including black games, listening to songs together, and online KTV. If you are interested, you can log in to NetEase Yunxin official website to download the Demo to experience.


网易数智
619 声望140 粉丝

欢迎关注网易云信 GitHub:


引用和评论

0 条评论