This article is the content of the " Dev for Dev Column " series. The author is Huang Yiqing, engineer of the Audio and Video Laboratory of Shengwang.
The optimization of audio quality is a complex system engineering, and echo cancellation is one of the common topics. Generally speaking, the effect of echo cancellation is affected by many factors such as the acoustic design of the device itself, the acoustic environment, and the software system. The traditional method of echo cancellation includes linear echo cancellation and filtering nonlinear processing, but the current frontier field of echo cancellation still faces nonlinear echo cancellation, near-end energy is less than echo energy, stereo echo problem, microphone and reference signal clock inconsistent, Inaccurate reference signals, lack of reliable delay estimation methods, etc. Based on its own practice, the audio technology team of Shengwang has launched a series of audio evaluation articles in special scenarios. This article is an echo cancellation article. Here, I would like to invite colleagues in the industry to criticize and correct me.
With the application of 4G/5G, the field of real-time audio and video has also developed rapidly, and the quality of real-time voice has attracted more and more attention. Factors such as echo, delay, and freeze have become the main aspects that people pay attention to in real-time voice quality. This article mainly introduces the problem of echo cancellation in real-time voice calls.
Echo is the phenomenon in which the sound played by the speaker is picked up by the microphone and sent back to the far end. All communication systems must perform echo cancellation, otherwise it will seriously affect the communication quality. The problems caused by echo cancellation are mainly divided into two categories, echo leakage and double-speech dropout.
Figure 1: Causes of Acoustic Echoes
01 Several problems of echo cancellation
■ Figure 2: Echo Cancellation Scheme
There are many factors that affect echo cancellation, such as volume problems - when the played signal is too loud, it is easy to generate echoes. The main reasons are as follows:
1. The echo signal collected by the microphone overflows (clipping) and introduces nonlinear echo;
2. Excessive volume aggravates the vibration of the hardware device itself and introduces nonlinear components;
3. The echo signal collected by the microphone does not overflow but is much larger than the near-end voice signal, causing serious word drop or even inaudible during double talk.
In addition, delay jitter, clock offset, unstable acquisition or playback frequency, nonlinear distortion, echo path changes, reverberation, processing effects of hardware 3A (mobile phone comes with 3A), etc., are all common factors that affect echo cancellation. factor. Macroscopically speaking, the appearance of the acquisition or playback device (model and arrangement of speaker/microphone devices), the 3A processing algorithm that comes with the mobile phone (manufacturer, system and model of the mobile phone), transmission algorithm, environmental factors and complex and changeable communication scenarios , will have different effects on echo cancellation.
02 Evaluation method of echo cancellation
The scene of echo generation is so complicated, so how do we conduct the evaluation of echo cancellation? In the laboratory scenario, our evaluation of echo cancellation is mainly divided into two parts. The first part, manual subjective test, focuses on whether there is an echo problem in various complex scenarios; the second part, an objective automated test, focuses on whether there is an echo problem in a large number of different models/system versions.
Manual subjective test, better understanding. It is artificial communication, to simulate various scenarios that users may use, to test whether there is echo generation, common scenarios such as anchor audience switching, switching to the background/lock screen, opening/closing (audio and video related) third-party applications, Interrupts, etc., as well as switching with various terminal devices (headphones/external speakers/Bluetooth headphones)/environment (quiet/noisy), etc.
So how does an objective automated test detect echoes?
We built a system for measuring AEC. The system is suitable for all scenarios of Shengwang and the industry SDK. The corpus used is the human voice corpus recorded in the anechoic room, and the evaluation is performed on the user top model and the frequently asked question model. The volume of the device is adjusted to the officially recommended volume, and the AEC quality is measured by indicators such as the playback integrity of the tester speaker, the playback loudness of the tester speaker, the proportion of long and short-term echoes, and the amount of residual echoes.
03 Specific AEC objective evaluation methods
The test method is to send and receive test signals uniformly through the test device, which can perform regression test, and also has good stability in large-scale automated tests, which can greatly improve the test efficiency.
Step 1: Connect the near-end device to the far-end device for communication;
Step 2: The computer outputs the audio signal to the near-end standard equipment through the sound card for the near-end equipment to collect the test audio signal;
Step 3: Play the received audio signal on the remote device;
Step 4: The sound card synchronously collects the audio signal to be tested received by the near-end device;
Step 5: Determine the echo cancellation quality of the remote device by detecting the loudness and duration of the audio signal to be tested.
■Figure 3: Anechoic chamber test environment
To evaluate the ability of echo cancellation under ideal conditions, we conducted tests in an anechoic chamber. Isolate noise and minimize echoes. Figure 3 is a partial display of the test environment. We conduct batch testing on the selected test machine.
describe | |
---|---|
A1 | Full duplex without attenuation |
A2 | Full duplex has attenuation in transmit direction |
B | very short clip |
C | Impaired syllable clipping |
D | Clipping causing word loss |
E | very short residual echo |
F | Intermittent |
G | continuous echo |
■Table 1 Echo Cancellation Performance Type Description
According to 3GPP's description of echo cancellation performance classification (Table 1) and the classification of echo cancellation performance (Figure 4), we divided the echo proportion into the proportion of echoes below 25ms, the proportion of echoes between 25 and 150ms, and the proportion of echoes exceeding 150ms. The three levels of the ratio are used to define the severity of the echo.
■Figure 4 Echo Cancellation Performance Classification
04 Analysis of test results
The following are the AEC evaluation results of the sound network and industry solutions.
■AEC evaluation results (partial)
The above is an ideal evaluation of echo cancellation. However, in actual communication, there will be various complex environments, resulting in different degrees of echoes. In order to simulate the echoes that occur in real-world situations, we use the recorded corpus in a tunable reverberation chamber for echo analysis. The following is some data of the AEC evaluation scheme in four different scenarios and different gears. At the same time, it can also test the effect of echo cancellation under frequent channel forwarding and backward or long-term stress testing.
The reverberation time that can be set in the adjustable reverberation room can reach 0.2 ~ 2 seconds, with a total of 7 levels, which can simulate practical application environments of different sizes such as small conference rooms, living rooms, lecture halls, large classrooms, cinemas, etc., for subjective sound evaluation. and objective algorithm quality assessment to provide repeatable full-scene test conditions.
Through the analysis of the data, we can clearly see the ability of echo cancellation. Using a large number of test models, it is possible to investigate the effect of a particular model on echo cancellation. By comparing different versions, the effect of optimization iterations can be evaluated. By comparing with the solutions in the industry, we can test the leadership of our R&D work.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。