1

From mono, stereo, surround sound to three-dimensional sound, the iterative evolution of audio playback technology is to restore real-world sound. Among them, the three-dimensional sound technology uses the method of signal processing to simulate the sound signal reaching both ears, and restores the sound field to a three-dimensional space, which is closer to the real world. With this technology, manufacturers create a more realistic, natural and immersive listening experience for users in games, movies, music and other scenarios, and also achieve better user subscription growth.

The production of traditional 3D audio needs to obtain the original track material (such as recorded vocals, piano sounds, etc.), and use professional digital audio workstation (DAW) and 3D mixing plug-ins to manually produce, so the production cycle is long and the production efficiency is low. , high cost, high threshold. In addition, it is very difficult for developers to make 3D audio through traditional methods because there is no original track of the song. HMS Core audio editing service (Audio Editor Kit) provides audio source separation (obtaining tracks) and spatial audio rendering capabilities. Developers only need to input stereo to quickly generate 3D audio content, improving user audio experience and enhancing product competitiveness!


HMS Core Audio Editing Service 3D Audio Generation Diagram

Audio separation technology

Since most of the audio we are currently exposed to is stereo, all audio objects (such as vocals, piano, guitar, etc. in music) are already mixed in the left and right channels and cannot be easily separated, not to mention the rendering and placement of them In different spatial positions, therefore, the separation of specific elements in stereo is a core technology of 3Dization.

The Huawei algorithm team performed deep learning modeling on a large amount of music, combined with traditional signal processing capabilities, and finally realized the separation of sound sources: first, the short-time Fourier transform (STFT) was used to transform the one-dimensional audio signal into a two-dimensional time spectrum; Then, the obtained two-dimensional time spectrum and the original one-dimensional time domain signal are used as dual-stream input, and the latent space representation of the target musical instrument is obtained through multi-layer residual coding and training of a large amount of data; finally, a series of transformations are carried out. The matrix is finally restored to the original object stereo signal.
The transformation matrix and network structure used in the above processing are Huawei's unique technology, which is specially designed for the characteristics of different musical instruments. It can ensure that each musical instrument can be separated as completely and cleanly as possible, and provide sufficient quality for 3D. Track material. The core competencies involved include:

1. Audio signal feature extraction: including extracting features directly from time-domain signals through an encoder, and extracting time-spectral features from time-domain signals through short-time Fourier transform;

2. Deep learning model construction: add residual module and attention mechanism to enhance the ability to model harmonics of different musical instruments and the ability to correlate time series;

3. Multi-channel Wiener filtering: Combined with the ability of traditional signal processing, the relationship between the power spectrum of objects and non-objects is predicted through deep learning modeling, and the filter coefficients are constructed and processed.

Schematic diagram of audio separation technology

At present, HMS Core has opened 12 kinds of sound source separation capabilities (vocal, accompaniment, drums, violin, bass, piano, acoustic guitar, electric guitar, strings, vocals, accompaniment with accompaniment and orchestra), helping developers quickly Extract the instruments you want for 3D editing.

Spatial Audio Rendering Technology

How can humans tell the location of the sound source by listening to external sounds through only two ears? This is due to the fact that there are subtle differences in the sound delivered from the source to the two ears, including information such as arrival time, received energy, and phase difference. These information differences are integrated into a series of transfer functions, called head-related transfer functions (HRTFs). By superimposing the HRTF to a single point sound source, we can simulate the azimuthal direct sound part of the sound in the real world. In order to solve the problem that HRTF varies from person to person due to differences in human body signs such as head shape and shoulder width, we have designed a set of more general HRTF through the analysis of a large amount of data, so that everyone can enjoy 3D audio. In addition, in order to create physical phenomena such as reflection, scattering, and interference of sound in the space, we also construct a real space by superimposing a series of room response functions (RIR) to form the so-called reverberation. Therefore, by filtering the sound source through a series of HRTF and RIR, we can 3Dize the previously separated material to form 3D music.

Schematic diagram of spatial audio rendering technology

At present, the combination of sound source separation and spatial audio rendering service provided by the HMS Core audio editing service has been applied to the advanced sound effects of Huawei Music. Users can enter the sound effects page of Huawei Music and select sound-space sound effects or vocal pure sound effects in the advanced sound effects column. Enjoy and feel the charm of 3D audio.

Huawei music sound and space sound effects and vocal pure enjoyment function

The above technologies come from Huawei's 2012 laboratory and are open to developers through the HMS Core audio editing service, bringing users a differentiated 3D audio experience in the field of music and audio. For more information about the HMS Core audio editing service, please visit the official website of Huawei Developer Alliance-HMS Core Audio Editing Service

Learn more details>>

Visit the official website of Huawei Developer Alliance
Get development guidance documents
Huawei Mobile Services Open Source Warehouse Address: GitHub , Gitee

Follow us to know the latest technical information of HMS Core for the first time~


HarmonyOS_SDK
596 声望11.7k 粉丝

HarmonyOS SDK通过将HarmonyOS系统级能力对外开放,支撑开发者高效打造更纯净、更智能、更精致、更易用的鸿蒙应用,和开发者共同成长。