How to realize the variable speed function of recording through the audio module of OpenHarmony?

Introduction

OpenAtom OpenHarmony (hereinafter referred to as "OpenHarmony") is an open source project incubated and operated by the Open Atom Open Source Foundation. It is an intelligent IoT operating system for the era of all scenarios, all connections, and all intelligence.
The multimedia subsystem is the core subsystem in the OpenHarmony system, providing multimedia functions such as camera, audio and video for the system. The audio module and audio recording function of the multimedia subsystem can provide two sets of interfaces. One is the AudioRecorder interface provided by ohos.multimedia.media, which can directly set the file path for recording and save, and automatically generate the corresponding recording file after the recording is over. Code The writing is relatively simple; the second is the AudioCapturer interface provided by ohos.multimedia.audio, which can obtain the PCM data in the recording process and process the data. Since the Capturer interface is more flexible for the processing of raw data, today I will introduce the method of realizing the function of recording variable speed through the Capturer interface.

Show results

The audio recording is realized through the Capturer interface, and the PCM data is resampled during the recording process to realize the fast and slow playback of the sound.
First set the recording acceleration or recording deceleration, after the setting is completed, click the "Recording Start" button to start recording, click the "Recording End" button to stop the recording, and then click "Play Start" to play the recorded audio, and the played audio is after the setting. Acceleration or deceleration effect.
The code has been uploaded to the SIG repository, and the link is as follows:
https://gitee.com/openharmony-sig/knowledge_demo_entainment/tree/master/FA/AudioChangeDemo

Directory Structure

call process

1.Start's framework layer calling process

Read's framework layer calling process
Source code analysis
1. First look at the layout of the page, which is mainly divided into four modules:
(1) Set the recording acceleration

 <div style="text-color: aqua;background-color: yellow;margin-bottom: 20fp;">
    <text style="font-size: 30fp;">设置录音加速：</text>
</div>

<div class="container">
    <button class="first" type="capsule" onclick="set_5_4">1.25倍速</button>
    <button class="first" type="capsule" onclick="set_6_4">1.5倍速</button>
</div>

<div class="container">
    <button class="first" type="capsule" onclick="set_7_4">1.75倍速</button>
    <button class="first" type="capsule" onclick="set_8_4">2倍速</button>
</div>

(2) Set the recording deceleration

 <div style="text-color: aqua;background-color: yellow;margin-bottom: 20fp;margin-top: 20fp;">
    <text style="font-size: 30fp;">设置录音减速：</text>
</div>

<div class="container">
    <button class="first" type="capsule" onclick="set_3_4">0.75倍速</button>
    <button class="first" type="capsule" onclick="set_2_4">0.5倍速</button>
</div>

(3) Recording

 <div style="text-color: aqua;background-color: yellow;margin-bottom: 20fp;margin-top: 20fp;">
    <text style="font-size: 30fp;">录音：</text>
</div>

<div class="container">
    <button class="first" type="capsule" onclick="record">录音开始</button>
    <button class="first" type="capsule" onclick="recordstop">录音结束</button>
</div>

(4) Play

 <div style="text-color: aqua;background-color: yellow;margin-bottom: 20fp;margin-top: 20fp;">
    <text style="font-size: 30fp;">播放：</text>
</div>

<div class="container">
    <button class="first" type="capsule" onclick="play">播放开始</button>
    <button class="first" type="capsule" onclick="playstop">播放结束</button>
</div>

<div class="container">
    <video if="{{ display }}" id="{{ videoId }}"
           class="video"
           src="{{url}}"
           autoplay="{{ autoplay }}"
           controls="{{ controlShow }}"
           muted="false"
           onseeked="seeked"
           onprepared="prepared"
            >
    </video>
</div>

2. The logic code is in JS:
(1) First obtain the PCM data through the AudioCapturer interface, and then start the recording process by calling the start interface of the AudioCapturer.

 globalThis.capturer.start().then(function () {
    console.log("gyf start");
    globalThis.capturer.getBufferSize((err, bufferSize) => {
        if (err) {
            console.error('gyf getBufferSize error');
        } else {
            console.log("gyf bufferSize = " + bufferSize);
            globalThis.getBuf(bufferSize);
        }
    });
});

(2) After the startup is successful, getBuf will call the getData function. The getData function reads the data through the read method of the AudioCapturer. After the data is successfully read, the handleBuffer function is used to process the data. The parameter arrayBuffer of the handleBuffer function is the pcm data read by the read method, and the data is processed by fast playback or slow playback in handleBuffer.

 //循环调用read，进行数据的读取
handleBuffer(arrayBuffer) {
    console.log("gyf handleBuffer");

    let result = new Uint8Array(arrayBuffer);
    console.log("gyf handleBuffer ================== " + result);

    let outData = this.test(result, up, down);

    fileio.writeSync(globalThis.fd, outData.buffer);

    globalThis.capturer.read(globalThis.bufSize, true).then(this.handleBuffer);
},

getData(bufSize) {
    console.log("gyf getData");
    globalThis.capturer.read(bufSize, true).then(this.handleBuffer);
},

getBuf(bufSize) {
    console.log("gyf getBuf");
    this.getData(bufSize);
},

(3) Fast playback or slow playback is achieved by the combination of the up and down methods. The principle of the down method is to interpolate the PCM data and insert down sampling points between two adjacent points. The principle is interval sampling, and sampling is performed at intervals of up points.

 up(data, up) {
    if (1 == up) {
        return data;
    }
    let length = data.byteLength;
    let upLength = Math.round(length / up);
    var upData = new Uint8Array(upLength);
    for (var i = 0, j = 0; i < length; ) {
        if (j >= upLength) {
            break;
        }
        upData[j] = data[i];
        i += up;
        j++;
    }
    return upData;
},

down(data, down) {
    if (1 == down) {
        return data;
    }

    let length = data.byteLength;
    let downLength = Math.round(length * down);
    var downData = new Uint8Array(downLength);
    for (var i = 0, j = 0; i < length - 1; ) {
        for (var k = 0; k < down; k++) {
            downData[j] = data[i];
            j++;
        }
        i++;
    }
    return downData;
},

(4) Combine the down and up methods to achieve playback at 1.25 times, 1.5 times, 1.75 times, 2 times, 0.75 times, and 0.5 times.

 test(data, up, down) {
    let downData = this.down(data, down);
    let upData = this.up(downData, up);
    return upData;
},

(5) To play audio files in wav format and collect and obtain PCM data, we need to add the wav header information to the pcm data according to the set parameters. When creating an AudioCapturer instance, set the parameters of the collected audio, such as the sampling rate and the number of channels. , sampling format, etc.

 //音频采集初始化
var audioStreamInfo = {
    samplingRate: audio.AudioSamplingRate.SAMPLE_RATE_8000,
    channels: audio.AudioChannel.CHANNEL_1,
    sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_U8,
    encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW
}

var audioCapturerInfo = {
    source: audio.SourceType.SOURCE_TYPE_MIC,
    capturerFlags: 1
}

var audioCapturerOptions = {
    streamInfo: audioStreamInfo,
    capturerInfo: audioCapturerInfo
}
let that = this;

audio.createAudioCapturer(audioCapturerOptions,(err, data) => {
    if (err) {
        console.error(`gyf AudioCapturer Created : Error: ${err.message}`);
    }
    else {
        console.info('gyf AudioCapturer Created : Success : SUCCESS');
        that.capturer = data;
    }
});

(6) According to the information set by these parameters, the wav file needs to be written into the file header. The header information generally contains 44 bytes, and the information of three chunks (RIFF chunk, fmt chunk, data chunk) needs to be set in it. The specific information can be View the introduction of the official website WAV file format introduction:
( http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html )

 //假设数据为1000秒钟的时间（8000 * 1000）
encodeWAV() {
    var dataLen = 8000000;
    var sampleRate = 8000;
    var sampleBits = 8;
    var buffer = new ArrayBuffer(44);
    var data = new DataView(buffer);

    var channelCount = 1;   // 单声道
    var offset = 0;

    // 资源交换文件标识符
    this.writeString(data, offset, 'RIFF'); offset += 4;
    // 下个地址开始到文件尾总字节数,即文件大小-8
    data.setUint32(offset, 36 + dataLen, true); offset += 4;
    // WAV文件标志
    this.writeString(data, offset, 'WAVE'); offset += 4;
    // 波形格式标志
    this.writeString(data, offset, 'fmt '); offset += 4;
    // 过滤字节,一般为 0x10 = 16
    data.setUint32(offset, 16, true); offset += 4;
    // 格式类别 (PCM形式采样数据)
    data.setUint16(offset, 1, true); offset += 2;
    // 通道数
    data.setUint16(offset, channelCount, true); offset += 2;
    // 采样率,每秒样本数,表示每个通道的播放速度
    data.setUint32(offset, sampleRate, true); offset += 4;
    // 波形数据传输率 (每秒平均字节数) 单声道×每秒数据位数×每样本数据位/8
    data.setUint32(offset, channelCount * sampleRate * (sampleBits / 8), true); offset += 4;
    // 快数据调整数 采样一次占用字节数 单声道×每样本的数据位数/8
    data.setUint16(offset, channelCount * (sampleBits / 8), true); offset += 2;
    // 每样本数据位数
    data.setUint16(offset, sampleBits, true); offset += 2;
    // 数据标识符
    this.writeString(data, offset, 'data'); offset += 4;
    // 采样数据总数,即数据总大小-44
    data.setUint32(offset, dataLen, true); offset += 4;

    return data;
},

Summarize

This article introduces the recording function by using the AudioCapturer interface of the OpenHarmony audio module. The AudioCapturer interface is very flexible in processing raw data, it can perform interpolation/decimation resampling on the collected data, and save the processed audio to a local file. Since the local file uses the WAV format, it is necessary to add header information to the WAV file before writing data. These information can be set according to the parameters set when the AudioCapturer was created to ensure the accuracy of the header information. Finally, Then play the audio data through the video component of the application layer.
I hope this article can provide some new ideas for developers to expand other scenarios, such as using the acquired data to realize speech recognition, speech transcription and other functions in this way. In the process of practical development, OpenHarmony Contribute to the development of ecology.

How to realize the variable speed function of recording through the audio module of OpenHarmony?

Introduction

Show results

Directory Structure

call process

Source code analysis

Summarize

OpenHarmony开发者

引用和评论

OpenHarmony 4.1 Release版本正式发布，邀您体验

基于 WSL2 搭建 OpenHarmony 南向富设备编译环境

如何在open harmony下完成对于子系统的裁剪