Yiwen teaches you how to quickly realize voice recognition

In the development process of some application projects, sometimes it is necessary to use the function of voice detection, that is, to recognize the functions of knocking, doorbell, car horn, etc. For small and medium developers, it is time-consuming to develop and build this capability separately. With the voice recognition service SDK in Huawei’s machine learning service, this function can be realized on the end-side only by simple integration.

1. Introduction to Huawei Voice Recognition Service:

The voice recognition service supports the detection of sound events through the online (real-time recording) mode. Based on the detected sound events, it can help developers perform follow-up command actions. Currently 13 types of sound events are supported, including: laughter, baby or child crying, snoring, sneezing, shouting, cat barking, dog barking, and running water (including running water from taps, streams, and waves) ), car horns, door bells, door knocks, fire alarm sounds (including fire alarm sirens, smoke alarm sirens), sirens (including fire truck sirens, ambulance sirens, police car sirens, and air defense Siren).

2. Integration preparation:

Development environment configuration

1. An application needs to be created on the Huawei Developer Alliance:

For details of this step, please refer to the link below:

https://developer.huawei.com/consumer/cn/doc/development/AppGallery-connect-Guides/agc-get-started#createproject?ha__source=hms1

2. Open the machine learning service:

The specific opening steps can be viewed at the link below:

https://developer.huawei.com/consumer/cn/doc/development/HMSCore-Guides-V5/enable-service-0000001050038078-V5?ha__source=hms1

2. After the application is created, the agconnect-services.json file will be automatically generated, and you need to manually copy the agconnect-services.json file to the application-level root directory

3. Configure the Maven warehouse address of the HMS Core SDK.

For the configuration of the Maven warehouse, you can check the link below:

https://developer.huawei.com/consumer/cn/doc/development/HMSCore-Guides/config-maven-0000001050040031?ha__source=hms1

4. Integrate voice recognition service SDK

It is recommended to use Full SDK for integration, and configure the corresponding SDK in the build.gradle file

// 引入声音识别集合包
implementation 'com.huawei.hms:ml-speech-semantics-sounddect-sdk:2.1.0.300'
implementation 'com.huawei.hms:ml-speech-semantics-sounddect-model:2.1.0.300'

According to the actual situation, there are two ways to declare the AGC plug-in configuration

apply plugin: 'com.android.application'
apply plugin: 'com.huawei.agconnect'
或
plugins {    id 'com.android.application'    
id 'com.huawei.agconnect'
}

Automatically update machine learning models

Add the following statement to the AndroidManifest.xml file. After the user installs your application from the Huawei App Market, the machine learning model will be automatically updated to the device:

<meta-data    
android:name="com.huawei.hms.ml.DEPENDENCY"  
android:value= "sounddect"/>

More detailed steps can be viewed through the link below:

https://developer.huawei.com/consumer/cn/doc/development/HMSCore-Guides/sound-detection-sdk-0000001055602754?ha__source=hms1

Three, application development coding stage

1. Obtain microphone permission, if there is no microphone permission, it will report an error of 12203

Set static permissions (required)

<
uses-permission 
android
:name
="android.permission.RECORD_AUDIO" 
/>

Dynamic permission acquisition (required)

ActivityCompat.requestPermissions(

    this, new String[]{Manifest.permission.RECORD_AUDIO

}, 1);

2. Create an MLSoundDector object

private static final String TAG = "MLSoundDectorDemo";

//Object of speech recognition
private MLSoundDector mlSoundDector;

//Create an MLSoundDector object and set the callback method
private void initMLSoundDector(){

mlSoundDector = MLSoundDector.createSoundDector();
mlSoundDector.setSoundDectListener(listener);

}

The voice recognition result callback is used to obtain the detection result and pass the callback to the voice recognition instance.

//Create a voice recognition result callback to obtain the detection result and pass the callback to the voice recognition instance.
private MLSoundDectListener listener = new MLSoundDectListener() {

@Override
public void onSoundSuccessResult(Bundle result) {
    //识别成功的处理逻辑，识别结果为：0-12（对应MLSoundDectConstants.java中定义的以SOUND_EVENT_TYPE开头命名的13种声音类型）。
    int soundType = result.getInt(MLSoundDector.RESULTS_RECOGNIZED);
    Log.d(TAG,"声音识别成功："+soundType);
}
@Override
public void onSoundFailResult(int errCode) {
    //识别失败，可能没有授予麦克风权限（Manifest.permission.RECORD_AUDIO）等异常情况。
    Log.d(TAG,"声音识别失败："+errCode);
}

};

In this code, only the int type of the voice recognition result is printed out. In the actual encoding, the int type of voice recognition result can be converted into a type that can be recognized by the user.

Definition of voice recognition type:

<string-array name="sound_dect_voice_type">

<item>笑声</item>
<item>婴儿或小孩哭声</item>
<item>打鼾声</item>
<item>喷嚏声</item>
<item>叫喊声</item>
<item>猫叫声</item>
<item>狗叫声</item>
<item>流水声</item>
<item>汽车喇叭声</item>
<item>门铃声</item>
<item>敲门声</item>
<item>火灾报警声</item>
<item>警报声</item>

</string-array>

Turn voice recognition on and off

@Override
public void onClick(View v) {

switch (v.getId()){
    case R.id.btn_start_detect:
        if (mlSoundDector != null){
            boolean isStarted = mlSoundDector.start(this); //context 是上下文
            //isStared 等于true表示启动识别成功、isStared等于false表示启动识别失败（原因可能是手机麦克风被系统或其它三方应用占用)
            if (isStarted){
                Toast.makeText(this,"语音识别开启成功", Toast.LENGTH_SHORT).show();
            }
        }

        break;

    case R.id.btn_stop_detect:
        if (mlSoundDector != null){
            mlSoundDector.stop();
        }
        break;
}

}

4. When the page is closed, you can call the destroy() method to release resources

@Override
protected void onDestroy() {

super.onDestroy();
if (mlSoundDector != null){
    mlSoundDector.destroy();
}

}

Fourth, run the test

Take the knock on the door as an example, the output result of the voice recognition type is expected to be 10
Click the Turn on Voice Recognition button to simulate the knock on the door, and you can get the following log in the AS console, indicating that the integration is successful.

Five, other

The voice recognition service belongs to a small module in Huawei’s machine learning service. Huawei’s machine learning service includes 6 modules: text, speech language, image, face and human, natural language processing, and custom Model.
This recording document only introduces the "voice recognition service" in the "speech language" module
If readers are interested in other modules of Huawei's machine learning service, they can check the relevant integration documents provided by Huawei. The address is as follows:

https://developer.huawei.com/consumer/cn/doc/development/HMSCore-Guides-V5/service-introduction-0000001050040017-V5?ha__source=hms1

Huawei Developer Alliance official website
Obtain the development guidance document
To participate in the developer discussion, please go to CSDN community or Reddit community
To download demo and sample code, please go to Github or Gitee
To solve integration problems, please go to Stack Overflow

Original link: https://developer.huawei.com/consumer/cn/forum/topic/0202580471954390028?fid=18
Original Author: Pepper

Yiwen teaches you how to quickly realize voice recognition

华为开发者论坛

引用和评论

Serverless轻松实现WEB页面与应用交互，玩转活动运营