Vscode voice annotation to enrich the information (middle)

foreword

In the previous article, we have completed the most basic function "recognizing voice annotations", and in this article we will develop related functions such as voice 'play'.

1. Mac computer to obtain audio files

To develop the audio 'play' function, we first need an audio file. Most of the mp3 file downloads on the Internet need to be registered. Then simply use the computer's own recording function to generate mp3 . Resolve.

Here is a demonstration of the recording function of the mac computer:

Step 1: Find the software:

Step 2: Share the recorded audio to an app

m4a file: (we can manually modify the suffix name)

"m4a is the extension of the MPEG-4 audio standard file. Like the familiar mp3, it is also an audio format file. Apple uses this name to distinguish mpeg4 videos."

2. The choice of playing audio plug-ins

The playback here refers to "hovering the mouse" to play the audio, then it cannot be played in the sense of web , because we cannot use the audio tag to achieve, vscode is developed based on Electron , so the plug-in in it is also in the node environment , then we can use node to input the audio stream to the audio output device for playback purposes.

There are not many plug-ins that use node play audio on the Internet. Here are two of them: play.js & node-wav-player

play.js has a flaw, that is, cannot pause the playback. This problem may be unbearable for developers, so I finally chose node-wav-player . Don't look at it as a wav player, mp3 can also be played.

`Install and start:`

yarn add

`Use play: (absolute address is used temporarily here)`

Inside the `hover.ts file add:

import * as vscode from 'vscode';
import * as player from 'node-wav-player';
import { getVoiceAnnotationDirPath, targetName, testTargetReg } from './util'
let stopFn: () => void;

function playVoice(id: string) {
    const voiceAnnotationDirPath = getVoiceAnnotationDirPath()
    if (voiceAnnotationDirPath) {
        player.play({
            path: `/xxx/xxxx/xx.mp3`
        }).catch(() => {
            vscode.window.showErrorMessage('播放失败')
        })
        stopFn = () => {
            player.stop()
        }
    }
}

export default vscode.languages.registerHoverProvider("*", {
    provideHover(documennt: vscode.TextDocument, position: vscode.Position) {
        stopFn?.()
        const word = documennt.getText(documennt.getWordRangeAtPosition(position));
        const testTargetRes = testTargetReg.exec(word);
        if (testTargetRes) {
            playVoice(testTargetRes[1])
            return new vscode.Hover('播放中 ...')
        }
    }
})

The playback address of player.play of path is temporarily written to death. You will find that the audio can be played normally at present. If you think the playback function is ok at this time, it is a big mistake.

`3. The core principle of node-wav-player`

The code of node-wav-player is very simple, much more concise than I imagined. The following is what I look like after I simplified the code, isn't it much more refreshing:

`The initialized play method is only responsible for sorting the data, and the real playback depends on the _play method`

`_play method`

node child_process.spawn used to start a new 'child process', which is the 'child process' used to start audio playback. The first parameter is the command statement, and the second parameter is an array, which is the location to execute the command.

Let me demonstrate how to use spawn :

For example afplay audio address can play sound on mac.

`Monitor error`

If it is not the case that code is 0 or this._called_stop === true is manually called to stop, the error "playback failed" will be reported. If no error is reported within 500 milliseconds, then removeAllListeners("close") remove the closed monitor.

`How to terminate playback`

In the stop method, you can directly kill to drop the 'child process':

`4. 'Where' to record audio?`

The purpose of our experience in making this plug-in is to be convenient and fast, so the "link must be short" for recording audio. It is best that users can 'record' with one key, and can generate audio annotations with one key.

My initial idea was to do as much as possible inside vscode , that is, don't open a new h5 page, so that users don't go beyond the level of vscode .

Recording audio cannot be like playing audio, because recording involves replay after recording, saving of audio files, and operations such as start + pause + end, etc., so it is better to have an operation interface instead of relying on node singles Fight alone.

`Five, webview`

`Create webview`

vscode provides the capability of webview internally. I was 'hearted' at first sight when I saw it. We can use the following code to add a webview page.

  const panel = vscode.window.createWebviewPanel(
    "类型xxx",
    "标题xxx",
    vscode.ViewColumn.One,
    {}
  );

`define content`

Need to use panel.webview.html attribute, similar to innerHTML :

  const panel = vscode.window.createWebviewPanel(
    "类型xxx",
    "标题xxx",
    vscode.ViewColumn.One,
    {}
  );
  panel.webview.html = `<div>123</div>`;

`limitation`

I read the official documentation and also checked the ts type files, but unfortunately I couldn't find a method for audio authorization, so I couldn't use the audio tag to collect the user's audio information, so I had to choose another way to implement it.

`6. Right click to record`

I referred to some music playback software and found that almost all the playback functions are realized by opening the h5 page, so our recording function can also try this node This address is http://localhost:8830/ . This address is returned to the user with a segment of html , which is where the recording is made.

`Define right-click navigation`

package.json in file

  "contributes": {
    "menus": {
      "editor/context": [
        {
          "when": "editorFocus",
          "command": "vn.recording",
          "group": "navigation"
        }
      ]
    },
    "commands": [
      {
        "command": "vn.recording",
        "title": "此工程内录制语音注释"
      }
    ]
}

editor/context defines the contents of the menu bar called by the right key.
when In what life cycle this function is activated is defined, here is selected when the editing focus is obtained.
command defines the command name.
title is the name displayed in the menu.

`Open h5 page`

extension.ts Added navigation module:

import * as vscode from 'vscode';
import hover from './hover';
import initVoiceAnnotationStyle from './initVoiceAnnotationStyle';
import navigation from './navigation' // 新增

export function activate(context: vscode.ExtensionContext) {
    initVoiceAnnotationStyle()
    context.subscriptions.push(hover);
    context.subscriptions.push(navigation); // 新增
    context.subscriptions.push(
        vscode.window.onDidChangeActiveTextEditor(() => {
            initVoiceAnnotationStyle()
        })
    )
}

export function deactivate() { }

navigation,ts file, responsible for starting the service and opening the browser to jump to the corresponding page:

yarn add open

import * as vscode from 'vscode';
import * as open from 'open';
import server from './server';
import { serverProt } from './util';
import { Server } from 'http';

let serverObj: Server;
export default vscode.commands.registerCommand("vn.recording", function () {
    const voiceAnnotationDirPath = getVoiceAnnotationDirPath()
    if (voiceAnnotationDirPath) {
        if (!serverObj) {
            serverObj = server()
        }
        open(`http://127.0.0.1:${serverProt()}`);
    }
})

`start server`

Because our plug-in should be as small as possible, of course we don't use any framework here, just hand-roll the native one:

New server.ts file:

import * as fs from 'fs';
import * as http from 'http';
import * as path from 'path';
import * as url from 'url';
import { targetName, getVoiceID } from './util';

export default function () {
    const server = http.createServer(function (
      req: http.IncomingMessage, res: http.ServerResponse) {
            res.write(123)
            res.end()
    }).listen(8830)

    return server
}

`Seven, return to the page, define api`

server does not work with light start, now start to define the interface capabilities, in server.ts :

import * as fs from 'fs';
import * as http from 'http';
import * as path from 'path';
import * as url from 'url';
import { serverProt, targetName, getVoiceID } from './util';

const temp = fs.readFileSync(
    path.join(__dirname, "./index.html")
)

export default function () {
    const server = http.createServer(function (req: http.IncomingMessage, res: http.ServerResponse) {
        if (req.method === "POST" && req.url === "/create_voice") {
            createVoice(req, res)
        }else {
            res.writeHead(200, {
                "content-type": 'text/html;charset="fs.unwatchFile-8"'
            })
            res.write(temp)
            res.end()
        }
    }).listen(serverProt())

    return server
}

src/html/index.html file is the h5 interface file for our recording.
We define the upload as a "POST" request, and the request address is /create_voice .

`createVoice method`

This method is used to receive an audio file, and save the audio file in a user-specified location:


function createVoice(req: http.IncomingMessage, res: http.ServerResponse) {
    let data: Uint8Array[] = [];
    req.on("data", (chunck: Uint8Array) => {
        data.push(chunck)
    })
    req.on("end", () => {
        let buffer = Buffer.concat(data);
        const voiceId = getVoiceID()
        try {
            fs.writeFileSync(`保存音频的位置`,
                buffer,
            )
        } catch (error) {
            res.writeHead(200)
            res.end()
        }
        res.writeHead(200)
        res.end(JSON.stringify({ voiceId: `// ${targetName}_${voiceId}` }))
    })
}

Because the front end will use the form of formData to transmit audio files, this way of receiving is required.
Return the final generated audio comment string of // voice_annotation_20220220153713111 to the front end, so that the front end can be directly put into the user clipboard.

`end`

Next is the recording and uploading of audio (related to webRTC knowledge), and how to define the path for storing audio files, and the release of the additional plug-in vscode , this is the case, I hope to make progress with you.

Vscode voice annotation to enrich the information (middle)