Vscode voice annotations make information richer (below)
foreword
The last article of this series mainly talks about the knowledge of recording audio & audio file storage. At that time, because of the bug in the recording, I was not in the mood to eat for a week (voice-annotation).
1. MP3
file storage location
"Voice Notes" usage scenarios
- Use "Voice Notes" for individual items.
- Multiple projects use Voice Notes.
- The
mp3
files generated by "Voice Notes" are placed in their own projects. - The
mp3
files generated by "Voice Notes" are uniformly stored somewhere in the world. - A part of
mp3
generated by "Voice Notes" exists in the project and a part uses the global path.
vscode workspace
Where the specific audio is stored must read the user's configuration, but if the user only configures one path globally, then this path cannot satisfy the scenario where each project stores the audio files in different locations. At this time, the vscode workspace is introduced. the concept of.
If the eslint
rules of each of our projects are different, at this time, we only configure the eslint
rules globally to meet this scenario. At this time, we need to create a new .vscode
folder in the project, and create a settings.json
file in this file. The configuration written inside is the personalized configuration for the current project.
Configure workspace (absolute path or
relative path)
Although I understand the concept of workspace, it still can't solve the actual problem. For example, if we configure the absolute path of the audio file in the workspace, then the
.vscode > settings.json
file is to be uploaded to the code repository, so the configuration will be pulled by everyone, The computer system of each developer may be different, and the location of the folder where the project is stored is also different, so defining the absolute path of in the workspace cannot solve the problem of team collaboration.
If the user configures the relative path, and this path is relative to the current
settings.json
file itself, then the question becomes how to know where the settings.json
file is? Although the vscode plugin can read the configuration information of the workspace, but Could not read the location of the
settings.json
file.
settings.json
file tracing
At first, I thought about letting the user manually select a location to store the audio file after each recording, but obviously this method is not simple enough in operation. During a run, I suddenly thought that the user would definitely want to record the audio. To click somewhere to trigger the recording function, vscode
provides a method to get the location of the file where the user triggered the command.
Then I use the file location where the user triggered the command as the starting point, and search for the .vscode
file step by step. For example, if the user clicks in the
/xxx1/xxx2/xxx3.js
file to record the audio comment, then I will first judge whether /xxx1/xxx2/.vscode
is a folder, if not Then judge whether /xxx1/.vscode
is a folder, and so on until the location of the .vscode
folder is found, if not found, an error will be reported.
Validation of audio folder path
Using the location of the settings.json
file and the relative path of configured by the user, the real audio storage location can be obtained. At this time, you can't relax. You need to check whether the obtained folder path really has a folder. User creates folders.
There may be problems at this time. If there is currently a 162165e3e7aa60 a project with a b project
, but you want to record audio in the
b project, but the
b project does not set
.vscode workspace folder, but There is
.vscode > settings.json
in the a project, then it will cause the recording file of the b project to be stored in the
a project.
The above problems cannot accurately detect the user's real target path, so the way I think of is to record the audio page to pre-display the path to be saved, and let the user be the final gatekeeper:
Current plugin easy user configuration:
{
"voiceAnnotation": {
"dirPath": "../mp3"
}
}
Second, the definition of configuration
If the user does not want to store the audio file in the project, for fear that the project will become larger, we support a separate audio storage project. At this time, an absolute path of needs to be configured globally, because the global configuration will not be synchronized to the For other developers, when we can't get the audio path defined by the user in the
vscode workspace, we take the value of the global path. Let's configure the global properties together:
package.json
Added global configuration settings:
"contributes":
"configuration": {
"type": "object",
"title": "语音注释配置",
"properties": {
"voiceAnnotation.globalDirPath": {
"type": "string",
"default": "",
"description": "语音注释文件的'绝对路径' (优先级低于工作空间的voiceAnnotation.dirPath)。"
},
"voiceAnnotation.serverProt": {
"type": "number",
"default": 8830,
"description": "默认值为8830"
}
}
}
},
For the specific meaning of each attribute, please refer to the effect diagram after configuration:
3. How to get the location of the audio folder
util/index.ts
(There are specific method analysis below):
export function getVoiceAnnotationDirPath() {
const activeFilePath: string = vscode.window.activeTextEditor?.document?.fileName ?? "";
const voiceAnnotationDirPath: string = vscode.workspace.getConfiguration().get("voiceAnnotation.dirPath") || "";
const workspaceFilePathArr = activeFilePath.split(path.sep)
let targetPath = "";
for (let i = workspaceFilePathArr.length - 1; i > 0; i--) {
try {
const itemPath = `${path.sep}${workspaceFilePathArr.slice(1, i).join(path.sep)}${path.sep}.vscode`;
fs.statSync(itemPath).isDirectory();
targetPath = itemPath;
break
} catch (_) { }
}
if (voiceAnnotationDirPath && targetPath) {
return path.resolve(targetPath, voiceAnnotationDirPath)
} else {
const globalDirPath = vscode.workspace
.getConfiguration()
.get("voiceAnnotation.globalDirPath");
if (globalDirPath) {
return globalDirPath as string
} else {
getVoiceAnnotationDirPathErr()
}
}
}
function getVoiceAnnotationDirPathErr() {
vscode.window.showErrorMessage(`请于 .vscode/setting.json 内设置
"voiceAnnotation": {
"dirPath": "音频文件夹的相对路径"
}`)
}
Sentence-by-sentence analysis
1: Get the active location
vscode.window.activeTextEditor?.document?.fileName
The above method can get the file location where your current trigger command is located. For example, if you right-click inside a.js
and click an option in the menu, then using the above method will get the absolute path of the
a.js
file, of course not only Operation menu, all commands including hover
certain piece of text can call this method to get the file location.
2: Get configuration items
vscode.workspace.getConfiguration().get("voiceAnnotation.dirPath") || "";
vscode.workspace.getConfiguration().get("voiceAnnotation.globalDirPath");
The above method can not only obtain the configuration of the .vscode > settings.json
file in the project, but also the method of obtaining the global configuration, so we have to distinguish which one to use, so here I named it dirPath
and globalDirPath
.
3: file path separator
The "/" in /xxx/xx/x.js
is path.sep
, because there are differences in mac or window systems, and path.sep
is used here to be compatible with users of other systems.
4: report an error
If neither the relative path nor the absolute path can be obtained, an error will be thrown:
vscode.window.showErrorMessage(错误信息)
5: use
The first is when the server saves the audio, and the second is when the web page is opened, it will be passed to the front-end user to display the save path.
Fourth, the initial knowledge of recording
For students who have not used the recording function, you may not have seen this method navigator.mediaDevices
, which returns a MediaDevices
object, which provides connection access to media input devices such as cameras and microphones, including screen sharing.
To record audio, you need to obtain the user's permission first. navigator.mediaDevices.getUserMedia
is a success callback when the user's permission is successfully obtained and the device is available.
navigator.mediaDevices.getUserMedia({audio:true})
.then((stream)=>{
// 因为我们输入的是{audio:true}, 则stream是音频的内容流
})
.carch((err)=>{
})
5. Initialize recording equipment and configuration
The following shows the 'initialization' that defines the playback tag and the environment, as usual, code first, and then you explain sentence by sentence:
<header>
<audio id="audio" controls></audio>
<audio id="replayAudio" controls></audio>
</header>
let audioCtx = {}
let processor;
let userMediStream;
navigator.mediaDevices.getUserMedia({ audio: true })
.then(function (stream) {
userMediStream = stream;
audio.srcObject = stream;
audio.onloadedmetadata = function (e) {
audio.muted = true;
};
})
.catch(function (err) {
console.log(err);
});
1: Find interesting things, get elements directly by id
2: The content stream that saves the audio
Here, the media source is saved in a global variable, which is convenient for subsequent replay of the sound:
userMediStream = stream;
srcObject
attribute specifies the 'media source' associated with the <audio> tag:
audio.srcObject = stream;
3: Monitor data changes
When the loading is complete, set audio.muted = true;
to mute the device. Why is the recorded audio still muted? In fact, it is because we do not need to play our sound at the same time when recording, which will cause a heavy "echo", so it needs to be muted here.
audio.onloadedmetadata = function (e) {
audio.muted = true;
};
6. Start recording
First add a click event for the 'start recording' button:
const oAudio = document.getElementById("audio");
let buffer = [];
oStartBt.addEventListener("click", function () {
oAudio.srcObject = userMediStream;
oAudio.play();
buffer = [];
const options = {
mimeType: "audio/webm"
};
mediaRecorder = new MediaRecorder(userMediStream, options);
mediaRecorder.ondataavailable = handleDataAvailable;
mediaRecorder.start(10);
});
Process the acquired audio data
function handleDataAvailable(e) {
if (e && e.data && e.data.size > 0) {
buffer.push(e.data);
}
}
oAudio.srcObject
defines the 'media source' of the playback tag.oAudio.play();
starts playing, since we setmuted = true
mute, so here is the start of recording.buffer
is used to store audio data, each recording needs to clear the last residue.new MediaRecorder
creates a MediaRecorder object that records the specified MediaStream, that is to say, this method exists for the recording function. Its second parameter can enter the specifiedmimeType
type. I checked the specific type on MDN. .mediaRecorder.ondataavailable
defines the specific processing logic for each piece of audio data.mediaRecorder.start(10);
the audio for 10 milliseconds. The audio information is stored in the Blob. I understand the configuration here is to generate a Blob object every 10 milliseconds.
At this point, our audio information can be continuously collected in the array buffer
. So far, we have completed the recording function, and then we need to enrich its functions.
7. end, replay, re-record
1: end recording
Of course, the recording will come to an end. Some students have asked whether it is necessary to limit the length or size of the audio? But I feel that the specific restriction rules should be customized by each team. I only provide core functions in this version.
const oEndBt = document.getElementById("endBt");
oEndBt.addEventListener("click", function () {
oAudio.pause();
oAudio.srcObject = null;
});
- Click
recording end button,
oAudio.pause()
to stop the tab playback. oAudio.srcObject = null;
Cut off the media source so that the tag can no longer get audio data.
2: Replay the recording
Of course, you have to listen to the recorded audio for the effect:
const oReplayBt = document.getElementById("replayBt");
const oReplayAudio = document.getElementById("replayAudio");
oReplayBt.addEventListener("click", function () {
let blob = new Blob(buffer, { type: "audio/webm" });
oReplayAudio.src = window.URL.createObjectURL(blob);
oReplayAudio.play();
});
Blob
is a form of data storage. We useblob
excel
It can be simply understood that the first parameter is the data of the file, and the second parameter can define the type of the file.- The parameter of
window.URL.createObjectURL
is 'resource data', this method generates a stringurl
, and the incoming 'resource data' can be accessed throughurl
. It should be noted that the generatedurl
is short-lived and cannot be accessed. oReplayAudio.src
specifies the playback address for the player. Since there is no need to record, there is no need to specifysrcObject
.oReplayAudio.play();
starts playing.
3: Re-record audio
If the recording is not good, of course I have to re-record it. At first, I wanted to be compatible with pause and resume recording, but I feel that these capabilities are a bit off the core. It is expected that there should be very few long voice notes, so here I just swipe the page violently.
const oResetBt = document.getElementById("resetBt");
oResetBt.addEventListener("click", function () {
location.reload();
});
Eight, conversion format
The obtained audio file can be played directly using node
, which may fail to play. Although this simple audio data stream file can be recognized by the browser, in order to eliminate the difference between different browsers and different operating systems, we need to convert it to be safe. into the standard mp3 audio format.
MP3 is a lossy music format while WAV is a lossless music format. In fact, the difference between the two is very obvious. The former sacrifices the quality of the music in exchange for a smaller file size, while the latter guarantees the quality of the music to the greatest extent possible. This also leads to different uses of the two. MP3 is generally used for our ordinary users to listen to songs, while WAV files are usually used for studio recording and professional audio projects.
Here I choose the plug-in lamejs
, the github address of the plug-in is here .
lamejs is an mp3 encoder rewritten in JS, which can be simply understood as it can output the standard mp3
encoding format.
Add some initial logic to the initialization logic:
let audioCtx = {};
let processor;
let source;
let userMediStream;
navigator.mediaDevices
.getUserMedia({ audio: true })
.then(function (stream) {
userMediStream = stream;
audio.srcObject = stream;
audio.onloadedmetadata = function (e) {
audio.muted = true;
};
audioCtx = new AudioContext(); // 新增
source = audioCtx.createMediaStreamSource(stream); // 新增
processor = audioCtx.createScriptProcessor(0, 1, 1); // 新增
processor.onaudioprocess = function (e) { // 新增
const array = e.inputBuffer.getChannelData(0);
encode(array);
};
})
.catch(function (err) {
console.log(err);
});
new AudioContext()
The context of audio processing, the operation of audio is basically carried out in this type.audioCtx.createMediaStreamSource(stream)
Creating an audio interface is a bit abstract.audioCtx.createScriptProcessor(0, 1, 1)
An object for JavaScript to directly process audio is created here, that is, it can be used to manipulate audio data with js. The three parameters are 'buffer size', 'number of input channels', 'number of output channels' .processor.onaudioprocess
Monitor the processing method of new data.encode
processes the audio and returns an array offloat32Array
.
The following code refers to the code of other people on the Internet, and the specific effect is to complete the conversion of lamejs
:
let mp3Encoder,
maxSamples = 1152,
samplesMono,
lame,
config,
dataBuffer;
const clearBuffer = function () {
dataBuffer = [];
};
const appendToBuffer = function (mp3Buf) {
dataBuffer.push(new Int8Array(mp3Buf));
};
const init = function (prefConfig) {
config = prefConfig || {};
lame = new lamejs();
mp3Encoder = new lame.Mp3Encoder(
1,
config.sampleRate || 44100,
config.bitRate || 128
);
clearBuffer();
};
init();
const floatTo16BitPCM = function (input, output) {
for (let i = 0; i < input.length; i++) {
let s = Math.max(-1, Math.min(1, input[i]));
output[i] = s < 0 ? s * 0x8000 : s * 0x7fff;
}
};
const convertBuffer = function (arrayBuffer) {
let data = new Float32Array(arrayBuffer);
let out = new Int16Array(arrayBuffer.length);
floatTo16BitPCM(data, out);
return out;
};
const encode = function (arrayBuffer) {
samplesMono = convertBuffer(arrayBuffer);
let remaining = samplesMono.length;
for (let i = 0; remaining >= 0; i += maxSamples) {
let left = samplesMono.subarray(i, i + maxSamples);
let mp3buf = mp3Encoder.encodeBuffer(left);
appendToBuffer(mp3buf);
remaining -= maxSamples;
}
};
The corresponding start recording needs to add some logic
oStartBt.addEventListener("click", function () {
clearBuffer();
oAudio.srcObject = userMediStream;
oAudio.play();
buffer = [];
const options = {
mimeType: "audio/webm",
};
mediaRecorder = new MediaRecorder(userMediStream, options);
mediaRecorder.ondataavailable = handleDataAvailable;
mediaRecorder.start(10);
source.connect(processor); // 新增
processor.connect(audioCtx.destination); // 新增
});
source.connect(processor)
n't panic,source
is returned bycreateScriptProcessor
mentioned above,createMediaStreamSource
is returned byprocessor
, here is to connect the two, so it is equivalent to start usingjs
to process audio data.audioCtx.destination
The final output address of the audio graphic in a specific case, usually a speaker.processor.connect
forms a link, that is, the monitoring ofprocessor
starts.
Add some logic to the corresponding end recording
oEndBt.addEventListener("click", function () {
oAudio.pause();
oAudio.srcObject = null;
mediaRecorder.stop(); // 新增
processor.disconnect(); // 新增
});
mediaRecorder.stop
Stop audio (for playback of recordings)processor.disconnect()
Stop processing audio data (after conversion to mp3).
9. Send the recorded audio file to the server
The finished data should be passed to the backend in the form of FormData
.
const oSubmitBt = document.getElementById("submitBt");
oSubmitBt.addEventListener("click", function () {
var blob = new Blob(dataBuffer, { type: "audio/mp3" });
const formData = new FormData();
formData.append("file", blob);
fetch("/create_voice", {
method: "POST",
body: formData,
})
.then((res) => res.json())
.catch((err) => console.log(err))
.then((res) => {
copy(res.voiceId);
alert(`已保到剪切板: ${res.voiceId}`);
window.opener = null;
window.open("", "_self");
window.close();
});
});
- Here we close the current page after successfully passing the audio file, because there are really not many voice notes to be recorded.
10. Future Outlook
No similar plug-ins were found in the vscode
plug-in store, and no similar plug-ins were found on github
, indicating that this problem is not very painful, but it does not mean that these problems should be left alone, and take action to really do something to improve Exactly.
It is conceivable for the developer to use this "voice annotation" plug-in. It is only used when the text cannot be clearly described, so the use of the recording function should be very low frequency. Because of this, the audio file will not be used. more', so the extra volume of the project may not cause much trouble.
If you use it later, I plan to add a "one-click deletion of unused comments". As the project develops, some comments will definitely be eliminated, and manual cleaning will definitely not make sense.
When playing, it will show who made the recording and the specific time of the recording.
In addition to voice annotations, users can also add text + pictures, that is, to make a plug-in with annotations as the core.
end
That's it this time, hope to progress with you.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。