Welcome to the second chapter of a series of how to use TypeScript, React, ANTLR4, Monaco Editor to create a custom web editor. Before that, I suggest you read create a custom web editor using TypeScript, React, ANTLR4, Monaco Editor ( A)
In this article, I will introduce how to implement language services. Language services are mainly used in the editor to parse the heavy work of typed text. We will use the abstract syntax tree ( AST Parser to find grammar or lexicon Errors, formatted text, the grammar can only be prompted for the user typed text (I will not implement grammatical automatic completion in this article), basically, the language service exposes the following functions:
format(code: string): string
validate(code: string): Errors[]
autoComplete(code: string, currentPosition: Position): string[]
Add ANTLER, Generate Lexer and Parser From the Grammar
I will introduce ANTLR library and increase according to a TODOLang.g4
raw syntax file Parser and Lexer script, you must first introduce two libraries: antlr4ts and antlr4ts-cli , antlr4 Typescript target generated parser antlr4ts dependent upon package has run, on the other hand, as the name suggests antlr4ts-cli is CLI we will use it to generate the language parser and Lexer
npm add antlr4ts
npm add -D antlr4ts-cli
Create a file TodoLangGrammar.g4
TodoLang
syntax rules in the root path
grammar TodoLangGrammar;
todoExpressions : (addExpression)* (completeExpression)*;
addExpression : ADD TODO STRING;
completeExpression : COMPLETE TODO STRING;
ADD : 'ADD';
TODO : 'TODO';
COMPLETE: 'COMPLETE';
STRING: '"' ~ ["]* '"';
EOL: [\r\n] + -> skip;
WS: [ \t] -> skip;
Now we add generate Parser and Lexer antlr-cli package.json
"antlr4ts": "antlr4ts ./TodoLangGrammar.g4 -o ./src/ANTLR"
Let us execute the antlr4ts script, you can see the typescript source code of the ./src/ANTLR
npm run antlr4ts
As we have seen, there is a Lexer and Parser , if you look at Parser file, you will find it exported TodoLangGrammarParser
class that has a constructor constructor(input: TokenStream)
, the constructor TodoLangGrammarLexer
given The code generated TokenStream
as a parameter, TodoLangGrammarLexer
constructor(input: CharStream)
that takes the code as an input parameter
Parser file contains the public todoExpressions(): TodoExpressionsContext
method, which will return all TodoExpressions
context objects TodoExpressions
can be traced. In fact, it is derived from the first line of grammar rules in our grammar rules file:
todoExpressions : (addExpression)* (completeExpression)*;
TodoExpressionsContext
is AST
. Each node in it is another context of another rule. It contains the terminal and node context. The terminal has the final token (ADD token, TODO token, todo token)
TodoExpressionsContext
contains a addExpressions
and completeExpressions
expressions, derived from the following three rules
todoExpressions : (addExpression)* (completeExpression)*;
addExpression : ADD TODO STRING;
completeExpression : COMPLETE TODO STRING;
On the other hand, each context class contains a terminal node, which basically contains the following text (code segment or token, for example: ADD, COMPLETE, string representing TODO), AST . The complexity depends on what you write Grammar rules
Let's take a look at TodoExpressionsContext , which contains ADD
, TODO
and STRING
terminal nodes, the corresponding rules are as follows:
addExpression : ADD TODO STRING;
STRING
Todo
text content we want to add. First, let’s parse a simple TodoLang
code to understand how AST works. parser.ts
./src/language-service
directory with the following content
import { TodoLangGrammarParser, TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser";
import { TodoLangGrammarLexer } from "../ANTLR/TodoLangGrammarLexer";
import { ANTLRInputStream, CommonTokenStream } from "antlr4ts";
export default function parseAndGetASTRoot(code: string): TodoExpressionsContext {
const inputStream = new ANTLRInputStream(code);
const lexer = new TodoLangGrammarLexer(inputStream);
const tokenStream = new CommonTokenStream(lexer);
const parser = new TodoLangGrammarParser(tokenStream);
// Parse the input, where `compilationUnit` is whatever entry point you defined
return parser.todoExpressions();
}
parser.ts
file exports the parseAndGetASTRoot(code)
method, which accepts the TodoLang
code and generates the corresponding AST , parse the following TodoLang
code:
parseAndGetASTRoot(`
ADD TODO "Create an editor"
COMPLETE TODO "Create an editor"
`)
Implementing Lexical and Syntax Validation
In this section, I will guide you step by step how to add grammar verification to the editor, ANTLR generates vocabulary and grammatical errors for us out of the box, we only need to implement the ANTLRErrorListner
class and provide it to Lexer Lexer 160c8570e1ff2fd and Parser , so that we can collect errors when ANTLR
Create a TodoLangErrorListener.ts
file in the ./src/language-service
directory, and export the file to implement the TodoLangErrorListener
class of the ANTLRErrorListner
import { ANTLRErrorListener, RecognitionException, Recognizer } from "antlr4ts";
export interface ITodoLangError {
startLineNumber: number;
startColumn: number;
endLineNumber: number;
endColumn: number;
message: string;
code: string;
}
export default class TodoLangErrorListener implements ANTLRErrorListener<any>{
private errors: ITodoLangError[] = []
syntaxError(recognizer: Recognizer<any, any>, offendingSymbol: any, line: number, charPositionInLine: number, message: string, e: RecognitionException | undefined): void {
this.errors.push(
{
startLineNumber:line,
endLineNumber: line,
startColumn: charPositionInLine,
endColumn: charPositionInLine+1,//Let's suppose the length of the error is only 1 char for simplicity
message,
code: "1" // This the error code you can customize them as you want
}
)
}
getErrors(): ITodoLangError[] {
return this.errors;
}
}
Every time ANTLR encounters an error during code parsing, it will call this TodoLangErrorListener
to provide it with information about the error. The listener will return an error message containing the code location where the parsing error occurred. Now we try to bind TodoLangErrorListener
To parser.ts
Lexer and Parser in the file of , eg:
import { TodoLangGrammarParser, TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser";
import { TodoLangGrammarLexer } from "../ANTLR/TodoLangGrammarLexer";
import { ANTLRInputStream, CommonTokenStream } from "antlr4ts";
import TodoLangErrorListener, { ITodoLangError } from "./TodoLangErrorListener";
function parse(code: string): {ast:TodoExpressionsContext, errors: ITodoLangError[]} {
const inputStream = new ANTLRInputStream(code);
const lexer = new TodoLangGrammarLexer(inputStream);
lexer.removeErrorListeners()
const todoLangErrorsListner = new TodoLangErrorListener();
lexer.addErrorListener(todoLangErrorsListner);
const tokenStream = new CommonTokenStream(lexer);
const parser = new TodoLangGrammarParser(tokenStream);
parser.removeErrorListeners();
parser.addErrorListener(todoLangErrorsListner);
const ast = parser.todoExpressions();
const errors: ITodoLangError[] = todoLangErrorsListner.getErrors();
return {ast, errors};
}
export function parseAndGetASTRoot(code: string): TodoExpressionsContext {
const {ast} = parse(code);
return ast;
}
export function parseAndGetSyntaxErrors(code: string): ITodoLangError[] {
const {errors} = parse(code);
return errors;
}
Create LanguageService.ts
in the ./src/language-service
directory, the following is the content it exports
import { TodoExpressionsContext } from "../ANTLR/TodoLangGrammarParser";
import { parseAndGetASTRoot, parseAndGetSyntaxErrors } from "./Parser";
import { ITodoLangError } from "./TodoLangErrorListener";
export default class TodoLangLanguageService {
validate(code: string): ITodoLangError[] {
const syntaxErrors: ITodoLangError[] = parseAndGetSyntaxErrors(code);
//Later we will append semantic errors
return syntaxErrors;
}
}
Yes, we have implemented editor error analysis. For this reason, I will create discussed in the web worker
article of 160c8570e1f478, and add worker
service agent, which will call the language service area to complete the advanced functions of the editor
Creating the web worker
First, we call monaco.editor.createWebWorker to use the built-in ES6 Proxies create a proxy TodoLangWorker
, TodoLangWorker
uses language service to perform editing functions, in web worker
those executed by monaco agent, so web worker
Invoking a method only calls the delegated method in the main thread.
TodoLangWorker.ts
under the ./src/todo-lang
folder with the following content:
import * as monaco from "monaco-editor-core";
import IWorkerContext = monaco.worker.IWorkerContext;
import TodoLangLanguageService from "../language-service/LanguageService";
import { ITodoLangError } from "../language-service/TodoLangErrorListener";
export class TodoLangWorker {
private _ctx: IWorkerContext;
private languageService: TodoLangLanguageService;
constructor(ctx: IWorkerContext) {
this._ctx = ctx;
this.languageService = new TodoLangLanguageService();
}
doValidation(): Promise<ITodoLangError[]> {
const code = this.getTextDocument();
return Promise.resolve(this.languageService.validate(code));
}
private getTextDocument(): string {
const model = this._ctx.getMirrorModels()[0];
return model.getValue();
}
We created language service
instance and adds doValidation
ways to further it calls language service
of validate
method, also added getTextDocument
method used to obtain the value of the text editor, TodoLangWorker
class can be extended if you want a lot of features to support multi File editing, etc., _ctx: IWorkerContext
is the context object of the editor, which saves the model information of the file
Now let us create a web worker file todolang.worker.ts
./src/todo-lang
import * as worker from 'monaco-editor-core/esm/vs/editor/editor.worker';
import { TodoLangWorker } from './todoLangWorker';
self.onmessage = () => {
worker.initialize((ctx) => {
return new TodoLangWorker(ctx)
});
};
We use the built-in worker.initialize
initialize our workers, and use TodoLangWorker
for the necessary method proxy
That is a web worker
, so we must let webpack
output the corresponding worker
file
// webpack.config.js
entry: {
app: './src/index.tsx',
"editor.worker": 'monaco-editor-core/esm/vs/editor/editor.worker.js',
"todoLangWorker": './src/todo-lang/todolang.worker.ts'
},
output: {
globalObject: 'self',
filename: (chunkData) => {
switch (chunkData.chunk.name) {
case 'editor.worker':
return 'editor.worker.js';
case 'todoLangWorker':
return "todoLangWorker.js"
default:
return 'bundle.[hash].js';
}
},
path: path.resolve(__dirname, 'dist')
}
We named the worker
file as the todoLangWorker.js
file, and now we add getWorkUrl
(window as any).MonacoEnvironment = {
getWorkerUrl: function (moduleId, label) {
if (label === languageID)
return "./todoLangWorker.js";
return './editor.worker.js';
}
}
This is how monaco get web worker
the URL of the way, please note that if worker
label is TodoLang
of ID, we will return for packaging output in Webpack the same name worker,
if we build the project, you may find a file called todoLangWorker.js
(Or in dev-tools, you will find two worker
in the thread section)
Now create one for management worker
create and access proxy worker
client WorkerManager
import * as monaco from "monaco-editor-core";
import Uri = monaco.Uri;
import { TodoLangWorker } from './todoLangWorker';
import { languageID } from './config';
export class WorkerManager {
private worker: monaco.editor.MonacoWebWorker<TodoLangWorker>;
private workerClientProxy: Promise<TodoLangWorker>;
constructor() {
this.worker = null;
}
private getClientproxy(): Promise<TodoLangWorker> {
if (!this.workerClientProxy) {
this.worker = monaco.editor.createWebWorker<TodoLangWorker>({
moduleId: 'TodoLangWorker',
label: languageID,
createData: {
languageId: languageID,
}
});
this.workerClientProxy = <Promise<TodoLangWorker>><any>this.worker.getProxy();
}
return this.workerClientProxy;
}
async getLanguageServiceWorker(...resources: Uri[]): Promise<TodoLangWorker> {
const _client: TodoLangWorker = await this.getClientproxy();
await this.worker.withSyncedResources(resources)
return _client;
}
}
We use createWebWorker
create monaco proxy web worker
, and then we get the client object that returns the proxy, we use workerClientProxy
call some methods of the proxy, let us create the DiagnosticsAdapter
class, which is used to connect the API Monaco and the language service mark error, in order to make the parsing error correct mark on monaco
import * as monaco from "monaco-editor-core";
import { WorkerAccessor } from "./setup";
import { languageID } from "./config";
import { ITodoLangError } from "../language-service/TodoLangErrorListener";
export default class DiagnosticsAdapter {
constructor(private worker: WorkerAccessor) {
const onModelAdd = (model: monaco.editor.IModel): void => {
let handle: any;
model.onDidChangeContent(() => {
// here we are Debouncing the user changes, so everytime a new change is done, we wait 500ms before validating
// otherwise if the user is still typing, we cancel the
clearTimeout(handle);
handle = setTimeout(() => this.validate(model.uri), 500);
});
this.validate(model.uri);
};
monaco.editor.onDidCreateModel(onModelAdd);
monaco.editor.getModels().forEach(onModelAdd);
}
private async validate(resource: monaco.Uri): Promise<void> {
const worker = await this.worker(resource)
const errorMarkers = await worker.doValidation();
const model = monaco.editor.getModel(resource);
monaco.editor.setModelMarkers(model, languageID, errorMarkers.map(toDiagnostics));
}
}
function toDiagnostics(error: ITodoLangError): monaco.editor.IMarkerData {
return {
...error,
severity: monaco.MarkerSeverity.Error,
};
}
onDidChangeContent
listener listens to the model
information. If the model
information changes, we will call webworker
every 500ms to verify the code and add the error flag; setModelMarkers
informs monaco add the error flag to the 81c8, in order to make the editor complete the setup
Call them in, and notice that we are using WorkerManager to get the proxy worker
monaco.languages.onLanguage(languageID, () => {
monaco.languages.setMonarchTokensProvider(languageID, monarchLanguage);
monaco.languages.setLanguageConfiguration(languageID, richLanguageConfiguration);
const client = new WorkerManager();
const worker: WorkerAccessor = (...uris: monaco.Uri[]): Promise<TodoLangWorker> => {
return client.getLanguageServiceWorker(...uris);
};
//Call the errors provider
new DiagnosticsAdapter(worker);
});
}
export type WorkerAccessor = (...uris: monaco.Uri[]) => Promise<TodoLangWorker>;
Now everything is ready, run the project and enter the wrong TodoLang
code, you will find that the error is marked below the code
Implementing Semantic Validation
Now add semantic verification to the editor, remember the two semantic rules I mentioned in the previous article
- If TODO is defined using the ADD TODO description, we can add it again.
- In TODO application, the COMPLETE instruction should not be used before ADD TODO is declared
To check whether TODO is defined, all we have to do is to traverse the AST to get each ADD expression and push it into definedTodos
. Then we definedTodos
. If it exists, it is a semantic error, so please download from ADD Get the position of the error in the context of the expression, and then push the error to the array, as is the second rule
function checkSemanticRules(ast: TodoExpressionsContext): ITodoLangError[] {
const errors: ITodoLangError[] = [];
const definedTodos: string[] = [];
ast.children.forEach(node => {
if (node instanceof AddExpressionContext) {
// if a Add expression : ADD TODO "STRING"
const todo = node.STRING().text;
// If a TODO is defined using ADD TODO instruction, we can re-add it.
if (definedTodos.some(todo_ => todo_ === todo)) {
// node has everything to know the position of this expression is in the code
errors.push({
code: "2",
endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex,
endLineNumber: node.stop.line,
message: `Todo ${todo} already defined`,
startColumn: node.stop.charPositionInLine,
startLineNumber: node.stop.line
});
} else {
definedTodos.push(todo);
}
}else if(node instanceof CompleteExpressionContext) {
const todoToComplete = node.STRING().text;
if(definedTodos.every(todo_ => todo_ !== todoToComplete)){
// if the the todo is not yet defined, here we are only checking the predefined todo until this expression
// which means the order is important
errors.push({
code: "2",
endColumn: node.stop.charPositionInLine + node.stop.stopIndex - node.stop.stopIndex,
endLineNumber: node.stop.line,
message: `Todo ${todoToComplete} is not defined`,
startColumn: node.stop.charPositionInLine,
startLineNumber: node.stop.line
});
}
}
})
return errors;
}
Call now checkSemanticRules
function in language service
of validate
semantics and syntax errors consolidation method will return, and now we've editor supports semantic check
Implementing Auto-Formatting
For the editor's automatic formatting function, you need to Monaco API registerDocumentFormattingEditProvider
. Check the monaco-editor documentation for more details. Calling and traversing the AST will show you the beautified code
// languageService.ts
format(code: string): string{
// if the code contains errors, no need to format, because this way of formating the code, will remove some of the code
// to make things simple, we only allow formatting a valide code
if(this.validate(code).length > 0)
return code;
let formattedCode = "";
const ast: TodoExpressionsContext = parseAndGetASTRoot(code);
ast.children.forEach(node => {
if (node instanceof AddExpressionContext) {
// if a Add expression : ADD TODO "STRING"
const todo = node.STRING().text;
formattedCode += `ADD TODO ${todo}\n`;
}else if(node instanceof CompleteExpressionContext) {
// If a Complete expression: COMPLETE TODO "STRING"
const todoToComplete = node.STRING().text;
formattedCode += `COMPLETE TODO ${todoToComplete}\n`;
}
});
return formattedCode;
}
In todoLangWorker
added in format
method, the format
method uses language service
of format
method
Now create the TodoLangFomattingProvider
class to implement the `DocumentFormattingEditProvider
interface
import * as monaco from "monaco-editor-core";
import { WorkerAccessor } from "./setup";
export default class TodoLangFormattingProvider implements monaco.languages.DocumentFormattingEditProvider {
constructor(private worker: WorkerAccessor) {
}
provideDocumentFormattingEdits(model: monaco.editor.ITextModel, options: monaco.languages.FormattingOptions, token: monaco.CancellationToken): monaco.languages.ProviderResult<monaco.languages.TextEdit[]> {
return this.format(model.uri, model.getValue());
}
private async format(resource: monaco.Uri, code: string): Promise<monaco.languages.TextEdit[]> {
// get the worker proxy
const worker = await this.worker(resource)
// call the validate methode proxy from the langaueg service and get errors
const formattedCode = await worker.format(code);
const endLineNumber = code.split("\n").length + 1;
const endColumn = code.split("\n").map(line => line.length).sort((a, b) => a - b)[0] + 1;
console.log({ endColumn, endLineNumber, formattedCode, code })
return [
{
text: formattedCode,
range: {
endColumn,
endLineNumber,
startColumn: 0,
startLineNumber: 0
}
}
]
}
}
TodoLangFormattingProvider
by calling worker
provided format
methods and means editor.getValue()
as the reference, and to Monaco provide a variety of code and the code range desired to replace, now enters setup
function and use Monaco registerDocumentFormattingEditProvider
the API register formatting provider
, re-run Application, you can see that the editor already supports automatic formatting
monaco.languages.registerDocumentFormattingEditProvider(languageID, new TodoLangFormattingProvider(worker));
Try to click Format document or Shift + Alt + F , you can see the effect as shown:
Implementing Auto-Completion
To support the auto-complete the definition of TODO, you have to do is to get all defined TODO from AST, and provide completion provider
by setup
call registerCompletionItemProvider
. completion provider
gives you the code and the current position of the cursor, so you can check the context in which the user is typing, and if they type TODO in a complete expression, you can suggest predefined TO DOs. Remember, by default, Monaco-editor supports automatic completion of predefined tags in the code, you may need to disable this feature and implement your own tags to make it more intelligent and contextual
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。