Example analysis: how to develop VSCode LSP service

The full text is 3000 words, welcome to like, follow and forward

Speaking from a moving picture:

The above picture should be the error diagnosis function that everyone often uses. It can prompt you in the process of writing the code, what type of problem exists in that piece of code.

This seemingly tall function is actually very simple from the point of view of plug-in developers, basically using the VSCode development language features briefly introduced in the previous You Don’t Know the VSCode Code Highlighting Principle Three options:

Lexical highlighting based on Sematic Tokens Provider
Programmatic syntax highlighting based on Language API
Multi-process architecture syntax highlighting based on Language Server Protocol

Among them, Language Server Protocol has gradually become the mainstream implementation solution due to its advantages in performance and development efficiency. This article will then introduce the implementation details of various language features based on LSP, and answer the communication model and development mode of LSP.

Sample code

The examples in this article have been synchronized to github , it is recommended that readers first pull down the actual experience of the code:

# 1. clone 示例代码
git clone git@github.com:Tecvan-fe/vscode-lsp-sample.git
# 2. 安装依赖
npm i # or yarn
# 3. 使用 vscode 打开示例代码
code ./vscode-lsp-sample
# 4. 在 vscode 中按下 F5 启动调试

After successful execution, you can see the plug-in debugging window:

The core code is:

server/src/server.ts : LSP server code, providing examples of common language functions such as code completion, error diagnosis, code prompts, etc.
client/src/extension.ts : Provides a series of LSP parameters, including the server's debugging port, code entry, communication mode, etc.
packages.json : Mainly provides configuration information required by the grammar plug-in, including:
- activationEvents : Declare the activation conditions of the plug-in, onLanguage:plaintext in the code means to activate when the txt text file is opened
- main : The entry file of the plug-in

Among them, client/src/extension.ts and packages.json are relatively simple, this article introduces too much, and the focus is on the server/src/server.ts file. Next, we will gradually disassemble and analyze the implementation details of different language features.

How to write Language Server

Server structure analysis

server/src/server.ts sample project implements a small but complete Language Server application, the core code:


// 要素1： 初始化 LSP 连接对象
const connection = createConnection(ProposedFeatures.all);

// 要素2： 创建文档集合对象，用于映射到实际文档
const documents: TextDocuments<TextDocument> = new TextDocuments(TextDocument);

connection.onInitialize((params: InitializeParams) => {
  // 要素3： 显式声明插件支持的语言特性
  const result: InitializeResult = {
    capabilities: {
      hoverProvider: true
    },
  };
  return result;
});

// 要素4： 将文档集合对象关联到连接对象
documents.listen(connection);

// 要素5： 开始监听连接对象
connection.listen();

The 5 necessary steps of Language Server can be summarized from the sample code:

Create connection object to realize the information exchange between client and server
Create a documents document collection object to map the file being edited by the client
In the connection.onInitialize event, the syntax features supported by the plug-in are explicitly declared. For example, the returned object in the above example contains the hoverProvider: true declaration, which means that the plug-in can provide the code hovering prompt function
Associate documents to the connection object
Call connection.listen function to start monitoring client messages

The above connection and documents are defined in the npm package:
vscode-languageserver/node
vscode-languageserver-textdocument

This is a basic template, which mainly completes the various initialization operations of Language Server. You can use connection.onXXX or documents.onXXX monitor various interactive events, and return the results that conform to the LSP protocol in the event callback, or explicitly call the communication function such as connection.sendDiagnostics Interactive information.

Next, we use a few simple examples to analyze the implementation logic of each language feature.

Hover prompt

When the mouse is hovering over the token of language elements such as functions, variables, symbols, etc., VSCode will display the corresponding description and help information of the token:

To implement the hover prompt function, you first need to declare that the plug-in supports the hoverProvider feature:

connection.onInitialize((params: InitializeParams) => {
  return {
    capabilities: {
      hoverProvider: true
    },
  };
});

After that, you need to listen to the connection.onHover event and return prompt information in the event callback:

connection.onHover((params: HoverParams): Promise<Hover> => {
  return Promise.resolve({
    contents: ["Hover Demo"],
  });
});

OK, this is a very simple example of language features, essentially listening to events + returning results, very simple.

Code formatting

Code formatting is a particularly useful function that can help users quickly and automatically complete code beautification processing, achieving effects such as:

To implement the hover prompt function, you first need to declare that the plug-in supports the documentFormattingProvider feature:

{
    ...
    capabilities : {
        documentFormattingProvider: true
        ...
    }
}

After that, listen to the onDocumentFormatting event:

connection.onDocumentFormatting(
  (params: DocumentFormattingParams): Promise<TextEdit[]> => {
    const { textDocument } = params;
    const doc = documents.get(textDocument.uri)!;
    const text = doc.getText();
    const pattern = /\b[A-Z]{3,}\b/g;
    let match;
    const res = [];
    // 查找连续大写字符串
    while ((match = pattern.exec(text))) {
      res.push({
        range: {
          start: doc.positionAt(match.index),
          end: doc.positionAt(match.index + match[0].length),
        },
        // 将大写字符串替换为 驼峰风格
        newText: match[0].replace(/(?<=[A-Z])[A-Z]+/, (r) => r.toLowerCase()),
      });
    }

    return Promise.resolve(res);
  }
);

In the sample code, the callback function mainly implements the formatting of continuous uppercase strings into camel case strings. The effect is shown in the figure:

Function signature

The function signature feature is triggered when the user enters the function call syntax. At this time, VSCode will display the help information of the function according to the content returned by the Language Server.

To implement the function signature function, you need to first declare that the plug-in supports the documentFormattingProvider feature:

{
    ...
    capabilities : {
        signatureHelpProvider: {
            triggerCharacters: ["("],
        }
        ...
    }
}

After that, monitor the onSignatureHelp event:

connection.onSignatureHelp(
  (params: SignatureHelpParams): Promise<SignatureHelp> => {
    return Promise.resolve({
      signatures: [
        {
          label: "Signature Demo",
          documentation: "帮助文档",
          parameters: [
            {
              label: "@p1 first param",
              documentation: "参数说明",
            },
          ],
        },
      ],
      activeSignature: 0,
      activeParameter: 0,
    });
  }
);

Realize the effect:

Error message

Note that the implementation logic of the error message is a little different from the above event + response mode:

First, there is no need to make an additional statement capabilities
What is listening is the documents.onDidChangeContent event, not the event on the connection
Instead of using the return statement to return the error message connection.sendDiagnostics send the error message

Full example:

// 增量错误诊断
documents.onDidChangeContent((change) => {
  const textDocument = change.document;

  // The validator creates diagnostics for all uppercase words length 2 and more
  const text = textDocument.getText();
  const pattern = /\b[A-Z]{2,}\b/g;
  let m: RegExpExecArray | null;

  let problems = 0;
  const diagnostics: Diagnostic[] = [];
  while ((m = pattern.exec(text))) {
    problems++;
    const diagnostic: Diagnostic = {
      severity: DiagnosticSeverity.Warning,
      range: {
        start: textDocument.positionAt(m.index),
        end: textDocument.positionAt(m.index + m[0].length),
      },
      message: `${m[0]} is all uppercase.`,
      source: "Diagnostics Demo",
    };
    diagnostics.push(diagnostic);
  }

  // Send the computed diagnostics to VSCode.
  connection.sendDiagnostics({ uri: textDocument.uri, diagnostics });
});

Whether there is a continuous uppercase character string in this logic diagnostic code, send the corresponding error message sendDiagnostics

How to identify incidents and response bodies

In the above example, I deliberately ignore most of the implementation details, and pay more attention to the basic framework and input and output of the language features. It is better to teach people how to fish than to teach people how to fish, so next we spend a little bit of time to understand where to get these interfaces, parameters, and response bodies. There are two very important links:

https://zjsms.com/egWtqPj/ , VSCode official website description document on programmable language features
https://zjsms.com/egWVTPg/ , the official website of the LSP protocol

These two web pages provide a detailed introduction to all the language features supported by VSCode. You can find a conceptual description of the features you want to implement here, such as code completion:

Well, it's a bit complicated and too detailed, but it is still necessary to be patient and understand, so that you have a high-level conceptual understanding of what you are about to do.

In addition, if you choose to use TS to write LSPs, things will become easier. vscode-languageserver package provides a very complete Typescript type definition. We can use the code hints of ts + VSCode to find the monitoring function that needs to be used:

After that, find the type definitions of the parameters and results according to the function signature:

After that, you can process the parameters in a targeted manner according to the type definition, and return the data of the corresponding structure.

Deep understanding of LSP

After reading the example, let's look at the LSP in reverse. LSP —— Language Server Protocol is essentially an inter-process communication protocol based on JSON-RPC. LSP itself contains two major contents:

Define the communication model between client and server, that is, who, when and how to send information in what format to the other party, and how the receiver returns response information
Define the body of the communication information, that is, what format, what field, and what value is used to express the information state

As an analogy, the HTTP protocol is specifically used to describe how to transmit and understand hypermedia documents between network nodes. The LSP protocol is specifically used to describe the communication method and information structure between user behavior and response in the IDE.

To sum up, the workflow of the LSP architecture is as follows:

Editors such as VSCode track, calculate, and manage user behavior models. When certain specific behavior sequences occur, they send actions and context parameters to Language Server in the communication mode specified by the LSP protocol.
Language Server asynchronously returns response information based on these parameters
The editor then processes interactive feedback based on the response information

To put it simply, the editor is responsible for direct interaction with the user, and the Language Server is responsible for silently calculating how to respond to the user's interaction behind the scenes. The two are separated and decoupled at the granularity of the process, and each perform its own duties and co-exist under the LSP protocol framework. Just like in the web applications we usually develop, the front-end is responsible for interacting with users, and the server is responsible for managing invisible parts such as permissions, business data, and business status circulation.

Currently, the LSP protocol has been developed to version 3.16, covering most language features, including:

Code completion
Code highlighting
Define jump
Type inference
Error detection
and many more

Thanks to the clear design of the LSP, the development routines of these language features are very similar, and the learning curve is very smooth. During development, you basically only need to care about which function is monitored and what format structure is returned. It can be said that after mastering the above few examples It's easy to get started.

In the past, IDE support for language features was integrated in the IDE or implemented in the form of isomorphic plug-ins. In VSCode, this isomorphic extension capability is provided in the Language API or Sematic Tokens Provider interface. an article " you do not know VSCode code highlighting principle " has been introduced, although the architecture is simple, easy to understand, but there are some obvious Mishap:

Plug-in developers must reuse VSCode's own development language and environment. For example, Python language plug-ins must be written in JavaScript
The same programming language requires repeated development of similar extension plug-ins for different IDEs, and repeated investment

The biggest advantage of LSP is that it isolates the IDE client from the server that actually computes interactive features. The same Language Service can be repeatedly applied to multiple different Language Clients.

In addition, under the LSP protocol, the client and server run in their own processes, and there will be positive benefits in terms of performance:

Ensure that the UI process is not stuck
Under the Node environment, make full use of multi-core CPU capabilities
Since the technology stack of Language Server is no longer limited, developers can choose a higher-performance language, such as Go

Generally speaking, it is very strong.

to sum up

This article introduces the most basic skills required to develop an LSP-based language plug-in under VSCode. In actual development, another technology is usually mixed: Embedded Grammar-Embedded Languages Server, which can achieve more complex Multi-language support, if anyone is interested, we can talk next week.

Example analysis: how to develop VSCode LSP service

Sample code

How to write Language Server

Server structure analysis

Hover prompt

Code formatting

Function signature

Error message

How to identify incidents and response bodies

Deep understanding of LSP

to sum up

范文杰

引用和评论

豆包 Marscode 最佳实践

浏览器原生「磁吸」效果！Anchor Positioning 锚点定位神器解析

不要再这样编写 async/await

Flex 布局学习总结（对齐方式）

Koa+Typescript起手式(空环境) 不用每次玩node都要搭环境了！

JavaScript&ES6----数组去重的多种方法

Angular 19 的 Resource API 与 HttpClient