Koa2 builds a signaling server, and JS can also handle video calls!

Hello everyone, my name is Yang Chenggong.

The last article introduced what WebRTC is, what are the steps in its communication process, and built a demo of local communication. Finally, I talked about the idea of one-to-many implementation.

In this article, we used 信令服务器 to transmit SDP when communicating between two ends of the LAN. We didn't cover the signaling server in detail at the time, just using two variables to simulate the connection.

In practical application scenarios, the essence of the signaling server is a WeSocket server, and two clients must establish a WeSocket connection with this server to send messages to each other.

However, the role of the signaling server is not only to send SDP. In multi-terminal communication, we generally communicate with a certain person or several people, and we need to group all connections, which belongs to the concept of "room" in audio and video communication. Another function of the signaling server is to maintain the binding relationship between client connections and rooms.

Then this article, based on the Koa2 framework of Node.js, will take you to implement a signaling server together.

Outline preview

The content presented in this article includes the following aspects:

Let's talk signaling again
koa meets ws
How to maintain connection objects?
Initiator implementation
Receiver implementation
Ready, the messengers are running!
Join the study group

Let's talk signaling again

In the last article, we mentioned that two clients in a local area network need to exchange information multiple times to establish a WebRTC peer-to-peer connection. Sending information is initiated by each end actively, and the other end listens for event reception, so the implementation scheme is WebSocket .

The process of remotely exchanging SDP based on WebSocket is called signaling .

In fact, WebRTC does not specify how to implement signaling. That is, signaling is not part of the WebRTC communication specification. For example, if we implement the communication of two RTCPeerConnection instances on one page, the whole connection process does not require signaling. Because both SDPs are defined on one page, we can get the variables directly.

It's just that in the case of multiple clients, both parties need to obtain each other's SDP, so there is signaling.

koa meets ws

We use Node.js to build a signaling server. There are two most critical parts:

Framework: Koa2
module: ws

Node.js development needs to choose a suitable framework. I have been using Express before, but this time I will try Koa2. However, the two of them are not much different. There may be some differences in some APIs or npm packages, and the basic structure is almost the same.

The ws module is a very simple and pure WebSocket implementation, including client and server. In this article , Front-end Architect Breaking Skills, NodeJS Landed WebSocket Practice , I introduced the usage of the ws module and how to integrate with the express framework in detail. If you don't know the ws module, you can read this.

Here we directly start to build the structure of Koa2 and introduce the ws module.

koa project structure construction

The first is to initialize the project and install:

 $ npm init && yarn add koa ws

After the creation is completed, the package.json file is generated, and then three folders are added to the same level directory:

routers : Store separate routing files
utils : Stores utility functions
config : store the configuration file

Finally, write the most important entry file, the basic structure is as follows:

 const Koa = require('koa')
const app = new Koa()

app.use(ctx => {
  ctx.body = 'Hello World'
})

server.listen(9800, () => {
  console.log(`listen to http://localhost:9800`)
})

See, it's basically the same as express. After instantiation, set a route, listen to a port, and a simple web server is started.

The big difference to be mentioned here is the difference between their 中间件函数 . The middleware function is the callback function passed in when using app.use or app.get . For more middleware knowledge, see here .

The parameters of the middleware function include 请求 and 响应 the key information of two blocks, which are represented by two parameters in express, and these two objects are combined in koa Together, it is represented by only one parameter.

Express is represented as follows:

 app.get('/test', (req, res, next) => {
  // req 是请求对象，获取请求信息
  // res 是响应对象，用于响应数据
  // next 进入下一个中间件
  let { query } = req
  res.status(200).send(query)
})

And koa is like this:

 app.get('/test', (ctx, next) => {
  // ctx.request 是请求对象，获取请求信息
  // ctx.response 是响应对象，用于响应数据
  // next 进入下一个中间件
  let { query } = ctx
  ctx.status = 200
  ctx.body = query
})

Although ctx.request represents the request object and ctx.response represents the response object, koa directly links some commonly used attributes to ctx. For example, ctx.body indicates the response body, so how to get the request body? You have to use ctx.request.body , and then get the URL parameter is ctx.query . In short, it feels confusing to use, and some people still like the design of express.

The basic structure is like this, we have to do two things:

Cross-domain processing
Request body parsing

Needless to say, cross-domain, everyone who does front-end understands. The request body parsing is because Node.js receives the request body in a stream-based way and cannot be obtained directly. Therefore, it needs to be processed separately, and it is convenient to use ctx.request.body to obtain it directly.

First install two npm packages:

 $ yarn add @koa/cors koa-bodyparser

Then configure it in app.js:

 const cors = require('@koa/cors')
const bodyParser = require('koa-bodyparser')

app.use(cors())
app.use(bodyParser())

ws module integration

Essentially, WebSocket and Http are two sets of services. Although they are integrated in a Koa framework, they are actually independent of each other.

Because they are in the same Koa application, we hope that WebSocket and Http can share a port, so that we only need to control one place to start/destroy/restart these operations.

To share the port, first make some modifications to the above entry file app.js:

 const http = require('http')
const Koa = require('koa')

const app = new Koa()
const server = http.createServer(app.callback())

server.listen(9800, () => {
  console.log(`listen to http://localhost:9800`)
}) // 之前是 app.listen

Then we create a new ws.js in the utils directory:

 // utils/ws.js
const WebSocketApi = (wss, app) => {
  wss.on('connection', (ws, req) => {
    console.log('连接成功')
  }
}

module.exports = WebSocketApi

Then import this file into app.js and add the following code:

 // app.js
const WebSocket = require('ws')
const WebSocketApi = require('./utils/ws')

const server = http.createServer(app.callback())
const wss = new WebSocket.Server({ server })

WebSocketApi(wss, app)

At this point, re-run node app.js , then open the browser console and write a line of code:

 var ws = new WebSocket('ws://localhost:9800')

Under normal circumstances, the browser results are as follows:

The readyState=1 here indicates that the WebSocket connection is successful.

How to maintain connection objects?

In the previous step, the ws module was integrated, and the test connection was successful. We wrote the logic of WebSocket in this function WebSocketApi . Let's continue to look at this function.

 // utils/ws.js
const WebSocketApi = (wss, app) => {
  wss.on('connection', (ws, req) => {
    console.log('连接成功')
  }
}

The function accepts two parameters, wss is the instance of the WebSocket server, app is the instance of the Koa application. Maybe you will ask what is the use of this app? In fact, its role is very simple: set global variables .

The main function of the signaling server is to find the two connected parties and transmit data. Then when there are many clients connecting to the server, we need to find the two parties that communicate with each other among the many clients, so we need to do 标识 and 分类 for all client connections. 分类 .

In the callback function of the above code monitoring the connection event, the first parameter ws represents a connected client. ws is a WebSocket instance object, calling ws.send() can send messages to the client.

 ws.send('hello') // 发消息
wss.clients // 所有的 ws 连接实例

Identifying ws is as simple as adding some attributes for differentiation. For example, add user_id , room_id , etc. These identifiers can be passed as parameters when the client connects, and then obtained from the req parameter in the above code.

After setting the logo, save the ws client with a name and a surname, and you can find it later.

But how to save? That is, how to maintain the connection object? This question requires serious thought. The WebSocket connection object is in memory, and the connection to the client is opened in real time. So we need to store the ws object in the memory, one of the ways is to set it in the global variables of the Koa application, which is also the meaning of the app parameter mentioned at the beginning.

The global variables of the Koa application are added under app.context , so we create two global variables with "initiator" and "receiver" as groups:

cusSender : Array, save all the ws objects of the initiator
cusReader : Array, save all the ws objects of the receiver

Then get these two variables and request parameters separately:

 // utils/ws.js
const WebSocketApi = (wss, app) => {
  wss.on('connection', (ws, req) => {
    let { url } = req // 从url中解析请求参数
    let { cusSender, cusReader } = app.context
    console.log('连接成功')
  }
}

The request parameters are parsed from the url, cusSender , cusReader are two arrays, which save the instance of ws, and all subsequent connection searches and state maintenance are in these two arrays Do the following.

Initiator implementation

The initiator refers to the end that initiates the connection. The initiator needs to carry two parameters when connecting to the WebSocket:

rule : role
roomid : room id

The role of the initiator is fixed as sender , which is only used to identify this WebSocket as an initiator role. roomid represents the unique ID of the current connection. In one-to-one communication, it can be the current user ID; in one-to-many communication, there will be a concept similar to "live room", and roomid means a room (Live room) ID.

First, on the client side, the URL that initiates the connection is as follows:

 var rule = 'sender',
  roomid = '354682913546354'
var socket_url = `ws://localhost:9800/webrtc/${rule}/${roomid}`
var socket = new WebSocket(socket_url)

Here we add a url prefix /webrtc to the WebSocket connection representing webrtc, and we bring the parameters directly into the url, because WebSocket does not support custom request headers, and can only carry parameters in the url.

The server receives the sender code as follows:

 wss.on('connection', (ws, req) => {
  let { url } = req // url 的值是 /webrtc/$role/$uniId
  let { cusSender, cusReader } = app.context
  if (!url.startsWith('/webrtc')) {
    return ws.clode() // 关闭 url 前缀不是 /webrtc 的连接
  }
  let [_, role, uniId] = url.slice(1).split('/')
  if(!uniId) {
    console.log('缺少参数')
    return ws.clode()
  }
  console.log('已连接客户端数量：', wss.clients.size)
  // 判断如果是发起端连接
  if (role == 'sender') {
    // 此时 uniId 就是 roomid
    ws.roomid = uniId
    let index = (cusReader = cusReader || []).findIndex(
      row => row.userid == ws.userid
    )
    // 判断是否已有该发送端，如果有则更新，没有则添加
    if (index >= 0) {
      cusSender[index] = ws
    } else {
      cusSender.push(ws)
    }
    app.context.cusSender = [...cusSender]
  }
}

In the above code, we judge that the current connection belongs to the sender according to the sender parsed from the url, then bind the roomid to the ws instance, and then update the cusSender array according to the conditions, which ensures that even if the client connects multiple times (such as page refresh), the instance It will not be added repeatedly.

This is the logic for initiating a connection. We also have to deal with a situation where the ws instance needs to be cleared when the connection is closed:

 wss.on('connection', (ws, req) => {
  ws.on('close', () => {
    if (from == 'sender') {
      // 清除发起端
      let index = app.context.cusSender.findIndex(row => row == ws)
      app.context.cusSender.splice(index, 1)
      // 解绑接收端
      if (app.context.cusReader && app.context.cusReader.length > 0) {
        app.context.cusReader
          .filter(row => row.roomid == ws.roomid)
          .forEach((row, ind) => {
            app.context.cusReader[ind].roomid = null
            row.send('leaveline')
          })
      }
    }
  })
})

Receiver implementation

The receiver refers to the client that receives the media stream from the initiator and plays it. The receiver needs to carry two parameters when connecting to the WebSocket:

rule : role
userid : user id

The role role is the same as the initiator, and the value is fixed as reader . The connection end can be regarded as a user, so when initiating a connection, pass a current user's userid as a unique identifier to bind to the connection.

On the client side, the URL for the receiver to connect to is as follows:

 var rule = 'reader',
  userid = '6143e8603246123ce2e7b687'
var socket_url = `ws://localhost:9800/webrtc/${rule}/${userid}`
var socket = new WebSocket(socket_url)

The code for the server to receive the message sent by the reader is as follows:

 wss.on('connection', (ws, req) => {
  // ...省略
  if (role == 'reader') {
    // 接收端连接
    ws.userid = uniId
    let index = (cusReader = cusReader || []).findIndex(
      row => row.userid == ws.userid
    )
    // ws.send('ccc' + index)
    if (index >= 0) {
      cusReader[index] = ws
    } else {
      cusReader.push(ws)
    }
    app.context.cusReader = [...cusReader]
  }
}

Here, the update logic of cusReader is consistent with that of cusSender above, which will ultimately ensure that only the instances in the connection are stored in the array. The same should be done when closing the connection:

 wss.on('connection', (ws, req) => {
  ws.on('close', () => {
    if (role == 'reader') {
      // 接收端关闭逻辑
      let index = app.context.cusReader.findIndex(row => row == ws)
      if (index >= 0) {
        app.context.cusReader.splice(index, 1)
      }
    }
  })
})

Ready, the messengers are running!

In the first two steps, we realized the information binding of the client WebSocket instance and the maintenance of the connected instance. Now we can receive the message transmitted by the client, and then pass the message to the target client, so that our "messenger" Run with the information!

For the client-side content, we continue to look at the part of the communication between both ends of the LAN and one-to-many communication in the previous article , and then re-organize the communication logic completely.

The first is that the initiator peerA and the receiver peerB have connected to the signaling server:

 // peerA
var socketA = new WebSocket('ws://localhost:9800/webrtc/sender/xxxxxxxxxx')
// peerB
var socketB = new WebSocket('ws://localhost:9800/webrtc/reader/xxxxxxxxxx')

Then the server listens to the message sent and defines a method eventHandel to handle the logic of message forwarding:

 wss.on('connection', (ws, req) => {
  ws.on('message', msg => {
    if (typeof msg != 'string') {
      msg = msg.toString()
      // return console.log('类型异常：', typeof msg)
    }
    let { cusSender, cusReader } = app.context
    eventHandel(msg, ws, role, cusSender, cusReader)
  })
})

At this point, peerA has acquired the video stream, stored it in the localStream variable, and started live broadcasting. Let's start to sort out the steps for connecting the peerB end to it.

Step 1 : Client peerB enters the live room and sends a message to join the connection:

 // peerB
var roomid = 'xxx'
socketB.send(`join|${roomid}`)

Note that socket information does not support sending objects. Convert all required parameters to strings, which can be divided by |

Then on the signaling server side, listen to the message sent by peerB, find peerA, and send the connection object:

 const eventHandel = (message, ws, role, cusSender, cusReader) => {
  if (role == 'reader') {
    let arrval = data.split('|')
    let [type, roomid] = arrval
    if (type == 'join') {
      let seader = cusSender.find(row => row.roomid == roomid)
      if (seader) {
        seader.send(`${type}|${ws.userid}`)
      }
    }
  }
}

Step 2 : The initiator peerA listens to the join event, and then creates an offer and sends it to peerB :

 // peerA
socketA.onmessage = evt => {
  let string = evt.data
  let value = string.split('|')
  if (value[0] == 'join') {
    peerInit(value[1])
  }
}
var offer, peer
const peerInit = async usid => {
  // 1. 创建连接
  peer = new RTCPeerConnection()
  // 2. 添加视频流轨道
  localStream.getTracks().forEach(track => {
    peer.addTrack(track, localStream)
  })
  // 3. 创建 SDP
  offer = await peer.createOffer()
  // 4. 发送 SDP
  socketA.send(`offer|${usid}|${offer.sdp}`)
}

The server listens to the message from peerA, finds peerB, and sends the offer information:

 // ws.js
const eventHandel = (message, ws, from, cusSender, cusReader) => {
  if (from == 'sender') {
    let arrval = message.split('|')
    let [type, userid, val] = arrval
    // 注意：这里的 type, userid, val 都是通用值，不管传啥，都会原样传给 reader
    if (type == 'offer') {
      let reader = cusReader.find(row => row.userid == userid)
      if (reader) {
        reader.send(`${type}|${ws.roomid}|${val}`)
      }
    }
  }
}

Step 3 : Client peerB listens to the offer event, and then creates an answer and sends it to peerA :

 // peerB
socketB.onmessage = evt => {
  let string = evt.data
  let value = string.split('|')
  if (value[0] == 'offer') {
    transMedia(value)
  }
}
var answer, peer
const transMedia = async arr => {
  let [_, roomid, sdp] = arr
  let offer = new RTCSessionDescription({ type: 'offer', sdp })
  peer = new RTCPeerConnection()
  await peer.setRemoteDescription(offer)
  let answer = await peer.createAnswer()
  await peer.setLocalDescription(answer)
  socketB.send(`answer|${roomid}|${answer.sdp}`)
}

The server listens to the message sent by peerB, finds peerA, and sends the answer message:

 // ws.js
const eventHandel = (message, ws, from, cusSender, cusReader) => {
  if (role == 'reader') {
    let arrval = message.split('|')
    let [type, roomid, val] = arrval
    if (type == 'answer') {
      let sender = cusSender.find(row => row.roomid == roomid)
      if (sender) {
        sender.send(`${type}|${ws.userid}|${val}`)
      }
    }
  }
}

Step 4 : The initiator peerA listens to the answer event, and then sets the local description.

 // peerA
socketB.onmessage = evt => {
  let string = evt.data
  let value = string.split('|')
  if (value[0] == 'answer') {
    let answer = new RTCSessionDescription({
      type: 'answer',
      sdp: value[2]
    })
    peer.setLocalDescription(offer)
    peer.setRemoteDescription(answer)
  }
}

Step 5 : The peerA side listens and delivers the candidate event and sends data. This event will be triggered when peer.setLocalDescription is executed in the previous step:

 // peerA
peer.onicecandidate = event => {
  if (event.candidate) {
    let candid = event.candidate.toJSON()
    socketA.send(`candid|${usid}|${JSON.stringify(candid)}`)
  }
}

Then listen on the peerB side and add the candidate:

 // peerB
socket.onmessage = evt => {
  let string = evt.data
  let value = string.split('|')
  if (value[0] == 'candid') {
    let json = JSON.parse(value[1])
    let candid = new RTCIceCandidate(json)
    peer.addIceCandidate(candid)
  }
}

Alright, that's it! !

There is a lot of content in this article. It is recommended that you must follow up and write it again, and then you will understand the process of one-to-many communication. Of course, this chapter has not yet implemented network hole punching. The signaling server can be deployed on the server, but the WebRTC client must be connected to the local area network.

In the next article, the third article in the WebRTC series, we implement an ICE server.

i want to learn more

The source of this article is the official account: Programmer Success . Here we mainly share the technical knowledge of front-end engineering and architecture. Welcome to pay attention to the official account, click " Add Group " to join the learning team, and explore learning progress with the big guys! ~

Koa2 builds a signaling server, and JS can also handle video calls!

Outline preview

Let's talk signaling again

koa meets ws

koa project structure construction

ws module integration

How to maintain connection objects?

Initiator implementation

Receiver implementation

Ready, the messengers are running!

i want to learn more

杨成功

引用和评论

Vue3 新项目，没必要用 Pinia 了！

2025年最新反编译微信小程序的教程及工具

手写一个动态海洋和天空效果的vue hooks

你可能不知道的图片加载相关知识

使用CSS给标题添加书名号并超出省略

原生electron起步-从零到一完成构建和打包

Koa+Typescript起手式(空环境) 不用每次玩node都要搭环境了！