Such a useful low-latency live broadcast, NetEase Yunxin actually made it open source? !

When it comes to live broadcasting, no one will feel unfamiliar.

Since the rise of streaming media live broadcast technology at the end of the last century, with the development of network infrastructure, live broadcast has also started to resonate at the same frequency. In recent years, technologies such as AI, cloud computing, audio and video have become increasingly mature, and the "home economy" stimulus brought about by the new crown pneumonia epidemic has further activated the development momentum of the live broadcast industry.

According to statistics from the "Statistical Report on Internet Development in China" released by the China Internet Network Information Center (CNNIC), by June 2021, the number of online live broadcast users in China has reached 637 million , and the market size of live broadcast is close to 300 billion yuan.

Live broadcast is a good thing, but live broadcast delay is not

Through webcasting, you can easily watch the intense sports events going on on the other side of the ocean, and you can also read the great rivers and mountains, sunrises and sunsets of the motherland without leaving home, and even "cloud supervisor" Huoshenshan Hospital with 60 million strangers. The construction progress, praise for the epidemic prevention and control force.

Live broadcast is a good thing, but live broadcast delay is not.

Maybe you stayed up all night in the e-commerce live broadcast room, and in the countdown to the second kill, you were taken first because of the delay; maybe you missed important knowledge points due to the delay during the online class; or at the critical moment of the sports competition, due to the delay The results were "spoiled" ahead of time.

All these destructive experiences are caused by "delay".

Live broadcast experience destroyer: How does "delay" occur?

So, as the spoiler of the audio and video live broadcast experience, where does the delay come from?

Like the transmission, reception and transmission of sound and light in the medium, the encoding, decoding and transmission of audio and video data also takes time. The transmission link of live audio and video mainly involves acquisition, preprocessing, encoding, data transmission, server transcoding, decoding and other links, and each link will cause delay.

Quantitative changes cause qualitative changes. When these subtle delays distributed in various links continue to superimpose, a delay in the overall live broadcast will be formed, which will then be reflected in the user experience.

Generally speaking, the delay of the entire link can be divided into three types: device-side processing delay, end-to-server transmission delay, and server-to-server transmission delay.

Take the logistics and transportation of express parcels as an analogy:

Device-side processing delay: Similar to packing before delivery and unpacking after receiving, the data also needs to be collected, encoded, packaged and sent on the acquisition side, and received, decoded, rendered and played on the playback side. The delay is strongly related to the hardware performance, the codec algorithm used, and the amount of audio and video data.
Transmission delay between client and server: Similar to the collection and delivery of couriers, the delay here is usually related to the physical distance between the client and the server, the network operator of the client and the server, the terminal network speed, load and network type. and so on.
Transmission delay between servers: Just like the flow of express delivery in various logistics distribution centers and transfer stations, data is queued, transmitted, and circulated between servers. The delay is related to the choice of transmission path.

These three types of delays, in addition to device-side delays are strongly related to hardware performance, the sum of the latter two is the "end-to-end delay" that the industry focuses on optimizing.

Low-latency live broadcast: the best solution for strong interactive scenarios

The delay of 3~5 seconds is common in most domestic CDN live broadcasts. These scenarios mainly use the transmission form of HTTP-FLV and RTMP protocols. For example, game live broadcast does not emphasize interactivity, and the key frame delay is usually 8~10 seconds. ; The live broadcast of the event has higher requirements for fluency, and the HLS protocol is generally selected, and the delay will reach more than 10 seconds.

With the arrival of "all-people live broadcast", the live broadcast mode and content are also constantly innovating, and new models such as Lianmai live broadcast, online classroom, and e-commerce live broadcast continue to emerge. Using CDN live broadcast, for these scenes that emphasize interactivity, if the delay exceeds 1 second, the gameplay may not be able to continue.

So, how does real-time audio and video perform?

It is suitable for scenarios where viewers and anchors interact frequently and require high real-time performance, so that the content delay between the two can be controlled within 300ms, and the offline face-to-face communication experience can be truly restored. However, this solution is still complicated in network optimization, echo cancellation and other issues. The most important thing is that the real-time audio and video solution is billed by time, which is usually expensive to implement.

Latency and cost are two ends of the scale: the lower the latency, the higher the cost of the solution, and vice versa.

Therefore, the low-latency live broadcast of "Top Students" provides a problem-solving idea, finds a balance between the two, and takes into account stronger live broadcast capabilities under an acceptable delay , bringing users a better interactive experience. It can not only meet the low-frequency interaction requirements between the anchor and the audience, so that the content delay between the two can be controlled at about 1s, and it can support millions of people online at the same time. At the same time, the cost is controlled between CDN and real-time audio and video. The overall cost is more controllable.

All are low-latency live broadcasts. What is the difference between NetEase Yunxin?

Where there is demand, there are people who are chasing after the wind. In fact, after gaining insight into the market's objective demand for low-latency live streaming, major cloud manufacturers with a keen sense of smell have successively launched their own short-latency, low-latency, and ultra-low-latency livestreaming products.

Behind the complicated naming, what is the difference in technology?

There are three major indicators in the traditional live broadcast field: first screen time, delay, and stall rate . The difficulty of low-latency live broadcast technology is: how to achieve the same or better freeze rate than RTMP streaming technology, greatly reduce the time and delay of the first screen, so as to bring users a better live broadcast experience.

If a worker wants to do a good job, he must first sharpen his tools. As an integrated communication cloud service expert under NetEase Zhiqi, NetEase Yunxin integrates its years of technical accumulation and experience in the CDN and RTC fields, combined with the WebRTC standard media streaming technology, in terms of first screen time, delay, and freeze rate . Deeply optimized.

Time above the fold optimization

● GOP cache optimization above the fold
Assuming that the GOP of the user's push-stream end is 5 seconds, in some cases, the pull-stream end needs to wait for nearly 5 seconds to receive the first I frame, and the first screen can start rendering. This is unacceptable for highly interactive live broadcast scenarios.

NetEase Yunxin's solution is to perform GOP caching in the media server, and cache the media packages of the last 1-2 GOPs on the server side. When the media connection between the client and the media device is successful, the media packet in the GOP cache is sent first, and then the current media data is sent. After the client receives the media package, it needs to align the audio and video packages according to a certain strategy, and then speed up the frame tracking.

In the specific practice process, attention should be paid to the size of the GOP cache, the coordination of the client's Jitter buffer size, the alignment of audio and video in the GOP cache, and the adaptation of different GOP lengths on different streaming ends.

● Pacer smooth send

If the GOP set on the push-stream end is large, when the media connection of the pull-stream client is successful, all GOP data will be sent to the client at one go, which may cause buffer overflow or other problems on the client side. At this time, the smooth sending of the Pacer of the Server needs to play a role.
In the specific practice process, it is necessary to pay attention to the cooperation between the pacer's frame rate and the client's frame rate.
Latency optimization

● WE-CAN global intelligent routing network

The reason why the live broadcast industry can flourish is that in terms of technology, the cloud capabilities of CDN manufacturers have played a big role in promoting it. CDN speeds up the back-to-source speed of edge nodes, and edge nodes speed up the access speed of streaming terminals.

In order to speed up the back-to-source speed, the selection of back-to-source media services will be as close as possible to the regional center node of the CDN; in order to optimize the access performance of the client, the streaming media server should also be as close to the streaming client as possible. The transfer from the back-to-source media service to the pull-streaming service is crucial.

WE-CAN has taken up the responsibilities very well. As a large-scale distributed transmission network developed by NetEase Yunxin, WE-CAN realizes fast and stable transmission between any two media servers in the world through intelligent scheduling of various resources. WE-CAN plays a role in accelerating transmission faster, more stable, smarter and wider than traditional CDN.

Stuck rate optimization

NetEase Yunxin supports standard WebRTC media stream access, and achieves the ability to adaptively match various complex networks through in-depth optimization of various QoS policies such as GCC, ARQ, FEC, RED, etc. In the case of 40% packet loss, it can still be smooth. live streaming.

No.1: The industry's first open source low-latency live broadcast solution

Not long ago, NetEase Zhiqi released the "Easy+" open source plan and officially open sourced the NetEase conference component to help users in all walks of life build a stable, reliable, high-definition and easy-to-use exclusive conference system.

Today, we have also open sourced our low-latency live broadcast solution, and have become the first manufacturer of an open-source low-latency live broadcast solution in the industry.

In the past few years, China's open source has developed rapidly. More and more developers have joined the ranks of open source contributions, contributing technology with talent and enthusiasm, promoting the further prosperity of the global open source ecosystem. At the same time, high-quality open source projects have become an important cornerstone of infrastructure. Data shows that more than 90% of enterprise business has been built on open source software and open source projects, and the influence of open source is being presented to the world in the form of digital quantification.

As the industry's first open source low-latency live broadcast solution, NetEase Yunxin has open sourced signaling interaction protocols, low-latency engines, and player plug-ins.

We open up the signaling interaction process, support standard SDP negotiation and ICE connection, developers only need to integrate a set of low-latency players, they can connect to multiple low-latency live broadcast manufacturers at the same time, greatly reducing the package size. Increment.

At the same time, the WebRTC engine is deeply customized for low-latency live broadcast scenarios , supporting AAC, B frames and multiple slices; the first frame and end-to-end delays are optimized. Currently, the first frame time is controlled at about 200ms, and the end-to-end delay is The delay is controlled within 1s.

In order to facilitate the integration of low-latency live broadcast engines in existing players, NetEase Yunxin has also open sourced a low-latency live broadcast player plug-in based on the standard FFmpeg plug-in form. Developers only need to modify a small amount of code to make ready-made players with low latency The ability to delay live broadcasts.

In the future, NetEase Yunxin will also open source a cropped version of the low-latency live broadcast engine to further optimize the package size and various playback indicators. And continue to launch open source low-latency live streaming engine and low-latency live streaming plug-in, provide a full-link solution for low-latency live streaming, further lower the development threshold, and promote the rapid development of the low-latency live streaming industry.

Adhering to the technical attitude of openness and sharing, we will always be committed to exploring the huge social value of open source with the industry through high-quality open source projects.

Live recommendation

May 17th 19:00 Netease Zhiqi video account live broadcast

Scan the QR code and join the official [Open Source Community]!

Such a useful low-latency live broadcast, NetEase Yunxin actually made it open source? !

Live broadcast is a good thing, but live broadcast delay is not

Live broadcast experience destroyer: How does "delay" occur?

Low-latency live broadcast: the best solution for strong interactive scenarios

All are low-latency live broadcasts. What is the difference between NetEase Yunxin?

No.1: The industry's first open source low-latency live broadcast solution

Live recommendation

网易数智

引用和评论

InfoQ官媒报道|网易云信裴明明：云原生架构下中间件联邦高可用架构实践

基于 MCP 的 AI Agent 应用开发实践

OSPO Summit 2025 正式定档！议题征集同步开启

OSPO Summit 2025 首批议程发布！

强烈推荐|新手从搭建到二开TinyEngine低代码引擎

面对开源大模型浪潮，基础模型公司如何持续盈利？

2025年GitHub Star增长最快的15个开源低代码项目