1
头图


At the 2022 Alibaba Cloud Live Streaming Summit, a number of technical experts and industry pioneers in the live broadcast industry jointly discussed the evolution trend and future development of video live broadcast technology in the hyper-video era. At the meeting, Alibaba Cloud released the industry's first "best practice map of live video technology", which summarized live broadcast technology into 7 points: cloud native, high reliability, low latency, ultra-high definition, intelligent, professional and For multiple scenarios, this article will provide an in-depth interpretation of the "Best Practice Map of Video Live Technology".

The trend of live video is to reduce the delay to the extreme, which includes transmission delay and calculation delay.

When it comes to delay, the public's understanding of delay mainly focuses on transmission delay. According to the delay of video, video can be divided into on-demand, live broadcast, continuous microphone interaction, real-time interaction, etc.

  • When the transmission delay is 3-10 seconds, such videos have broadcastable properties, such as live sports events;
  • When the transmission delay is between 250-800 milliseconds, communication and interaction can be carried out, such as: interactive classroom connection with microphone, etc.;
  • When the transmission delay is reduced to 50-80 milliseconds, the video will have controllability and immersion, such as: cloud real-time 3D rendering, remote video control...

In addition to the transmission delay, the calculation of video encoding and decoding, high-definition and other technologies will also bring about a delay in computing power. Following the trend of live broadcast, how can the transmission delay and calculation delay be continuously reduced to bring technical support and imagination to more live broadcast scenarios?

Alibaba Cloud's live broadcast technology is based on cloud-native bases and distributed edge nodes. Through the transformation of transmission protocols, the integration of real-time media processing capabilities and edge computing power can significantly reduce transmission and computing delays. Media transmission network GRTN (‍Global Real-time Transport Network), ultra-low-latency live broadcast service RTS (Real-time Streaming), real-time media processing capability, video + AI and other technical applications, complete the best practice of low latency and realize cost With the best balance of experience, while bringing many general live broadcast solutions, it also derives many scene solutions.

The industry's first "Best Practice Map of Video Live Broadcasting Technology" released at this summit is the result of Alibaba Cloud's years of live broadcast technology exploration and practice. It is summarized into 7 major points: cloud native, high reliability, and low latency , Ultra HD, intelligent, professional and multi-scene.

cloud native

Video technology is a cloud-native best practice.

There are three main points of cloud native advocated by Alibaba Cloud: "service-oriented products", "flexibility at will", "integration of software and hardware, integration of cloud and edge, and integration of cloud", and video technology is precisely the best practice of cloud native.

Cloud infrastructure, including central nodes, edge nodes, and CDN networks, is the basis for ensuring large-scale distribution and transmission; cloud-native software and hardware integration can support CPU/GPU/FPGA/ASIC and other heterogeneous software and hardware solutions; cloud The close collaboration and computing power distribution with the end can achieve consistent rendering effects on the cloud, mobile, web, and PC.

In addition, cloud-native time, space, and heterogeneous elasticity can not only support dozens of business mixed running, flexible and quantitative adjustment of cloud-side computing, but also realize 100+ real-time transmission, media processing, and multi-machine AI tasks. It brings unlimited computing power to the video business and makes full and effective use of resources, greatly reducing costs and generating more new scenarios.

High reliability

Hot videos have tens of millions of real-time concurrency, and high reliability is the most basic requirement.

Video live broadcast technology requires high reliability, especially hot videos often bring millions or tens of millions of concurrency. At this time, high reliability is the most basic requirement. The high reliability of Alibaba Cloud's video technology is mainly reflected in two aspects. First, the architecture has full-link log/monitoring/alarm/prediction and highly reliable, multi-copy second-level switching, which can realize intelligent automatic operation and maintenance and access to the network in seconds. Level-level information investigation brings cross-center escape capability and disaster recovery service guarantee.

The second aspect of high reliability is reflected in the improvement of weak network experience. Alibaba Cloud's unique QoS technology can accurately predict bandwidth and greatly improve bandwidth utilization and congestion control capabilities. At the same time, combined with the encoder's weak network awareness and anti-packet loss technology, it can still reach 70% of the packet loss state. Higher clarity and fluidity. The intelligent voice packet loss compensation based on deep learning can improve the audio clarity in the weak network state, and the delay-sensitive adaptive technology under the microphone can achieve a balance between audio fluency and call delay in multiple scenarios. QoS technology can identify and dynamically adapt to various network scenarios such as packet loss and delay, greatly improving the end user's subjective perception of audio and video service performance.

low latency

GRTN creates best streaming media practice scenarios.

The delay refers to the time it takes for the image of the host to be transmitted to the user's screen. When the network, code stream, and device performance are excluded, choosing the appropriate live streaming protocol in different live streaming scenarios can greatly reduce the delay of live streaming. . Looking back at the history of live broadcast, it is also the history of live broadcast protocols. The mainstream protocols include the familiar HLS, DASH, RTMP, etc. The delay is generally more than 5s. Under the demand of strong interaction, the live broadcast protocol is also constantly transforming to low latency, such as : SRT, LL-HLS, etc.

Alibaba Cloud's best practices for low latency are mainly in two aspects. First, at the network level, the traditional CDN content distribution network is transformed into a GRTN global real-time transmission network. Its positioning is based on the heterogeneous nodes of the central cloud and the edge cloud to build an ultra-low latency, fully distributed and sinking communication-level flow. media delivery network.

GRTN currently integrates the transmission and exchange of audio and video streams in various business scenarios such as Internet live broadcast and RTC, and has many other core technologies, such as: the bidirectional real-time signaling network constructed by GRTN can achieve millisecond-level transmission of network-cutting messages, When a media stream of a publisher undergoes network switching, the subscribed client is completely unaware of the switching behavior that occurs within the GRTN.

Second, on this "one network", Alibaba Cloud has created an ultra-low latency live broadcast service RTS (Real-Time Streaming). The short-latency live RTS based on GRTN can support standard H5 WebRTC push broadcast, and the delay can be controlled within 1s in the case of tens of millions of concurrency; the RTC end-to-end delay can be controlled at about 250ms. Watching the comparison video of the live broadcast protocols of RTS and RTMP below, you can find that in the case of a certain packet loss rate, RTS has obvious advantages over RTMP in terms of experience, fluency and color.

https://www.youku.com/video/XNTg4ODEyNDYwNA==
RTS vs RTMP Latency Comparison Video

Ultra HD

The best reconciliation of cost and experience brings a more immersive and extreme audio and video experience.

Regarding the practice of UHD in live video technology, Alibaba Cloud's self-developed s265 encoding technology can achieve high image quality and low bit rate, and supports 4K real-time encoding; it supports AV1 encoding, saving more than 25% of the bit rate compared to HEVC. The well-known "narrow-band HD" technology, narrow-height 1.0 optimizes multiple scenes, saves bit rate through RIO and JND intelligent coding, narrow-height 2.0 adaptive video noise reduction and content repair, enhances the subjective image of the human eye through color and texture enhancement quality, bringing the best reconciliation of experience and cost.

At the same time, Alibaba Cloud also optimized the acquisition and encoding transmission link in live broadcast technology, and the entire link supports 4K and 8K. In engineering, the frame rate, bit rate, resolution, color and other dimensions are improved through various algorithms. Whether it is old movies, flaws, portraits, or animation scenes, it can be repaired to bring an ultra-high-definition experience. In addition to processing video in the cloud, it can also perform super-sub-frames, noise reduction, color enhancement, etc. on the device side. Even non-HDR devices can achieve a consistent ultra-high-definition experience on the device side through the color enhancement SDR+ technology.


End-to-end UHD comparison


Color-enhancing SDR+ technology

Intelligent

In the era of hyper-video, the intelligence of audio and video is a major trend.

Deep learning can bring about the improvement of various AI capabilities, and is the best outlet in video practice. In terms of intelligence, Alibaba Cloud's live video technology, in addition to traditional intelligent dubbing, intelligent stripping, and intelligent collection, can also review audio and video content in real time, and achieve accurate identification of pornographic anti-terrorism advertisements, saving a lot of manual screening. cost.

The trained virtual human technology supports 3D avatar, Live2D, stylized migration, virtual anchor, etc., bringing more evolution of XR technology. In addition, "intelligence" is also reflected in the audio experience. Based on the 3A technology organically combined with deep learning technology and traditional signal processing, it can achieve intelligent noise reduction, highlight vocals, and lossless music, and can be widely used in various real-time scenarios. The intelligent voice super-resolution technology can still maintain high sound quality in the case of small models. These are the effects brought by the combination of AI and video.

https://www.youku.com/video/XNTg4ODEyODkyNA==
Multi-scene experience of "smart noise reduction"

professional

Professional, so that live broadcast gradually evolved into "smart broadcast".

Alibaba Cloud's expertise in live broadcast technology is reflected in multi-bit rate, multi-protocol, content protection and real-time production, and live broadcast has gradually evolved into "smart broadcast". It is worth mentioning that in terms of real-time production, Alibaba Cloud recreates the traditional director station in the cloud, integrating real-time translation, graphic packaging, dynamic tags, advertisement replacement and other directing innovation capabilities, taking into account the professionalism of live broadcast and the advantages of remote directing.

At the same time, based on multi-channel real-time keying, Alibaba Cloud also moved the "virtual studio" to the Winter Olympics. Alibaba Cloud's "cloud director" technology not only supports multiple devices, multiple cameras, and remote broadcasts, but also enables dual-screen, split-screen, picture-in-picture and other start-up scenarios to meet the needs of live broadcasts to the greatest extent.

https://www.youku.com/video/XNTg4NzA4OTY1Mg==
Interactive Virtual Studio Helps Winter Olympics

Alibaba Cloud's professionalism in live broadcast technology, combined with the rich program production forms and lower cost of "cloud director", can be widely used in new radio and television media, event live broadcast, event live broadcast, commercial live broadcast and other scenarios, helping customers break business bottlenecks , do business faster and better.

https://www.youku.com/video/XNTg4ODEyNzYyNA==
"This! It's Hip-hop" cloud director + frame-level multi-view synchronization

multiple scenes

"Live +" has become a trend that penetrates into every scene.

From the perspective of the scene, live broadcast has gradually penetrated from the earliest large-scale sports live broadcast, e-commerce live broadcast, and game live broadcast to corporate training, online education, and new media of radio and television. Alibaba Cloud integrates various algorithm capabilities of live broadcast, video-on-demand, and online conferences into the same SDK. While realizing the integration of multiple scenarios, the integrated SDK can also be packaged on demand to achieve flexible customization. From traditional SDK access and API access to "low-code live broadcast model room", Alibaba Cloud Live provides one-stop access scenarios for e-commerce live broadcast, online education, enterprise live broadcast, etc. More than a dozen lines of code allow customers to easily access the live broadcast experience and help business development.

At present, the live broadcast business has become an important part of digital social services. More and more content and industries are turning to the "live broadcast +" model. The future picture of the development of live broadcast technology will become clearer with changes in market demand.

The "Best Practice Map of Live Video Technology" is based on Alibaba Cloud's many years of exploration and best practices in live broadcast technology. From the core of live broadcast technology, to the full scene coverage of live broadcast, to the innovation and application of live broadcast technology, it helps enterprises in depth Understand "live broadcast", break down technical barriers, and join hands with all walks of life to continuously change and move forward in the wave of the Internet of Everything.

"Video Cloud Technology", your most noteworthy public account of audio and video technology, pushes practical technical articles from the frontline of Alibaba Cloud every week, where you can communicate with first-class engineers in the audio and video field. Reply to [Technology] in the background of the official account, you can join the Alibaba Cloud video cloud product technology exchange group, discuss audio and video technology with industry leaders, and obtain more latest industry information.

CloudImagine
222 声望1.5k 粉丝