Introduction to Cloud PTS stands on the shoulders of the giants of Double 11 and is an extension of Alibaba's full-link stress test. PTS can easily initiate millions of users' traffic through its elasticity, eliminating machine and labor costs; PTS's control of traffic can be pulsed and accurately controlled in real time; it is an excellent solution to the rapid increase of traffic pulses in live video broadcasting.
Author: Zizhen
review & proofreading: Fengyun
Editing & Typesetting: Wine Circle
"From January this year to the present, Taobao Live has more than 500 million users. By August, the traffic has also increased by 59%, and it has increased by 55% on the core merchant GMV. Double 11 began on the evening of October 20. We hope that Taobao Live will take over this as the home venue.” Cheng Daofang, the head of the live broadcast division of the Taobao business group, revealed in an interview with a reporter from 21st Century Business Herald that the past year's live broadcast was lively and this year will be more professional.
With such a large user base, what are the different challenges that live streaming applications bring to back-end services? Today we will introduce some live broadcast architecture and the challenges that this architecture brings to our application architecture.
Live broadcast architecture
We usually see the following live broadcasts:
1. Single-player live broadcast, such as Taobao live broadcast, usually accompanied by spike, barrage, rocket delivery and other business logic;
2. Live broadcast by multiple people at the same time, such as connecting microphones and conferences;
3. Recording and broadcasting. For some live broadcast scenes such as training and conferences, it is necessary to save the live broadcast video for dissemination, retention, etc., and there is a need for recording the live broadcast. This often does not require high real-time performance.
When the user is watching the live broadcast, if the service is connected to the CDN, if the CDN is connected, the player selects the nearest CDN node to pull the stream. At this time, the pressure of the stream is on the CDN; if the CDN is not connected, the player will start from the live source Station to pull the flow.
The following figure shows a more common video stream architecture and two data trends:
1. Video stream push and pull logic, as shown by the blue line
2. Conventional business logic, as shown by the yellow line
You can see that there are four main modules:
1. streaming end : The main function is to collect the anchor's audio and video data and push it to the streaming media server;
2. streaming media server : The main function is to convert the data transmitted by the push streaming terminal into a specified format, and push it to the playback terminal to facilitate viewing by different playback terminal users. Of course, the current cloud manufacturer also provides a complete set of solutions for the streaming media server. plan;
3. business server : mainly deal with some common business logic, such as spike, barrage, etc.;
4. player : In short, the player is to pull audio and video for playback, and present the corresponding content to the user.
The protocol of the four key modules is actually the streaming media transmission protocol. Most of the live broadcast structure adopts the format of the above figure, the big difference is whether to introduce CDN. Generally speaking, we recommend that customers introduce a CDN to reduce the impact of live streaming traffic on the server. The agreement between the four modules does not emphasize consistency.
Next, let's discuss along this structure, what are the more fragile risks in this, and how we can troubleshoot these risk points through stress testing early.
live broadcast
Challenge 1: The pressure of video streaming on streaming media server
In this push-pull logic, due to the large amount of video traffic involved and the long route it takes, it will have an impact on the streaming media server; the way to pass the scene is to introduce a CDN. When users start to watch the video, they will go to the nearest CDN first. Retrieve the stream. If the video content is not cached in the CDN at this time, the CDN will return the source to the streaming media server.
However, the risk exists when a large number of users watch CDN at the same time in an instant, and a large number of CDNs return to the source; this kind of pulse traffic will bring unpredictable effects to the streaming server.
We usually use pressure testing to verify the validity of the link in advance, and even through pressure testing, we can preheat the video on the CDN . However, the traditional HTTP request protocol cannot support this scenario because:
1. Even the open source software srs\_bench and JMeter provide some plug-ins to use. However, these open source software require users to have a deeper understanding of the video protocol, and the threshold for use will be slightly higher;
2. The video pressure test itself has very high requirements for bandwidth, which means that the cost of the pressure test machine is relatively high;
3. The video pressure test needs to take into account the influence of the region on the transmission quality.
In response to the above problems, PTS has added the RTMP/HLS protocol and abstracted it in combination with the stress test scenario, allowing users to interface stress tests with different protocols.
In addition, PTS also provides a wealth of layout modes, which can easily and freely arrange scenes; more importantly, it can also use PTS's national customized mode to simulate customers requesting from different places and detect problems more quickly. .
Challenge 2: Low-latency interactive protocol
Unlike traditional big promotions, live broadcasts often pursue interaction with offline customers. For example, barrage, comment, chat, spike and so on. The host’s chat was full of enthusiasm, and the users did not respond. This was a failed live broadcast. Ordinary HTTP requests cannot meet the demand for timeliness; therefore, these functions are usually implemented by WebSocket. Because HTTP is a stateless, connectionless protocol, WebSocket establishes a long chain through the server/client to ensure real-time messages and reduce performance overhead.
Every time a WebSocket connection is established, an HTTP request is initiated during the handshake phase. Through the HTTP protocol, the WebSocket supported version number, the protocol version number, the original address, and the host address are agreed to the server. The key part of the message is the Upgrade header, which is used to tell the server to upgrade the current HTTP request to the WebSocket protocol. If the server supports it, the returned status code must be 101:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept:xxxxxxxxxxxxxxxxxxxx
With the above return, the Websocket connection is established successfully, and the next step is to complete the data transmission in accordance with the Websocket protocol.
For the communication process of WebSocket, JMeter provides a plug-in to simulate the whole process, but it also requires users to understand the gameplay of the protocol, which is relatively obscure to use. PTS abstracts business meanings. Users configure scenarios and pressure configurations, and only need to configure a few simple parameters such as pressure test URLs, output parameter settings, and checkpoint settings to play with complex protocols.
In addition to being used in live broadcasts, Websocket is also widely used in online games, stock funds, sports live updates, chat rooms, bullet screens, online education and other scenarios with very high real-time requirements.
Challenge three: high concurrent pulse traffic
Different from ordinary applications, the time period of use of live broadcast applications is very concentrated. Therefore, a large number of users will be flooded in just a few hours. A live broadcast of a big V will usually cause millions of users to log in. The ability of the live broadcast system to respond to pulsed traffic has also become very high. And when grabbing goods, unlike traditional spikes, the anchors often initiate spikes at a certain time-this time is often not very accurate-and the pulse traffic has extremely high requirements on the system, and many of them do not usually appear. Problems such as lazy loading, jit preheating, hot and cold data switching, and other traditional high-traffic problems that do not occur, will appear.
These two characteristics require the pressure measurement tool to be able to initiate a large flow in an instant. This not only requires more machine engines, but also requires precise control of the flow-to meet the demand of rapid increase in flow.
These two points are the strengths of Alibaba Cloud PTS. Alibaba Cloud PTS stands on the shoulders of the Double 11 giants and is an extension of Alibaba's full-link stress test. PTS can easily initiate millions of users' traffic through scalability and flexibility, eliminating machine and labor costs; PTS's control of traffic can be pulsed and accurately controlled in real time; it is an excellent solution to respond to the rapid increase of traffic pulses in live video broadcasting.
last
PTS has fully upgraded the protocols supported by PTS in response to changes in the video and live broadcast industries. It not only supports traditional HTTP requests, but also introduces HTTP 2, streaming media, MQTT and other protocols, allowing users to Test Anywhere!
In addition, PTS has launched a new resource pack price, and a lower price JMeter exclusive resource pack. It is now in the Double Eleven promotion. The audience is 88% off, and the minimum can be as low as 0.99 yuan. More questions about PTS and Product suggestions, everyone is welcome to scan the code and enter the group for communication.
related links
- PTS_ https://pts.console.aliyun.com/#/overviewpage _
- PTS resource package purchase_ https://common-buy.aliyun.com/?commodityCode=ptsbag#/buy _
Copyright Notice: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。