Guide: H.265 is a new generation of video coding standard developed by ITU-T VCEG after H.264. Compared with H.264, H.265 can further improve the compression efficiency and improve the picture quality. In the video scene, it has been more and more widely used. We have done a lot of engineering practice on H.265 in NetEase Yunxin NERTC. This article is the third part of the experience sharing series-the video article. The article will be done from four aspects. Specific introduction.
Capability negotiation
Whether a client can send a code stream with specified characteristics is not only determined by whether the local end supports encoding, but also whether other receivers in the room can decode it. , that is, the sending end and the receiving end jointly determine what kind of code stream the local end can send. For H.265, the following situations may occur:
So how will they communicate? If client B sends H.265 stream, client A and client D only support H.264 decoding, they receive the H.265 bit stream sent by client B, they cannot decode it, so we need Design a set of capability negotiation mechanism to ensure normal communication between clients that support different codec capabilities.
The following figure shows the overall design of the capability negotiation
The following is a detailed introduction to the design plan of our capability negotiation.
Capability Set Design
The capability set is defined as {uint32 key: [uint8 value1, uint8 value2 ...]} , using a 1-bit mask
sdk is [0, 2^8)
video is [2^8, 2^16)
audio is [2^16, 2^24)
bytes are as follows:
Examples are as follows:
Capability Negotiation Process (Client)
- Client-side defined capability set
- The client implements the capability set report locally, and the capability set sends the configuration mechanism
- The server provides the generation, synthesis and delivery functions of the capability set in the channel
example,
Definition: Capability field VideoCodec: 256 (2^8), capability value H.264: 0, H.265: 1
client A and client B enter the same channel , client A reporting capability set {256: [0, 1]}, client B reporting capability {256: [0]}
After the server receives the report of the capability set of client A and client B, it knows that A supports H.264 and H.265, and B supports H.264. Combining the capability sets of A and B, the channel supports H.264. The conclusion of the release capability set {256: [0]}
After client A receives the capability set, it actively closes its own H.265 capability; after client B receives the capability set, it is consistent with its own capability set and does not change it.
Capability negotiation process (server)
- When the room is created, the default capability set is generated. **The default capability set is provided by the engine, and the server is used as a global configuration
** - In the first edge_login request, if has the capability set field , use this capability set to override the default capability set as the capability set of the room; if does not have the capability set field , use the default capability set as the capability set of the room. Since it is the first one, there is no need to broadcast.
- For each subsequent incoming user, if capability set greater than the capacity of the room set , the room is returned to the user set capacity, the ability to set the same room. An example is as follows:
- For each subsequent incoming user, if capability set smaller than the capacity of the room set , is on the intersection, reduced capability set of the room, and the results are broadcast to all users. An example is as follows:
H.265 Codec Practice
Each platform has its own unique hardware 265 codec, as well as various open source software 265 codecs, Then how will we choose to achieve the best use effect?
Below we will introduce the practice of the 265 codec on each platform for Android, iOS, Mac and Windows. Among them, the software encoder uses x265 for evaluation, and the software solution uses ffmpeg and libhevc for evaluation.
Android side
First, let’s take a look at the power consumption and bit rate of the (Test model Mi 10: Qualcomm Technologies, Inc SM8250, test Profile: 720P 30fps 1.7M)
take a look at the performance of each software codec (Test model Mi 10: Qualcomm Technologies, Inc SM8250, test Profile: 720P 30fps 1.7M)
Finally, take a look at quality comparison (test model Mi 10: Qualcomm Technologies, Inc SM8250, test Profile: 720P 30fps 858k), the left is H.264, the right is H.265
Summarize:
- Android hard-coded 265 is basically the same as hard-coded 264 in terms of power consumption, while the bit rate of android hard-coded 265 is more stable than hard-coded 264
- The performance of Android soft-coded 265 is still relatively poor and cannot meet the demand
- Android ffmpeg soft solution 265 has poor performance on arm64, CPU occupies 15%, under the same conditions, using libhevc, CPU occupies only 4.5%, there is a lot of room for improvement in performance
4. The quality of 16166a02215523 android hard-coded 265 is obviously clearer than that of hard-coded 264, and the image quality benefits are obvious
So the android end our 265 codec usage strategy is as follows:
- preferential use of 265 hard solution. Some devices have 265 hardware solution compatibility issues, high-end models choose 265 soft decoders, low-end devices directly think that they do not support 265 decoding
- preferentially use 265 hard code. Some devices have 265 hard-code compatibility issues, they think they do not support 265 encoding, and downgrade to 264 encoding
- Since the performance of libhevc soft solution is significantly better than ffmpeg soft solution 265, , we will give priority to libhevc
iOS
First, let’s take a look at the power consumption of the hardware codec (Test model iPhone11, test Profile: 720P 30fps 1.7M)
In addition, we found that on some models or in some scenarios, iOS hard-coded 265 will have an obvious problem of insufficient bitrate (test model iPhoneXR, test profile: 720P 30fps 1.7M)
H.264:
H.265:
Like this situation where the bit rate is seriously insufficient, we designed a set of bit rate monitoring methods based on iOS hard-coding control. If the bit rate is seriously insufficient, it will fall back from H.265 to H.264
Finally, take a look at the quality comparison of 16166a0221577f when the hard-coded rate is stable. (Test model iPhone11, test Profile: 720P 30fps 858k)
Left H.264, Right H.265
summary:
- power consumption of using iOS hard-coded 265 is significantly higher than that of hard-coded 264, and the power consumption of using iOS hard-coded 265 is also significantly larger than that of hard-coded 264.
- iOS hard-coded 265 occasionally has insufficient bit rate, resulting in poorer image quality than 264
3. iOS hard-coded 265 image quality is clearer than hard-coded 264 image quality , there are some image quality benefits
So the final iOS end our 265 codec usage strategy is as follows:
1. prefer to use hard code 265, does not support soft code 265
2. preferentially use hard solution 265, only fall back to ffmpeg 265 soft solution after hard decode 265 has failed to be recovered multiple times.
- Since iOS hardware encoder 265 power larger than 264 hardware, iOS devices to monitor the overall power, in low battery condition (such as 20% to the critical point), we cut back to the hardware 265 264 Hardware Encoder
- Use the iOS hard-coded control module to monitor the actual encoding bit rate. If there is an obvious bit rate insufficient or excessive bit rate, we switch the hard-coded 265 back to the hard-coded 264
iOS hard-coded control
The following figure shows the design of the hard-coded control module:
code control process:
First, the hard encoder passes the target bitrate Target Bitrate to the code control module HWBitrateController
Each one-encoded, the encoded frame size to update the estimated HWBitrateController per second actual coding rate Estimated the Bitrate
Calculate the and the actual code rate 16166a02215b83 Diff
Through the dichotomy , take 0.5 times the Diff plus Target Bitrate as the adjusted bitrate
Set the Adjusted Bitrate to be adjusted back to the hard encoder HWEncoder, as the target bitrate of HWEncoder
Then returns to the first step, the hard encoder passes the current target bitrate Target Bitrate to the code control module HWBitrateController
Calculate the Diff/Target Bitrate, continues to be greater than 30%, it is considered that the bit rate is significantly less than , and the degraded encoder needs to be triggered
Mac side
First, let's take a look at the CPU and bit rate of the 265 hardware and software codec on the Mac (test model MacBook Pro (15-inch, 2016): Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, test Profile : 720P 30fps 1.7M)
hard coded 265 bit rate:
Soft code 265 bit rate:
At the same time, we hard-coded the 265 code stream by Mac and found that 16166a02215ca5 uses forward B frames
Finally, let’s take a look at the quality comparison (test model MacBook Pro (15-inch, 2016): Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, test profile: 720P 30fps 1M)
Summarize:
- On Mac Hardware 265 Compared with software 265, CPU occupies lower, performance gains are more obvious
2. hard coded 265 code rate stability is not as good as soft coded 265 , the fluctuation is relatively large, but all fluctuate around the target code rate, after long-term testing, it is found that the overall code rate is averaged and there is no obvious over-transmission or insufficient situation.
3. hard-coded stream forward B frame instead of P frame, the overall compression rate will be higher, and the same bit rate has better picture quality
4. hard-coded 265 picture quality is significantly better than 264 , while the soft-coded 265 and hard-coded 265 picture quality is not much different
So the Mac side our 265 codec usage strategy is as follows:
- preferentially use 265 hard solution. Some devices have 265 hardware solution compatibility problems or do not support 265 hardware solution. Use soft solution 265 (ffmpeg) on devices with stronger CPU performance. If it is a device with weaker CPU, it is considered that 265 decoding is not supported.
- prefer to use 265 hard code. Some devices have compatibility issues with 265 hard-coded or do not support 265 hard-coded. Use soft-coded 265 on devices with stronger CPU performance. If it is a device with weaker CPU, it is considered that 265 encoding is not supported.
Windows side
Due to the serious fragmentation of hard-coded and hard-coded solutions on windows, we have not considered using hard-coded and hard-coded solutions for the time being. At present, uses soft-coded and soft solutions.
The following picture shows the situation of soft editing and decoding 265 on win (test model Dell Latitude 5290: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz, test profile: 720P 30fps 1.7M)
can be seen:
1. The performance of software 265 and software solution 265 on x86 is not as good as x86_64
2. The CPU consumption of soft-coded 265 on x86_64 is also much higher than that of soft-coded 264
So the final Win side our 265 codec usage strategy is as follows:
1. In the case of win x86 , not directly compiled 265 Soft Soft Solutions Support
2. in the case of using win x86_64, CPU performance is relatively strong equipment to open 265 soft-encoding and soft-decoding; at the same time soft-encoding 265 has higher CPU performance requirements than soft-decoding 265, so the performance requirements of the device will be more strict
Engineering strategy
Whitelist policy
As mentioned earlier, when we use 265 hard-coded and hard-coded solutions for ; at the same time, when using 265 soft-coded and soft solutions, due to the large number of codec operations involved, we must also consider the device’s Performance issues.
In order to provide the best user experience at the overall engineering level, we use the whitelist strategy . Through the whitelist configuration and delivery, the devices that support 265 hard-coded hard-coded and soft-coded and soft-coded solutions can be distinguished.
following is our specific approach:
- Through a large number of device adaptations, configures the devices with better support for 265 hard-coding and hard-decoding into the online whitelist
- Through the device running score, distinguish the level of the device’s CPU performance. configured with high-performance devices to support 265 soft programming and software decoding, and low-performance devices do not support 265 software programming and software decoding, and then update the configuration to the online whitelist
- Finally, through the online whitelist configuration and delivery, the client obtains the current configuration information whether it supports hard-coded and hard-coded solutions and whether it supports soft-coded and soft-coded solutions.
H.265 capability negotiation
negotiates the H.265 decoding capability, and the negotiation result finally acts on the encoding side, whether to use H.265 encoding
- H.265 decoding configuration issuance, user settings and current device support capabilities, are combined to get the current H.265 decoding capability
- The capability negotiation module generates a capability set based on the current decoding capability of H.265, and reports it to the capability negotiation server
- Capability negotiation server's comprehensive channel capability set of each client, generate a new room capability set, and send it to each client
- The client receives the capability set issued from the server, and parses the capability set of H.265 in the current channel
- According to the H.265 capability set in the current channel, the configuration of H.265 encoding, user settings, and current device support capabilities, the server's support for H.265, three-part synthesis to determine whether H.265 encoding is currently supported , And finally act on whether to encode H.265 stream
CPU OverUse strategy
It may not be completely accurate to distinguish the CPU performance of different devices through running scores. In actual scenarios, there may be a situation where the running score data is very high, but the encoding performance is not enough. At this time, we need to determine whether the current CPU is overloaded by real-time monitoring and statistics of the current video encoding time.
Our approach is:
- Hard-coded 265 takes into account the possibility of pipeline delay. At present, we not count the coding time of each frame of coding
- When using soft coding 265, count the coding time of each frame of coding. If continuous coding takes too long, then it is considered that the current CPU is overloaded, and 265 soft coding will immediately
QP threshold adjustment
Our QOE module will adjust the frame rate and resolution according to the QP threshold to achieve the best subjective video quality. Therefore, it is very important to set a reasonable encoder QP threshold for 265 encoder, how do we explore reasonable QP upper and lower thresholds?
Our approach is:
- To ensure that H.264 and H.265 are basically aligned with subjective image quality, output the QP value, generate the QP curve, and obtain the basic range of the QP upper and lower limit thresholds based on the QP curve
Let's take this model as an example: (Test model Mi 10: Qualcomm Technologies, Inc SM8250, test Profile: 720P 30fps)
can be seen:
- In the Profile gear of 720p 30fps, when the image quality of hard-coded 265 and hard-coded 264 are basically aligned, the bitrate of hardware 265 can be saved by 40%, the QP fluctuation is basically similar, and the upper and lower thresholds are basically close. So if the QP upper and lower threshold of the current android hard-coded 264 is [A, B], then it is recommended that the upper QP threshold range of the android hard-coded 265 is [A-1, A+1], and the lower QP threshold range is [B-1 ,B+1]
- Based on the QP upper and lower threshold range obtained in step 1, the network loss is gradually released. observes the adjustment of the QOE module based on the QP threshold resolution and frame rate, and Take the following data as an example:
We found that when the QP threshold was [A-1,B-1], the overall QOE performed best, so [A-1,B-1] was selected as the final hard-coded 265 QP threshold
Income
At present, the first phase of the revenue evaluation is based on the alignment with the H.264 resolution bit rate, looking at the four indicators quality revenue, end-to-end latency, CPU usage, and fluency 16166a0221660b. Here is a list of the image quality gains of Android hard-coded 265 and iOS hard-coded 265 compared with hard-coded 264. From our evaluation, end-to-end latency, CPU usage, and fluency are basically three indicators The difference is not big, and image quality gains, you can refer to the following chart data:
Overall:
- Compared to 264, 265 has obvious benefits in video quality.
- In scenarios with low bitrates, weak networks, or drastic changes in the picture, the picture quality gains will be more obvious
3. android hard-coded 265's image quality gains are significantly better than iOS hard-coded 265
Summary
Regardless of whether it is hard-coded 265 or soft-coded 265, compared to 264, the video quality has obvious benefits; follow-up NetEase Yunxin will align the image quality and save bit rate for further exploration.
Author introduction
Shi Lingkai, NetEase Yunxin audio and video engine development engineer, responsible for the development and maintenance of NERTC video engine.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。