In the previous DevforDev column, our engineers shared the establishment and practical experience of Shengwang's no-reference video evaluation system (VQA). Establishing a VQA system requires a long process. Before that, there are still many ways to judge the quality of real-time video. Today we will talk about some common video experience indicators in the industry.
We generally judge the quality of the video through performance indicators such as video real-time, fluency (stutter rate, rendering frame rate), subjective image quality, first frame output, audio and video synchronization and other performance indicators. The following content will explain these indicators one by one, and talk about the detailed testing method of each indicator in the laboratory environment.
real-time
When it comes to the live video delay test, we have to first discuss several links that generate delays:
part1. Video capture and encoding
part2. Video device to server transmission
part3. Transmission from server to client
part4. Client-side playback
Among them, part1 and part4, that is, the encoding/decoding part, consume the most delay in this process, while part2 and part3 are network data forwarding and transmission, and the time consumption is very small. The video delay is an indicator used to reflect the real-time performance of the video. The delay between sending the shooting content to the receiving end for rendering, unit: ms. Generally speaking, the smaller the video end-to-end delay value, the better the real-time performance.
At present, the more accurate test method in the industry is to start a timer accurate to milliseconds on the PC, and then push the live stream to the server through the camera/mobile phone/desktop encoding push stream, and then start a player (on the local machine or another) at the same time. PC/mobile phone player), and then include the source video and the playback video in one photo by taking screenshots or taking pictures of mobile phones, and then calculate the time difference. Laboratory testing methods are basically similar to industry testing methods.
Laboratory Test Methods
1. Device 1 and device 2 are connected to the microphone.
2. Find a device to open an online stopwatch or stopwatch with an accuracy of ms, and point the rear camera of device 1 to the stopwatch. The camera collects the numbers of the stopwatch, and the number of the stopwatch sent by device 1 can be clearly seen on device 2. The location is shown in Figure 1 below.
■Figure 1
3. Set the position of picture 1, and then use iPhone to take a 10s video, after 3-5 minutes, take a 10s video, after another 3-5 minutes, take a 10s video, a total of 3 videos.
4. Delay statistics: select 10 pictures for each video (select one sample point per second), video delay = the moment displayed on the online stopwatch - the moment displayed on the mobile phone 2 (video delay in Figure 1 = 762-447 = 315ms). There are 30 video delay data in 3 video segments, and the average value is the final video delay.
Stuttering rate and frame rate
In video quality detection, detecting whether the video freezes is also one of the standards for video quality detection.
Stalling rate: an indicator used to reflect the smoothness of the video, the total stalling duration in the test period/total test duration*100%, the higher the stalling rate, the worse the subjective experience.
Frame rate: The rendering frame rate observed by the receiving end. The higher the frame rate, the better the fluency.
The industry test fluency test method is:
1. Test video recorded by automation;
2. Use FFmpe to select the area of interest for interception;
3. Get the area of interest for the first 10s, the middle 10s, and the last 10s of the video;
4. Take a picture every 50ms for three 10s videos;
5. Analyze images through opencv;
The main basis for judging stuttering is that the same picture lasts for 200ms when the person's line of sight rule is the same. According to this rule, the images obtained in step 4 above are compared through opencv. If the nth picture is compared with the n+1th picture, and the pixel similarity lasts for 4 pictures within a certain range (50ms*4=200ms), the video is considered to be stuck, and the stuck rate is calculated. . The laboratory test principle is roughly the same as this method.
Laboratory Test Methods
1. Mobile phone 1 and mobile phone 2 are connected to the microphone;
2. The rear camera of mobile phone 1 is facing the rotating globe (the captured image needs to keep moving), the camera collects the moving globe and the moving anchor, the sphere occupies about 1/4 of the overall screen size, and the anchor accounts for 1/1 of the overall screen. 4. On mobile phone 2, you can see that mobile phone 1 has sent the globe and the anchor, and the specific placement is shown in Figure 2 below;
■Figure 2
3. After setting the position of Figure 2, set the weak network, for example, limit the upstream bandwidth to 500K, and lose 20% of the packets. After the weak network takes effect, after about 30-40s, use the built-in screen recording function of mobile phone 2 to record the screen for 3 minutes. The mobile phone with poor performance needs to connect to the computer through HDMI, and use QuickTime Player to record the screen;
4. Transfer the recorded mp4 video to the computer, and use the stutter rate statistics script to calculate the stutter rate and frame rate: 200ms stutter rate: during the test period, the cumulative duration of stuttering over 200ms/total test duration * 100%.
Subjective image quality
The purpose of video subjective quality assessment is to accurately measure the human eye's perception of video content. After the source video is transmitted to the human eye through modules such as acquisition, encoding, transmission, and decoding, it will inevitably introduce some compression distortion. In severe cases, there may even be problems such as green screen, blurry screen, and mosaic. There are two common methods for evaluating subjective video quality in the industry: subjective evaluation and objective evaluation. Subjective evaluation is a rating that verifies the quality of all videos by subjective scoring. Objective evaluation uses some mathematical models similar to subjective quality evaluation results to quantify the human eye's perception of video content, which can improve the efficiency of some evaluations.
Subjective picture quality refers to the picture quality received by the receiver. The higher the quality level, the better the picture quality.
Laboratory Test Methods
Observe the video picture quality to give a subjective picture quality rating.
10 points: The sports sphere is clear, the outline of the anchor is clear, the facial features of the anchor are clear, and the subjective image quality is clear and lossless
9 points: The sports sphere is clear, the anchor's outline is clear, and the anchor's facial features have a slight mosaic
8 points: The sports sphere is clear, the outline of the anchor is clear, and the anchor's facial features are mosaic
7 points: The sports sphere is clear, the outline of the anchor is more obvious, the facial features of the anchor are blurred, and there are block mosaics
6 points: The sports sphere is blurred, the outline of the anchor is more obvious, the facial features of the anchor are blurred, and there are a lot of mosaics
5 points: The moving sphere is blurred, the anchor's outline is blurred, and there are many mosaics on the face
4 points: The sports sphere is muddy, the anchor's face is muddy, and there are many mosaics
3 points: The sports sphere is muddy, the anchor's face is muddy, and there are a lot of mosaics
2 points: The sports sphere is blurred, the anchor's face is blurred, the facial features are not obvious, and there are a lot of mosaics
1 point: The overall picture is blurred and the recognition is low
first frame drawing
The time from entering the channel to seeing the picture of the opposite end, unit: ms. The sooner the first frame is displayed, the better, and it should be displayed within 1 second at least.
To put it simply, the first frame of the video is the time when the first frame of the video is loaded and displayed, so how is this time calculated?
In order to calculate the loading time of the first frame of the video, we need to find out what the flag of the first frame of the video is loaded, that is, how do we determine whether the first frame is loaded, and where is it loaded?
The method we use here is to capture the picture regularly after playing the video, find the first picture that is 90% similar to the reference picture, which is the first frame picture, and then click the system time to enter the channel to receive this picture. The system time, this time difference is the loading time of the first frame of the video.
Laboratory Test Methods
1. Mobile phone 1 first enter the channel;
2. Mobile phone 2 repeatedly enters and exits the same channel 10 times, each time in the channel is about 10s, and the waiting time outside the channel is about 10s;
3. Use a fixed frame rate (such as 30 frames per second) on the recording device to record the entire process of entering and leaving the channel of mobile phone 2;
4. Pour the recorded video into the computer, open the video editing software Final cut Pro, and import the video. Retrieve the recorded video frames in the software to find all frames where the "Enter Channel" button is clicked (Figure 5), and all frames where the first frame occurs (Figure 6). Since the video is recorded at a fixed frame rate of 30 fps, we subtract the frame numbers to get how many frames are between the two events. Then according to 1/30 second of each frame, you can quickly calculate the time to enter the channel each time;
5. The first frame drawing time=[c 30+d-(a 30+b)]*1000/30, and the average drawing time is taken for 10 times;
■Figure 5
■Figure 6
Audio and video synchronization
The so-called audio and video synchronization, its principle is to record the live broadcast, and obtain the image time and audio time by decomposing the audio and video files of the recorded video, and the difference between the two results in the synchronization delay difference.
Audio and video synchronization reflects the degree of synchronization between video and audio. The standard for audio and video synchronization is -200m to 200ms, which means that the audio can be ahead of the video by 200ms, and the video can be ahead of the audio by 200ms. The closer the audio and video synchronization delay is to 0, the better the synchronization effect.
At present, the common way to test the synchronization of audio and video is for the tester to watch the video and compare whether the lip movement of the characters in the video screen is synchronized with the sound heard. Due to the individual differences of testers, there will be an error of about 200ms. Therefore, it is necessary to objectively test whether the audio and video of the video are synchronized.
Laboratory Test Methods
1. Mobile phone 1 and mobile phone 2 enter the same channel;
2. Play the video files for testing in a loop on the computer;
3. Put the rear camera of mobile phone 1 directly on the computer screen, and collect the synchronous video source that is being played, to ensure that the red rectangle and numbers can be seen completely on the mobile phone 2 (Figure 7);
3. Move the mobile phone 2 to another room, in order to prevent the sound played by the computer from interfering with the sound played on the mobile phone 2 being collected correctly. Adjust the playback volume of the computer to ensure that the beeping sound can be heard louder on the 2 end of the mobile phone;
5. Use iPhone to record the playback video and sound of mobile phone 2, and place them as shown in Figure 8 below;
■Figure 7
■Figure 8
6. Import the recorded video to the computer, play and analyze the video. Each frame of the recorded video has a frame number, and a row of squares is drawn at the bottom of the screen, with a slider moving on it, one frame per frame, each representing 1/30 of a second (33ms). When making a test video, make sure that every hit sounds exactly when the slider is centered. Therefore, play the video, pause immediately when you hear the sound, observe the frame number or observe the position of the slider, and you can calculate the delay between the sound and the picture.
Bandwidth tracking
Bandwidth tracking, that is, detecting the size of the bit rate transmission in the channel.
The industry test methods include using ping to test the network, using the test website, and using the trace route to test the video bandwidth. The laboratory uses wireshark (network packet analysis software) to intercept network packets to analyze the bandwidth of the video.
Laboratory Test Methods
1. After the mobile phone is connected to the computer, open Xcode, then select window-->Device and Simulator-->Device to select your mobile phone, and you can see the information. The "Identifier" column is the UDID;
2. Create an RVI interface and obtain a virtual interface. Terminal input $ rvictl -s
3. Open wireshark and select the interface name just generated in the capture (for example, rvi0 in Figure 9);
■Figure 9
4. Open the app to start communication, and wireshark will display various packets;
5. Select the ip of the sender (the machine) and the receiver (the server). Generally, the source is the local ip with the highest frequency, and the server ip is the destination with the highest frequency. (As shown in Figures 10 and 11);
■Figure 10
■Figure 11
6. Copy the filter conditions, click Statistics, paste the copied filter conditions into the display filter column, and display the I/O diagram of the captured data, and then you can get the bandwidth of real-time communication (as shown in Figure 12)
■Figure 12
The above are our video experience indicators in the RTC scenario and the corresponding laboratory test methods. These test methods can be used for reference in daily scene development. If you have any other knowledge you want to know about the video experience, you can leave us a message to exchange.
Introduction to the Dev for Dev column
Dev for Dev (Developer for Developer) is a developer interactive innovation practice activity jointly initiated by Shengwang and the RTC developer community. Through various forms of technology sharing, communication and collision, and project co-construction from the perspective of engineers, the power of developers is gathered, the most valuable technical content and projects are mined and delivered, and the creativity of technology is fully released.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。