Recently encountered a ffmpeg pit in work, hereby record it. In our work, there is a requirement to splice the segmented video into a complete video, and we found that the video duration is incorrect after splicing using ffmpeg. For example, I used ffmpeg to stitch together 4 and a half hours of mp4 video, and the length of the obtained video was far more than 2 hours. After watching it, I found that there would be a long period of freeze at the connection point of the video, which caused the final video to be too long. .

In the official ffmpeg document Concatenating media files , three video splicing methods are introduced, as follows:

1. For the same kind of coded video

You can list all video file names into a text file, the format is as follows:

file '/path/to/file1.wav'
file '/path/to/file2.wav'
file '/path/to/file3.wav'

Then use the command ffmpeg -f concat -safe 0 -i mylist.txt -c copy output.wav complete the splicing of the video, which is also the fastest way to splice. The general principle is to directly connect the video to the first position without involving codec. The overall execution time is mainly the disk IO time. We measured 100 files and spliced them into a 5g long video, which only takes tens of seconds. time.
However, this splicing method has its own limitations. First of all, it can only splice videos of the same encoding, such as mp4. Moreover, this method also has a bug. The video duration obtained by splicing mp4 video files is incorrect. This is the problem I mentioned at the beginning. Because of this bug, we almost changed our business requirements. But this bug can be circumvented by converting all mp4 files into ts files first, and then splicing the ts files, and splicing ts videos will not show this bug.

The command for converting mp4 to ts file is as follows:

ffmpeg -i input.mp4 -c:v copy ouput.ts

Because the process of converting mp4 to ts does not involve video encoding and decoding, it is also very fast. We also used this method to bypass the bug and complete the entire requirement. In fact, there are two ways of video splicing, neither of which is suitable for us, as we will talk about later.

2. Use concat protocol

ffmpeg -i "concat:input1.ts|input2.ts|input3.ts" -c copy output.ts

We have not tested this method specifically, and it seems that it will not involve codec, so it should be quite fast, but it is said on the Internet that the conditions for executing this command are also relatively harsh, and it is not recommended to use it. The reason we are useless is simply because hundreds of videos need to be stitched together. This method requires a very long command line.

3. Use Concat filter

ffmpeg -i input1.mp4 -i input2.webm -i input3.mov -filter_complex "[0:v:0][0:a:0][1:v:0][1:a:0][2:v:0][2:a:0]concat=n=3:v=1:a=1[outv][outa]" -map "[outv]" -map "[outa]" output.mkv

complicated. For details, please refer to the official document 1619a3bdd02882 Concatenating media files . The advantage of this method is that the effect is stable and it supports videos of different formats, so it is also the most recommended way of video splicing. But the shortcomings are also obvious. It needs to involve video codec, so it will consume a lot of performance. Because of the performance problem, we also abandoned this scheme.

Let’s talk about our measured data. We use a general-purpose server. It takes 20-30 minutes to stitch a 60-minute video (depending on the server configuration). It’s okay, but we have thousands of hours of video stitching every day, which requires dozens of sets. The server can only be completed at full capacity within 24 hours, and the cost is still high for us at the moment. We also commissioned others to try out the splicing effect using GPU acceleration. It is indeed much faster, and it can be completed within 1 minute of 1 hour of video.

Summarize

We currently do not have GPU resources, so we still choose to use the first video splicing method. The current biggest bottleneck of the first method is only network IO (video download and upload), but this solution also limits us to complete For video splicing, the resolution cannot be adjusted to achieve the purpose of reducing storage. In the long run, we must consider using hardware acceleration to complete the processing of large video volumes.


xindoo
717 声望3k 粉丝