前言

本次测试,关注两个点:

  • 线程数目和拆帧速度的关系
  • 不同分辨率的视频,对拆帧速度的影响

代码 demo

测试代码

import av
import time
from pathlib import Path

input_file = '/home/ponponon/Downloads/XiaoShengKeDeJiuShu_4-1080P.mp4'


def get_video_seconds(video_file_path: Path) -> int:
    import av
    with av.open(str(video_file_path), metadata_encoding='utf-8', metadata_errors='ignore') as container:
        stream = container.streams.video[0]
        return int(stream.frames/stream.average_rate)


total_seconds: int = get_video_seconds(Path(input_file))


items = []

for thread_count in reversed([1,2,3,4,5,6,7,8,9,10]):
    count = 0

    s = time.time()
    with av.open(input_file, metadata_encoding='utf-8', metadata_errors='ignore') as container:
        video_stream = container.streams.video[0]
        video_stream.thread_type = 'AUTO'
        video_stream.thread_count = thread_count

        average_fps: int = round(video_stream.average_rate)
        interval = 1

        for index, frame in enumerate(container.decode(video_stream)):

            if index % (average_fps) == 0:
                frame.to_ndarray(format='rgb24')
                count += 1
    e = time.time()

    print(f'thread count is {thread_count}, pay time is {e-s}')

    items.append([thread_count, round(e-s, 2), round(total_seconds/(e-s), 1)])

for item in items:
    print('| ', ' | '.join([str(i) for i in item]), ' |')
上面的代码,使用 pyav,按照一秒一帧的方式,从视频中提取帧

不同分辨率的视频

高分辨率视频

重新测试,加上倍速

视频是一个 1080P 的视频

╰─➤  ffmpeg -i /Volumes/SanDisk128G/标准视频/XiaoShengKeDeJiuShu_4-1080P.mp4                                         
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Volumes/SanDisk128G/标准视频/XiaoShengKeDeJiuShu_4-1080P.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.71.100
    description     : Packed by Bilibili XCoder v2.0.2
  Duration: 00:09:35.04, start: 0.000000, bitrate: 2832 kb/s
  Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 2631 kb/s, 30 fps, 30 tbr, 16k tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 192 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
At least one output file must be specified

平台 macbook pro Apple Silicon M1

线程数耗时(秒)倍速
812.1747.3
712.5245.9
613.2743.3
514.4839.7
416.9533.9
321.5826.6
228.5820.1
146.312.4
该测试,没有做资源限制,直接跑在 macbook 上,没有使用虚拟机或者 docker 限制 cpu 或者内存资源。属于撒开丫子跑

平台 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

线程数耗时(秒)倍速
1020.9827.4
920.9827.4
821.526.7
721.3926.9
623.8224.1
524.8523.1
430.0319.1
337.9615.1
250.0811.5
174.487.7
该测试,没有做资源限制,直接跑在 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 上,没有使用虚拟机或者 docker 限制 cpu 或者内存资源。属于撒开丫子跑。
Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz 有 28 个物理核,56 个逻辑核,都可以被进程调用。当然,ffmpeg、pyav 不会有多少吃多少,我记得是上限最多 16 个线程。

低分辨率视频

再换一个低分辨率的视频

视频是一个 540P 的视频

╰─➤  ffmpeg -i /home/ponponon/Downloads/XiaoShengKeDeJiuShu_4-540P.mp4                         
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/ponponon/Downloads/XiaoShengKeDeJiuShu_4-540P.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.27.100
    description     : Packed by Bilibili XCoder v2.0.2
  Duration: 00:09:35.04, start: 0.000000, bitrate: 1049 kb/s
  Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 960x540 [SAR 1:1 DAR 16:9], 849 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc59.37.100 libx264
  Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 192 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
At least one output file must be specified

平台 macbook pro Apple Silicon M1

线程数耗时(秒)倍速
84.81119.5
74.72121.7
64.81119.5
54.96116.0
46.6586.4
38.1670.5
29.9857.6
114.6839.2

平台 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

线程数耗时(秒)倍速
1011.251.3
911.5949.6
812.4746.1
713.7941.7
612.3846.4
58.4967.7
414.9238.6
313.0244.2
217.0433.7
123.824.2

总结

结论一:多线程可以加速拆帧,但是不是线性相关。随着线程数的增加,倍速提升越小

结论二:分辨率越高的视频,多线程加速效果越明显

对于高分辨率的视频,8 个线程,相比 1 个线程,快了 4 倍
对于低分辨率的视频,8 个线程,相比 1 个线程,快了 2 倍

多线程会拖后腿吗?

如果机器只有一个 cpu core,此时,pyav 启用多线程拆帧,会扯后腿吗?(就是多线程比单线程还慢)

我想测试一下,所以使用 vagrant+virtualbox,限定 cpu 个数为 1

高分辨率视频

还是刚刚的 1080P 的视频

╰─➤  ffmpeg -i /Volumes/SanDisk128G/标准视频/XiaoShengKeDeJiuShu_4-1080P.mp4                                         
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Volumes/SanDisk128G/标准视频/XiaoShengKeDeJiuShu_4-1080P.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.71.100
    description     : Packed by Bilibili XCoder v2.0.2
  Duration: 00:09:35.04, start: 0.000000, bitrate: 2832 kb/s
  Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 2631 kb/s, 30 fps, 30 tbr, 16k tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 192 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
At least one output file must be specified

平台 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz,开一个 cpu core

线程数耗时(秒)倍速
1099.715.8
995.686.0
895.026.1
792.036.2
691.686.3
588.566.5
487.96.5
385.736.7
284.116.8
175.167.7
测试的时候,vagrant+virtualbox,限定 cpu 个数为 1

可以看到,多线程会拖后腿,10个线程比一个线程慢了 25%!!!

低分辨率视频

视频还是选用这个

╰─➤  ffmpeg -i /home/ponponon/Downloads/XiaoShengKeDeJiuShu_4-540P.mp4                         
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/ponponon/Downloads/XiaoShengKeDeJiuShu_4-540P.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf59.27.100
    description     : Packed by Bilibili XCoder v2.0.2
  Duration: 00:09:35.04, start: 0.000000, bitrate: 1049 kb/s
  Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 960x540 [SAR 1:1 DAR 16:9], 849 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc59.37.100 libx264
  Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 192 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
At least one output file must be specified

平台 Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz,开一个 cpu core

线程数耗时(秒)倍速
1029.6119.4
928.5620.1
828.5620.1
727.8420.7
627.8220.7
527.4720.9
428.1320.4
327.2121.1
227.0621.2
123.924.1
测试的时候,vagrant+virtualbox,限定 cpu 个数为 1

可以看到,多线程会拖后腿,10个线程比一个线程慢了 20%!!!

总结

如果你的进程跑在 k8s 或者 docker 容器中,并且使用了 cgroup 做 CPU 资源限制,那么我建议你,也要合理主动设置 pyav 的线程数

因为 pyav 默认的线程数,是按照你机器的 cpu 核来的。比如你的机器有 100 核,但是你把进程跑在 docker 容器里面,并且限制 cpu 资源上限就是一核,但是 pyav 认为我可以用 100 核,就会设置很多个线程。但实际只能最多用一核,这反而变慢了


universe_king
3.4k 声望680 粉丝