【模型推理】deepstream6.0 部署 yolov3 和 yolov4 教程

欢迎关注我的公众号 [极智视界]，回复001获取Google编程规范

O_o >_< o_O O_o ~_~ o_O

大家好，我是极智视界，本文介绍了使用 deepstream6.0 部署 yolov3 和 yolov4 的方法。

Yolo 系列是工程中应用十分广泛的目标检测算法，特别是从 yolov3 开始，逐步的进化，到 yolov4、yolov5 等，工程的接受度越来越高。而 deepstream 是英伟达提出的一套加速深度学习落地的 pipeline 应用，那么当 deepstream 遇到 yolo，会擦出什么样的火花呢，让我们来看。

关于 deepstream 的安装教程，可以查阅我之前写的几篇：《【经验分享】ubuntu 安装 deepstream6.0》、《【经验分享】ubuntu 安装 deepstream5.1》。

先来看下 deepstream6.0 source 的目录结构：

apps
- apps-common
- audio_apps
- sample_apps：例程，如 deepstream-app、deepstream-test1...
gst-plugins：gstreamer 插件
include：头
libs：库
objectDetector_FasterRCNN：FasterRCNN 示例
objectDetector_SSD：SSD 示例
objectDetector_Yolo：YOLO 示例
tools: 日志相关

1、deepstream6.0 部署 yolov3

通过上述的 objectDetector_Yolo 工程来跑 yolov3，在 objectDetector_Yolo 工程里主要关注以下几个模块：

nvdsinfer_custom_impl_Yolo：yolov3 工程实现代码；
- nvdsinfer_yolo_engine.cpp：解析模型、生成引擎
- nvdsparsebbox_Yolo.cpp：输出层的解析函数，解析目标检测框
- trt_utils.cpp 和 trt_utils.h：构造 TensorRT网络的工具类的接口和实现
- yolo.cpp 和 yolo.h：生成 yolo 引擎的接口和实现
- yoloPlugins.cpp 和 yoloPlugins.h：YoloLayerV3 and YoloLayerV3PluginCreator 的接口和实现
- kernels.cu：cuda核底层实现
config_infer_xxx_.txt：模型的配置；
deepstream_app_config_xxx.txt：Gstreamer nvinfer 插件的配置文件；
xxx.cfg、xxx.weights：模型文件；

有以上这些就够了，下面开始。

1.1 下载模型文件

deepstream6.0 SDK 中是没有 yolov3 的模型文件的，需要自行下载，给出传送。

yolov3.cfg：https://github.com/pjreddie/d...；

yolov3.weights：https://link.zhihu.com/?targe...；

这里多说一句，如果你有 TensorRT 的 yolov3.engine 的话，就不需要原始模型文件了，如果没有 .engine 的话，其实会根据原始文件先生成 .engine。

1.2 配置 config_infer_primary_yolov3.txt

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
custom-network-config=yolov3.cfg
model-file=yolov3.weights
labelfile-path=labels.txt
int8-calib-file=yolov3-calibration.table.trt7.0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=1
num-detected-classes=80
gie-unique-id=1
network-type=0
is-classifier=0
cluster-mode=2
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
nms-iou-threshold=0.3
threshold=0.7

1.3 配置 deepstream_app_config_yolov3.txt

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720
gpu-id=0
nvbuf-memory-type=0

[source0]
enable=1
type=3
uri=file://../../samples/streams/sample_1080p_h264.mp4
num-sources=1
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

[osd]
enable=1
gpu-id=0
border-width=1
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
live-source=0
batch-size=1
batched-push-timeout=40000
width=1920
height=1080
enable-padding=0
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
#model-engine-file=model_b1_gpu0_int8.engine
labelfile-path=labels.txt
batch-size=1
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=2
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV3.txt

[tracker]
enable=1
tracker-width=640
tracker-height=384
ll-lib-file=/opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_nvmultiobjecttracker.so
ll-config-file=../../samples/configs/deepstream-app/config_tracker_NvDCF_perf.yml
gpu-id=0
enable-batch-process=1
enable-past-frame=1
display-tracking-id=1

[tests]
file-loop=0

1.4 工程编译

进入到 /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo：

cd /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo

依次执行下面两条命令，编译生成 .so 文件：

export CUDA_VER=11.4    # 设置与设备相同的CUDA版本

或者在 /opt/nvidia/deepstream/deepstream-6.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/Makefile 中修改：

然后执行编译

make -C nvdsinfer_custom_impl_Yolo

编译后会生产动态库文件，生成了 libnvdsinfer_custom_impl_Yolo.so 动态库文件。

1.5 执行

deepstream-app -c deepstream_app_config_yoloV3.txt

这里完成了 deepstream6.0 Yolov3 的部署。

2、deepstream6.0 部署 yolov4

这里以不同的方式来部署一下 yolov4，即直接调用 TensorRT Engine，而不是从原始模型导入。

2.1 使用 darknet2onnx2TRT 生成 yolov4.engine

下载 yolov4 darknet 原始权重，给出百度网盘传送：

https://pan.baidu.com/s/1dAGEW8cm-dqK14TbhhVetA     Extraction code:dm5b

clone 模型转换工程：

git clone https://github.com/Tianxiaomo/pytorch-YOLOv4.git Yolov42TRT

开始模型转换：

cd Yolov42TRT

# darknet2onnx
python demo_darknet2onnx.py ./cfg/yolov4.cfg ./cfg/yolov4.weights ./data/dog.jpg 1

# onnx2trt
trtexec --onnx=./yolov4_1_3_608_608_static.onnx --fp16 --saveEngine=./yolov4.engine --device=0

这样就会生成 yolov4.engine。

2.2 deepstream yolov4 推理工程配置

clone deepstream yolov4 推理工程：

git clone https://github.com/NVIDIA-AI-IOT/yolov4_deepstream.git

cd yolov4_deepstream/deepstream_yolov4

配置 config_infer_primary_yoloV4.txt：

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
model-engine-file=yolov4.engine
labelfile-path=labels.txt
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=80
gie-unique-id=1
network-type=0
is-classifier=0
## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=2
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV4
custom-lib-path=nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so

[class-attrs-all]
nms-iou-threshold=0.6
pre-cluster-threshold=0.4

配置 deepstream_app_config_yoloV4.txt：

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5

[tiled-display]
enable=0
rows=1
columns=1
width=1280
height=720
gpu-id=0
nvbuf-memory-type=0

[source0]
enable=1
type=3
uri=file:/opt/nvidia/deepstream/deepstream-6.0/samples/streams/sample_1080p_h264.mp4
num-sources=1
gpu-id=0
cudadec-memtype=0

[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=3
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0
container=1
codec=1
output-file=yolov4.mp4

[osd]
enable=1
gpu-id=0
border-width=1
text-size=12
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
live-source=0
batch-size=1
batched-push-timeout=40000
width=1280
height=720
enable-padding=0
nvbuf-memory-type=0

[primary-gie]
enable=1
gpu-id=0
model-engine-file=yolov4.engine
labelfile-path=labels.txt
batch-size=1

bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=config_infer_primary_yoloV4.txt

[tracker]
enable=0
tracker-width=512
tracker-height=320
ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so

[tests]
file-loop=0

把 2.1 转换生成的 yolov4.engine 拷贝到 /opt/nvidia/deepstream/deepstream-6.0/sources/yolov4_deepstream。

2.3 工程编译

进入到 /opt/nvidia/deepstream/deepstream-6.0/sources/yolov4_deepstream：

cd /opt/nvidia/deepstream/deepstream-6.0/sources/yolov4_deepstream

依次执行下面两条命令，编译生成 .so 文件：

export CUDA_VER=11.4    # 设置与设备相同的CUDA版本

或者在 /opt/nvidia/deepstream/deepstream-6.0/sources/yolov4_deepstream/nvdsinfer_custom_impl_Yolo/Makefile 中修改：

然后执行编译

make -C nvdsinfer_custom_impl_Yolo

编译后会生产动态库文件，生成了 libnvdsinfer_custom_impl_Yolo.so 动态库文件。

2.4 执行

deepstream-app -c deepstream_app_config_yoloV4.txt

这里完成了 deepstream6.0 Yolov4 的部署。

以上分享了 deepstream6.0 部署 yolov3 和 yolov4 的方法，希望我的分享会对你的学习有一点帮助。

【公众号传送】
《【模型推理】deepstream6.0 部署 yolov3 和 yolov4 教程》

【模型推理】deepstream6.0 部署 yolov3 和 yolov4 教程

1、deepstream6.0 部署 yolov3

1.1 下载模型文件

1.2 配置 config_infer_primary_yolov3.txt

1.3 配置 deepstream_app_config_yolov3.txt

1.4 工程编译

1.5 执行

2、deepstream6.0 部署 yolov4

2.1 使用 darknet2onnx2TRT 生成 yolov4.engine

2.2 deepstream yolov4 推理工程配置

2.3 工程编译

2.4 执行

极智视界

引用和评论

【模型推理】量化实现分享三：详解 ACIQ 对称量化算法实现

一文掌握 MCP 上下文协议：从理论到实践

LRU算法，你别跑，我就要吃透你

开放创新，昇腾 CANN 再向深处

AI Agent爆火后，MCP协议为什么如此重要！

2025年医疗大模型各医疗场景赋能实践研究报告130+份汇总解读|附PDF下载

AdventureX 2025 正式启动：五天四夜，120小时极限创造！一起在杭州点燃青年创新之火！