头图

📕前言

该文章主要是简述一下自己为了完成极市平台赛事过程中,使用 MMSegmentation 语义分割开源库的心得。

在学习一个新的工具之前,一定需要明白自己是用工具实现什么目标,而不是为了学工具而学,一旦有了目的会给你所作的事情带来意义,但是也要避免急于求成(人总是喜欢简单直接的事情,但是只有真正拉扯过肌肉才会成长),所以坚持不下去的时候,只要明白这是你的大脑退缩了,但你仍然想学。💪

\( \quad \)

🌳文章结构

本文章将从一下几个方面介绍如何上手 MMsegmentation,并用 MMDeploy 实现简单的部署:

  1. 安装 MMSegmentation
  2. MMSegmentation 的文件结构
  3. MMSegmentation 的配置文件(核心)
  4. 如何在 MMSegmentation 中自定义数据集
  5. 训练和测试
强烈建议配合官方文档一起学习:https://mmsegmentation.readthedocs.io/zh_CN/latest/index.html
PS:如此良心的开源库还带中文文档!😭

\( \quad \)

📝正文

安装 MMSegmentation

环境准备(可选,但推荐)

一般我们为了环境隔离用 Miniconda(Anaconda) 创建一个新的 python 环境,但在某些情况下也可以不用,取决于你的习惯。

官方网站下载并安装 Miniconda & 创建一个 conda 环境,并激活:

conda create --name openmmlab python=3.8 -y
conda activate openmmlab

\( \quad \)

安装库

  1. 根据官网安装 pytorch,现在更新到2.0了

    conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

    也可以直接点击 install previous versions of PyTorch 安装之前的版本(下面是 torch=1.10.1 的例子)

    # pip 安装
    # CUDA 11.1 
    pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html
    
    # 或者
    # conda 安装
    # CUDA 11.3
    conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge 
    

\( \quad \)

  1. 安装 MMCV(OpenMMLab 其他许多库都有这个依赖)
    推荐安装方式 mim,更多方式看 MMCV

    pip install -U openmim
    mim install mmengine
    mim install "mmcv>=2.0.0"

\( \quad \)

  1. 安装 MMsegmentation
    a. 方式一:作为依赖库安装(推荐)

    pip install "mmsegmentation>=1.0.0"

    a. 方式二:源码安装(方便观看库的文件结构)

     git clone -b main https://github.com/open-mmlab/mmsegmentation.git
     cd mmsegmentation
     pip install -v -e .
     # '-v' 表示详细模式,更多的输出
     # '-e' 表示以可编辑模式安装工程,
      # 因此对代码所做的任何修改都生效,无需重新安装

    \( \quad \)

验证安装是否成功

  1. 下载配置文件、模型文件、测试图片

    mkdir test_mmseg
    ce test_mmseg
    mim download mmsegmentation --config pspnet_r50-d8_4xb2-40k_cityscapes-512x1024 --dest . 
    该下载过程可能需要花费几分钟,这取决于您的网络环境。当下载结束,您将看到以下两个文件在您当前工作目录:pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.pypspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth
    下载测试图片:https://github.com/open-mmlab/mmsegmentation/blob/main/demo/demo.png
  1. Run 起来!(检验是否成功)

    a. 依赖库安装方式检验

    创建一个 test_demo.py,并将以下内容拷贝到 python 文件中运行

    from mmseg.apis import inference_model, init_model, show_result_pyplot
    import mmcv
    
    config_file = 'pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py'
    checkpoint_file = 'pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
    
    # 根据配置文件和模型文件建立模型
    model = init_model(config_file, checkpoint_file, device='cuda:0')
    
    # 在单张图像上测试并可视化
    img = 'demo.png'  
    result = inference_model(model, img)
    # 在新的窗口可视化结果
    show_result_pyplot(model, img, result, show=True)
    # 或者将可视化结果保存到图像文件夹中
    # 您可以修改分割 map 的透明度 (0, 1].
    show_result_pyplot(model, img, result, show=True, out_file='result.jpg', opacity=0.5)
    # 在一段视频上测试并可视化分割结果
    video = mmcv.VideoReader('video.mp4')
    for frame in video:
       result = inference_segmentor(model, frame)
       show_result_pyplot(model, result, wait_time=1)
    您将在当前文件夹中看到一个新图像 result.jpg,其中所有目标都覆盖了分割 mask

demo.png

result.png

\( \quad \)

b. 源码安装检验方式

cd mmsegmentation
python demo/image_demo.py demo/demo.png \\
configs/pspnet/pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py \\
pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth \\
--device cuda:0 --out-file result.jpg
其他更多安装方式见官方文档:https://mmsegmentation.readthedocs.io/zh_CN/latest/get_starte...

\( \quad \)

MMSegmentation 的文件结构

接下来我们稍微看一下 MMsegmentation 的文件结构目录:https://github.com/open-mmlab/mmsegmentation/tree/main/

mmsegmentation
- configs # **配置文件,是该库的核心**
    - _base_ # 基础模块文件,**但本质上还是配置文件**,包括数据集,模型,训练配置
        - datasets
        - models
        - schedules    
    - else model config # 除了 _base_ 之外,其他都是通过利用 _base_ 中定义好的模块进行组合的模型文件

- mmseg # **这是库核心的实现,上面配置文件的模块都在这里定义**
    - datasets
    - models

- tools # 这里包括训练、测试、转onnx等写好了的工具,直接调用即可
    - train.py
    - test.py

- data # 放置数据集

- demo # 简单使用小 demo
- projects # 放置一些案例

从上面可以看出,其实 MMSegmentation 做了很好的封装,如果只是使用,那是非常容易上手的。

config/_base_ 和 mmseg 中的 datasets、models等文件有什么区别呢?
下面用 ade 数据集举一个例子(大致看一下差异,不需要弄懂):

  • config/_base_/datasets/ade20k.py

    # dataset settings
    dataset_type = 'ADE20KDataset'
    data_root = 'data/ade/ADEChallengeData2016'
    crop_size = (512, 512)
    train_pipeline = [
      dict(type='LoadImageFromFile'),
      dict(type='LoadAnnotations', reduce_zero_label=True),
      dict(
          type='RandomResize',
          scale=(2048, 512),
          ratio_range=(0.5, 2.0),
          keep_ratio=True),
      dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
      dict(type='RandomFlip', prob=0.5),
      dict(type='PhotoMetricDistortion'),
      dict(type='PackSegInputs')
    ]
    test_pipeline = [
      dict(type='LoadImageFromFile'),
      dict(type='Resize', scale=(2048, 512), keep_ratio=True),
      # add loading annotation after ``Resize`` because ground truth
      # does not need to do resize data transform
      dict(type='LoadAnnotations', reduce_zero_label=True),
      dict(type='PackSegInputs')
    ]
    img_ratios = [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
    tta_pipeline = [
      dict(type='LoadImageFromFile', backend_args=None),
      dict(
          type='TestTimeAug',
          transforms=[
              [
                  dict(type='Resize', scale_factor=r, keep_ratio=True)
                  for r in img_ratios
              ],
              [
                  dict(type='RandomFlip', prob=0., direction='horizontal'),
                  dict(type='RandomFlip', prob=1., direction='horizontal')
              ], [dict(type='LoadAnnotations')], [dict(type='PackSegInputs')]
          ])
    ]
    train_dataloader = dict(
      batch_size=4,
      num_workers=4,
      persistent_workers=True,
      sampler=dict(type='InfiniteSampler', shuffle=True),
      dataset=dict(
          type=dataset_type,
          data_root=data_root,
          data_prefix=dict(
              img_path='images/training', seg_map_path='annotations/training'),
          pipeline=train_pipeline))
    val_dataloader = dict(
      batch_size=1,
      num_workers=4,
      persistent_workers=True,
      sampler=dict(type='DefaultSampler', shuffle=False),
      dataset=dict(
          type=dataset_type,
          data_root=data_root,
          data_prefix=dict(
              img_path='images/validation',
              seg_map_path='annotations/validation'),
          pipeline=test_pipeline))
    test_dataloader = val_dataloader
    
    val_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU'])
    test_evaluator = val_evaluator
  • mmseg/datasets/ade.py

    # Copyright (c) OpenMMLab. All rights reserved.
    from mmseg.registry import DATASETS
    from .basesegdataset import BaseSegDataset
    
    
    @DATASETS.register_module()
    class ADE20KDataset(BaseSegDataset):
      """ADE20K dataset.
    
      In segmentation map annotation for ADE20K, 0 stands for background, which
      is not included in 150 categories. ``reduce_zero_label`` is fixed to True.
      The ``img_suffix`` is fixed to '.jpg' and ``seg_map_suffix`` is fixed to
      '.png'.
      """
      METAINFO = dict(
          classes=('wall', 'building', 'sky', 'floor', 'tree', 'ceiling', 'road',
                      ... # 省略
                   'clock', 'flag'),
                   
          palette=[[120, 120, 120], [180, 120, 120], [6, 230, 230], [80, 50, 50],
                   ... # 省略
                   [184, 255, 0], [0, 133, 255], [255, 214, 0], [25, 194, 194],
                   [102, 255, 0], [92, 0, 255]])
    
      def __init__(self,
                   img_suffix='.jpg',
                   seg_map_suffix='.png',
                   reduce_zero_label=True,
                   **kwargs) -> None:
          super().__init__(
              img_suffix=img_suffix,
              seg_map_suffix=seg_map_suffix,
              reduce_zero_label=reduce_zero_label,
              **kwargs)

\( \quad \)

MMSegmentation 的 config 配置文件 (核心

在使用 MMSegmentation 中的模型进行训练和测试的时候就能够看出 config 配置文件的重要性,如以下例子 configs/pspnet/pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py

该配置文件调用了_base_中定义的 models、dataset、schedules等配置文件,这种模块化方式就很容易通过重新组合来调整整体模型(注意: 在使用依赖库方式定义的时候有所不同,见下面的自定义数据集小结)。

_base_ = [
    '../_base_/models/pspnet_r50-d8.py', '../_base_/datasets/cityscapes.py',
    '../_base_/default_runtime.py', '../_base_/schedules/schedule_40k.py'
]
crop_size = (512, 1024)
data_preprocessor = dict(size=crop_size)
model = dict(data_preprocessor=data_preprocessor)
其中每个模块的配置文件细节见:https://mmsegmentation.readthedocs.io/zh_CN/latest/user_guide...

\( \quad \)

训练和测试

注意:训练和测试的命令和官方文档的有些不同了(旧版),下面给出新版的使用方式

训练命令:min train
min train mmsegmentation ${CONFIG_FILE} [optional arguments]

Options:
  -l, --launcher [none|pytorch|slurm]
                                  Job launcher
  --port INTEGER                  The port used for inter-process
                                  communication (only applicable to slurm /
                                  pytorch launchers). If set to None, will
                                  randomly choose a port between 20000 and
                                  30000.
  -G, --gpus INTEGER              Number of gpus to use
  -g, --gpus-per-node INTEGER     Number of gpus per node to use (only
                                  applicable to launcher == "slurm")
  -c, --cpus-per-task INTEGER     Number of cpus per task (only applicable to
                                  launcher == "slurm")
  -p, --partition TEXT            The partition to use (only applicable to
                                  launcher == "slurm")
  --srun-args TEXT                Other srun arguments that might be used
  -y, --yes                       Don't ask for confirmation.
  -h, --help                      Show this message and exit.
测试命令:min test
min test mmsegmentation ${CONFIG_FILE} [optional arguments]
Options:
  -C, --checkpoint TEXT           checkpoint path
  -l, --launcher [none|pytorch|slurm]
                                  Job launcher
  --port INTEGER                  The port used for inter-process
                                  communication (only applicable to slurm /
                                  pytorch launchers). If set to None, will
                                  randomly choose a port between 20000 and
                                  30000
  -G, --gpus INTEGER              Number of gpus to use (only applicable to
                                  launcher == "slurm")
  -g, --gpus-per-node INTEGER     Number of gpus per node to use (only
                                  applicable to launcher == "slurm")
  -c, --cpus-per-task INTEGER     Number of cpus per task (only applicable to
                                  launcher == "slurm")
  -p, --partition TEXT            The partition to use (only applicable to
                                  launcher == "slurm")
  --srun-args TEXT                Other srun arguments that might be used
  -y, --yes                       Don't ask for confirmation.
  -h, --help                      Show this message and exit.

如何在 MMSegmentation 中自定义数据集

在这部分将带大家从自定义数据开始实操一下 mmsegmentation 的使用流程。

自定义数据集

我们首先看看官方对于一些常用的数据集的文件目录是怎么样的(拿 CHASE_DB1 数据集(二类别语义分割)举个例子):

mmsegmentation
├── mmseg
├── tools
├── configs
├── data
│   ├── CHASE_DB1
│   │   ├── images
│   │   │   ├── training
│   │   │   ├── validation
│   │   ├── annotations
│   │   │   ├── training
│   │   │   ├── validation

可见其中包含:

  • annotations:语义分割的真实 mark label
  • images:待分割的RGB图像

根据以上结构我们可以构建自己的数据集,这里我主要是利用极市平台写字楼消防门堵塞识别二类别语义分割任务的数据集,其中门的label是1,背景label是0
image.png

并且将其划分为训练集和验证集,在 test_mmseg/data 中添加以下文件:

test_mmseg
|   data
|   | xiaofang
│   │   ├── images
│   │   │   ├── training
│   │   │   ├── validation
│   │   ├── annotations
│   │   │   ├── training
│   │   │   ├── validation
│   datasets
│   configs

添加数据集模块

  1. test_mmseg/datasets 中添加一个 xiaofang_dataset.py 定义自己的数据类 XiaoFangDataset
    xiaofang.py

    from mmseg.datasets import BaseSegDataset
    from mmseg.registry import DATASETS
    import mmengine.fileio as fileio
    
    @DATASETS.register_module()
    class XiaoFangDataset(BaseSegDataset):
     METAINFO = dict(
       classes = ('background', 'door'),
       palette = [[120, 120, 120], [6, 230, 230]])
    
     def __init__(self, 
                  img_suffix='.jpg',
                  seg_map_suffix='.png',
                  reduce_zero_label=False,
                  **kwargs):
         super(XiaoFangDataset, self).__init__(
             img_suffix=img_suffix, # 注意路径
             seg_map_suffix=seg_map_suffix,
             reduce_zero_label=reduce_zero_label,
             **kwargs)
         assert fileio.exists(self.data_prefix['img_path'], backend_args=self.backend_args)
  2. test_mmseg/datasets 中添加一个 __init__.py 中声明自己定义的数据类XiaoFangDataset

    from .xiaofang_dataset import XiaoFangDataset
    
    __all__ = ['XiaoFangDataset']
  3. test_mmseg/configs/ 中添加自己数据集的配置文件 xiaofang.py

     # dataset settings
     dataset_type = 'XiaoFangDataset' # 数据类名称
     data_root = 'data/xiaofang' # 数据存放位置
     crop_size = (512, 512)
     img_scale = (1920, 1080)
    
    
     train_pipeline = [
       dict(type='LoadImageFromFile'),
       dict(type='LoadAnnotations'),
       dict(type='RandomResize', scale=img_scale, ratio_range=(0.5, 2.0), keep_ratio=True),
       dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
       dict(type='RandomFlip', prob=0.5),
       dict(type='PhotoMetricDistortion'),
       dict(type='PackSegInputs')
     ]
     test_pipeline = [
       dict(type='LoadImageFromFile'),
       dict(type='Resize', scale=img_scale, keep_ratio=True),
       # add loading annotation after ``Resize`` because ground truth
       # does not need to do resize data transform
       dict(type='LoadAnnotations'),
       dict(type='PackSegInputs')
     ]
    
     img_ratios = [0.5, 0.75, 1.0, 1.25, 1.5, 1.75]
     tta_pipeline = [
       dict(type='LoadImageFromFile', backend_args=None),
       dict(
           type='TestTimeAug',
           transforms=[
               [
                   dict(type='Resize', scale_factor=r, keep_ratio=True)
                   for r in img_ratios
               ],
               [
                   dict(type='RandomFlip', prob=0., direction='horizontal'),
                   dict(type='RandomFlip', prob=1., direction='horizontal')
               ], [dict(type='LoadAnnotations')], [dict(type='PackSegInputs')]
           ])
     ]
    
     data = dict(
       samples_per_gpu=4,
       workers_per_gpu=4,
       train=dict(
           type=dataset_type,
           data_root=data_root,
           img_dir='images/training',
           ann_dir='annotations/training',
           pipeline=train_pipeline),
       val=dict(
           type=dataset_type,
           data_root=data_root,
           img_dir='images/validation',
           ann_dir='annotations/validation',
           pipeline=test_pipeline),
       test=dict(
           type=dataset_type,
           data_root=data_root,
           img_dir='images/validation',
           ann_dir='annotations/validation',
           pipeline=test_pipeline))
    
    
     train_dataloader = dict(
       batch_size=4,
       num_workers=4,
       persistent_workers=True,
       sampler=dict(type='InfiniteSampler', shuffle=True),
       dataset=dict(
           type=dataset_type,
           data_root=data_root,
           data_prefix=dict(
               img_path='images/training', seg_map_path='annotations/training'),
           pipeline=train_pipeline))
     val_dataloader = dict(
       batch_size=1,
       num_workers=4,
       persistent_workers=True,
       sampler=dict(type='DefaultSampler', shuffle=False),
       dataset=dict(
           type=dataset_type,
           data_root=data_root,
           data_prefix=dict(
               img_path='images/validation', seg_map_path='annotations/validation'),
           pipeline=test_pipeline))
     test_dataloader = val_dataloader
    
     val_evaluator = dict(type='IoUMetric', iou_metrics=['mIoU'])
     test_evaluator = val_evaluator
    

\( \quad \)

训练和测试

在完成了数据集配置后,就需要搭建整体模型的配置文件即可,MMSegmentation 提供了许多开源模型(下面是一部分,更多详情):
image.png

一般需要根据自己的GPU显存大小选择模型,点击上面的 config 能够看到对应模型所需要的显存大小,如这里我们举例选择一个 STDC 2模型
image.png

  1. 修改完整配置文件:在 test/configs 中添加上自己的模型 stdc2_512x1024_10k_xiaofang.py
    注意:使用依赖库的时候没法直接改源码,因此需要对自定义的模块在 config 中进行注册,如下

    custom_imports = dict(imports='datasets.xiaofang_dataset')

    完整代码

    _base_ = ['mmseg::_base_/default_runtime.py', 'mmseg::_base_/schedules/schedule_80k.py', './xiaofang.py', 'mmseg::_base_/models/stdc.py']
    
    # checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'  # noqa
    
    crop_size = (512, 512)
    num_classes = 2
    data_preprocessor = dict(size=crop_size)
    norm_cfg = dict(type='BN', requires_grad=True)
    custom_imports = dict(imports='datasets.xiaofang_dataset')
    model = dict(
        data_preprocessor=data_preprocessor,
        backbone=dict(backbone_cfg=dict(stdc_type='STDCNet2')),
        decode_head=dict(num_classes=num_classes),
    auxiliary_head=[
            dict(
                type='FCNHead',
                in_channels=128,
                channels=64,
                num_convs=1,
                num_classes=num_classes,
                in_index=2,
                norm_cfg=norm_cfg,
                concat_input=False,
                align_corners=False,
                sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
                loss_decode=dict(
                    type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
            dict(
                type='FCNHead',
                in_channels=128,
                channels=64,
                num_convs=1,
                num_classes=num_classes,
                in_index=1,
                norm_cfg=norm_cfg,
                concat_input=False,
                align_corners=False,
                sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
                loss_decode=dict(
                    type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
            dict(
                type='STDCHead',
                in_channels=256,
                channels=64,
                num_convs=1,
                num_classes=num_classes,
                boundary_threshold=0.1,
                in_index=0,
                norm_cfg=norm_cfg,
                concat_input=False,
                align_corners=True,
                loss_decode=[
                    dict(
                        type='CrossEntropyLoss',
                        loss_name='loss_ce',
                        use_sigmoid=True,
                        loss_weight=1.0),
                    dict(type='DiceLoss', loss_name='loss_dice', loss_weight=1.0)
                ]),
        ],
    )
    
    
    param_scheduler = [
        dict(
            type='PolyLR',
            eta_min=1e-4,
            power=0.9,
            begin=0,
            end=10000,
            by_epoch=False)
    ]
    # training schedule for 80k
    train_cfg = dict(type='IterBasedTrainLoop', max_iters=10000, val_interval=1000)
    val_cfg = dict(type='ValLoop')
    test_cfg = dict(type='TestLoop')
    default_hooks = dict(
        timer=dict(type='IterTimerHook'),
        logger=dict(type='LoggerHook', interval=50, log_metric_by_epoch=False),
        param_scheduler=dict(type='ParamSchedulerHook'),
        checkpoint=dict(type='CheckpointHook', by_epoch=False, interval=1000, save_last=False),
        sampler_seed=dict(type='DistSamplerSeedHook'),
        visualization=dict(type='SegVisualizationHook'))
  2. 训练

    mim train mmsegmentation configs/stdc2_512x1024_10k_xiaofang.py --work-dir logs/stdc2
  3. 测试结果:MIoU=0.9225,下面分别是RGB图像、真实Label、STDC模型输出
    fire_door_office_building_inside_20220523_26.jpg

👏本文参与了SegmentFault 思否写作挑战赛,欢迎正在阅读的你也加入。


02
1 声望1 粉丝