PaddlePaddle
diff --git a/‎configs/mask_rtdetr/README.md
Lines changed: 115 additions & 0 deletions b/‎configs/mask_rtdetr/README.md
Lines changed: 115 additions & 0 deletions
diff --git a/‎configs/mask_rtdetr/_base_/mask_rtdetr_r50vd.yml
Lines changed: 76 additions & 0 deletions b/‎configs/mask_rtdetr/_base_/mask_rtdetr_r50vd.yml
Lines changed: 76 additions & 0 deletions
diff --git a/‎configs/mask_rtdetr/_base_/mask_rtdetr_reader.yml
Lines changed: 44 additions & 0 deletions b/‎configs/mask_rtdetr/_base_/mask_rtdetr_reader.yml
Lines changed: 44 additions & 0 deletions
diff --git a/‎configs/mask_rtdetr/_base_/optimizer_6x.yml
Lines changed: 19 additions & 0 deletions b/‎configs/mask_rtdetr/_base_/optimizer_6x.yml
Lines changed: 19 additions & 0 deletions
diff --git a/‎configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml
Lines changed: 24 additions & 0 deletions b/‎configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml
Lines changed: 24 additions & 0 deletions
diff --git a/‎configs/mask_rtdetr/mask_rtdetr_hgnetv2_x_6x_coco.yml
Lines changed: 39 additions & 0 deletions b/‎configs/mask_rtdetr/mask_rtdetr_hgnetv2_x_6x_coco.yml
Lines changed: 39 additions & 0 deletions
diff --git a/‎configs/mask_rtdetr/mask_rtdetr_r101vd_6x_coco.yml
Lines changed: 37 additions & 0 deletions b/‎configs/mask_rtdetr/mask_rtdetr_r101vd_6x_coco.yml
Lines changed: 37 additions & 0 deletions
@@ -0,0 +1,115 @@
+# Mask-RT-DETR
+
+
+## 简介
+Mask-RT-DETR是一个实例分割模型。基于RT-DETR和MaskDINO。
+
+## 模型库
+|        Model        | Epoch | Backbone | Input shape | Box AP | Mask AP | Params(M) | FLOPs(G) | T4 TensorRT FP16(FPS) | Pretrained Model |                      config                      |
+|:-------------------:|:-----:|:--------:|:-----------:|:------:|:-------:|:---------:|:--------:|:---------------------:|:----------------:|:------------------------------------------------:|
+|   Mask-RT-DETR-L    |  6x   | HGNetv2  |     640     |        |         |           |          |                       |                  |   [config](mask_rtdetr_hgnetv2_l_6x_coco.yml)    |
+
+
+## 快速开始
+
+<details open>
+<summary>依赖包:</summary>
+
+- PaddlePaddle >= 2.4.1
+
+</details>
+
+<details>
+<summary>安装</summary>
+
+- [安装指导文档](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/INSTALL.md)
+
+</details>
+
+<details>
+<summary>训练&评估</summary>
+
+- 单卡GPU上训练:
+
+```shell
+# training on single-GPU
+export CUDA_VISIBLE_DEVICES=0
+python tools/train.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml --eval
+```
+
+- 多卡GPU上训练:
+
+```shell
+# training on multi-GPU
+export CUDA_VISIBLE_DEVICES=0,1,2,3
+python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml --amp --eval
+```
+
+- 评估:
+
+```shell
+python tools/eval.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml \
+              -o weights=${model_params_path}
+```
+
+- 测试:
+
+```shell
+python tools/infer.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml \
+              -o weights=${model_params_path} \
+              --infer_img=./demo/000000570688.jpg
+```
+
+详情请参考[快速开始文档](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/GETTING_STARTED.md).
+
+</details>
+
+## 部署
+
+<details open>
+<summary>1. 导出模型 </summary>
+
+```shell
+cd PaddleDetection
+python tools/export_model.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml \
+              -o weights=${model_params_path} trt=True exclude_post_process=True \
+              --output_dir=output_inference
+```
+
+</details>
+
+<details>
+<summary>2. 转换模型至ONNX </summary>
+
+- 安装[Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX) 和 ONNX
+
+```shell
+pip install onnx==1.13.0
+pip install paddle2onnx==1.0.5
+```
+
+- 转换模型:
+
+```shell
+paddle2onnx --model_dir=./output_inference/mask_rtdetr_hgnetv2_l_6x_coco/ \
+            --model_filename model.pdmodel  \
+            --params_filename model.pdiparams \
+            --opset_version 16 \
+            --save_file mask_rtdetr_hgnetv2_l_6x_coco.onnx
+```
+</details>
+
+<details>
+<summary>3. 转换成TensorRT（可选） </summary>
+
+- 确保TensorRT的版本>=8.5.1
+- TRT推理可以参考[RT-DETR](https://github.com/lyuwenyu/RT-DETR)的部分代码或者其他网络资源
+
+```shell
+trtexec --onnx=./mask_rtdetr_hgnetv2_l_6x_coco.onnx \
+        --workspace=4096 \
+        --shapes=image:1x3x640x640 \
+        --saveEngine=mask_rtdetr_hgnetv2_l_6x_coco.trt \
+        --avgRuns=100 \
+        --fp16
+```
@@ -0,0 +1,76 @@
+architecture: DETR
+with_mask: True
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
+norm_type: sync_bn
+use_ema: True
+ema_decay: 0.9999
+ema_decay_type: "exponential"
+ema_filter_no_grad: True
+hidden_dim: 256
+use_focal_loss: True
+eval_size: [640, 640]
+
+
+DETR:
+  backbone: ResNet
+  neck: MaskHybridEncoder
+  transformer: MaskRTDETR
+  detr_head: MaskDINOHead
+  post_process: DETRPostProcess
+
+ResNet:
+  # index 0 stands for res2
+  depth: 50
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [1, 2, 3]
+  lr_mult_list: [0.1, 0.1, 0.1, 0.1]
+  num_stages: 4
+  freeze_stem_only: True
+
+MaskHybridEncoder:
+  hidden_dim: 256
+  use_encoder_idx: [2]
+  num_encoder_layers: 1
+  encoder_layer:
+    name: TransformerLayer
+    d_model: 256
+    nhead: 8
+    dim_feedforward: 1024
+    dropout: 0.
+    activation: 'gelu'
+  mask_dim: 32
+  expansion: 1.0
+
+
+MaskRTDETR:
+  num_queries: 300
+  position_embed_type: sine
+  feat_strides: [8, 16, 32]
+  mask_dim: 32
+  num_levels: 3
+  nhead: 8
+  num_decoder_layers: 6
+  dim_feedforward: 1024
+  dropout: 0.0
+  activation: relu
+  num_denoising: 100
+  label_noise_ratio: 0.5
+  box_noise_scale: 1.0
+  learnt_init_query: False
+  mask_enhanced: True
+
+MaskDINOHead:
+  loss:
+    name: MaskDINOLoss
+    loss_coeff: {class: 4, bbox: 5, giou: 2, mask: 5, dice: 5}
+    aux_loss: True
+    use_vfl: True
+    matcher:
+      name: HungarianMatcher
+      matcher_coeff: {class: 4, bbox: 5, giou: 2, mask: 5, dice: 5}
+
+DETRPostProcess:
+  num_top_queries: 100
+  mask_stride: 8
@@ -0,0 +1,44 @@
+worker_num: 4
+TrainReader:
+  sample_transforms:
+    - Decode: {}
+    - Poly2Mask: {del_poly: True}
+    - RandomDistort: {prob: 0.8}
+    - RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
+    - RandomCrop: {prob: 0.8}
+    - RandomFlip: {}
+  batch_transforms:
+    - BatchRandomResize: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
+    - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+    - NormalizeBox: {}
+    - BboxXYXY2XYWH: {}
+    - Permute: {}
+  batch_size: 4
+  shuffle: true
+  drop_last: true
+  collate_batch: false
+  use_shared_memory: true
+
+
+EvalReader:
+  sample_transforms:
+    - Decode: {}
+    - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+    - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+    - Permute: {}
+  batch_size: 1 # mask be 1
+  shuffle: false
+  drop_last: false
+
+
+TestReader:
+  inputs_def:
+    image_shape: [3, 640, 640]
+  sample_transforms:
+    - Decode: {}
+    - Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
+    - NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
+    - Permute: {}
+  batch_size: 1
+  shuffle: false
+  drop_last: false
@@ -0,0 +1,19 @@
+epoch: 72
+
+LearningRate:
+  base_lr: 0.0001
+  schedulers:
+  - !PiecewiseDecay
+    gamma: 1.0
+    milestones: [100]
+    use_warmup: true
+  - !LinearWarmup
+    start_factor: 0.001
+    steps: 2000
+
+OptimizerBuilder:
+  clip_grad_by_norm: 0.1
+  regularizer: false
+  optimizer:
+    type: AdamW
+    weight_decay: 0.0001
@@ -0,0 +1,24 @@
+_BASE_: [
+  '../datasets/coco_instance.yml',
+  '../runtime.yml',
+  '_base_/optimizer_6x.yml',
+  '_base_/mask_rtdetr_r50vd.yml',
+  '_base_/mask_rtdetr_reader.yml',
+]
+
+weights: output/mask_rtdetr_hgnetv2_l_6x_coco/model_final
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/PPHGNetV2_L_ssld_pretrained.pdparams
+find_unused_parameters: True
+log_iter: 200
+save_dir: output/mask_rtdetr_hgnetv2_l_6x_coco
+
+DETR:
+  backbone: PPHGNetV2
+
+PPHGNetV2:
+  arch: 'L'
+  return_idx: [1, 2, 3]
+  freeze_stem_only: True
+  freeze_at: 0
+  freeze_norm: True
+  lr_mult_list: [0., 0.05, 0.05, 0.05, 0.05]
@@ -0,0 +1,39 @@
+_BASE_: [
+  '../datasets/coco_instance.yml',
+  '../runtime.yml',
+  '_base_/optimizer_6x.yml',
+  '_base_/mask_rtdetr_r50vd.yml',
+  '_base_/mask_rtdetr_reader.yml',
+]
+
+weights: output/mask_rtdetr_hgnetv2_l_6x_coco/model_final
+pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/PPHGNetV2_X_ssld_pretrained.pdparams
+find_unused_parameters: True
+log_iter: 200
+
+
+DETR:
+  backbone: PPHGNetV2
+
+
+PPHGNetV2:
+  arch: 'X'
+  return_idx: [1, 2, 3]
+  freeze_stem_only: True
+  freeze_at: 0
+  freeze_norm: True
+  lr_mult_list: [0., 0.01, 0.01, 0.01, 0.01]
+
+
+MaskHybridEncoder:
+  hidden_dim: 384
+  use_encoder_idx: [2]
+  num_encoder_layers: 1
+  encoder_layer:
+    name: TransformerLayer
+    d_model: 384
+    nhead: 8
+    dim_feedforward: 2048
+    dropout: 0.
+    activation: 'gelu'
+  expansion: 1.0
@@ -0,0 +1,37 @@
+_BASE_: [
+  '../datasets/coco_instance.yml',
+  '../runtime.yml',
+  '_base_/optimizer_6x.yml',
+  '_base_/mask_rtdetr_r50vd.yml',
+  '_base_/mask_rtdetr_reader.yml',
+]
+
+weights: output/mask_rtdetr_r101vd_6x_coco/model_final
+find_unused_parameters: True
+log_iter: 200
+
+pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_ssld_pretrained.pdparams
+
+ResNet:
+  # index 0 stands for res2
+  depth: 101
+  variant: d
+  norm_type: bn
+  freeze_at: 0
+  return_idx: [1, 2, 3]
+  lr_mult_list: [0.01, 0.01, 0.01, 0.01]
+  num_stages: 4
+  freeze_stem_only: True
+
+MaskHybridEncoder:
+  hidden_dim: 384
+  use_encoder_idx: [2]
+  num_encoder_layers: 1
+  encoder_layer:
+    name: TransformerLayer
+    d_model: 384
+    nhead: 8
+    dim_feedforward: 2048
+    dropout: 0.
+    activation: 'gelu'
+  expansion: 1.0