Skip to content

Commit 730ff28

Browse files
committed
add mask_rtdetr
1 parent 7d6dc40 commit 730ff28

20 files changed

+1066
-7
lines changed

configs/mask_rtdetr/README.md

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Mask-RT-DETR
2+
3+
4+
## 简介
5+
Mask-RT-DETR是一个实例分割模型。基于RT-DETR和MaskDINO。
6+
7+
## 模型库
8+
| Model | Epoch | Backbone | Input shape | Box AP | Mask AP | Params(M) | FLOPs(G) | T4 TensorRT FP16(FPS) | Pretrained Model | config |
9+
|:-------------------:|:-----:|:--------:|:-----------:|:------:|:-------:|:---------:|:--------:|:---------------------:|:----------------:|:------------------------------------------------:|
10+
| Mask-RT-DETR-L | 6x | HGNetv2 | 640 | | | | | | | [config](mask_rtdetr_hgnetv2_l_6x_coco.yml) |
11+
12+
13+
## 快速开始
14+
15+
<details open>
16+
<summary>依赖包:</summary>
17+
18+
- PaddlePaddle >= 2.4.1
19+
20+
</details>
21+
22+
<details>
23+
<summary>安装</summary>
24+
25+
- [安装指导文档](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/INSTALL.md)
26+
27+
</details>
28+
29+
<details>
30+
<summary>训练&评估</summary>
31+
32+
- 单卡GPU上训练:
33+
34+
```shell
35+
# training on single-GPU
36+
export CUDA_VISIBLE_DEVICES=0
37+
python tools/train.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml --eval
38+
```
39+
40+
- 多卡GPU上训练:
41+
42+
```shell
43+
# training on multi-GPU
44+
export CUDA_VISIBLE_DEVICES=0,1,2,3
45+
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml --amp --eval
46+
```
47+
48+
- 评估:
49+
50+
```shell
51+
python tools/eval.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml \
52+
-o weights=${model_params_path}
53+
```
54+
55+
- 测试:
56+
57+
```shell
58+
python tools/infer.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml \
59+
-o weights=${model_params_path} \
60+
--infer_img=./demo/000000570688.jpg
61+
```
62+
63+
详情请参考[快速开始文档](https://github.com/PaddlePaddle/PaddleDetection/blob/develop/docs/tutorials/GETTING_STARTED.md).
64+
65+
</details>
66+
67+
## 部署
68+
69+
<details open>
70+
<summary>1. 导出模型 </summary>
71+
72+
```shell
73+
cd PaddleDetection
74+
python tools/export_model.py -c configs/mask_rtdetr/mask_rtdetr_hgnetv2_l_6x_coco.yml \
75+
-o weights=${model_params_path} trt=True exclude_post_process=True \
76+
--output_dir=output_inference
77+
```
78+
79+
</details>
80+
81+
<details>
82+
<summary>2. 转换模型至ONNX </summary>
83+
84+
- 安装[Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX) 和 ONNX
85+
86+
```shell
87+
pip install onnx==1.13.0
88+
pip install paddle2onnx==1.0.5
89+
```
90+
91+
- 转换模型:
92+
93+
```shell
94+
paddle2onnx --model_dir=./output_inference/mask_rtdetr_hgnetv2_l_6x_coco/ \
95+
--model_filename model.pdmodel \
96+
--params_filename model.pdiparams \
97+
--opset_version 16 \
98+
--save_file mask_rtdetr_hgnetv2_l_6x_coco.onnx
99+
```
100+
</details>
101+
102+
<details>
103+
<summary>3. 转换成TensorRT(可选) </summary>
104+
105+
- 确保TensorRT的版本>=8.5.1
106+
- TRT推理可以参考[RT-DETR](https://github.com/lyuwenyu/RT-DETR)的部分代码或者其他网络资源
107+
108+
```shell
109+
trtexec --onnx=./mask_rtdetr_hgnetv2_l_6x_coco.onnx \
110+
--workspace=4096 \
111+
--shapes=image:1x3x640x640 \
112+
--saveEngine=mask_rtdetr_hgnetv2_l_6x_coco.trt \
113+
--avgRuns=100 \
114+
--fp16
115+
```
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
architecture: DETR
2+
with_mask: True
3+
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet50_vd_ssld_v2_pretrained.pdparams
4+
norm_type: sync_bn
5+
use_ema: True
6+
ema_decay: 0.9999
7+
ema_decay_type: "exponential"
8+
ema_filter_no_grad: True
9+
hidden_dim: 256
10+
use_focal_loss: True
11+
eval_size: [640, 640]
12+
13+
14+
DETR:
15+
backbone: ResNet
16+
neck: MaskHybridEncoder
17+
transformer: MaskRTDETR
18+
detr_head: MaskDINOHead
19+
post_process: DETRPostProcess
20+
21+
ResNet:
22+
# index 0 stands for res2
23+
depth: 50
24+
variant: d
25+
norm_type: bn
26+
freeze_at: 0
27+
return_idx: [1, 2, 3]
28+
lr_mult_list: [0.1, 0.1, 0.1, 0.1]
29+
num_stages: 4
30+
freeze_stem_only: True
31+
32+
MaskHybridEncoder:
33+
hidden_dim: 256
34+
use_encoder_idx: [2]
35+
num_encoder_layers: 1
36+
encoder_layer:
37+
name: TransformerLayer
38+
d_model: 256
39+
nhead: 8
40+
dim_feedforward: 1024
41+
dropout: 0.
42+
activation: 'gelu'
43+
mask_dim: 32
44+
expansion: 1.0
45+
46+
47+
MaskRTDETR:
48+
num_queries: 300
49+
position_embed_type: sine
50+
feat_strides: [8, 16, 32]
51+
mask_dim: 32
52+
num_levels: 3
53+
nhead: 8
54+
num_decoder_layers: 6
55+
dim_feedforward: 1024
56+
dropout: 0.0
57+
activation: relu
58+
num_denoising: 100
59+
label_noise_ratio: 0.5
60+
box_noise_scale: 1.0
61+
learnt_init_query: False
62+
mask_enhanced: True
63+
64+
MaskDINOHead:
65+
loss:
66+
name: MaskDINOLoss
67+
loss_coeff: {class: 4, bbox: 5, giou: 2, mask: 5, dice: 5}
68+
aux_loss: True
69+
use_vfl: True
70+
matcher:
71+
name: HungarianMatcher
72+
matcher_coeff: {class: 4, bbox: 5, giou: 2, mask: 5, dice: 5}
73+
74+
DETRPostProcess:
75+
num_top_queries: 100
76+
mask_stride: 8
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
worker_num: 4
2+
TrainReader:
3+
sample_transforms:
4+
- Decode: {}
5+
- Poly2Mask: {del_poly: True}
6+
- RandomDistort: {prob: 0.8}
7+
- RandomExpand: {fill_value: [123.675, 116.28, 103.53]}
8+
- RandomCrop: {prob: 0.8}
9+
- RandomFlip: {}
10+
batch_transforms:
11+
- BatchRandomResize: {target_size: [480, 512, 544, 576, 608, 640, 640, 640, 672, 704, 736, 768, 800], random_size: True, random_interp: True, keep_ratio: False}
12+
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
13+
- NormalizeBox: {}
14+
- BboxXYXY2XYWH: {}
15+
- Permute: {}
16+
batch_size: 4
17+
shuffle: true
18+
drop_last: true
19+
collate_batch: false
20+
use_shared_memory: true
21+
22+
23+
EvalReader:
24+
sample_transforms:
25+
- Decode: {}
26+
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
27+
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
28+
- Permute: {}
29+
batch_size: 1 # mask be 1
30+
shuffle: false
31+
drop_last: false
32+
33+
34+
TestReader:
35+
inputs_def:
36+
image_shape: [3, 640, 640]
37+
sample_transforms:
38+
- Decode: {}
39+
- Resize: {target_size: [640, 640], keep_ratio: False, interp: 2}
40+
- NormalizeImage: {mean: [0., 0., 0.], std: [1., 1., 1.], norm_type: none}
41+
- Permute: {}
42+
batch_size: 1
43+
shuffle: false
44+
drop_last: false
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
epoch: 72
2+
3+
LearningRate:
4+
base_lr: 0.0001
5+
schedulers:
6+
- !PiecewiseDecay
7+
gamma: 1.0
8+
milestones: [100]
9+
use_warmup: true
10+
- !LinearWarmup
11+
start_factor: 0.001
12+
steps: 2000
13+
14+
OptimizerBuilder:
15+
clip_grad_by_norm: 0.1
16+
regularizer: false
17+
optimizer:
18+
type: AdamW
19+
weight_decay: 0.0001
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
_BASE_: [
2+
'../datasets/coco_instance.yml',
3+
'../runtime.yml',
4+
'_base_/optimizer_6x.yml',
5+
'_base_/mask_rtdetr_r50vd.yml',
6+
'_base_/mask_rtdetr_reader.yml',
7+
]
8+
9+
weights: output/mask_rtdetr_hgnetv2_l_6x_coco/model_final
10+
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/PPHGNetV2_L_ssld_pretrained.pdparams
11+
find_unused_parameters: True
12+
log_iter: 200
13+
save_dir: output/mask_rtdetr_hgnetv2_l_6x_coco
14+
15+
DETR:
16+
backbone: PPHGNetV2
17+
18+
PPHGNetV2:
19+
arch: 'L'
20+
return_idx: [1, 2, 3]
21+
freeze_stem_only: True
22+
freeze_at: 0
23+
freeze_norm: True
24+
lr_mult_list: [0., 0.05, 0.05, 0.05, 0.05]
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
_BASE_: [
2+
'../datasets/coco_instance.yml',
3+
'../runtime.yml',
4+
'_base_/optimizer_6x.yml',
5+
'_base_/mask_rtdetr_r50vd.yml',
6+
'_base_/mask_rtdetr_reader.yml',
7+
]
8+
9+
weights: output/mask_rtdetr_hgnetv2_l_6x_coco/model_final
10+
pretrain_weights: https://bj.bcebos.com/v1/paddledet/models/pretrained/PPHGNetV2_X_ssld_pretrained.pdparams
11+
find_unused_parameters: True
12+
log_iter: 200
13+
14+
15+
DETR:
16+
backbone: PPHGNetV2
17+
18+
19+
PPHGNetV2:
20+
arch: 'X'
21+
return_idx: [1, 2, 3]
22+
freeze_stem_only: True
23+
freeze_at: 0
24+
freeze_norm: True
25+
lr_mult_list: [0., 0.01, 0.01, 0.01, 0.01]
26+
27+
28+
MaskHybridEncoder:
29+
hidden_dim: 384
30+
use_encoder_idx: [2]
31+
num_encoder_layers: 1
32+
encoder_layer:
33+
name: TransformerLayer
34+
d_model: 384
35+
nhead: 8
36+
dim_feedforward: 2048
37+
dropout: 0.
38+
activation: 'gelu'
39+
expansion: 1.0
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
_BASE_: [
2+
'../datasets/coco_instance.yml',
3+
'../runtime.yml',
4+
'_base_/optimizer_6x.yml',
5+
'_base_/mask_rtdetr_r50vd.yml',
6+
'_base_/mask_rtdetr_reader.yml',
7+
]
8+
9+
weights: output/mask_rtdetr_r101vd_6x_coco/model_final
10+
find_unused_parameters: True
11+
log_iter: 200
12+
13+
pretrain_weights: https://paddledet.bj.bcebos.com/models/pretrained/ResNet101_vd_ssld_pretrained.pdparams
14+
15+
ResNet:
16+
# index 0 stands for res2
17+
depth: 101
18+
variant: d
19+
norm_type: bn
20+
freeze_at: 0
21+
return_idx: [1, 2, 3]
22+
lr_mult_list: [0.01, 0.01, 0.01, 0.01]
23+
num_stages: 4
24+
freeze_stem_only: True
25+
26+
MaskHybridEncoder:
27+
hidden_dim: 384
28+
use_encoder_idx: [2]
29+
num_encoder_layers: 1
30+
encoder_layer:
31+
name: TransformerLayer
32+
d_model: 384
33+
nhead: 8
34+
dim_feedforward: 2048
35+
dropout: 0.
36+
activation: 'gelu'
37+
expansion: 1.0

0 commit comments

Comments
 (0)