Replies: 1 comment 2 replies
-
It seems you do not configure
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I tried to use the command
"
bash ./tools/benchmarks/mmdetection/mim_dist_train_c4.sh configs/benchmarks/mmdetection/voc0712/faster_rcnn_r50_c4_mstrain_24k_voc0712ls.py work_dirs/selfsup/densecl_resnet50_8xb32-coslr-200e_in1k/epoch_200.pth 1
"
And the config I used is:

"
base = 'mmdet::pascal_voc/faster-rcnn_r50-caffe-c4_ms-18k_voc0712.py'
data_preprocessor = dict(
type='DetDataPreprocessor',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
bgr_to_rgb=True,
pad_size_divisor=32)
norm_cfg = dict(type='SyncBN', requires_grad=True)
model = dict(
backbone=dict(
frozen_stages=-1,
norm_cfg=norm_cfg,
norm_eval=False,
style='pytorch',
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
roi_head=dict(
shared_head=dict(
type='ResLayerExtraNorm',
norm_cfg=norm_cfg,
norm_eval=False,
style='pytorch'),
bbox_head=dict(num_classes=2)))
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='RandomChoiceResize',
scales = [(666, 240), (666, 256), (666,272), (666, 288),
(666, 304), (666, 320), (666, 336), (666, 352),
(666, 368), (666, 384), (666, 400)],
keep_ratio=True),
dict(type='RandomFlip', prob=0.5),
dict(type='PackDetInputs')
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='Resize', scale=(666, 400), keep_ratio=True),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor'))
]
dataset_type = 'VOCDataset'
data_root = '/media/ls/disk1/DOTA/VOCdevkit/'
train_dataloader = dict(
batch_size=2,
num_workers=1,
sampler=dict(type='InfiniteSampler', shuffle=True),
dataset=dict(
delete=True,
type='VOCDataset',
data_root=data_root,
ann_file='VOC2007/ImageSets/Main/trainval.txt',
data_prefix=dict(sub_data_root='VOC2007/'),
filter_cfg=dict(filter_empty_gt=True, min_size=32),
pipeline=train_pipeline,
))
val_dataloader = dict(dataset=dict(pipeline=test_pipeline,data_root=data_root,))
test_dataloader = val_dataloader
train_cfg = dict(delete=True, type='EpochBasedTrainLoop', max_epochs=24, val_interval=4)
#max_iter = 824
param_scheduler = [
dict(
type='LinearLR', start_factor=0.001, by_epoch=False, begin=0,
end=1000),
dict(
type='MultiStepLR',
begin=0,
end=24,
by_epoch=True,
milestones=[16, 22],
gamma=0.1)
]
val_evaluator = dict(type='VOCMetric', metric='mAP', eval_mode='11points')
test_evaluator = val_evaluator
default_hooks = dict(checkpoint=dict(by_epoch=True, interval=4))
log_processor = dict(by_epoch=True)
custom_imports = dict(
imports=['mmselfsup.evaluation.functional.res_layer_extra_norm'],
allow_failed_imports=False)
"
However the training process stuck at the epoch1 all the time , and the traning epoch couldn't run into the next epoch as below.
The log file is like "mmengine - INFO - Epoch(train) [1][2400/824]", which is "24000" is already over "824". The training process should have went to the "Epoch(train) [2]".
I tried to check the config file, but I couldn't find what caused the error.
May I get some advice? Thanks in advanced.
Beta Was this translation helpful? Give feedback.
All reactions