Skip to content

Commit 22389e7

Browse files
committed
add code
1 parent 0a6a660 commit 22389e7

File tree

93 files changed

+15617
-3
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

93 files changed

+15617
-3
lines changed

INSTALL.md

+75
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
### Installation
2+
3+
First, clone the repository locally:
4+
5+
```bash
6+
conda create -n vmt python=3.7 -y
7+
8+
conda activate vmt
9+
10+
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 -c pytorch
11+
12+
git clone --recursive https://github.com/SysCV/vmt.git
13+
```
14+
15+
Install detectron2 for visualization under your working directory:
16+
```
17+
git clone https://github.com/facebookresearch/detectron2.git
18+
cd detectron2
19+
pip install -e .
20+
```
21+
22+
Install dependencies and pycocotools for VIS and HQ-YTVIS:
23+
```bash
24+
pip install -r requirements.txt
25+
26+
cd cocoapi_hq/PythonAPI
27+
# To compile and install locally
28+
python setup.py build_ext --inplace
29+
# To install library to Python site-packages
30+
python setup.py build_ext install
31+
```
32+
33+
Compiling CUDA operators:
34+
35+
```bash
36+
cd ./models/ops
37+
sh ./make.sh
38+
# unit test (should see all checking is True)
39+
python test.py
40+
41+
cd ./models_swin/ops
42+
sh ./make.sh
43+
```
44+
45+
### Data Preparation
46+
47+
Download and extract 2019 version of YoutubeVIS train and val images with annotations from [YouTubeVIS](https://youtube-vos.org/dataset/vis/), and download [HQ-YTVIS annotations](https://www.vis.xyz/data/hqvis/) and COCO 2017 datasets. We expect the directory structure to be the following:
48+
49+
50+
```
51+
vmt
52+
├── datasets
53+
│ ├── coco_keepfor_ytvis19_new.json
54+
...
55+
ytvis
56+
├── train
57+
├── val
58+
├── annotations
59+
│ ├── instances_train_sub.json
60+
│ ├── instances_val_sub.json
61+
│ ├── ytvis_hq-train.json
62+
│ ├── ytvis_hq-val.json
63+
│ ├── ytvis_hq-test.json
64+
coco
65+
├── train2017
66+
├── val2017
67+
├── annotations
68+
│ ├── instances_train2017.json
69+
│ ├── instances_val2017.json
70+
```
71+
72+
The modified coco annotations 'coco_keepfor_ytvis19_new.json' for joint training can be downloaded from [[google]](https://drive.google.com/file/d/18yKpc8wt7xJK26QFpR5Xa0vjM5HN6ieg/view?usp=sharing). The HQ-YTVIS annotations can be downloaded from [[google]](https://drive.google.com/drive/folders/1ZU8_qO8HnJ_-vvxIAn8-_kJ4xtOdkefh?usp=sharing).
73+
74+
##
75+

README.md

+25-3
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,26 @@ python eval_hqvis.py --save-path prediction_results.json
2929
```
3030

3131
## VMT Code
32-
<!-- <img src="figures/result_demo1.gif" width="830"/> -->
32+
---------------
33+
### Install
34+
Please refer to [INSTALL.md](INSTALL.md) for installation instructions.
3335

3436
https://user-images.githubusercontent.com/17427852/181796768-3e79ee74-2465-4af8-ba89-b5c837098e00.mp4
3537

36-
Code for VMT is coming soon (before ECCV happens).
38+
### Usages
39+
Please refer to [USAGE.md](USAGE.md) for dataset preparation and detailed running (including testing, visualization, etc.) instructions.
40+
41+
### Model zoo
42+
43+
#### HQ-YTVIS model
44+
45+
Train on [HQ-YTVIS](https://www.vis.xyz/data/hqvis/) **train** set and COCO, evaluate on [HQ-YTVIS](https://www.vis.xyz/data/hqvis/) **test** set.
46+
47+
| Model | AP<sup>B</sup> | AP<sup>B</sup><sub>75</sub> | AR<sup>B</sup><sub>1</sub> | AP<sup>M</sup> | AR<sup>M</sup><sub>75</sub> | download |
48+
| ------------------------------------------------------------ | ---- | ---- | ---- | ---- | ---- | ------------------------------------------------------------ |
49+
| VMT_r50 | 30.7 | 24.2 | 31.5 | 50.5 | 54.5 | [weight](https://drive.google.com/file/d/1e9hKCC-pAGB-wSO0_qyUNoEe-5XzRocz/view?usp=sharing) |
50+
| VMT_r101 | 33.0 | 29.3 | 33.3 | 51.6 | 55.8 | [weight](https://drive.google.com/file/d/1TQs_meDaomLz56xCjAZKT1BNtS3K3sla/view?usp=sharing) |
51+
| VMT_swin_L | 44.8 | 43.4 | 43.0 | 64.8 | 70.1 | [weight](https://drive.google.com/file/d/13cDni9olYd6-xdURQMWstsW0VLbkgIKt/view?usp=sharing) |
3752

3853
## Citation
3954

@@ -44,7 +59,14 @@ Code for VMT is coming soon (before ECCV happens).
4459
booktitle = {European Conference on Computer Vision (ECCV)},
4560
year = {2022}
4661
}
62+
63+
@inproceedings{transfiner,
64+
title={Mask Transfiner for High-Quality Instance Segmentation},
65+
author={Ke, Lei and Danelljan, Martin and Li, Xia and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
66+
booktitle = {CVPR},
67+
year = {2022}
68+
}
4769
```
4870

4971
## Acknowledgement
50-
This repo is based on [Mask Transfiner](https://github.com/SysCV/transfiner) and [SeqFormer](https://github.com/wjf5203/SeqFormer).
72+
We thank [Mask Transfiner](https://github.com/SysCV/transfiner) and [SeqFormer](https://github.com/wjf5203/SeqFormer) for their open source codes.

USAGE.md

+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
### Pretrained Models
2+
---------------
3+
Download the pretrained models from the Model zoo table:
4+
```
5+
mkdir pretrained_model
6+
#And put the downloaded pretrained models in this directory.
7+
```
8+
9+
### Inference & Evaluation on HQ-YTVIS
10+
---------------
11+
Refer to our [scripts folder](./scripts) for more commands:
12+
13+
Evaluating on HQ-YTVIS test:
14+
```
15+
bash scripts/eval_swin_test.sh
16+
```
17+
or
18+
```
19+
bash scripts/eval_r101_test.sh
20+
```
21+
22+
### Results Visualization
23+
---------------
24+
```
25+
bash scripts/eval_swin_val_vis.sh
26+
```
27+
or
28+
```
29+
python3 -m tools.inference_swin_with_vis --masks --backbone swin_l_p4w12 --output vis_output_swin_vmt --model_path ./pretrained_model/checkpoint_swinl_final.pth --save_path exp_swin_hq_val_result.json --save-frames True
30+
```
31+

__init__.py

Whitespace-only changes.

datasets/__init__.py

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
import torch.utils.data
2+
from .torchvision_datasets import CocoDetection
3+
from datasets.ytvos import YTVOSDataset as YTVOSDataset
4+
5+
from .coco import build as build_coco
6+
from .coco2seq import build as build_seq_coco
7+
from .concat_dataset import build as build_joint
8+
from .ytvos import build as build_ytvs
9+
10+
11+
12+
def get_coco_api_from_dataset(dataset):
13+
for _ in range(10):
14+
if isinstance(dataset, torch.utils.data.Subset):
15+
dataset = dataset.dataset
16+
if isinstance(dataset, CocoDetection):
17+
return dataset.coco
18+
if isinstance(dataset, YTVOSDataset):
19+
return dataset.ytvos
20+
21+
22+
### build_type only works for YoutubeVIS ###
23+
def build_dataset(image_set, args):
24+
if args.dataset_file == 'YoutubeVIS':
25+
return build_ytvs(image_set, args)
26+
27+
if args.dataset_file == 'coco':
28+
return build_coco(image_set, args)
29+
if args.dataset_file == 'Seq_coco':
30+
return build_seq_coco(image_set, args)
31+
if args.dataset_file == 'jointcoco':
32+
return build_joint(image_set, args)
33+
34+
35+
raise ValueError(f'dataset {args.dataset_file} not supported')
36+
37+

datasets/coco.py

+176
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
"""
2+
COCO dataset which returns image_id for evaluation.
3+
4+
Mostly copy-paste from https://github.com/pytorch/vision/blob/13b35ff/references/detection/coco_utils.py
5+
"""
6+
from pathlib import Path
7+
8+
import torch
9+
import torch.utils.data
10+
from pycocotools import mask as coco_mask
11+
12+
from .torchvision_datasets import CocoDetection as TvCocoDetection
13+
from util.misc import get_local_rank, get_local_size
14+
import datasets.transforms as T
15+
import random
16+
17+
18+
class CocoDetection(TvCocoDetection):
19+
def __init__(self, img_folder, ann_file, transforms, return_masks, cache_mode=False, local_rank=0, local_size=1):
20+
super(CocoDetection, self).__init__(img_folder, ann_file,
21+
cache_mode=cache_mode, local_rank=local_rank, local_size=local_size)
22+
self._transforms = transforms
23+
self.prepare = ConvertCocoPolysToMask(return_masks)
24+
25+
def __getitem__(self, idx):
26+
27+
instance_check = False
28+
while not instance_check:
29+
img, target = super(CocoDetection, self).__getitem__(idx)
30+
image_id = self.ids[idx]
31+
target = {'image_id': image_id, 'annotations': target}
32+
img, target = self.prepare(img, target)
33+
if self._transforms is not None:
34+
img, target = self._transforms(img, target)
35+
36+
if len(target['labels']) == 0: # None instance
37+
idx = random.randint(0,self.__len__()-1)
38+
else:
39+
instance_check=True
40+
41+
return img, target
42+
43+
44+
def convert_coco_poly_to_mask(segmentations, height, width):
45+
masks = []
46+
for polygons in segmentations:
47+
rles = coco_mask.frPyObjects(polygons, height, width)
48+
mask = coco_mask.decode(rles)
49+
if len(mask.shape) < 3:
50+
mask = mask[..., None]
51+
mask = torch.as_tensor(mask, dtype=torch.uint8)
52+
mask = mask.any(dim=2)
53+
masks.append(mask)
54+
if masks:
55+
masks = torch.stack(masks, dim=0)
56+
else:
57+
masks = torch.zeros((0, height, width), dtype=torch.uint8)
58+
return masks
59+
60+
61+
class ConvertCocoPolysToMask(object):
62+
def __init__(self, return_masks=False):
63+
self.return_masks = return_masks
64+
65+
def __call__(self, image, target):
66+
w, h = image.size
67+
68+
image_id = target["image_id"]
69+
image_id = torch.tensor([image_id])
70+
71+
anno = target["annotations"]
72+
73+
anno = [obj for obj in anno if 'iscrowd' not in obj or obj['iscrowd'] == 0]
74+
75+
boxes = [obj["bbox"] for obj in anno]
76+
# guard against no boxes via resizing
77+
boxes = torch.as_tensor(boxes, dtype=torch.float32).reshape(-1, 4)
78+
boxes[:, 2:] += boxes[:, :2]
79+
boxes[:, 0::2].clamp_(min=0, max=w)
80+
boxes[:, 1::2].clamp_(min=0, max=h)
81+
82+
classes = [obj["category_id"] for obj in anno]
83+
classes = torch.tensor(classes, dtype=torch.int64)
84+
85+
if self.return_masks:
86+
segmentations = [obj["segmentation_refined"] for obj in anno]
87+
masks = convert_coco_poly_to_mask(segmentations, h, w)
88+
89+
keypoints = None
90+
if anno and "keypoints" in anno[0]:
91+
keypoints = [obj["keypoints"] for obj in anno]
92+
keypoints = torch.as_tensor(keypoints, dtype=torch.float32)
93+
num_keypoints = keypoints.shape[0]
94+
if num_keypoints:
95+
keypoints = keypoints.view(num_keypoints, -1, 3)
96+
97+
keep = (boxes[:, 3] > boxes[:, 1]) & (boxes[:, 2] > boxes[:, 0])
98+
boxes = boxes[keep]
99+
classes = classes[keep]
100+
if self.return_masks:
101+
masks = masks[keep]
102+
if keypoints is not None:
103+
keypoints = keypoints[keep]
104+
105+
target = {}
106+
target["boxes"] = boxes
107+
target["labels"] = classes
108+
if self.return_masks:
109+
target["masks"] = masks
110+
target["image_id"] = image_id
111+
if keypoints is not None:
112+
target["keypoints"] = keypoints
113+
114+
# for conversion to coco api
115+
area = torch.tensor([obj["area"] for obj in anno])
116+
iscrowd = torch.tensor([obj["iscrowd"] if "iscrowd" in obj else 0 for obj in anno])
117+
target["area"] = area[keep]
118+
target["iscrowd"] = iscrowd[keep]
119+
120+
target["orig_size"] = torch.as_tensor([int(h), int(w)])
121+
target["size"] = torch.as_tensor([int(h), int(w)])
122+
123+
return image, target
124+
125+
126+
def make_coco_transforms(image_set):
127+
128+
normalize = T.Compose([
129+
T.ToTensor(),
130+
T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
131+
])
132+
133+
scales = [480, 512, 544, 576, 608, 640, 672, 704, 736, 768]
134+
# scales = [296, 328, 360, 392]
135+
136+
if image_set == 'train':
137+
return T.Compose([
138+
T.RandomHorizontalFlip(),
139+
T.RandomSelect(
140+
T.RandomResize(scales, max_size=1333),
141+
T.Compose([
142+
T.RandomResize([400, 500, 600]),
143+
T.RandomSizeCrop(384, 600),
144+
T.RandomResize(scales, max_size=1333),
145+
])
146+
),
147+
normalize,
148+
])
149+
150+
if image_set == 'val':
151+
return T.Compose([
152+
T.RandomResize([800], max_size=1333),
153+
normalize,
154+
])
155+
156+
raise ValueError(f'unknown {image_set}')
157+
158+
159+
def build(image_set, args):
160+
root = Path(args.coco_path)
161+
assert root.exists(), f'provided COCO path {root} does not exist'
162+
mode = 'instances'
163+
dataset_type = args.dataset_type
164+
if args.dataset_file == 'coco':
165+
PATHS = {
166+
"train": (root / "train2017", root / "annotations" / f'{mode}_train2017.json'),
167+
"val": (root / "val2017", root / "annotations" / f'{mode}_val2017.json'),
168+
}
169+
170+
171+
img_folder, ann_file = PATHS[image_set]
172+
dataset = CocoDetection(img_folder, ann_file, transforms=make_coco_transforms(image_set), return_masks=args.masks,
173+
cache_mode=args.cache_mode, local_rank=get_local_rank(), local_size=get_local_size())
174+
return dataset
175+
176+

0 commit comments

Comments
 (0)