Skip to content

Commit 369e340

Browse files
author
Guillem Braso Andilla
committed
Add README, clean-up and fix output dirs
1 parent 71c6960 commit 369e340

13 files changed

+286
-104
lines changed

README.md

Lines changed: 69 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,84 @@
11
# MOTSynth Baselines
2-
This repository provides baseline implementations for object detection, segmentation and tracking on the MOTSynth dataset.
3-
4-
5-
Pretrained models and complete instructions will be released soon after the ECCV deadline (7th of March).
2+
This repository provides download instructions and helper code for the [MOTSynth dataset](https://arxiv.org/abs/2108.09518), as well as baseline implementations for object detection, segmentation and tracking.
63

4+
Check out our:
5+
- [ICCV 2021 paper](https://openaccess.thecvf.com/content/ICCV2021/html/Fabbri_MOTSynth_How_Can_Synthetic_Data_Help_Pedestrian_Detection_and_Tracking_ICCV_2021_paper.html)
6+
- [5 min. video](https://www.youtube.com/watch?v=dc_Z1iCceL4)
7+
- [Dataset page](https://motchallenge.net/data/MOTSynth-MOT-CVPR22/)
8+
- [Project Page](https://aimagelab.ing.unimore.it/imagelab/page.asp?IdPage=42)
79

810
> ![Method Visualization](teaser_github.png)
911
1012

1113
# Installation:
12-
TODO
14+
See [docs/INSTALL.md](docs/INSTALL.md)
1315

14-
# Data Preparation:
15-
TODO
16+
# Dataset Download and Preparation:
17+
See [docs/DATA_PREPARATION.md](docs/DATA_PREPARATION.md)
1618

17-
# Object Detection:
18-
TODO
19+
# Object Detection (and Instance Segmentation):
20+
We adapt [torchvision's detection reference code](https://github.com/pytorch/vision/tree/main/references/detection) to train [Mask R-CNN](https://arxiv.org/abs/1703.06870) on MOTSynth. To train Mask R-CNN with a ResNet50 with FPN backbone, you can run the following:
21+
```
22+
NUM_GPUS=3
23+
PORT=1234
24+
python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS --use_env --master_port=$PORT tools/train_detector.py\
25+
--model maskrcnn_resnet50_fpn\
26+
--batch-size 5 --world-size $NUM_GPUS --trainable-backbone-layers 1 --backbone resnet50 --train-dataset train --epochs 10
27+
```
28+
If you use a different number of GPUs (`$NUM_GPUS`), please adapt your learning rate or modify your batch size so that the overall batch size stays at 15 (3 GPUs with 5 images per GPU).
1929

20-
# ReID:
21-
TODO
30+
Our trained model can be downloaded [here](https://vision.in.tum.de/webshare/u/brasoand/motsynth/maskrcnn_resnet50_fpn_epoch_10.pth)
2231

2332
# Multi-Object Tracking:
33+
We use our Mask R-CNN model trained on MOTSynth to test [Tracktor](https://arxiv.org/abs/1903.05625) for tracking on MOT17.
34+
35+
To produce results for MOT17 train, you can run the following:
36+
```
37+
python tools/test_tracktor.py
38+
```
39+
This model should yield the following results:
2440
TODO
2541

2642
# Multi-Object Tracking and Segmentation:
27-
TODO
43+
We provide a simple baseline for MOTS. We run Tracktor with our trained Mask R-CNN detector, and use Mask R-CNN's segmentation head to produce an segmentation mask for every output bounding box.
44+
45+
To evaluate this model on MOTS20, you can run the following:
46+
```
47+
python tools/test_tracktor.py mots.do_mots=True mots.mots20_only=True
48+
```
49+
This model should yield the following results on MOT17 train:
50+
```
51+
IDF1 IDP IDR Rcll Prcn GT MT PT ML FP FN IDs FM MOTA MOTP IDt IDa IDm
52+
MOT17-02 35.2% 51.7% 26.7% 38.9% 75.4% 62 8 27 27 2361 11353 99 152 25.7% 0.251 28 78 8
53+
MOT17-04 55.5% 65.9% 48.0% 63.2% 86.8% 83 29 33 21 4569 17524 93 245 53.3% 0.204 23 75 5
54+
MOT17-05 62.2% 78.4% 51.6% 59.0% 89.6% 133 30 71 32 473 2834 41 90 51.6% 0.242 29 27 16
55+
MOT17-09 47.4% 51.9% 43.6% 67.0% 79.8% 26 10 15 1 903 1757 51 69 49.1% 0.230 21 34 6
56+
MOT17-10 42.1% 60.1% 32.4% 49.1% 91.1% 57 12 23 22 614 6534 146 326 43.2% 0.240 13 129 4
57+
MOT17-11 57.7% 70.4% 48.9% 63.0% 90.7% 75 23 22 30 607 3491 31 43 56.2% 0.197 7 26 2
58+
MOT17-13 39.9% 64.7% 28.8% 38.4% 86.2% 110 17 47 46 717 7168 88 151 31.5% 0.253 42 67 23
59+
OVERALL 49.7% 63.7% 40.8% 54.9% 85.7% 546 129 238 179 10244 50661 549 1076 45.3% 0.220 163 436 64
60+
```
61+
62+
# Person Re-Identification
63+
We treat MOTSynth and MOT17 as ReID datasets by sampling 1 in 60 frames and treating each pedestrian as a unique identity. We use [torchreid](https://github.com/KaiyangZhou/deep-person-reid/tree/master/torchreid)'s amazing work to train our models.
64+
65+
You can train our baseline ReID model with a ResNet50, on MOTSynth (and evaluate it on MOT17 train) by running:
66+
```
67+
python tools/main_reid.py --config-file configs/r50_fc512_motsynth_train.yaml
68+
```
69+
The resulting checkpoint can be downloaded [here](https://vision.in.tum.de/webshare/u/brasoand/motsynth/resnet50_fc512_reid_epoch_19.pth)
70+
71+
72+
# Acknowledgements
73+
This codebase is built on top of several great works. Our detection code is minimally modified from [torchvision's detection reference code](https://github.com/pytorch/vision/tree/main/references/detection). For MOT, we directly use [Tracktor's codebase](https://github.com/phil-bergmann/tracking_wo_bnw), and for ReID, we use the great [torchreid](https://github.com/KaiyangZhou/deep-person-reid/tree/master/torchreid) framework. [Orçun Cetintas](https://github.com/ocetintas/) also helped with the MOTS postprocesing code. We thank all the authors of these codebases for their amazing work.
74+
75+
# Citation:
76+
If you find MOTSynth useful in your research, please cite our publication:
77+
```
78+
@inproceedings{fabbri21iccv,
79+
title = {MOTSynth: How Can Synthetic Data Help Pedestrian Detection and Tracking?},
80+
author = {Matteo Fabbri and Guillem Bras{\'o} and Gianluca Maugeri and Aljo{\v{s}}a O{\v{s}}ep and Riccardo Gasparini and Orcun Cetintas and Simone Calderara and Laura Leal-Taix{\'e} and Rita Cucchiara},
81+
booktitle = {International Conference on Computer Vision (ICCV)},
82+
year = {2021}
83+
}
84+
```

configs/r50_fc512_motsynth_train.yaml

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@ model:
44

55
data:
66
type: 'image'
7-
sources: ['motsynth_train_mini']
7+
sources: ['motsynth_split_1_mini']
88
targets: ['mot17']
99
height: 256
1010
width: 128
1111
combineall: False
1212
transforms: ['random_flip']
13-
save_dir: 'log/resnet50_fc512_motsynth_train_new_data'
13+
save_dir: 'resnet50_fc512_motsynth_train'
1414

1515
loss:
1616
name: 'softmax'
@@ -19,17 +19,13 @@ loss:
1919

2020
train:
2121
optim: 'amsgrad'
22-
#lr: 0.0006
23-
lr: 0.0036
24-
max_epoch: 120
25-
#batch_size: 32
26-
batch_size: 196
22+
lr: 0.0009
23+
max_epoch: 19
24+
batch_size: 180
2725
fixbase_epoch: 5
28-
#fixbase_epoch: 0
29-
#open_layers: ['classifier']
3026
open_layers: ['fc', 'classifier']
3127
lr_scheduler: 'single_step'
32-
stepsize: [60]
28+
stepsize: [15]
3329

3430
test:
3531
batch_size: 224

configs/r50_fc512_motsynth_train_dflt.yaml

Lines changed: 0 additions & 42 deletions
This file was deleted.

configs/tracktor.yaml

Lines changed: 6 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -7,22 +7,12 @@ seed: 12345
77
network: fpn
88

99
mots:
10-
do_mots: False
11-
maskrcnn_model: /storage/user/brasoand/MOTSynth_train_1_trainable_layer_mini/model_4.pth
12-
mots20_only: True
10+
do_mots: False # determines whether segmentation masks are also generated during tracking
11+
maskrcnn_model: maskrcnn_resnet50_fpn_epoch_10.pth # Mask RCNN checkpoint used to obtain masks. It is expected to be an absolute path or a rel path at ${OUTPUT_DIR}/models
12+
mots20_only: True # if mots.do_mots is set to True, determines whether masks are generated for all sequences or only those in MOTS20
1313

14-
15-
# frcnn
16-
# obj_detect_weights: output/frcnn/res101/mot_2017_train/180k/res101_faster_rcnn_iter_180000.pth
17-
# obj_detect_config: output/frcnn/res101/mot_2017_train/180k/sacred_config.yaml
18-
19-
# fpn
20-
obj_detect_models: /storage/user/brasoand/MOTSynth_train_1_trainable_layer_mini/model_4.pth
21-
# obj_detect_model: output/faster_rcnn_fpn/faster_rcnn_fpn_training_mot_20/model_epoch_27.model
22-
23-
#reid_models: /usr/wiss/brasoand/motsynth-baselines/log/resnet50_fc512_motsyn4_softmax/model/model.pth.tar-5
24-
#reid_models: /storage/slurm/brasoand/motsynth_output/reid/r50_split_1_ep150.pth
25-
reid_models: /usr/wiss/brasoand/motsynth-baselines/log/resnet50_fc512_motsynth_split_3/model/model.pth.tar-95
14+
obj_detect_models: maskrcnn_resnet50_fpn_epoch_10.pth # Mask RCNN checkpoint used by Tracktor. It is expected to be an absolute path or rel path at ${OUTPUT_DIR}/models
15+
reid_models: resnet50_fc512_reid_epoch_19.pth # ReID model checkpoint used by Tracktor. It is expected to be at ${OUTPUT_DIR}/models
2616

2717
interpolate: False
2818
# [False, 'debug', 'pretty']
@@ -41,7 +31,7 @@ frame_range:
4131

4232
tracker:
4333
# FRCNN score threshold for detections
44-
detection_person_thresh: 0.5
34+
detection_person_thresh: 0.95 # Only modification over the original config. A high threshold is needed to avoid FPs
4535
# FRCNN score threshold for keeping the track alive
4636
regression_person_thresh: 0.5
4737
# NMS threshold for detection

docs/DATA_PREPARATION.md

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
2+
# Data preparation
3+
## Setup
4+
- You can optionally modify `MOTCHA_PATH` and `MOTSYNTH_PATH` and `OUTPUT_DIR` as your directories for you MOT17, MOTSynth, and you train/eval outputs at `configs/path_cfg.py`.
5+
6+
## Downloading and preparing MOTSynth
7+
8+
1. Download and extract all MOTSynth videos. This will take a while...
9+
```
10+
MOTSYNTH_ROOT=$(python -c "from configs.path_cfg import MOTSYNTH_ROOT; print(MOTSYNTH_ROOT);")
11+
wget -P $MOTSYNTH_ROOT https://motchallenge.net/data/MOTSynth_1.zip
12+
wget -P $MOTSYNTH_ROOT https://motchallenge.net/data/MOTSynth_2.zip
13+
wget -P $MOTSYNTH_ROOT https://motchallenge.net/data/MOTSynth_3.zip
14+
15+
unzip $MOTSYNTH_ROOT/MOTSynth_1.zip -d $MOTSYNTH_ROOT
16+
unzip $MOTSYNTH_ROOT/MOTSynth_2.zip -d $MOTSYNTH_ROOT
17+
unzip $MOTSYNTH_ROOT/MOTSynth_3.zip -d $MOTSYNTH_ROOT
18+
19+
rm $MOTSYNTH_ROOT/MOTSynth_1.zip
20+
rm $MOTSYNTH_ROOT/MOTSynth_2.zip
21+
rm $MOTSYNTH_ROOT/MOTSynth_3.zip
22+
```
23+
2. Extract frames from the videos you downloaded. Again, this will take while.
24+
```
25+
python tools/anns/to_frames.py --motsynth-root $MOTSYNTH_ROOT
26+
27+
# You can now delete the videos
28+
rm -r $MOTSYNTH_ROOT/MOTSynth_1
29+
rm -r $MOTSYNTH_ROOT/MOTSynth_2
30+
rm -r $MOTSYNTH_ROOT/MOTSynth_3
31+
```
32+
3. Download and extract the annotations (in several formats):
33+
```
34+
wget -P $MOTSYNTH_ROOT https://motchallenge.net/data/MOTSynth_coco_annotations.zip
35+
wget -P $MOTSYNTH_ROOT https://motchallenge.net/data/MOTSynth_mot_annotations.zip
36+
wget -P $MOTSYNTH_ROOT https://motchallenge.net/data/MOTSynth_mots_annotations.zip
37+
# Merged annotation files for ReID and detection trainings
38+
wget -P $MOTSYNTH_ROOT https://vision.in.tum.de/webshare/u/brasoand/motsynth/comb_annotations.zip
39+
40+
unzip $MOTSYNTH_ROOT/MOTSynth_coco_annotations.zip -d $MOTSYNTH_ROOT
41+
unzip $MOTSYNTH_ROOT/MOTSynth_mot_annotations.zip -d $MOTSYNTH_ROOT
42+
unzip $MOTSYNTH_ROOT/MOTSynth_mots_annotations.zip -d $MOTSYNTH_ROOT
43+
unzip $MOTSYNTH_ROOT/comb_annotations.zip -d $MOTSYNTH_ROOT
44+
45+
rm $MOTSYNTH_ROOT/MOTSynth_coco_annotations.zip
46+
rm $MOTSYNTH_ROOT/MOTSynth_mot_annotations.zip
47+
rm $MOTSYNTH_ROOT/MOTSynth_mots_annotations.zip
48+
rm $MOTSYNTH_ROOT/comb_annotations.zip
49+
```
50+
**Note**: You can generate the mot, mots and combined annotation files yourself from the original coco format annotations with the scripts `tools/anns/generate_mot_format_files.py`, `tools/anns/generate_mots_format_files.py`, and `tools/anns/combine_anns.py`, respectively.
51+
52+
After runnning these steps, your `MOTSYNTH_ROOT` directory should look like this:
53+
```text
54+
$MOTSYNTH_ROOT
55+
├── frames
56+
│-- 000
57+
│ │-- rgb
58+
│ │ │-- 0000.jpg
59+
│ │ │-- 0001.jpg
60+
│ │ │-- ...
61+
│-- ...
62+
├── annotations
63+
│-- 000.json
64+
│-- 001.json
65+
│-- ...
66+
├── comb_annotations
67+
│-- split_1.json
68+
│-- split_2.json
69+
│-- ...
70+
├── mot_annotations
71+
│-- 000
72+
│ │-- gt
73+
│ │ │-- gt.txt
74+
│ │-- seqinfo.ini
75+
│-- ...
76+
├── mots_annotations
77+
│-- 000
78+
│ │-- gt
79+
│ │ │-- gt.txt
80+
│ │-- seqinfo.ini
81+
│-- ...
82+
83+
```
84+
85+
86+
## Downloading and preparing MOT17
87+
We will use MOT17 for both tracking and MOTS experiments, since MOTS20 sequences are a subset of MOT17 sequences. To download it, follow these steps:
88+
89+
1. Download and extract it under `$MOTCHA_ROOT`. E.g.:
90+
```
91+
MOTCHA_ROOT=$(python -c "from configs.path_cfg import MOTCHA_ROOT; print(MOTCHA_ROOT);")
92+
wget -P $MOTCHA_ROOT https://motchallenge.net/data/MOT17.zip
93+
unzip $MOTCHA_ROOT/MOT17.zip -d $MOTCHA_ROOT
94+
rm $MOTCHA_ROOT/MOT17.zip
95+
```
96+
2. Download and extract COCO-format MOT17 annotations (or alternatively, you can generate them with `tools/anns/motcha_to_coco.py`). These are needed for evaluation in detection and reid trainings.
97+
```
98+
wget -P $MOTCHA_ROOT https://vision.in.tum.de/webshare/u/brasoand/motsynth/motcha_coco_annotations.zip
99+
unzip $MOTCHA_ROOT/motcha_coco_annotations.zip -d $MOTCHA_ROOT
100+
rm $MOTCHA_ROOT/motcha_coco_annotations.zip
101+
```
102+
103+
After runnning these steps, your `MOTCHA_ROOT` directory should look like this:
104+
```
105+
$MOTCHA_ROOT
106+
├── MOT17
107+
| │-- train
108+
| │ │-- MOT17-02-DPM
109+
| │ │ │-- gt
110+
| │ │ │ |-- gt.txt
111+
| │ │ │-- det
112+
| │ │ │ |-- det.txt
113+
| │ │ |-- img1
114+
| │ │ │ |-- 000001.jpg
115+
| │ │ │ |-- 000002.jpg
116+
| │ │ │ |-- ...
117+
| │ │ │-- seqinfo.ini
118+
| | |-- MOT17-02-FRCNN
119+
| │ │ │-- ...
120+
| | |-- ...
121+
| │-- test
122+
| │-- MOT17-01-DPM
123+
| │-- ...
124+
|
125+
|--motcha_coco_annotations
126+
│-- MOT17-02.json
127+
│-- ...
128+
│-- MOT17-train.json
129+
```
130+
## ReID data
131+
**Note**: This is only needed if you want to train you own ReID model.
132+
133+
To train and evaluate ReID models, we store the bounding-box cropped images of pedestrians in every 60th frame from both MOTSynth and MOT17, respectively. You can download these images here:
134+
```
135+
# For MOT17
136+
MOTCHA_ROOT=$(python -c "from configs.path_cfg import MOTCHA_ROOT; print(MOTCHA_ROOT);")
137+
wget -P $MOTCHA_ROOT https://vision.in.tum.de/webshare/u/brasoand/motsynth/motcha_reid_images.zip.zip
138+
unzip $MOTCHA_ROOT/motcha_reid_images.zip -d $MOTCHA_ROOT
139+
rm $MOTCHA_ROOT/motcha_reid_images.zip
140+
141+
# For MOTSynth
142+
MOTSYNTH_ROOT=$(python -c "from configs.path_cfg import MOTSYNTH_ROOT; print(MOTSYNTH_ROOT);")
143+
wget -P $MOTSYNTH_ROOT https://vision.in.tum.de/webshare/u/brasoand/motsynth/motsynth_reid_images.zip.zip
144+
unzip $MOTSYNTH_ROOT/motsynth_reid_images.zip -d $MOTSYNTH_ROOT
145+
rm $MOTSYNTH_ROOT/motsynth_reid_images.zip
146+
147+
```
148+
149+
Alternatively, you can directly generate these images locally by running:
150+
```
151+
# For MOT17
152+
python tools/anns/store_reid_imgs.py --ann-path $MOTCHA_ROOT/motcha_coco_annotations/MOT17-train.json
153+
154+
# For MOTSynth
155+
python tools/anns/store_reid_imgs.py --ann-path $MOTSYNTH_ROOT/comb_annotations/train_mini.json
156+
```

0 commit comments

Comments
 (0)