ByteTrack: Multi-object Tracking by Associating Every Detection Box

被引：903

作者：

Zhang, Yifu ^{[1
]}

Sun, Peize ^{[2
]}

Jiang, Yi ^{[3
]}

Yu, Dongdong ^{[3
]}

Weng, Fucheng ^{[1
]}

Yuan, Zehuan ^{[3
]}

Luo, Ping ^{[2
]}

Liu, Wenyu ^{[1
]}

Wang, Xinggang ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China

[2] Univ Hong Kong, Hong Kong, Peoples R China

[3] ByteDance Inc, Beijing, Peoples R China

来源：

COMPUTER VISION, ECCV 2022, PT XXII | 2022年 / 13682卷

关键词：

Multi-object tracking; Data association; Detection boxes; MULTITARGET;

D O I：

10.1007/978-3-031-20047-2_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos. Most methods obtain identities by associating detection boxes whose scores are higher than a threshold. The objects with low detection scores, e.g. occluded objects, are simply thrown away, which brings non-negligible true object missing and fragmented trajectories. To solve this problem, we present a simple, effective and generic association method, tracking by associating almost every detection box instead of only the high score ones. For the low score detection boxes, we utilize their similarities with tracklets to recover true objects and filter out the background detections. When applied to 9 different state-of-the-art trackers, our method achieves consistent improvement on IDF1 score ranging from 1 to 10 points. To put forwards the state-of-the-art performance of MOT, we design a simple and strong tracker, named ByteTrack. For the first time, we achieve 80.3 MOTA, 77.3 IDF1 and 63.1 HOTA on the test set of MOT17 with 30 FPS running speed on a single V100 GPU. ByteTrack also achieves state-of-the-art performance on MOT20, HiEve and BDD100K tracking benchmarks. The source code, pre-trained models with deploy versions and tutorials of applying to other trackers are released at https://github. com/ifzhang/ByteTrack.

引用

页码：1 / 21

页数：21

共 91 条

[61]

Tokmakov P., 2021, PREPRINT

[62]

Vaswani A, 2017, ADV NEUR IN, V30

[63] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [J].

Wang, Chien-Yao ;

Bochkovskiy, Alexey ;

Liao, Hong-Yuan Mark .

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, :7464-7475

[64] Multiple Object Tracking with Correlation Learning [J].

Wang, Qiang ;

Zheng, Yun ;

Pan, Pan ;

Xu, Yinghui .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :3875-3885

[65]

Wang WH, 2021, Arxiv, DOI arXiv:2102.12122

[66]

Wang YX, 2021, Arxiv, DOI arXiv:2006.13164

[67]

Wang Z., 2021, arXiv

[68] Towards Real-Time Multi-Object Tracking [J].

Wang, Zhongdao ;

Zheng, Liang ;

Liu, Yixuan ;

Li, Yali ;

Wang, Shengjin .

COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :107-122

[69]

Wojke N, 2017, IEEE IMAGE PROC, P3645, DOI 10.1109/ICIP.2017.8296962

[70] Track to Detect and Segment: An Online Multi-Object Tracker [J].

Wu, Jialian ;

Cao, Jiale ;

Song, Liangchen ;

Wang, Yu ;

Yang, Ming ;

Yuan, Junsong .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :12347-12356

← 1 2 3 4 5 6 7 8 9 10 →