Compressing the Multiobject Tracking Model via Knowledge Distillation

被引：4

作者：

Liang, Tianyi ^{[1
]}

Wang, Mengzhu ^{[2
]}

Chen, Junyang ^{[3
]}

Chen, Dingyao ^{[4
]}

Luo, Zhigang ^{[4
]}

Leung, Victor C. M. ^{[3
]}

机构：

[1] Inspur Grp Co Ltd, Jinan 250101, Shandong, Peoples R China

[2] DAMO Acad, Alibaba Grp, Hangzhou, Peoples R China

[3] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China

[4] Natl Univ Def Technol, Coll Comp, Changsha, Peoples R China

来源：

IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS | 2024年 / 11卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Knowledge distillation (KD); model compression; multiobject tracking (MOT); MULTITARGET;

D O I：

10.1109/TCSS.2023.3293882

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent multiobject tracking (MOT) methods usually use very deep neural networks to achieve competitive accuracy, which inevitably results in degraded inference speed. To strike a better balance between tracking accuracy and speed, in this work, we propose to compress the MOT model via knowledge distillation (KD), enabling the more lightweight student model to obtain similar performance as the teacher model. Nonetheless, despite KD has been well studied for simpler tasks such as image classification, the complexity of MOT poses new challenges because the MOT model is more sensitive to foreground information than the classification model. To deal with that, we first propose attention-guided feature distillation, which focuses the student model on the crucial region (foreground and the region with strong discrepancy against itself) of the teacher's feature map. Moreover, we propose foreground mask, which leverages the knowledge from the teacher model to filter out the low-quality soft labels from the background, thereby reducing their negative effects for distillation. Evaluations on several benchmarks demonstrate that the proposed KD method can make the student network achieve leading performance, meanwhile running faster than the teacher network 20.0%-27.4% and reducing the parameters 28.5%-87.1%. To the best of our knowledge, this is the first work to compress the MOT model via KD.

引用

页码：2713 / 2723

页数：11

共 67 条

[1] Confidence-Based Data Association and Discriminative Deep Appearance Learning for Robust Online Multi-Object Tracking
Bae, Seung-Hwan
Yoon, Kuk-Jin
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (03) : 595 - 610
[2] Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics
Bernardin, Keni
Stiefelhagen, Rainer
[J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2008, 2008 (1)
[3] Bucilu C., 2006, P 12 ACM SIGKDD INT, P535
[4] Chen L, 2017, IEEE IMAGE PROC, P645, DOI 10.1109/ICIP.2017.8296360
[5] Beyond triplet loss: a deep quadruplet network for person re-identification
Chen, Weihua
Chen, Xiaotang
Zhang, Jianguo
Huang, Kaiqi
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1320 - 1329
[6] TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking
Chu, Peng
Wang, Jiang
You, Quanzeng
Ling, Haibin
Liu, Zicheng
[J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4859 - 4869
[7] Deformable Convolutional Networks
Dai, Jifeng
Qi, Haozhi
Xiong, Yuwen
Li, Yi
Zhang, Guodong
Hu, Han
Wei, Yichen
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 764 - 773
[8] Denil M., 2013, Proc. Adv. Neural Inf. Process. Syst., V26, P1
[9] Dollár P, 2009, PROC CVPR IEEE, P304, DOI 10.1109/CVPRW.2009.5206631
[10] Dynamical Hyperparameter Optimization via Deep Reinforcement Learning in Tracking
Dong, Xingping
Shen, Jianbing
Wang, Wenguan
Shao, Ling
Ling, Haibin
Porikli, Fatih
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) : 1515 - 1529

← 1 2 3 4 5 6 7 →