End-to-End Learning Deep CRF Models for Multi-Object Tracking Deep CRF Models

被引：53

作者：

Xiang, Jun ^{[1
,2
]}

Xu, Guohan ^{[1
]}

Ma, Chao ^{[3
]}

Hou, Jianhua ^{[1
]}

机构：

[1] South Cent Univ Nationalities, Hubei Key Lab Intelligent Wireless Commun, Wuhan 430074, Peoples R China

[2] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen 518172, Peoples R China

[3] Shanghai Jiao Tong Univ, AI Inst, Shanghai 200240, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2021年 / 31卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Target tracking; Machine learning; Recurrent neural networks; Optimization; Task analysis; Standards; Inference algorithms; Multi-object tracking; end-to-end deep learning; conditional random field; data association; MULTITARGET TRACKING; APPROXIMATION;

D O I：

10.1109/TCSVT.2020.2975842

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

By bundling multiple complex sub-problems into a unified framework, end-to-end deep learning frameworks reduce the need for hand engineering or tuning of parameters for each component, and optimize different modules jointly to ensure the generalization of the whole deep architecture. Despite tremendous success in numerous computer vision tasks, end-to-end learnings for multi-object tracking (MOT), especially for the assignment problem in data association, have been surprisingly less investigated mainly due to the lack of available training data. Furthermore, it is challenging to discriminate target objects under mutual occlusions or to reduce identity switches in crowded scenes. To tackle these challenges, this paper proposes learning deep conditional random field (CRF) networks, aiming to model the assignment costs as unary potentials and the long-term dependencies among detection results as pairwise potentials. Specifically, we use a bidirectional long short-term memory (LSTM) network to encode the long-term dependencies. We pose the CRF inference as a recurrent neural network learning process using the standard gradient descent algorithm, where unary and pairwise potentials are jointly optimized in an end-to-end manner. Extensive experiments are conducted on the challenging MOT datasets including MOT15, MOT16 and MOT17, and the results show that the proposed algorithm performs favorably against the state-of-the-art methods.

引用

页码：275 / 288

页数：14

共 68 条

[1]

[Anonymous], 2018, P BMVC

[2]

[Anonymous], 2018, arXiv Preprint arXiv:1802.06897 1.1

[3] Robust Online Multi-Object Tracking based on Tracklet Confidence and Online Discriminative Appearance Learning [J].

Bae, Seung-Hwan ;

Yoon, Kuk-Jin .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1218-1225

[4] Tracking without bells and whistles [J].

Bergmann, Philipp ;

Meinhardt, Tim ;

Leal-Taixe, Laura .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :941-951

[5] Fully-Convolutional Siamese Networks for Object Tracking [J].

Bertinetto, Luca ;

Valmadre, Jack ;

Henriques, Joao F. ;

Vedaldi, Andrea ;

Torr, Philip H. S. .

COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865

[6] Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters [J].

Beyer, Lucas ;

Breuers, Stefan ;

Kurin, Vitaly ;

Leibe, Bastian .

2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1444-1453

[7]

Chen L, 2017, IEEE IMAGE PROC, P645, DOI 10.1109/ICIP.2017.8296360

[8] Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor [J].

Choi, Wongun .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :3029-3037

[9] FAMNet: Joint Learning of Feature, Affinity and Multi-dimensional Assignment for Online Multiple Object Tracking [J].

Chu, Peng ;

Ling, Haibin .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6171-6180

[10] Online Multi-Object Tracking with Instance-Aware Tracker and Dynamic Model Refreshment [J].

Chu, Peng ;

Fan, Heng ;

Tan, Chiu C. ;

Ling, Haibin .

2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, :161-170

← 1 2 3 4 5 6 7 →