Point-to-Set Distance Metric Learning on Deep Representations for Visual Tracking

被引:50
作者
Zhang, Shengping [1 ]
Qi, Yuankai [2 ]
Jiang, Feng [2 ]
Lan, Xiangyuan [3 ]
Yuen, Pong C. [3 ]
Zhou, Huiyu [4 ,5 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Weihai 264209, Peoples R China
[2] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Heilongjiang, Peoples R China
[3] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[4] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Belfast BT3 9DT, Antrim, North Ireland
[5] Univ Leicester, Leicester LE1 7RH, Leics, England
基金
英国工程与自然科学研究理事会; 中国国家自然科学基金;
关键词
Metric learning; point to set; visual tracking; ROBUST OBJECT TRACKING; ONLINE TRACKING;
D O I
10.1109/TITS.2017.2766093
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
For autonomous driving application, a car shall be able to track objects in the scene in order to estimate where and how they will move such that the tracker embedded in the car can efficiently alert the car for effective collision-avoidance. Traditional discriminative object tracking methods usually train a binary classifier via a support vector machine (SVM) scheme to distinguish the target from its background. Despite demonstrated success, the performance of the SVM-based trackers is limited because the classification is carried out only depending on support vectors (SVs) but the target's dynamic appearance may look similar to the training samples that have not been selected as SVs, especially when the training samples are not linearly classifiable. In such cases, the tracker may drift to the background and fail to track the target eventually. To address this problem, in this paper, we propose to integrate the point-to-set/image-to-imageSet distance metric learning (DML) into visual tracking tasks and take full advantage of all the training samples when determining the best target candidate. The point-to-set DML is conducted on convolutional neural network features of the training data extracted from the starting frames. When a new frame comes, target candidates are first projected to the common subspace using the learned mapping functions, and then the candidate having the minimal distance to the target template sets is selected as the tracking result. Extensive experimental results show that even without model update the proposed method is able to achieve favorable performance on challenging image sequences compared with several state-of-the-art trackers.
引用
收藏
页码:187 / 198
页数:12
相关论文
共 64 条
[11]  
Davis J.V., 2007, P 24 INT C MACHINE L, P209, DOI DOI 10.1145/1273496.1273523
[12]  
Dinh TB, 2011, PROC CVPR IEEE, P1177, DOI 10.1109/CVPR.2011.5995733
[13]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[14]   Human Tracking Using Convolutional Neural Networks [J].
Fan, Jialue ;
Xu, Wei ;
Wu, Ying ;
Gong, Yihong .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (10) :1610-1623
[15]   Part-based Online Tracking with Geometry Constraint and Attention Selection [J].
Fang, Jianwu ;
Wang, Qi ;
Yuan, Yuan .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2014, 24 (05) :854-864
[16]   Real-Time Multipedestrian Tracking in Traffic Scenes via an RGB-D-Based Layered Graph Model [J].
Gao, Shan ;
Han, Zhenjun ;
Li, Ce ;
Ye, Qixiang ;
Jiao, Jianbin .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2015, 16 (05) :2814-2825
[17]   Visual object tracking via sample-based Adaptive Sparse Representation (AdaSR) [J].
Han, Zhenjun ;
Jiao, Jianbin ;
Zhang, Baochang ;
Ye, Qixiang ;
Liu, Jianzhuang .
PATTERN RECOGNITION, 2011, 44 (09) :2170-2183
[18]  
Hare S, 2011, IEEE I CONF COMP VIS, P263, DOI 10.1109/ICCV.2011.6126251
[19]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[20]   Robust Object Tracking via Key Patch Sparse Representation [J].
He, Zhenyu ;
Yi, Shuangyan ;
Cheung, Yiu-Ming ;
You, Xinge ;
Tang, Yuan Yan .
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (02) :354-364