Human Tracking Using Convolutional Neural Networks

被引:258
作者
Fan, Jialue [1 ]
Xu, Wei [2 ]
Wu, Ying [1 ]
Gong, Yihong [2 ]
机构
[1] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
[2] NEC Labs Amer Inc, Cupertino, CA 95014 USA
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2010年 / 21卷 / 10期
基金
美国国家科学基金会;
关键词
Convolutional neural networks; machine learning; visual tracking; VISUAL TRACKING; MULTIPLE;
D O I
10.1109/TNN.2010.2066286
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we treat tracking as a learning problem of estimating the location and the scale of an object given its previous location, scale, as well as current and previous image frames. Given a set of examples, we train convolutional neural networks (CNNs) to perform the above estimation task. Different from other learning methods, the CNNs learn both spatial and temporal features jointly from image pairs of two adjacent frames. We introduce multiple path ways in CNN to better fuse local and global information. A creative shift-variant CNN architecture is designed so as to alleviate the drift problem when the distracting objects are similar to the target in cluttered environment. Furthermore, we employ CNNs to estimate the scale through the accurate localization of some key points. These techniques are object-independent so that the proposed method can be applied to track other types of object. The capability of the tracker of handling complex situations is demonstrated in many testing sequences.
引用
收藏
页码:1610 / 1623
页数:14
相关论文
共 33 条
[11]  
ESS A, 2008, P IEEE C COMP VIS PA, P1
[12]  
FAN J, 2008, P IEEE INT C IM PROC, P2660
[13]  
Grabner H, 2008, LECT NOTES COMPUT SC, V5302, P234, DOI 10.1007/978-3-540-88682-2_19
[14]  
Ho J, 2004, PROC CVPR IEEE, P782
[15]   Robust Object Tracking by Hierarchical Association of Detection Responses [J].
Huang, Chang ;
Wu, Bo ;
Nevatia, Ramakant .
COMPUTER VISION - ECCV 2008, PT II, PROCEEDINGS, 2008, 5303 :788-801
[16]   RECEPTIVE FIELDS, BINOCULAR INTERACTION AND FUNCTIONAL ARCHITECTURE IN CATS VISUAL CORTEX [J].
HUBEL, DH ;
WIESEL, TN .
JOURNAL OF PHYSIOLOGY-LONDON, 1962, 160 (01) :106-&
[17]  
Jepson AD, 2001, PROC CVPR IEEE, P415
[18]   Face recognition: A convolutional neural-network approach [J].
Lawrence, S ;
Giles, CL ;
Tsoi, AC ;
Back, AD .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (01) :98-113
[19]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[20]   Tracking in low frame rate video: A cascade particle filter with discriminative observers of different life spans [J].
Li, Yuan ;
Ai, Haizhou ;
Yamashita, Takayoshi ;
Lao, Shihong ;
Kawade, Masato .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (10) :1728-1740