Online Video Object Detection using Association LSTM

被引:74
作者
Lu, Yongyi [1 ]
Lu, Cewu [2 ]
Tang, Chi-Keung [1 ]
机构
[1] HKUST, Hong Kong, Hong Kong, Peoples R China
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
来源
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年
关键词
D O I
10.1109/ICCV.2017.257
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video object detection is a fundamental tool for many applications. Since direct application of image-based object detection cannot leverage the rich temporal information inherent in video data, we advocate to the detection of long-range video object pattern. While the Long Short-Term Memory (LSTM) has been the de facto choice for such detection, currently LSTM cannot fundamentally model object association between consecutive frames. In this paper, we propose the association LSTM to address this fundamental association problem. Association LSTM not only regresses and classifiy directly on object locations and categories but also associates features to represent each output object. By minimizing the matching error between these features, we learn how to associate objects in two consecutive frames. Additionally, our method works in an online manner, which is important for most video tasks. Compared to the traditional video object detection methods, our approach outperforms them on standard video datasets.
引用
收藏
页码:2363 / 2371
页数:9
相关论文
共 28 条
[1]  
[Anonymous], P EUR C COMP VIS ECC
[2]  
[Anonymous], IEEE C COMP VIS ICCV
[3]  
[Anonymous], P BRIT MACH VIS C BM
[4]  
[Anonymous], ARXIV160505863
[5]  
[Anonymous], P EUR C COMP VIS ECC
[6]  
[Anonymous], 1997, Neural Computation
[7]  
[Anonymous], 2016, IEEE C COMP VIS PATT
[8]  
[Anonymous], ARXIV160309025
[9]  
[Anonymous], 2016, IEEE C COMP VIS PATT
[10]  
[Anonymous], IEEE C COMP VIS PATT