Rethinking Online Action Detection in Untrimmed Videos: A Novel Online Evaluation Protocol

被引:5
作者
Baptista-Rios, Marcos [1 ]
Lopez-Sastre, Roberto J. [1 ]
Caba Heilbron, Fabian [2 ]
Van Gemert, Jan C. [3 ]
Acevedo-Rodriguez, F. Javier [1 ]
Maldonado-Bascon, Saturnino [1 ]
机构
[1] Univ Alcala, Dept Signal Theory & Commun, GRAM, Alcala De Henares 314100, Spain
[2] Adobe Res, Deep Learning Grp, Media Intelligence Lab, San Jose, CA 95110 USA
[3] Delft Univ Technol, Fac Elect Engn Math & Comp Sci, NL-2628 Delft, Netherlands
关键词
Computer vision; deep learning; evaluation; instantaneous accuracy; online action detection;
D O I
10.1109/ACCESS.2019.2961789
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Online Action Detection (OAD) problem needs to be revisited. Unlike traditional offline action detection approaches, where the evaluation metrics are clear and well established, in the OAD setting we find very few works and no consensus on the evaluation protocols to be used. In this work we propose to rethink the OAD scenario, clearly defining the problem itself and the main characteristics that the models which are considered online must comply with. We also introduce a novel metric: the Instantaneous Accuracy (IA). This new metric exhibits an online nature and solves most of the limitations of the previous metrics. We conduct a thorough experimental evaluation on 3 challenging datasets, where the performance of various baseline methods is compared to that of the state-of-the-art. Our results confirm the problems of the previous evaluation protocols, and suggest that an IA-based protocol is more adequate to the online scenario. The baselines models and a development kit with the novel evaluation protocol will be made publicly available.
引用
收藏
页码:5139 / 5146
页数:8
相关论文
共 34 条
[1]  
[Anonymous], 2015, ARXIV151106984
[2]  
[Anonymous], 2014, P CVPR
[3]  
[Anonymous], 2015, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2015.7298698
[4]  
Buch S., 2017, P CVPR
[5]  
Chao Y-W., 2018, P CVPR
[6]  
Dai X., 2017, P ICCV
[7]   Modeling temporal structure with LSTM for online action detection [J].
De Geest, Roeland ;
Tuytelaars, Tinne .
2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :1549-1557
[8]  
DeGeest R., 2016, P ECCV
[9]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[10]  
Gao J., 2018, P ECCV