Attentive Spatio-Temporal Representation Learning for Diving Classification

被引:19
作者
Kanojia, Gagan [1 ]
Kumawat, Sudhakar [1 ]
Raman, Shanmuganathan [1 ]
机构
[1] Indian Inst Technol Gandhinagar, Palaj, India
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019) | 2019年
关键词
D O I
10.1109/CVPRW.2019.00302
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Competitive diving is a well recognized aquatic sport in which a person dives from a platform or a springboard into the water. Based on the acrobatics performed during the dive, diving is classified into a finite set of action classes which are standardized by FINA. In this work, we propose an attention guided LSTM-based neural network architecture for the task of diving classification. The network takes the frames of a diving video as input and determines its class. We evaluate the performance of the proposed model on a recently introduced competitive diving dataset, Diving48. It contains over 18000 video clips which covers 48 classes of diving. The proposed model outperforms the classification accuracy of the state-of-the-art models in both 2D and 3D frameworks by 11.54% and 4.24%, respectively. We show that the network is able to localize the diver in the video frames during the dive without being trained with such a supervision.
引用
收藏
页码:2467 / 2476
页数:10
相关论文
共 29 条
[1]   Social LSTM: Human Trajectory Prediction in Crowded Spaces [J].
Alahi, Alexandre ;
Goel, Kratarth ;
Ramanathan, Vignesh ;
Robicquet, Alexandre ;
Li Fei-Fei ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :961-971
[2]  
Bertasius Gedas, 2018, ARXIV181204172
[3]  
Cao ZC, 2018, IEEE INT CON AUTO SC, P803, DOI 10.1109/COASE.2018.8560578
[4]  
Donahue J, 2015, PROC CVPR IEEE, P2625, DOI 10.1109/CVPR.2015.7298878
[5]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[6]   Hockey Action Recognition via Integrated Stacked Hourglass Network [J].
Fani, Mehrnaz ;
Neher, Helmut ;
Clausi, David A. ;
Wong, Alexander ;
Zelek, John .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :85-93
[7]   Recurrent Network Models for Human Dynamics [J].
Fragkiadaki, Katerina ;
Levine, Sergey ;
Felsen, Panna ;
Malik, Jitendra .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4346-4354
[8]  
He K., 2016, IEEE C COMPUT VIS PA, DOI [10.1007/978-3-319-46493-0_38, DOI 10.1007/978-3-319-46493-0_38, DOI 10.1109/CVPR.2016.90]
[9]  
Hochreiter S, 1997, Neural Computation, V9, P1735
[10]  
Hwang J., 2017, P IEEE C COMPUTER VI, P58