Research on 3D Convolutional Neural Network and Its Application to Video Understanding

被引:4
作者
Bai, Jing [1 ,2 ]
Yang, Zhanyuan [1 ]
Peng, Bin [1 ]
Li, Wenjing [1 ]
机构
[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan 750021, Peoples R China
[2] Natl Ethn Affairs Commiss, Image Graph Intelligent Proc Lab, Yinahuan 750021, Peoples R China
基金
中国国家自然科学基金;
关键词
Video understanding; Deep learning; 3D Convolutional Neural Network (3D CNN); Network structure;
D O I
10.11999/JEIT220596
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
3D Convolutional Neural Network (3D CNN) has been a hot topic in deep learning research over the last few years and has made great achievements in computer vision. Despite years of research and abundant results, a comprehensive and detailed review of this content is still lacking. In this paper, the 3D convolutional neural network is introduced in the following aspects. Firstly, the rationale and model structure of 3D convolutional neural network are put forward. Then the improvement of 3D convolutional neural network is summarized from the network structure, network interior and optimization methods. After that the application of 3D convolutional neural network to the field of video understanding is explained. Finally, the contents summary of the paper and future development. This paper provides a systematic review of the latest research progress of 3D convolutional neural networks and their applications in the field of video understanding, which is of positive significance to the research and development of 3D convolutional neural network.
引用
收藏
页码:2273 / 2283
页数:11
相关论文
共 48 条
[21]   DEEPFAKE VIDEO DETECTION USING 3D-ATTENTIONAL INCEPTION CONVOLUTIONAL NEURAL NETWORK [J].
Lu, Changlei ;
Liu, Bin ;
Zhou, Wenbo ;
Chu, Qi ;
Yu, Nenghai .
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, :3572-3576
[22]  
Lu Xiaoling, 2021, Computer Engineering and Applications, V57, P253, DOI 10.3778/j.issn.1002-8331.2005-0141
[23]   Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks [J].
Molchanov, Pavlo ;
Yang, Xiaodong ;
Gupta, Shalini ;
Kim, Kihwan ;
Tyree, Stephen ;
Kautz, Jan .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4207-4215
[24]   Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks [J].
Qiu, Zhaofan ;
Yao, Ting ;
Mei, Tao .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5534-5542
[25]   Detection of Pulmonary Nodules Based on C-3D Deformable Convolutional Neural Network Model [J].
Ruan Hongyang ;
Chen Zhilan ;
Cheng Yingsheng ;
Kai, Yang .
LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (04)
[26]   Screening of COVID-19 Suspected Subjects Using Multi-Crossover Genetic Algorithm Based Dense Convolutional Neural Network [J].
Singh, Dilbag ;
Kumar, Vijay ;
Kaur, Manjit ;
Jabarulla, Mohamed Yaseen ;
Lee, Heung-No .
IEEE ACCESS, 2021, 9 :142566-142580
[27]  
Soomro K, 2012, Arxiv, DOI arXiv:1212.0402
[28]   D3D: Distilled 3D Networks for Video Action Recognition [J].
Stroud, Jonathan C. ;
Ross, David A. ;
Sun, Chen ;
Deng, Jia ;
Sukthankar, Rahul .
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, :614-623
[29]  
TRAN D, 2015, IEEE INT C COMPUTER, P4459, DOI [10.1109/100V2018.810, DOI 10.1109/100V2018.810]
[30]  
Tran D, 2017, Arxiv, DOI [arXiv:1708.05038, 10.48550/arXiv.1708.05038, DOI 10.48550/ARXIV.1708.05038]