A review of convolutional neural networks in computer vision

被引:219
作者
Zhao, Xia [1 ]
Wang, Limin [1 ]
Zhang, Yufei [2 ]
Han, Xuming [3 ]
Deveci, Muhammet [4 ,5 ,6 ]
Parmar, Milan [7 ]
机构
[1] Guangdong Univ Finance & Econ, Sch Informat Sci, Guangzhou 510320, Peoples R China
[2] Changchun Univ Sci & Technol, Sch Comp Sci & Technol, Changchun 130022, Peoples R China
[3] Jinan Univ, Sch Informat Sci & Technol, Guangzhou 510632, Peoples R China
[4] Natl Def Univ, Turkish Naval Acad, Dept Ind Engn, TR-34942 Istanbul, Turkiye
[5] UCL, Bartlett Sch Sustainable Construction, 1-19 Torrington Pl, London WC1E 7HB, England
[6] Lebanese Amer Univ, Dept Elect & Comp Engn, Byblos, Lebanon
[7] Mississippi State Univ, Dept Comp Sci & Engn, Starkville, MS 39762 USA
关键词
Convolutional neural networks; Computer vision; Status quo review; Deep learning; MODELS;
D O I
10.1007/s10462-024-10721-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In computer vision, a series of exemplary advances have been made in several areas involving image classification, semantic segmentation, object detection, and image super-resolution reconstruction with the rapid development of deep convolutional neural network (CNN). The CNN has superior features for autonomous learning and expression, and feature extraction from original input data can be realized by means of training CNN models that match practical applications. Due to the rapid progress in deep learning technology, the structure of CNN is becoming more and more complex and diverse. Consequently, it gradually replaces the traditional machine learning methods. This paper presents an elementary understanding of CNN components and their functions, including input layers, convolution layers, pooling layers, activation functions, batch normalization, dropout, fully connected layers, and output layers. On this basis, this paper gives a comprehensive overview of the past and current research status of the applications of CNN models in computer vision fields, e.g., image classification, object detection, and video prediction. In addition, we summarize the challenges and solutions of the deep CNN, and future research directions are also discussed.
引用
收藏
页数:43
相关论文
共 100 条
[1]   Identifying Phasic dopamine releases using DarkNet-19 Convolutional Neural Network [J].
Abu Al-Haija, Qasem ;
Smadi, Mahmoud ;
Al-Bataineh, Osama M. .
2021 IEEE INTERNATIONAL IOT, ELECTRONICS AND MECHATRONICS CONFERENCE (IEMTRONICS), 2021, :864-868
[2]   Thermal-based early breast cancer detection using inception V3, inception V4 and modified inception MV4 [J].
Al Husaini, Mohammed Abdulla Salim ;
Habaebi, Mohamed Hadi ;
Gunawan, Teddy Surya ;
Islam, Md Rafiqul ;
Elsheikh, Elfatih A. A. ;
Suliman, F. M. .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (01) :333-348
[3]   Bibliometric Analysis of Data Sources and Tools for Shoreline Change Analysis and Detection [J].
Ankrah, Johnson ;
Monteiro, Ana ;
Madureira, Helena .
SUSTAINABILITY, 2022, 14 (09)
[4]  
Anuj L., 2020, Solid State Technol, V63, P3237
[5]  
Baldi P., 2012, P ICML WORKSHOP UNSU, P37
[6]   SURF: Speeded up robust features [J].
Bay, Herbert ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 :404-417
[7]   CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope [J].
Bhatt, Dulari ;
Patel, Chirag ;
Talsania, Hardik ;
Patel, Jigar ;
Vaghela, Rasmika ;
Pandya, Sharnil ;
Modi, Kirit ;
Ghayvat, Hemant .
ELECTRONICS, 2021, 10 (20)
[8]  
Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934]
[9]  
Bouvrie J, 2006, Introduction Notes on Convolutional Neural Networks
[10]   D2Det: Towards High Quality Object Detection and Instance Segmentation [J].
Cao, Jiale ;
Cholakkal, Hisham ;
Anwer, Rao Muhammad ;
Khan, Fahad Shahbaz ;
Pang, Yanwei ;
Shao, Ling .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11482-11491