Convolutional neural network: a review of models, methodologies and applications to object detection

被引:566
作者
Dhillon, Anamika [1 ]
Verma, Gyanendra K. [1 ]
机构
[1] Natl Inst Technol Kurukshetra, Dept Comp Engn, Kurukshetra 136119, Haryana, India
关键词
Deep learning; CNN architectures; Transfer learning; Object detection; DEEP LEARNING APPROACH; HANDGUN DETECTION; RECOGNITION; CLASSIFICATION; IDENTIFICATION; ARCHITECTURES; AUTOENCODERS; FUSION; HEALTH;
D O I
10.1007/s13748-019-00203-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning has developed as an effective machine learning method that takes in numerous layers of features or representation of the data and provides state-of-the-art results. The application of deep learning has shown impressive performance in various application areas, particularly in image classification, segmentation and object detection. Recent advances of deep learning techniques bring encouraging performance to fine-grained image classification which aims to distinguish subordinate-level categories. This task is extremely challenging due to high intra-class and low inter-class variance. In this paper, we provide a detailed review of various deep architectures and model highlighting characteristics of particular model. Firstly, we described the functioning of CNN architectures and its components followed by detailed description of various CNN models starting with classical LeNet model to AlexNet, ZFNet, GoogleNet, VGGNet, ResNet, ResNeXt, SENet, DenseNet, Xception, PNAS/ENAS. We mainly focus on the application of deep learning architectures to three major applications, namely (i) wild animal detection, (ii) small arm detection and (iii) human being detection. A detailed review summary including the systems, database, application and accuracy claimed is also provided for each model to serve as guidelines for future work in the above application areas.
引用
收藏
页码:85 / 112
页数:28
相关论文
共 134 条
  • [1] A novel comparative deep learning framework for facial age estimation
    Abousaleh, Fatma S.
    Lim, Tekoing
    Cheng, Wen-Huang
    Yu, Neng-Hao
    Hossain, M. Anwar
    Alhamid, Mohammed F.
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,
  • [2] Adam G., 2019, ARXIV190400438
  • [3] Alom MZ, 2018, HIST BEGAN ALEXNET C
  • [4] [Anonymous], 2016, IEEE Region, DOI DOI 10.1109/SPMB.2016.7846859
  • [5] [Anonymous], 2016, PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), DOI 10.1109/ SSCI.2016.7850111
  • [6] Anwar Muhamad Khoirul, 2017, 2017 International Electronics Symposium on Engineering Technology and Applications (IES-ETA), P134, DOI 10.1109/ELECSYM.2017.8240392
  • [7] A deep convolutional neural network for video sequence background subtraction
    Babaee, Mohammadreza
    Duc Tung Dinh
    Rigoll, Gerhard
    [J]. PATTERN RECOGNITION, 2018, 76 : 635 - 649
  • [8] Baccouche Moez, 2011, Human Behavior Unterstanding. Proceedings Second International Workshop, HBU 2011, P29, DOI 10.1007/978-3-642-25446-8_4
  • [9] Bazrafkan S, 2018, IEEE ICCE
  • [10] Besbinar B, 2016, 2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU), P2041, DOI 10.1109/SIU.2016.7496171