A survey: object detection methods from CNN to transformer

被引:54
作者
Arkin, Ershat [1 ]
Yadikar, Nurbiya [1 ]
Xu, Xuebin [1 ]
Aysa, Alimjan [2 ]
Ubul, Kurban [1 ,2 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Peoples R China
[2] Xinjiang Univ, Key Lab Multilingual Informat Technol, Urumqi 830046, Peoples R China
基金
美国国家科学基金会;
关键词
Computer vision; Object detection; Real-time system; CNN; Transformer; NETWORKS;
D O I
10.1007/s11042-022-13801-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection is the most important problem in computer vision tasks. After AlexNet proposed, based on Convolutional Neural Network (CNN) methods have become mainstream in the computer vision field, many researches on neural networks and different transformations of algorithm structures have appeared. In order to achieve fast and accurate detection effects, it is necessary to jump out of the existing CNN framework and has great challenges. Transformer's relatively mature theoretical support and technological development in the field of Natural Language Processing have brought it into the researcher's sight, and it has been proved that Transformer's method can be used for computer vision tasks, and proved that it exceeds the existing CNN method in some tasks. In order to enable more researchers to better understand the development process of object detection methods, existing methods, different frameworks, challenging problems and development trends, paper introduced historical classic methods of object detection used CNN, discusses the highlights, advantages and disadvantages of these algorithms. By consulting a large amount of paper, the paper compared different CNN detection methods and Transformer detection methods. Vertically under fair conditions, 13 different detection methods that have a broad impact on the field and are the most mainstream and promising are selected for comparison. The comparative data gives us confidence in the development of Transformer and the convergence between different methods. It also presents the recent innovative approaches to using Transformer in computer vision tasks. In the end, the challenges, opportunities and future prospects of this field are summarized.
引用
收藏
页码:21353 / 21383
页数:31
相关论文
共 50 条
  • [31] Efficient convolutional neural networks and network compression methods for object detection: a survey
    Zhou, Yong
    Xia, Lei
    Zhao, Jiaqi
    Yao, Rui
    Liu, Bing
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 10167 - 10209
  • [32] A Systematic Survey of Transformer-Based 3D Object Detection for Autonomous Driving: Methods, Challenges and Trends
    Zhu, Minling
    Gong, Yadong
    Tian, Chunwei
    Zhu, Zuyuan
    DRONES, 2024, 8 (08)
  • [33] Real-Time Object Detection Network in UAV-Vision Based on CNN and Transformer
    Ye, Tao
    Qin, Wenyang
    Zhao, Zongyang
    Gao, Xiaozhi
    Deng, Xiangpeng
    Ouyang, Yu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [34] A Survey on Object Detection, Annotation and Anomaly Detection Methods for Endoscopic Videos
    Chheda, Tejas
    Koppaka, Soumya
    Iyer, Rithvika
    Kalbande, Dhananjay
    PROCEEDINGS OF THE 2020 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND SECURITY (ICCCS-2020), 2020,
  • [35] Fast Object Detection Algorithm Based On HOG And CNN
    Lu, Tongwei
    Wang, Dandan
    Zhang, Yanduo
    NINTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2017), 2018, 10615
  • [36] HA-Transformer: Harmonious aggregation from local to global for object detection
    Chen, Yang
    Chen, Sihan
    Deng, Yongqiang
    Wang, Kunfeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 230
  • [37] Multiscale fire image detection method based on CNN and Transformer
    Shengbao Wu
    Buyun Sheng
    Gaocai Fu
    Daode Zhang
    Yuchao Jian
    Multimedia Tools and Applications, 2024, 83 : 49787 - 49811
  • [38] CNN-Transformer Hybrid Architecture for Early Fire Detection
    Yang, Chenyue
    Pan, Yixuan
    Cao, Yichao
    Lu, Xiaobo
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 570 - 581
  • [39] Multiscale fire image detection method based on CNN and Transformer
    Wu, Shengbao
    Sheng, Buyun
    Fu, Gaocai
    Zhang, Daode
    Jian, Yuchao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (16) : 49787 - 49811
  • [40] Banknote Object Detection for the Visually Impaired using a CNN
    Thomas, Maria
    Meehan, Kevin
    2021 32ND IRISH SIGNALS AND SYSTEMS CONFERENCE (ISSC 2021), 2021,