A survey: object detection methods from CNN to transformer

被引:54
作者
Arkin, Ershat [1 ]
Yadikar, Nurbiya [1 ]
Xu, Xuebin [1 ]
Aysa, Alimjan [2 ]
Ubul, Kurban [1 ,2 ]
机构
[1] Xinjiang Univ, Coll Informat Sci & Engn, Urumqi 830046, Peoples R China
[2] Xinjiang Univ, Key Lab Multilingual Informat Technol, Urumqi 830046, Peoples R China
基金
美国国家科学基金会;
关键词
Computer vision; Object detection; Real-time system; CNN; Transformer; NETWORKS;
D O I
10.1007/s11042-022-13801-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection is the most important problem in computer vision tasks. After AlexNet proposed, based on Convolutional Neural Network (CNN) methods have become mainstream in the computer vision field, many researches on neural networks and different transformations of algorithm structures have appeared. In order to achieve fast and accurate detection effects, it is necessary to jump out of the existing CNN framework and has great challenges. Transformer's relatively mature theoretical support and technological development in the field of Natural Language Processing have brought it into the researcher's sight, and it has been proved that Transformer's method can be used for computer vision tasks, and proved that it exceeds the existing CNN method in some tasks. In order to enable more researchers to better understand the development process of object detection methods, existing methods, different frameworks, challenging problems and development trends, paper introduced historical classic methods of object detection used CNN, discusses the highlights, advantages and disadvantages of these algorithms. By consulting a large amount of paper, the paper compared different CNN detection methods and Transformer detection methods. Vertically under fair conditions, 13 different detection methods that have a broad impact on the field and are the most mainstream and promising are selected for comparison. The comparative data gives us confidence in the development of Transformer and the convergence between different methods. It also presents the recent innovative approaches to using Transformer in computer vision tasks. In the end, the challenges, opportunities and future prospects of this field are summarized.
引用
收藏
页码:21353 / 21383
页数:31
相关论文
共 50 条
  • [41] T-CNN: Tubelets With Convolutional Neural Networks for Object Detection From Videos
    Kang, Kai
    Li, Hongsheng
    Yan, Junjie
    Zeng, Xingyu
    Yang, Bin
    Xiao, Tong
    Zhang, Cong
    Wang, Zhe
    Wang, Ruohui
    Wang, Xiaogang
    Ouyang, Wanli
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) : 2896 - 2907
  • [42] RGB-INFRARED MULTI-MODAL REMOTE SENSING OBJECT DETECTION USING CNN AND TRANSFORMER BASED FEATURE FUSION
    Tian, Tao
    Cai, Jiang
    Xu, Yang
    Wu, Zebin
    Wei, Zhihui
    Chanussot, Jocelyn
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5728 - 5731
  • [43] Transformer-based few-shot object detection in traffic scenarios
    Erjun Sun
    Di Zhou
    Yan Tian
    Zhaocheng Xu
    Xun Wang
    Applied Intelligence, 2024, 54 : 947 - 958
  • [44] Transformer-based few-shot object detection in traffic scenarios
    Sun, Erjun
    Zhou, Di
    Tian, Yan
    Xu, Zhaocheng
    Wang, Xun
    APPLIED INTELLIGENCE, 2024, 54 (01) : 947 - 958
  • [45] CTAFFNet: CNN-Transformer Adaptive Feature Fusion Object Detection Algorithm for Complex Traffic Scenarios
    Dong, Xinlong
    Shi, Peicheng
    Liang, Taonian
    Yang, Aixi
    TRANSPORTATION RESEARCH RECORD, 2024, : 1947 - 1965
  • [46] ForegroundNet: Domain Adaptive Transformer for Camouflaged Object Detection
    Liu, Zhouyong
    Luo, Shun
    Sun, Shilei
    Li, Chunguo
    Huang, Yongming
    Yang, Luxi
    IEEE SENSORS JOURNAL, 2024, 24 (14) : 21972 - 21986
  • [47] Improved Object Detection with Content and Position Separation in Transformer
    Wang, Yao
    Ha, Jong-Eun
    REMOTE SENSING, 2024, 16 (02)
  • [48] UAV Object Detection Based on Joint YOLO and Transformer
    Gao, Yifan
    Ding, Rui
    Zhou, Fuhui
    Wu, Qihui
    2024 INTERNATIONAL CONFERENCE ON UBIQUITOUS COMMUNICATION, UCOM 2024, 2024, : 202 - 206
  • [49] Deep Learning for Object Detection: A Survey
    Wang, Jun
    Zhang, Tingjuan
    Cheng, Yong
    Al-Nabhan, Najla
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 38 (02): : 165 - 182
  • [50] Deep learning-based small object detection: A survey
    Feng, Qihan
    Xu, Xinzheng
    Wang, Zhixiao
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (04) : 6551 - 6590