GANsformer: A Detection Network for Aerial Images with High Performance Combining Convolutional Network and Transformer

被引:30
|
作者
Zhang, Yan [1 ]
Liu, Xi [2 ]
Wa, Shiyun [1 ]
Chen, Shuyu [3 ]
Ma, Qin [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] China Agr Univ, Coll Humanities & Dev, Beijing 100083, Peoples R China
[3] China Agr Univ, Coll Engn, Beijing 100083, Peoples R China
关键词
object detection; transformer; deep learning; aerial image; generative model; GANsformer detection network; DEEP LEARNING-METHODS; OBJECT DETECTION; RECOGNITION;
D O I
10.3390/rs14040923
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
There has been substantial progress in small object detection in aerial images in recent years, due to the extensive applications and improved performances of convolutional neural networks (CNNs). Typically, traditional machine learning algorithms tend to prioritize inference speed over accuracy. Insufficient samples can cause problems for convolutional neural networks, such as instability, non-convergence, and overfitting. Additionally, detecting aerial images has inherent challenges, such as varying altitudes and illuminance situations, and blurred and dense objects, resulting in low detection accuracy. As a result, this paper adds a transformer backbone attention mechanism as a branch network, using the region-wide feature information. This paper also employs a generative model to expand the input aerial images ahead of the backbone. The respective advantages of the generative model and transformer network are incorporated. On the dataset presented in this study, the model achieves 96.77% precision, 98.83% recall, and 97.91% mAP by adding the Multi-GANs module to the one-stage detection network. These three indices are enhanced by 13.9%, 20.54%, and 10.27%, respectively, when compared to the other detection networks. Furthermore, this study provides an auto-pruning technique that may achieve 32.2 FPS inference speed with a minor performance loss while responding to the real-time detection task's usage environment. This research also develops a macOS application for the proposed algorithm using Swift development technology.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] High-Precision Microseismic Source Localization Using a Fusion Network Combining Convolutional Neural Network and Transformer
    Feng, Qiang
    Han, Liguo
    Ma, Liyun
    Li, Qiang
    SURVEYS IN GEOPHYSICS, 2024, 45 (05) : 1527 - 1560
  • [22] TOWARD COUNTRY SCALE BUILDING DETECTION WITH CONVOLUTIONAL NEURAL NETWORK USING AERIAL IMAGES
    Yang, Hsiuhan Lexie
    Lunga, Dalton
    Yuan, Jiangye
    2017 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2017, : 870 - 873
  • [23] Vehicle Detection in Aerial Images Based on Hyper Feature Map in Deep Convolutional Network
    Shen, Jiaquan
    Liu, Ningzhong
    Sun, Han
    Tao, Xiaoli
    Li, Qiangyi
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2019, 13 (04) : 1989 - 2011
  • [24] Earthquake Crack Detection From Aerial Images Using a Deformable Convolutional Neural Network
    Yu, Dawen
    Ji, Shunping
    Li, Xue
    Yuan, Zhaode
    Shen, Chaoyong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [25] ORIENTATION ROBUST OBJECT DETECTION IN AERIAL IMAGES USING DEEP CONVOLUTIONAL NEURAL NETWORK
    Zhu, Haigang
    Chen, Xiaogang
    Dai, Weiqun
    Fu, Kun
    Ye, Qixiang
    Jiao, Jianbin
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 3735 - 3739
  • [26] An Anchor-Free Lightweight Deep Convolutional Network for Vehicle Detection in Aerial Images
    Shen, Jiaquan
    Zhou, Wangcheng
    Liu, Ningzhong
    Sun, Han
    Li, Deguang
    Zhang, Yongxin
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 24330 - 24342
  • [27] A Comparative Study of Convolutional Neural Network and Transformer Architectures for Drone Detection in Thermal Images
    Gutierrez, Gian
    Llerena, Juan P.
    Usero, Luis
    Patricio, Miguel A.
    APPLIED SCIENCES-BASEL, 2025, 15 (01):
  • [28] A CNN-Transformer Network Combining CBAM for Change Detection in High-Resolution Remote Sensing Images
    Yin, Mengmeng
    Chen, Zhibo
    Zhang, Chengjian
    REMOTE SENSING, 2023, 15 (09)
  • [29] Fully Convolutional Network for Recognition of Small Buildings in Aerial Images
    Wang, Zhu-qing
    Gong, Wen-wen
    Jiao, Yun-feng
    Wu, Qiu-lan
    Li, Wei-yan
    2018 FIFTH INTERNATIONAL WORKSHOP ON EARTH OBSERVATION AND REMOTE SENSING APPLICATIONS (EORSA), 2018, : 157 - 161
  • [30] RockSeg: A Novel Semantic Segmentation Network Based on a Hybrid Framework Combining a Convolutional Neural Network and Transformer for Deep Space Rock Images
    Fan, Lili
    Yuan, Jiabin
    Niu, Xuewei
    Zha, Keke
    Ma, Weiqi
    REMOTE SENSING, 2023, 15 (16)