GANsformer: A Detection Network for Aerial Images with High Performance Combining Convolutional Network and Transformer

被引:29
作者
Zhang, Yan [1 ]
Liu, Xi [2 ]
Wa, Shiyun [1 ]
Chen, Shuyu [3 ]
Ma, Qin [1 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] China Agr Univ, Coll Humanities & Dev, Beijing 100083, Peoples R China
[3] China Agr Univ, Coll Engn, Beijing 100083, Peoples R China
关键词
object detection; transformer; deep learning; aerial image; generative model; GANsformer detection network; DEEP LEARNING-METHODS; OBJECT DETECTION; RECOGNITION;
D O I
10.3390/rs14040923
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
There has been substantial progress in small object detection in aerial images in recent years, due to the extensive applications and improved performances of convolutional neural networks (CNNs). Typically, traditional machine learning algorithms tend to prioritize inference speed over accuracy. Insufficient samples can cause problems for convolutional neural networks, such as instability, non-convergence, and overfitting. Additionally, detecting aerial images has inherent challenges, such as varying altitudes and illuminance situations, and blurred and dense objects, resulting in low detection accuracy. As a result, this paper adds a transformer backbone attention mechanism as a branch network, using the region-wide feature information. This paper also employs a generative model to expand the input aerial images ahead of the backbone. The respective advantages of the generative model and transformer network are incorporated. On the dataset presented in this study, the model achieves 96.77% precision, 98.83% recall, and 97.91% mAP by adding the Multi-GANs module to the one-stage detection network. These three indices are enhanced by 13.9%, 20.54%, and 10.27%, respectively, when compared to the other detection networks. Furthermore, this study provides an auto-pruning technique that may achieve 32.2 FPS inference speed with a minor performance loss while responding to the real-time detection task's usage environment. This research also develops a macOS application for the proposed algorithm using Swift development technology.
引用
收藏
页数:29
相关论文
共 50 条
[21]   Object Detection in Aerial Images Using a Multiscale Keypoint Detection Network [J].
Su, Jinhe ;
Liao, JiaJia ;
Gu, Dujuan ;
Wang, Zongyue ;
Cai, Guorong .
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2021, 14 :1389-1398
[22]   RelationRS: Relationship Representation Network for Object Detection in Aerial Images [J].
Liu, Zhiming ;
Zhang, Xuefei ;
Liu, Chongyang ;
Wang, Hao ;
Sun, Chao ;
Li, Bin ;
Huang, Pu ;
Li, Qingjun ;
Liu, Yu ;
Kuang, Haipeng ;
Xiu, Jihong .
REMOTE SENSING, 2022, 14 (08)
[23]   Aortic Injury Detection from CT Images Using Convolutional Neural Network [J].
Wakamori, Mayu ;
Takahara, Shunsuke ;
Ohtera, Ryo .
INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2024, 2024, 13164
[24]   An Insulator defect detection network combining bidirectional feature pyramid network and attention mechanism in unmanned aerial vehicle images [J].
Feng, Fu ;
Yang, Xiaoxia ;
Yang, Ronghao ;
Yu, Hao ;
Liao, Fangzhou ;
Shi, Qiqi ;
Zhu, Feng .
Engineering Applications of Artificial Intelligence, 2025, 152
[25]   Convolutional Neural Network for Saliency Detection in Images [J].
Misaghi, Hooman ;
Moghadam, Reza Askari ;
Madani, Kurosh .
2018 6TH IRANIAN JOINT CONGRESS ON FUZZY AND INTELLIGENT SYSTEMS (CFIS), 2018, :17-19
[26]   A Novel Fault Diagnosis Method of Rolling Bearings Combining Convolutional Neural Network and Transformer [J].
Liu, Wenkai ;
Zhang, Zhigang ;
Zhang, Jiarui ;
Huang, Haixiang ;
Zhang, Guocheng ;
Peng, Mingda .
ELECTRONICS, 2023, 12 (08)
[27]   Convolutional Neural Network-Based Transfer Learning for Optical Aerial Images Change Detection [J].
Liu, Junfu ;
Chen, Keming ;
Xu, Guangluan ;
Sun, Xian ;
Yan, Menglong ;
Diao, Wenhui ;
Han, Hongzhe .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (01) :127-131
[28]   Enhancing the Resolution of Seismic Images With a Network Combining CNN and Transformer [J].
Zhong, Tie ;
Zheng, Kaiyuan ;
Dong, Shiqi ;
Tong, Xunqian ;
Dong, Xintong .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2025, 22
[29]   Object detection of transmission line visual images based on deep convolutional neural network [J].
Zhou Zhu-bo ;
Gao Jiao ;
Zhang Wei ;
Wang Xiao-jing ;
Zhang Jiang .
CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2018, 33 (04) :317-325
[30]   Lightweight Feature Fusion Network for Object Detection in Aerial Photography Images [J].
Fan Qiangqiang ;
Shi Zaifeng ;
Kong Fanning ;
Li Shaoxiong ;
Xiao Jun .
LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (10)