Enhanced object recognition from remote sensing images based on hybrid convolution and transformer structure

被引：0

作者：

Nguyen, Hoanh ^{[1
]}

Ngo, Thanh Quyen ^{[1
]}

Uyen, Hoang Thi Tu ^{[1
]}

Duong, Mien Ka ^{[1
]}

机构：

[1] Ind Univ Ho Chi Minh City, Fac Elect Engn Technol, Ho Chi Minh City 700000, Vietnam

来源：

EARTH SCIENCE INFORMATICS | 2025年 / 18卷 / 02期

关键词：

Object detection; Remote sensing images; Depthwise separable convolution; Attention mechanisms; NETWORK;

D O I：

10.1007/s12145-025-01751-x

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Object recognition in remote sensing images presents unique challenges due to the diverse scales, shapes, and distributions of objects, particularly small and complex ones. Existing frameworks, such as RT-DETR, struggle to accurately detect small objects because of their limited ability to extract fine-grained details and integrate multi-scale information. To overcome these challenges, we propose an enhanced object recognition model based on a hybrid convolution and transformer structure. This model improves two critical components of the original RT-DETR by introducing the Multi-Scale Adaptive Attention Module (MSAAM) and the Hybrid Feature Fusion Module (HFFM), specifically designed to enhance feature extraction and integration. The MSAAM strengthens the ResNet backbone by adaptively combining local and global information, ensuring the effective extraction of fine-grained details while emphasizing features critical for small object detection. The HFFM, integrated into the final stages of the neck, employs a dual-branch design to balance fine-grained local detail extraction and large-scale contextual understanding. By employing group convolution, depthwise separable convolution, and attention mechanisms, the HFFM mitigates the loss of fine details caused by downsampling while leveraging the expanded receptive field for broader context understanding. Experimental results demonstrate that the proposed model achieves superior object recognition performance, particularly for small objects, making it well-suited for remote sensing applications.

引用

页数：15

共 50 条

[1] Convolution and Transformer based hybrid neural network for Road Extraction in Remote Sensing Images
Liu, Shufan
Wang, Yang
Wang, Haoqi
Xiong, Youqiang
Liu, Yinfeng
Xie, Chenxi
2024 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, ICMA 2024, 2024, : 471 - 476
[2] Remote Sensing Object Detection Based on Convolution and Swin Transformer
Jiang, Xuzhao
Wu, Yonghong
IEEE ACCESS, 2023, 11 : 38643 - 38656
[3] QAGA-Net: enhanced vision transformer-based object detection for remote sensing images
Song, Huaxiang
Xia, Hanjun
Wang, Wenhui
Zhou, Yang
Liu, Wanbo
Liu, Qun
Liu, Jinling
INTERNATIONAL JOURNAL OF INTELLIGENT COMPUTING AND CYBERNETICS, 2025, 18 (01) : 133 - 152
[4] Transformer with large convolution kernel decoder network for salient object detection in optical remote sensing images
Dong, Pengwei
Wang, Bo
Cong, Runmin
Sun, Hai-Han
Li, Chongyi
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 240
[5] Transformer Based Remote Sensing Object Detection With Enhanced Multispectral Feature Extraction
Zhu, Jiahe
Chen, Xu
Zhang, Huan
Tan, Zelong
Wang, Shengjin
Ma, Hongbing
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
[6] Object Recognition in Remote Sensing Images Based on Modified Backpropagation Neural Network
Raju, Manthena Narasimha
Natarajan, Kumaran
Vasamsetty, Chandra Sekhar
TRAITEMENT DU SIGNAL, 2021, 38 (02) : 451 - 459
[7] MDCT: Multi-Kernel Dilated Convolution and Transformer for One-Stage Object Detection of Remote Sensing Images
Chen, Juanjuan
Hong, Hansheng
Song, Bin
Guo, Jie
Chen, Chen
Xu, Junjie
REMOTE SENSING, 2023, 15 (02)
[8] HIERARCHICAL REGION BASED CONVOLUTION NEURAL NETWORK FOR MULTISCALE OBJECT DETECTION IN REMOTE SENSING IMAGES
Li, Qingpeng
Mou, Lichao
Jiang, Kaiyu
Liu, Qingjie
Wang, Yunhong
Zhu, Xiao Xiang
IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 4355 - 4358
[9] RepSViT: An Efficient Vision Transformer Based on Spiking Neural Networks for Object Recognition in Satellite On-Orbit Remote Sensing Images
Pang, Yanhua
Yao, Libo
Luo, Yiping
Dong, Chengguo
Kong, Qinglei
Chen, Bo
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 16
[10] Salient Object Detection in Optical Remote Sensing Images Driven by Transformer
Li, Gongyang
Bai, Zhen
Liu, Zhi
Zhang, Xinpeng
Ling, Haibin
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5257 - 5269

← 1 2 3 4 5 →