Multi-scale coupled attention for visual object detection

被引:2
|
作者
Li, Fei [1 ]
Yan, Hongping [2 ]
Shi, Linsu [1 ]
机构
[1] China Tower Corp Ltd, 9 Dongran North St, Beijing 100195, Peoples R China
[2] China Univ Geosci, Xueyuan Rd 29, Beijing 100083, Peoples R China
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
关键词
Attention mechanism; Deep neural networks; Object detection; Self-attention learning; Transformer; YOLO;
D O I
10.1038/s41598-024-60897-8
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The application of deep neural network has achieved remarkable success in object detection. However, the network structures should be still evolved consistently and tuned finely to acquire better performance. This gears to the continuous demands on high performance in those complex scenes, where multi-scale objects to be detected are located here and there. To this end, this paper proposes a network structure called Multi-Scale Coupled Attention (MSCA) under the framework of self-attention learning with methodologies of importance assessment. Architecturally, it consists of a Multi-Scale Coupled Channel Attention (MSCCA) module, and a Multi-Scale Coupled Spatial Attention (MSCSA) module. Specifically, the MSCCA module is developed to achieve the goal of self-attention learning linearly on the multi-scale channels. In parallel, the MSCSA module is constructed to achieve this goal nonlinearly on the multi-scale spatial grids. The MSCCA and MSSCA modules can be connected together into a sequence, which can be used as a plugin to develop end-to-end learning models for object detection. Finally, our proposed network is compared on two public datasets with 13 classical or state-of-the-art models, including the Faster R-CNN, Cascade R-CNN, RetinaNet, SSD, PP-YOLO, YOLO v3, YOLO v5, YOLO v7, YOLOX, DETR, conditional DETR, UP-DETR and FP-DETR. Comparative experimental results with numerical scores, the ablation study, and the performance behaviour all demonstrate the effectiveness of our proposed model.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Multi-Scale Attention Deep Neural Network for Fast Accurate Object Detection
    Song, Kaiyou
    Yang, Hua
    Yin, Zhouping
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 2972 - 2985
  • [22] Multi-scale traffic sign detection model with attention
    Fan, Bei Bei
    Yang, He
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART D-JOURNAL OF AUTOMOBILE ENGINEERING, 2021, 235 (2-3) : 708 - 720
  • [23] Multi-Attention Object Detection Model in Remote Sensing Images Based on Multi-Scale
    Ying, Xiang
    Wang, Qiang
    Li, Xuewei
    Yu, Mei
    Jiang, Han
    Gao, Jie
    Liu, Zhiqiang
    Yu, Ruiguo
    IEEE ACCESS, 2019, 7 : 94508 - 94519
  • [24] Multi-Scale Spatial and Channel-wise Attention for Improving Object Detection in Remote Sensing Imagery
    Chen, Jie
    Wan, Li
    Zhu, Jingru
    Xu, Gang
    Deng, Min
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (04) : 681 - 685
  • [25] Multi-scale Vertical Cross-layer Feature Aggregation and Attention Fusion Network for Object Detection
    Gao, Wenting
    Li, Xiaojuan
    Han, Yu
    Liu, Yue
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 139 - 150
  • [26] Multi-scale oriented object detection in aerial images based on convolutional neural networks with global attention
    Fei, Jingjing
    Wang, Zhicheng
    Yu, Zhaohui
    Gu, Xi
    Wei, Gang
    MIPPR 2019: REMOTE SENSING IMAGE PROCESSING, GEOGRAPHIC INFORMATION SYSTEMS, AND OTHER APPLICATIONS, 2020, 11432
  • [27] Multi-scale visual detection for waterborne ship targets
    Huang J.
    Tang N.
    Wen Y.
    Guo Y.
    Zhu L.
    Xiao C.
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2024, 56 (05): : 103 - 113
  • [28] MSA-YOLO: A Remote Sensing Object Detection Model Based on Multi-Scale Strip Attention
    Su, Zihang
    Yu, Jiong
    Tan, Haotian
    Wan, Xueqiang
    Qi, Kaiyang
    SENSORS, 2023, 23 (15)
  • [29] MFFAMM: A Small Object Detection with Multi-Scale Feature Fusion and Attention Mechanism Module
    Qu, Zhong
    Han, Tongqiang
    Yi, Turning
    APPLIED SCIENCES-BASEL, 2022, 12 (18):
  • [30] Feature Enhancement for Multi-scale Object Detection
    Zheng, Huicheng
    Chen, Jiajie
    Chen, Lvran
    Li, Ye
    Yan, Zhiwei
    NEURAL PROCESSING LETTERS, 2020, 51 (02) : 1907 - 1919