Dynamic multi-headed self-attention and multiscale enhancement vision transformer for object detection

被引:0
|
作者
Fang, Sikai [1 ]
Lu, Xiaofeng [1 ,2 ]
Huang, Yifan [1 ]
Sun, Guangling [1 ]
Liu, Xuefeng [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, 99 Shangda Rd, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Wenzhou Inst, Wenzhou, Peoples R China
关键词
Dynamic gate; Multiscale; Object detection; Self-attention; Vision transformer;
D O I
10.1007/s11042-024-18234-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The self-attention-based vision transformer has powerful feature extraction capabilities and has demonstrated competitive performance in several tasks. However, the conventional self-attention mechanism that exhibits global perceptual properties while favoring large-scale objects, room for improvement still remains in terms of performance at other scales during object detection. To circumvent this issue, the dynamic gate-assisted network (DGANet), a novel yet simple framework, is proposed to enhance the multiscale generalization capability of the vision transformer structure. First, we design the dynamic multi-headed self-attention mechanism (DMH-SAM), which dynamically selects the self-attention components and uses a local-to-global self-attention pattern that enables the model to learn features of objects at different scales autonomously, while reducing the computational effort. Then, we propose a dynamic multiscale encoder (DMEncoder), which weights and encodes the feature maps with different perceptual fields to self-adapt the performance gap of the network for each scale object. Extensive ablation and comparison experiments have proven the effectiveness of the proposed method. Its detection accuracy for small, medium and large targets has reached 27.6, 47.4 and 58.5 respectively, even better than the most advanced target detection methods, while its model complexity down 23%.
引用
收藏
页码:67213 / 67229
页数:17
相关论文
共 50 条
  • [1] SpotNet: Self-Attention Multi-Task Network for Object Detection
    Perreault, Hughes
    Bilodeau, Guillaume-Alexandre
    Saunier, Nicolas
    Heritier, Maguelonne
    2020 17TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV 2020), 2020, : 230 - 237
  • [2] Self-Attention Guidance and Multiscale Feature Fusion-Based UAV Image Object Detection
    Zhang, Yunzuo
    Wu, Cunyu
    Zhang, Tian
    Liu, Yameng
    Zheng, Yuxin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [3] Tiny Object Detection via Regional Cross Self-Attention Network
    Cheng, Keyang
    Cui, Honggang
    Ghafoor, Humaira Abdul
    Wan, Hao
    Mao, Qirong
    Zhan, Yongzhao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 8984 - 8996
  • [4] Rethinking Self-Attention for Multispectral Object Detection
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Prendinger, Helmut
    Sidibe, Desire
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 16300 - 16311
  • [5] SMSTracker: A Self-Calibration Multi-Head Self-Attention Transformer for Visual Object Tracking
    Wang, Zhongyang
    Zhu, Hu
    Liu, Feng
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (01): : 605 - 623
  • [6] PLG-ViT: Vision Transformer with Parallel Local and Global Self-Attention
    Ebert, Nikolas
    Stricker, Didier
    Wasenmueller, Oliver
    SENSORS, 2023, 23 (07)
  • [7] Robust Visual Tracking Using Hierarchical Vision Transformer with Shifted Windows Multi-Head Self-Attention
    Gao, Peng
    Zhang, Xin-Yue
    Yang, Xiao-Li
    Ni, Jian-Cheng
    Wang, Fei
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 161 - 164
  • [8] Object Detection Algorithm Based on Context Information and Self-Attention Mechanism
    Liang, Hong
    Zhou, Hui
    Zhang, Qian
    Wu, Ting
    SYMMETRY-BASEL, 2022, 14 (05):
  • [9] Sampling Equivariant Self-Attention Networks for Object Detection in Aerial Images
    Yang, Guo-Ye
    Li, Xiang-Li
    Xiao, Zi-Kai
    Mu, Tai-Jiang
    Martin, Ralph R.
    Hu, Shi-Min
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 6413 - 6425
  • [10] Joint self-attention and branch sampling for object detection on drone imagery
    Zhang Y.
    Wu C.
    Liu Y.
    Zhang T.
    Zheng Y.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2023, 31 (18): : 2723 - 2735