DKTNet: Dual-Key Transformer Network for small object detection

被引:29
作者
Xu, Shoukun [1 ]
Gu, Jianan [1 ]
Hua, Yining [2 ]
Liu, Yi [1 ]
机构
[1] Changzhou Univ, Changzhou 213164, Jiangsu, Peoples R China
[2] Univ Aberdeen, Aberdeen, Scotland
基金
中国国家自然科学基金;
关键词
Small object detection; Transformer; Dual-key;
D O I
10.1016/j.neucom.2023.01.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection is a fundamental computer vision task that plays a crucial role in a wide range of real-world applications. However, it is still a challenging task to detect the small size objects in the complex scene, due to the low resolution and noisy representation appearance caused by occlusion, distant depth view, etc. To tackle this issue, a novel transformer architecture, Dual-Key Transformer Network (DKTNet), is proposed in this paper. To improve the feature attention ability, the coherence of linear layer outputs Q and V are enhanced by a dual-K integrated from K1 and K2, which are computed along Q and V, respectively. Instead of spatial-wise attention, channel-wise self-attention mechanism is adopted to promote the important feature channels and suppress the confusing ones. Moreover, 2D and 1D convolution computations for Q, K and V are proposed. Compared with the fully-connected computa-tion in conventional transformer architectures, the 2D convolution can better capture local details and global contextual information, and the 1D convolution can reduce network complexity significantly. Experimental evaluation is conducted on both general and small object detection datasets. The superior-ity of the aforementioned features in our proposed approach is demonstrated with the comparison against the state-of-the-art approaches.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:29 / 41
页数:13
相关论文
共 50 条
  • [31] Enhanced semantic feature pyramid network for small object detection
    Chen, Yuqi
    Zhu, Xiangbin
    Li, Yonggang
    Wei, Yuanwang
    Ye, Lihua
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 113
  • [32] Lightweight multi-scale network for small object detection
    Li L.
    Li B.
    Zhou H.
    PeerJ Computer Science, 2022, 8
  • [33] Lightweight multi-scale network for small object detection
    Li, Li
    Li, Bingxue
    Zhou, Hongjuan
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [34] Dual Network Structure With Interweaved Global-Local Feature Hierarchy for Transformer-Based Object Detection in Remote Sensing Image
    Xue, Jingqian
    He, Da
    Liu, Mengwei
    Shi, Qian
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 6856 - 6866
  • [35] KPTr: Key point transformer for LiDAR-based 3D object detection
    Cao, Jie
    Peng, Yiqiang
    Wei, Hongqian
    Mo, Lingfan
    Fan, Likang
    Wang, Longfei
    MEASUREMENT, 2025, 242
  • [36] ATBHC-YOLO: aggregate transformer and bidirectional hybrid convolution for small object detection
    Liao, Dandan
    Zhang, Jianxun
    Tao, Ye
    Jin, Xie
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (01)
  • [37] Ghostformer: A GhostNet-Based Two-Stage Transformer for Small Object Detection
    Li, Sijia
    Sultonov, Furkat
    Tursunboev, Jamshid
    Park, Jun-Hyun
    Yun, Sangseok
    Kang, Jae-Mo
    SENSORS, 2022, 22 (18)
  • [38] TPRNet: camouflaged object detection via transformer-induced progressive refinement network
    Zhang, Qiao
    Ge, Yanliang
    Zhang, Cong
    Bi, Hongbo
    VISUAL COMPUTER, 2023, 39 (10) : 4593 - 4607
  • [39] TENet: Accurate light-field salient object detection with a transformer embedding network
    Wang, Xingzheng
    Chen, Songwei
    Wei, Guoyao
    Liu, Jiehao
    IMAGE AND VISION COMPUTING, 2023, 129
  • [40] TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network
    Liu, Zhengyi
    Wang, Yuan
    Tu, Zhengzheng
    Xiao, Yun
    Tang, Bin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4481 - 4490