DKTNet: Dual-Key Transformer Network for small object detection

被引:29
作者
Xu, Shoukun [1 ]
Gu, Jianan [1 ]
Hua, Yining [2 ]
Liu, Yi [1 ]
机构
[1] Changzhou Univ, Changzhou 213164, Jiangsu, Peoples R China
[2] Univ Aberdeen, Aberdeen, Scotland
基金
中国国家自然科学基金;
关键词
Small object detection; Transformer; Dual-key;
D O I
10.1016/j.neucom.2023.01.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection is a fundamental computer vision task that plays a crucial role in a wide range of real-world applications. However, it is still a challenging task to detect the small size objects in the complex scene, due to the low resolution and noisy representation appearance caused by occlusion, distant depth view, etc. To tackle this issue, a novel transformer architecture, Dual-Key Transformer Network (DKTNet), is proposed in this paper. To improve the feature attention ability, the coherence of linear layer outputs Q and V are enhanced by a dual-K integrated from K1 and K2, which are computed along Q and V, respectively. Instead of spatial-wise attention, channel-wise self-attention mechanism is adopted to promote the important feature channels and suppress the confusing ones. Moreover, 2D and 1D convolution computations for Q, K and V are proposed. Compared with the fully-connected computa-tion in conventional transformer architectures, the 2D convolution can better capture local details and global contextual information, and the 1D convolution can reduce network complexity significantly. Experimental evaluation is conducted on both general and small object detection datasets. The superior-ity of the aforementioned features in our proposed approach is demonstrated with the comparison against the state-of-the-art approaches.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:29 / 41
页数:13
相关论文
共 50 条
[41]   Mutual-Guidance Transformer-Embedding Network for Video Salient Object Detection [J].
Min, Dingyao ;
Zhang, Chao ;
Lu, Yukang ;
Fu, Keren ;
Zhao, Qijun .
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 :1674-1678
[42]   TPRNet: camouflaged object detection via transformer-induced progressive refinement network [J].
Qiao Zhang ;
Yanliang Ge ;
Cong Zhang ;
Hongbo Bi .
The Visual Computer, 2023, 39 :4593-4607
[43]   Few-Shot Object Detection Based on the Transformer and High-Resolution Network [J].
Zhang, Dengyong ;
Pu, Huaijian ;
Li, Feng ;
Ding, Xiangling ;
Sheng, Victor S. .
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02) :3439-3454
[44]   DBYOLOv8: Dual-Branch YOLOv8 Network for Small Object Detection on Drone Image [J].
Tan, Yawei ;
Xu, Bingxin ;
Sun, Jiangsheng ;
Xu, Cheng ;
Pan, Weiguo ;
Dai, Songyin ;
Liu, Hongzhe .
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (01) :1301-1309
[45]   MOD-YOLO: Multispectral object detection based on transformer dual-stream YOLO [J].
Shao, Yanhua ;
Huang, Qimeng ;
Mei, Yanying ;
Chu, Hongyu .
PATTERN RECOGNITION LETTERS, 2024, 183 :26-34
[46]   DFS-DETR: Detailed-Feature-Sensitive Detector for Small Object Detection in Aerial Images Using Transformer [J].
Cao, Xinyu ;
Wang, Hanwei ;
Wang, Xiong ;
Hu, Bin .
ELECTRONICS, 2024, 13 (17)
[47]   MBAN: multi-branch attention network for small object detection [J].
Li, Li ;
Gao, Shuaikun ;
Wu, Fangfang ;
An, Xin .
PEERJ COMPUTER SCIENCE, 2024, 10
[48]   Attention-based scale sequence network for small object detection [J].
Lee, Young-Woon ;
Kim, Byung-Gyu .
HELIYON, 2024, 10 (12)
[49]   Towards salient object detection via parallel dual-decoder network [J].
Cen, Chaojun ;
Li, Fei ;
Li, Zhenbo ;
Wang, Yun .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 139
[50]   TransMIN: Transformer-Guided Multi-Interaction Network for Remote Sensing Object Detection [J].
Xu, Guangming ;
Song, Tiecheng ;
Sun, Xia ;
Gao, Chenqiang .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20