DKTNet: Dual-Key Transformer Network for small object detection

被引：29

作者：

Xu, Shoukun ^{[1
]}

Gu, Jianan ^{[1
]}

Hua, Yining ^{[2
]}

Liu, Yi ^{[1
]}

机构：

[1] Changzhou Univ, Changzhou 213164, Jiangsu, Peoples R China

[2] Univ Aberdeen, Aberdeen, Scotland

来源：

NEUROCOMPUTING | 2023年 / 525卷

基金：

中国国家自然科学基金;

关键词：

Small object detection; Transformer; Dual-key;

D O I：

10.1016/j.neucom.2023.01.055

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detection is a fundamental computer vision task that plays a crucial role in a wide range of real-world applications. However, it is still a challenging task to detect the small size objects in the complex scene, due to the low resolution and noisy representation appearance caused by occlusion, distant depth view, etc. To tackle this issue, a novel transformer architecture, Dual-Key Transformer Network (DKTNet), is proposed in this paper. To improve the feature attention ability, the coherence of linear layer outputs Q and V are enhanced by a dual-K integrated from K1 and K2, which are computed along Q and V, respectively. Instead of spatial-wise attention, channel-wise self-attention mechanism is adopted to promote the important feature channels and suppress the confusing ones. Moreover, 2D and 1D convolution computations for Q, K and V are proposed. Compared with the fully-connected computa-tion in conventional transformer architectures, the 2D convolution can better capture local details and global contextual information, and the 1D convolution can reduce network complexity significantly. Experimental evaluation is conducted on both general and small object detection datasets. The superior-ity of the aforementioned features in our proposed approach is demonstrated with the comparison against the state-of-the-art approaches.(c) 2023 Elsevier B.V. All rights reserved.

引用

页码：29 / 41

页数：13

共 50 条

[31] Enhanced semantic feature pyramid network for small object detection
Chen, Yuqi
Zhu, Xiangbin
Li, Yonggang
Wei, Yuanwang
Ye, Lihua
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 113
[32] Lightweight multi-scale network for small object detection
Li L.
Li B.
Zhou H.
PeerJ Computer Science, 2022, 8
[33] Lightweight multi-scale network for small object detection
Li, Li
Li, Bingxue
Zhou, Hongjuan
PEERJ COMPUTER SCIENCE, 2022, 8
[34] Dual Network Structure With Interweaved Global-Local Feature Hierarchy for Transformer-Based Object Detection in Remote Sensing Image
Xue, Jingqian
He, Da
Liu, Mengwei
Shi, Qian
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2022, 15 : 6856 - 6866
[35] KPTr: Key point transformer for LiDAR-based 3D object detection
Cao, Jie
Peng, Yiqiang
Wei, Hongqian
Mo, Lingfan
Fan, Likang
Wang, Longfei
MEASUREMENT, 2025, 242
[36] ATBHC-YOLO: aggregate transformer and bidirectional hybrid convolution for small object detection
Liao, Dandan
Zhang, Jianxun
Tao, Ye
Jin, Xie
COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (01)
[37] Ghostformer: A GhostNet-Based Two-Stage Transformer for Small Object Detection
Li, Sijia
Sultonov, Furkat
Tursunboev, Jamshid
Park, Jun-Hyun
Yun, Sangseok
Kang, Jae-Mo
SENSORS, 2022, 22 (18)
[38] TPRNet: camouflaged object detection via transformer-induced progressive refinement network
Zhang, Qiao
Ge, Yanliang
Zhang, Cong
Bi, Hongbo
VISUAL COMPUTER, 2023, 39 (10) : 4593 - 4607
[39] TENet: Accurate light-field salient object detection with a transformer embedding network
Wang, Xingzheng
Chen, Songwei
Wei, Guoyao
Liu, Jiehao
IMAGE AND VISION COMPUTING, 2023, 129
[40] TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network
Liu, Zhengyi
Wang, Yuan
Tu, Zhengzheng
Xiao, Yun
Tang, Bin
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4481 - 4490

← 1 2 3 4 5 →