Tiny Object Detection via Regional Cross Self-Attention Network

被引:7
作者
Cheng, Keyang [1 ]
Cui, Honggang [1 ]
Ghafoor, Humaira Abdul [1 ]
Wan, Hao [1 ]
Mao, Qirong [1 ]
Zhan, Yongzhao [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang 212013, Peoples R China
基金
中国国家自然科学基金;
关键词
Detectors; Object detection; Encoding; Feature extraction; Transformers; Image coding; Generators; Tiny object detection; context aggregation; vision transformer; self-attention; position coding; feature fusion;
D O I
10.1109/TCSVT.2022.3232688
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As vision sensor technology continues to evolve, the requirements for detecting targets of interest in the images captured by the sensors are increasing. Considering fast detection and high accuracy, the industry favors geometric key point-based solutions. However, there are a large number of small and fuzzy objects in the real world. Geometric key point detectors do not effectively utilize the contextual features of the region of interest, leading to excessive false positive and false negative results. In this work, a simple, effective, and interpretable tiny object detection method called Regional Cross Self-Attention Object Detection Network (RCSANet) is proposed. It adopts Region Proposal Networks and transformers to capture regional background relations and uses regional background relations to generate key point sequences. The regional cross self-attention mechanism is introduced to curtail computation redundancy and minimize the interference of redundant information to the target region. Additionally, a position coding called dynamic implicit position coding is proposed to cooperate with regional cross self-attentiveness. Dynamic implicit location coding can encode arbitrarily long input sequences. The computational cost of RCSANet is significantly lower than that of state-of-the-art object detection solutions. Moreover, RCSANet improves the performance on the four benchmark datasets, of MSCOCO, Tinyperson, DOTA, and AI-TOD, by about 3.0%AP.
引用
收藏
页码:8984 / 8996
页数:13
相关论文
共 50 条
[41]   A Self-attention Network for Face Detection Based on Unmanned Aerial Vehicles [J].
Hua, Shunfu ;
Fan, Huijie ;
Ding, Naida ;
Li, Wei ;
Tang, Yandong .
INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT II, 2022, 13456 :440-449
[42]   CASNet: A Cross-Attention Siamese Network for Video Salient Object Detection [J].
Ji, Yuzhu ;
Zhang, Haijun ;
Jie, Zequn ;
Ma, Lin ;
Wu, Q. M. Jonathan .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) :2676-2690
[43]   An Adaptive Region Proposal Network With Progressive Attention Propagation for Tiny Person Detection From UAV Images [J].
Yu, Youjiang ;
Zhang, Kaibing ;
Wang, Xiaohua ;
Wang, Nannan ;
Gao, Xinbo .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (06) :4392-4406
[44]   Referring Segmentation in Images and Videos With Cross-Modal Self-Attention Network [J].
Ye, Linwei ;
Rochan, Mrigank ;
Liu, Zhi ;
Zhang, Xiaoqin ;
Wang, Yang .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) :3719-3732
[45]   UAV image object detection based on self-attention guidance and global feature fusion [J].
Bai, Jing ;
Hu, Haiyang ;
Liu, Xiaojing ;
Zhuang, Shanna ;
Wang, Zhengyou .
IMAGE AND VISION COMPUTING, 2024, 151
[46]   PPDTSA: Privacy-preserving Deep Transformation Self-attention Framework For Object Detection [J].
Ma, Bo ;
Wu, Jinsong ;
Lai, Edmund ;
Hu, Shuolin .
2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,
[47]   3D Object Detection Based on Voxel Self-Attention Auxiliary Networks [J].
Cao, Jie ;
Peng, Yiqiang ;
Fan, Likang ;
Wang, Longfei .
LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (24)
[48]   A Lightweight Safety Helmet Detection Network Based on Bidirectional Connection Module and Polarized Self-attention [J].
Li, Tianyang ;
Xu, Hanwen ;
Bai, Jinxu .
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT V, 2024, 14451 :253-264
[49]   A Scalable Network for Tiny Object Detection Based on Faster RCNN [J].
Li, Yunbo ;
Ding, Yu ;
Bai, Wei ;
Jiao, Shanshan ;
Pan, Zhisong .
2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, :447-453
[50]   Cascaded feature fusion with multi-level self-attention mechanism for object detection [J].
Wang, Chuanxu ;
Wang, Huiru .
PATTERN RECOGNITION, 2023, 138