TGADHead: An efficient and accurate task-guided attention-decoupled head for single-stage object detection

被引:1
作者
Zuo, Fengyuan [1 ]
Liu, Jinhai [1 ]
Chen, Zhaolin [2 ]
Fu, Mingrui [3 ]
Wang, Lei [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
[2] Monash Univ, Monash Biomed Imaging, Clayton, VIC, Australia
[3] Shenyang Paidelin Technol Co Ltd, Shenyang 110081, Peoples R China
关键词
Object detection; Knowledge-based intelligent models; Task decoupled attention distributor; Task correlation network; Location and classification; FEATURE ALIGNMENT; REPRESENTATION; NETWORKS;
D O I
10.1016/j.knosys.2024.112349
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In object detection, localization and classification of the targets are two fundamental subtasks that underpin the application of many knowledge-based intelligent models in various visual fields. However, current methods show inconsistency between the two subtasks, i.e., accurate localization may show a poor classification score or vice versa, due to inadequate design of the detectors. This inconsistency significantly degrades the overall detection accuracy. To address this issue, we propose a novel Task-Guided Attention-Decoupled Head (TGADHead) to improve detection accuracy using an efficient single-stage approach. The proposed framework consists of two inter-connected components: Task Decoupled Attention Distributor (TDAD) and Task Correlation Network (TCN). In the first component, we propose TDAD with two well-designed task specific attention perceptrons to enhance the spatial information required for localization and the semantic information required for classification, respectively. This task specific prediction mechanism improves classification and location performance. Secondly, we construct a Task Correlation Network (TCN) for each of decoupled branch to transfer the correlation between classification and localization features while maintaining the consistency of the two subtasks, i.e., an accurate object prediction outcome should simultaneously have a high quality location boundary box and a high classification score. We achieve +1.9 Average Precision (AP) on the MS-COCO compared to the state-of-the-art detectors.
引用
收藏
页数:12
相关论文
共 77 条
[1]   Complex permittivity measurement as a new noninvasive tool for monitoring In vitro tissue engineering and cell signature through the detection of cell proliferation, differentiation, and pretissue formation [J].
Bagnaninchi, PO ;
Dikeakos, M ;
Veres, T ;
Tabrizian, M .
IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2004, 3 (04) :243-250
[2]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[3]   Hierarchical Regression and Classification for Accurate Object Detection [J].
Cao, Jiale ;
Pang, Yanwei ;
Han, Jungong ;
Li, Xuelong .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (05) :2425-2439
[4]   You Only Look One-level Feature [J].
Chen, Qiang ;
Wang, Yingming ;
Yang, Tong ;
Zhang, Xiangyu ;
Cheng, Jian ;
Sun, Jian .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13034-13043
[5]   High-Resolution Feature Pyramid Network for Small Object Detection on Drone View [J].
Chen, Zhaodong ;
Ji, Hongbing ;
Zhang, Yongquan ;
Zhu, Zhigang ;
Li, Yifan .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) :475-489
[6]   Towards Large-Scale Small Object Detection: Survey and Benchmarks [J].
Cheng, Gong ;
Yuan, Xiang ;
Yao, Xiwen ;
Yan, Kebing ;
Zeng, Qinghua ;
Xie, Xingxing ;
Han, Junwei .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) :13467-13488
[7]   Class attention network for image recognition [J].
Cheng, Gong ;
Lai, Pujian ;
Gao, Decheng ;
Han, Junwei .
SCIENCE CHINA-INFORMATION SCIENCES, 2023, 66 (03)
[8]   Multi-Scale Human-Object Interaction Detector [J].
Cheng, Yamin ;
Wang, Zhi ;
Zhan, Wenhan ;
Duan, Hancong .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) :1827-1838
[9]   Detection in Crowded Scenes: One Proposal, Multiple Predictions [J].
Chu, Xuangeng ;
Zheng, Anlin ;
Zhang, Xiangyu ;
Sun, Jian .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12211-12220
[10]   TOOD: Task-aligned One-stage Object Detection [J].
Feng, Chengjian ;
Zhong, Yujie ;
Gao, Yu ;
Scott, Matthew R. ;
Huang, Weilin .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3490-3499