A Refinement Method for Single-Stage Object Detection Based on Progressive Decoupled Task Alignment

被引：3

作者：

Tang, Xianlun ^{[1
]}

Yang, Qiao ^{[2
]}

Zhang, Xi ^{[3
]}

Deng, Wuquan ^{[4
]}

Wang, Huiming ^{[1
]}

Gao, Xinbo ^{[1
]}

机构：

[1] Chongqing Univ Posts & Telecommun, Chongqing Key Lab Complex Syst & Bion Control, Chongqing 400065, Peoples R China

[2] China Elect Technol Grp Corp, Res Inst 10, Chengdu 610036, Peoples R China

[3] Chongqing Coll Mobile Commun, Chongqing Key Lab Publ Big Data Secur Technol, Chongqing 401520, Peoples R China

[4] Chongqing Univ, Chongqing Emergency Med Ctr, Dept Endocrinol & Metab, Cent Hosp, Chongqing 400014, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 05期

关键词：

Single-stage object detection; task alignment; feature conflicts; probabilistic mapping method; information interaction;

D O I：

10.1109/TCSVT.2023.3323879

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The parallel branches with independent optimized classification and localization capabilities are widely used in single-stage object detection. Defects such as feature conflicts, low level of information interaction, and empirical sample allocation scheme lead to weak spatial consistency of the outputs from different branches. In this work, we propose a Progressive Decoupled Task Alignment (PDTA) that enhances the information interaction between tasks while reducing the degree of feature coupling, and adopts a strategy based on sample screening and learning to achieve task alignment. First, we design the Discrepant Feature Decoupling Module (DFDM) embedded with the novel Oriented Decoupling Convolution (ODC) for the coupled features of the shared input, and the features extracted by ODC are utilized for disentanglement through the feed-in scheme with differences. Second, the Probabilistic Mapping Interaction Head (PMI-Head) utilizes the probabilistic mapping method to enhance task-specific semantics by information interaction. Finally, the network's common attention to the content and position of the target is enhanced through the metric in the proposed Relevance-Guided Adaptive Task Alignment (RATA), in which an exponentially decaying manner is used to preserve the training samples that are more efficient for both tasks. During training, task-aligned learning is performed by Relevance-Guided Loss. Experiments on MS COCO and DIOR datasets demonstrate the effectiveness of our method, PDTA achieves better performance for object detection.

引用

页码：3383 / 3394

页数：12

共 43 条

[41] Zhang XS, 2019, ADV NEUR IN, V32
[42] Feature Selective Anchor-Free Module for Single-Shot Object Detection
Zhu, Chenchen
He, Yihui
Savvides, Marios
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 840 - 849
[43] Deformable ConvNets v2: More Deformable, Better Results
Zhu, Xizhou
Hu, Han
Lin, Stephen
Dai, Jifeng
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9300 - 9308

← 1 2 3 4 5 →