AO2-DETR: Arbitrary-Oriented Object Detection Transformer

被引:91
作者
Dai, Linhui [1 ]
Liu, Hong [1 ]
Tang, Hao [2 ]
Wu, Zhiwei [3 ]
Song, Pinhao [1 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Key Lab Machine Percept, Beijing 100871, Peoples R China
[2] Swiss Fed Inst Technol, Comp Vis Lab, CH-8800 Zurich, Switzerland
[3] South China Univ Technol, Sch Software Engn, Guangzhou 511400, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Proposals; Transformers; Feature extraction; Object detection; Detectors; Task analysis; Pipelines; Oriented object detection; detection transformer; oriented proposals; feature refinement; FEATURE REFINEMENT;
D O I
10.1109/TCSVT.2022.3222906
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Arbitrary-oriented object detection (AOOD) is a challenging task to detect objects in the wild with arbitrary orientations and cluttered arrangements. Existing approaches are mainly based on anchor-based boxes or dense points, which rely on complicated hand-designed processing steps and inductive bias, such as anchor generation, transformation, and non-maximum suppression reasoning. Recently, the emerging transformer-based approaches view object detection as a direct set prediction problem that effectively removes the need for hand-designed components and inductive biases. In this paper, we propose an Arbitrary-Oriented Object DEtection TRansformer framework, termed AO2-DETR, which comprises three dedicated components. More precisely, an oriented proposal generation mechanism is proposed to explicitly generate oriented proposals, which provides better positional priors for pooling features to modulate the cross-attention in the transformer decoder. An adaptive oriented proposal refinement module is introduced to extract rotation-invariant region features and eliminate the misalignment between region features and objects. And a rotation-aware set matching loss is used to ensure the one-to-one matching process for direct set prediction without duplicate predictions. Our method considerably simplifies the overall pipeline and presents a new AOOD paradigm. Comprehensive experiments on several challenging datasets show that our method achieves superior performance on the AOOD task.
引用
收藏
页码:2342 / 2356
页数:15
相关论文
共 52 条
[1]  
Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
[2]   Hybrid Task Cascade for Instance Segmentation [J].
Chen, Kai ;
Pang, Jiangmiao ;
Wang, Jiaqi ;
Xiong, Yu ;
Li, Xiaoxiao ;
Sun, Shuyang ;
Feng, Wansen ;
Liu, Ziwei ;
Shi, Jianping ;
Ouyang, Wanli ;
Loy, Chen Change ;
Lin, Dahua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978
[3]   High-Quality R-CNN Object Detection Using Multi-Path Detection Calibration Network [J].
Chen, Xiaoyu ;
Li, Hongliang ;
Wu, Qingbo ;
Ngan, King Ngi ;
Xu, Linfeng .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) :715-727
[4]   Joint Anchor-Feature Refinement for Real-Time Accurate Object Detection in Images and Videos [J].
Chen, Xingyu ;
Yu, Junzhi ;
Kong, Shihan ;
Wu, Zhengxing ;
Wen, Li .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) :594-607
[5]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6]   Learning RoI Transformer for Oriented Object Detection in Aerial Images [J].
Ding, Jian ;
Xue, Nan ;
Long, Yang ;
Xia, Gui-Song ;
Lu, Qikai .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2844-2853
[7]   Precise Detection in Densely Packed Scenes [J].
Goldman, Eran ;
Herzig, Roei ;
Eisenschtat, Aviv ;
Goldberger, Jacob ;
Hassner, Tal .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5222-5231
[8]   Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection [J].
Guo, Zonghao ;
Zhang, Xiaosong ;
Liu, Chang ;
Ji, Xiangyang ;
Jiao, Jianbin ;
Ye, Qixiang .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (08) :5252-5265
[9]   Align Deep Features for Oriented Object Detection [J].
Han, Jiaming ;
Ding, Jian ;
Li, Jie ;
Xia, Gui-Song .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[10]   ReDet: A Rotation-equivariant Detector for Aerial Object Detection [J].
Han, Jiaming ;
Ding, Jian ;
Xue, Nan ;
Xia, Gui-Song .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :2785-2794