Learning Calibrated-Guidance for Object Detection in Aerial Images

被引：28

作者：

Wei, Zongqi ^{[1
]}

Liang, Dong ^{[1
]}

Zhang, Dong ^{[2
]}

Zhang, Liyan ^{[1
]}

Geng, Qixiang ^{[1
]}

Wei, Mingqiang ^{[1
]}

Zhou, Huiyu ^{[3
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Collaborat Innovat Ctr Novel Software Technol & I, MIIT Key Lab Pattern Anal & Machine Intelligence, Coll Comp Sci & Technol, Nanjing 211106, Peoples R China

[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China

[3] Univ Leicester, Sch Informat, Leicester LE1 7RH, Leics, England

来源：

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING | 2022年 / 15卷

关键词：

Object detection; Feature extraction; Task analysis; Calibration; Head; Magnetic heads; Detectors; Aerial image; attention learning; calibrated-guidance (CG); deep learning; object detection; SHIP DETECTION;

D O I：

10.1109/JSTARS.2022.3158903

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Object detection is one of the most fundamental yet challenging research topics in the domain of computer vision. Recently, the study on this topic in aerial images has made tremendous progress. However, complex background and worse imaging quality are obvious problems in aerial object detection. Most state-of-the-art approaches tend to develop elaborate attention mechanisms for the space-time feature calibrations with arduous computational complexity, while surprisingly ignoring the importance of feature calibrations in channel-wise. In this work, we propose a simple yet effective calibrated-guidance (CG) scheme to enhance channel communications in a feature transformer fashion, which can adaptively determine the calibration weights for each channel based on the global feature affinity correlations. Specifically, for a given set of feature maps, CG first computes the feature similarity between each channel and the remaining channels as the intermediary calibration guidance. Then, rerepresenting each channel by aggregating all the channels weighted together via the guidance operation. Our CG is a general module that can be plugged into any deep neural networks, which is named as CG-Net. To demonstrate its effectiveness and efficiency, extensive experiments are carried out on both oriented object detection task and horizontal object detection task in aerial images. Experimental results on two challenging benchmarks (i.e., DOTA and HRSC2016) demonstrate that our CG-Net can achieve the new state-of-the-art performance in accuracy with a fair computational overhead. The source code has been open sourced at https://github.com/WeiZongqi/CG-Net.

引用

页码：2721 / 2733

页数：13

共 69 条

[1] Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery [J].

Azimi, Seyed Majid ;

Vig, Eleonora ;

Bahmanyar, Reza ;

Koerner, Marco ;

Reinartz, Peter .

COMPUTER VISION - ACCV 2018, PT III, 2019, 11363 :150-165

[2] Attention Augmented Convolutional Networks [J].

Bello, Irwan ;

Zoph, Barret ;

Vaswani, Ashish ;

Shlens, Jonathon ;

Le, Quoc V. .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3285-3294

[3] Abnormal crowd density estimation in aerial images based on the deep and handcrafted features fusion [J].

Bouhlel, Fatma ;

Mliki, Hazar ;

Hammami, Mohamed .

EXPERT SYSTEMS WITH APPLICATIONS, 2021, 173

[4] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[5] MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation [J].

He, Chaoyang ;

Ye, Haishan ;

Shen, Li ;

Zhang, Tong .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11990-11999

[6] Multi-Scale Spatial and Channel-wise Attention for Improving Object Detection in Remote Sensing Imagery [J].

Chen, Jie ;

Wan, Li ;

Zhu, Jingru ;

Xu, Gang ;

Deng, Min .

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (04) :681-685

[7] SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J].

Chen, Long ;

Zhang, Hanwang ;

Xiao, Jun ;

Nie, Liqiang ;

Shao, Jian ;

Liu, Wei ;

Chua, Tat-Seng .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6298-6306

[8]

Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036

[9]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[10] Learning RoI Transformer for Oriented Object Detection in Aerial Images [J].

Ding, Jian ;

Xue, Nan ;

Long, Yang ;

Xia, Gui-Song ;

Lu, Qikai .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :2844-2853

← 1 2 3 4 5 6 7 →