Poly Kernel Inception Network for Remote Sensing Detection

被引:157
作者
Cai, Xinhao [1 ]
Lai, Qiuxia [2 ]
Wang, Yuwei [1 ]
Wang, Wenguan [3 ]
Sun, Zeren [1 ]
Yao, Yazhou [1 ]
机构
[1] Nanjing Univ Sci & Technol, Nanjing, Peoples R China
[2] Commun Univ China, Beijing, Peoples R China
[3] Zhejiang Univ, Hangzhou, Peoples R China
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年
基金
中国国家自然科学基金;
关键词
OBJECT DETECTION; IMAGES; CNN;
D O I
10.1109/CVPR52733.2024.02617
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Object detection in remote sensing images (RSIs) often suffers from several increasing challenges, including the large variation in object scales and the diverse-ranging context. Prior methods tried to address these challenges by expanding the spatial receptive field of the backbone, either through large-kernel convolution or dilated convolution. However, the former typically introduces considerable background noise, while the latter risks generating overly sparse feature representations. In this paper, we introduce the Poly Kernel Inception Network (PKINet) to handle the above challenges. PKINet employs multi-scale convolution kernels without dilation to extract object features of varying scales and capture local context. In addition, a Context Anchor Attention (CAA) module is introduced in parallel to capture long-range contextual information. These two components work jointly to advance the performance of PKINet on four challenging remote sensing detection benchmarks, namely DOTA-v1.0, DOTA-v1.5, HRSC2016, and DIOR-R.
引用
收藏
页码:27706 / 27716
页数:11
相关论文
共 83 条
[1]  
[Anonymous], 2021, ICML
[2]  
[Anonymous], 2021, CVPR, DOI DOI 10.1109/CVPR46437.2021.01556
[3]   Hybrid Task Cascade for Instance Segmentation [J].
Chen, Kai ;
Pang, Jiangmiao ;
Wang, Jiaqi ;
Xiong, Yu ;
Li, Xiaoxiao ;
Sun, Shuyang ;
Feng, Wansen ;
Liu, Ziwei ;
Shi, Jianping ;
Ouyang, Wanli ;
Loy, Chen Change ;
Lin, Dahua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978
[4]   Anchor-Free Oriented Proposal Generator for Object Detection [J].
Cheng, Gong ;
Wang, Jiabao ;
Li, Ke ;
Xie, Xingxing ;
Lang, Chunbo ;
Yao, Yanqing ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[5]   Dual-Aligned Oriented Detector [J].
Cheng, Gong ;
Yao, Yanqing ;
Li, Shengyang ;
Li, Ke ;
Xie, Xingxing ;
Wang, Jiabao ;
Yao, Xiwen ;
Han, Junwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[6]  
Cohen I., 2009, Noise reduction in speech processing, P1, DOI DOI 10.1007/978-3-642-00296-0_5
[7]  
Dai Linhui, 2022, IEEE TCSVT
[8]   Multi-Scale Depthwise Separable Convolution for Semantic Segmentation in Street-Road Scenes [J].
Dai, Yingpeng ;
Li, Chenglin ;
Su, Xiaohang ;
Liu, Hongxian ;
Li, Jiehao .
REMOTE SENSING, 2023, 15 (10)
[9]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[10]   Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges [J].
Ding, Jian ;
Xue, Nan ;
Xia, Gui-Song ;
Bai, Xiang ;
Yang, Wen ;
Yang, Michael Ying ;
Belongie, Serge ;
Luo, Jiebo ;
Datcu, Mihai ;
Pelillo, Marcello ;
Zhang, Liangpei .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (11) :7778-7796