mSODANet: A network for multi-scale object detection in aerial images using hierarchical dilated convolutions *

被引:74
作者
Chalavadi, Vishnu [1 ]
Jeripothula, Prudviraj [1 ]
Datla, Rajeshreddy [1 ,2 ]
Babu, Sobhan Ch [1 ]
Mohan, Krishna C. [1 ]
机构
[1] Indian Inst Technol Hyderabad, Dept Comp Sci & Engn, Visual Learning & Intelligence Grp VIGIL, Kandi 502285, Sangareddy, India
[2] Adv Data Proc Res Inst ADRIN, Dept Space, Akbar Rd, Manovikas Nagar 500009, Secunderabad, India
关键词
Multi-scale object detection; Contextual features; Dilated convolutions; Aerial images;
D O I
10.1016/j.patcog.2022.108548
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A B S T R A C T The object detection in aerial images is one of the most commonly used tasks in the wide-range of computer vision applications. However, the object detection is more challenging due to the following issues: (a) the pixel occupancy vary among the different scales of objects, (b) the distribution of objects is not uniform in aerial images, (c) the appearance of an object varies with different view-points and illumination conditions, and (d) the number of objects, even though they belong to same type, vary across the images. To address these issues, we propose a novel network for multi-scale object detection in aerial images using hierarchical dilated convolutions, called as mSODANet. In particular, we probe hierarchical dilated network using parallel dilated convolutions to learn the contextual information of different types of objects at multiple scales and multiple field-of-views. The introduced hierarchical dilated network captures the visual information of aerial image more effectively and enhances the detection capability of the model. Further, the extensive experiments conducted on three challenging publicly available datasets, i.e., Visdrone2019, DOTA (OBB & HBB), NWPU VHR-10, demonstrate the effectiveness of the proposed mSODANet and achieve the state-of-the-art performance on all three datasets. (c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 43 条
  • [1] [Anonymous], 2018, ADV NEUR IN
  • [2] STDnet-ST: Spatio-temporal ConvNet for small object detection
    Bosquet, Brais
    Mucientes, Manuel
    Brea, Victor M.
    [J]. PATTERN RECOGNITION, 2021, 116 (116)
  • [3] RRNet: A Hybrid Detector for Object Detection in Drone-captured Images
    Chen, Changrui
    Zhang, Yu
    Lv, Qingxuan
    Wei, Shuo
    Wang, Xiaorui
    Sun, Xin
    Dong, Junyu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 100 - 108
  • [4] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [5] RIFD-CNN: Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection
    Cheng, Gong
    Zhou, Peicheng
    Han, Junwei
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2884 - 2893
  • [6] Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images
    Cheng, Gong
    Zhou, Peicheng
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12): : 7405 - 7415
  • [7] Multi-class geospatial object detection and geographic image classification based on collection of part detectors
    Cheng, Gong
    Han, Junwei
    Zhou, Peicheng
    Guo, Lei
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2014, 98 : 119 - 132
  • [8] Learning RoI Transformer for Oriented Object Detection in Aerial Images
    Ding, Jian
    Xue, Nan
    Long, Yang
    Xia, Gui-Song
    Lu, Qikai
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 2844 - 2853
  • [9] Everingham M., 2010, INT J COMPUT VISION, V88, P303, DOI DOI 10.1007/s11263-009-0275-4
  • [10] NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
    Ghiasi, Golnaz
    Lin, Tsung-Yi
    Le, Quoc V.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7029 - 7038