Multi-Level Refinement Feature Pyramid Network for Scale Imbalance Object Detection

被引:3
|
作者
Aziz, Lubna [1 ,2 ]
Salam, Md Sah Bin Haji [1 ]
Sheikh, Usman Ullah [3 ]
Khan, Surat [4 ]
Ayub, Huma [5 ]
Ayub, Sara [4 ]
机构
[1] Univ Teknol Malaysia UTM, Sch Comp, Div Art Intelligence, Fac Engn, Skudai 81310, Kagawa, Malaysia
[2] Balochistan Univ Informat Technol Engn & Manageme, Fac Informat & Commun Technol, Dept Comp Engn, Quetta 87300, Pakistan
[3] Univ Teknol Malaysia UTM, Sch Elect Engn, Fac Engn, Skudai 81310, Kagawa, Malaysia
[4] Balochistan Univ Informat Technol Engn & Manageme, Fac Informat & Commun Technol, Dept Elect Engn, Quetta 87300, Pakistan
[5] Sardar Bahadur Khan Woman Univ, Dept Chem & Technol, Quetta 86301, Pakistan
关键词
Object detection; feature pyramid; convolutional neural network; computer vision;
D O I
10.1109/ACCESS.2021.3130129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection becomes a challenge due to diversity of object scales. In general, modern object detectors use feature pyramid to learn multi-scale representation for better results. However, current versions of feature pyramid are insufficient to handle scale imbalance, as it is inefficient to integrate semantic information across different scales. Here, we reformulate feature pyramid construction as a feature reconfiguration process. We propose a detection network, Multi-level Refinement Feature pyramid Network, to combine high-level features (i.e., semantic information), middle-level feature and low-level feature (i.e., boundary information), in a highly-nonlinear yet efficient manner. A novel contextual features module is proposed, which consists of global attention and local reconfigurations. It efficiently gathers task-oriented contextual features across different scales and spatial locations (i.e., lightweight local reconfiguration and global attention). To evaluate significance of proposed model, we designed and trained end-to-end single stage detector called MRFDet by assimilating it into Single Shot Detector (SSD), and it achieved better detection performance compared to most recent single-stage objects detectors. MRFDet achieves an AP of 45.2 with MS-COCO and an improvement in mAP of 4.5% with VOC.
引用
收藏
页码:156492 / 156506
页数:15
相关论文
共 50 条
  • [21] Salient object detection network with multi-scale feature refinement and boundary feedback
    Zhang, Qing
    Li, Xiang
    IMAGE AND VISION COMPUTING, 2021, 116
  • [22] RoMP-transformer: Rotational bounding box with multi-level feature pyramid transformer for object detection
    Moon, Joonhyeok
    Jeon, Munsu
    Jeong, Siheon
    Oh, Ki-Yong
    PATTERN RECOGNITION, 2024, 147
  • [23] Multi-level and multi-scale deep saliency network for salient object detection
    Zhang, Qing
    Lin, Jiajun
    Zhuge, Jingling
    Yuan, Wenhao
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 : 415 - 424
  • [24] Construct Effective Geometry Aware Feature Pyramid Network for Multi-Scale Object Detection
    Dong, Jinpeng
    Huang, Yuhao
    Zhang, Songyi
    Chen, Shitao
    Zheng, Nanning
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 534 - 541
  • [25] SEFPN: Scale-Equalizing Feature Pyramid Network for Object Detection
    Zhang, Zhiqiang
    Qiu, Xin
    Li, Yongzhou
    SENSORS, 2021, 21 (21)
  • [26] A single-shot multi-level feature reused neural network for object detection
    Lixin Wei
    Wei Cui
    Ziyu Hu
    Hao Sun
    Shijie Hou
    The Visual Computer, 2021, 37 : 133 - 142
  • [27] A single-shot multi-level feature reused neural network for object detection
    Wei, Lixin
    Cui, Wei
    Hu, Ziyu
    Sun, Hao
    Hou, Shijie
    VISUAL COMPUTER, 2021, 37 (01): : 133 - 142
  • [28] Pose-aware Multi-level Feature Network for Human Object Interaction Detection
    Wan, Bo
    Zhou, Desen
    Liu, Yongfei
    Li, Rongjie
    He, Xuming
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9468 - 9477
  • [29] Triplet Network with Multi-level Feature Fusion for Object Tracking
    Cao, Yang
    Wan, Bo
    Wang, Quan
    Cheng, Fei
    2020 JOINT 9TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2020 4TH INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2020,
  • [30] FEATURE FUSING OF FEATURE PYRAMID NETWORK FOR MULTI-SCALE PEDESTRIAN DETECTION
    Tesema, Fiseha B.
    Lin, Junpeng
    Ou, Jie
    Wu, Hong
    Zhu, William
    2018 15TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2018, : 10 - 13