IEMask R-CNN: Information-Enhanced Mask R-CNN

被引:30
作者
Bi, Xiuli [1 ]
Hu, Jinwu [1 ]
Xiao, Bin [1 ]
Li, Weisheng [1 ]
Gao, Xinbo [1 ]
机构
[1] Chongqing Univ Posts & Telecommun, Dept Comp Sci & Technol, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Object segmentation; Semantics; Feature extraction; Image segmentation; Location awareness; Head; Instance segmentation; information-enhanced FPN; adaptive feature fusion; encoding-decoding mask head; INSTANCE SEGMENTATION;
D O I
10.1109/TBDATA.2022.3187413
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The instance segmentation task is relatively difficult in computer vision, which requires not only high-quality masks but also high-accuracy instance category classification. Mask R-CNN has been proven to be a feasible method. However, due to the Feature Pyramid Network (FPN) structure lack useful channel information, global information and low-level texture information, and mask branch cannot obtain useful local-global information, Mask R-CNN is prevented from obtaining high-quality masks and high-accuracy instance category classification. Therefore, we proposed the Information-enhanced Mask R-CNN, called IEMask R-CNN. In the FPN structure of IEMask R-CNN, the information-enhanced FPN will enhance the useful channel information and the global information of the feature maps to solve the issues that the high-level feature map loses useful channel information and inaccurate of instance category classification, meanwhile the bottom-up path enhancement with adaptive feature fusion will ultilize the precise positioning signal in the lower layer to enhance the feature pyramid. In the mask branch of IEMask R-CNN, an encoding-decoding mask head will strength local-global information to gain a high-quality mask. Without bells and whistles, IEMask R-CNN gains significant gains of about 2.60%, 4.00%, 3.17% over Mask R-CNN on MS COCO2017, Cityscapes and LVIS1.0 benchmarks respectively.
引用
收藏
页码:688 / 700
页数:13
相关论文
共 43 条
  • [1] Deep Watershed Transform for Instance Segmentation
    Bai, Min
    Urtasun, Raquel
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2858 - 2866
  • [2] YOLACT Real-time Instance Segmentation
    Bolya, Daniel
    Zhou, Chong
    Xiao, Fanyi
    Lee, Yong Jae
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9156 - 9165
  • [3] Cascade R-CNN: High Quality Object Detection and Instance Segmentation
    Cai, Zhaowei
    Vasconcelos, Nuno
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) : 1483 - 1498
  • [4] DCAN: Deep contour-aware networks for object instance segmentation from histology images
    Chen, Hao
    Qi, Xiaojuan
    Yu, Lequan
    Dou, Qi
    Qin, Jing
    Heng, Pheng-Ann
    [J]. MEDICAL IMAGE ANALYSIS, 2017, 36 : 135 - 146
  • [5] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [6] Deep Feature Learning for Medical Image Analysis with Convolutional Autoencoder Neural Network
    Chen, Min
    Shi, Xiaobo
    Zhang, Yin
    Wu, Di
    Guizani, Mohsen
    [J]. IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (04) : 750 - 758
  • [7] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [8] Fei Wu, 2015, IEEE Transactions on Big Data, V1, P109, DOI 10.1109/TBDATA.2015.2497270
  • [9] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
  • [10] Guo CX, 2020, PROC CVPR IEEE, P12592, DOI 10.1109/CVPR42600.2020.01261