IEMask R-CNN: Information-Enhanced Mask R-CNN

被引:41
作者
Bi, Xiuli [1 ]
Hu, Jinwu [1 ]
Xiao, Bin [1 ]
Li, Weisheng [1 ]
Gao, Xinbo [1 ]
机构
[1] Chongqing Univ Posts & Telecommun, Dept Comp Sci & Technol, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Object segmentation; Semantics; Feature extraction; Image segmentation; Location awareness; Head; Instance segmentation; information-enhanced FPN; adaptive feature fusion; encoding-decoding mask head; INSTANCE SEGMENTATION;
D O I
10.1109/TBDATA.2022.3187413
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The instance segmentation task is relatively difficult in computer vision, which requires not only high-quality masks but also high-accuracy instance category classification. Mask R-CNN has been proven to be a feasible method. However, due to the Feature Pyramid Network (FPN) structure lack useful channel information, global information and low-level texture information, and mask branch cannot obtain useful local-global information, Mask R-CNN is prevented from obtaining high-quality masks and high-accuracy instance category classification. Therefore, we proposed the Information-enhanced Mask R-CNN, called IEMask R-CNN. In the FPN structure of IEMask R-CNN, the information-enhanced FPN will enhance the useful channel information and the global information of the feature maps to solve the issues that the high-level feature map loses useful channel information and inaccurate of instance category classification, meanwhile the bottom-up path enhancement with adaptive feature fusion will ultilize the precise positioning signal in the lower layer to enhance the feature pyramid. In the mask branch of IEMask R-CNN, an encoding-decoding mask head will strength local-global information to gain a high-quality mask. Without bells and whistles, IEMask R-CNN gains significant gains of about 2.60%, 4.00%, 3.17% over Mask R-CNN on MS COCO2017, Cityscapes and LVIS1.0 benchmarks respectively.
引用
收藏
页码:688 / 700
页数:13
相关论文
共 43 条
[1]   Deep Watershed Transform for Instance Segmentation [J].
Bai, Min ;
Urtasun, Raquel .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2858-2866
[2]   YOLACT Real-time Instance Segmentation [J].
Bolya, Daniel ;
Zhou, Chong ;
Xiao, Fanyi ;
Lee, Yong Jae .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9156-9165
[3]   Cascade R-CNN: High Quality Object Detection and Instance Segmentation [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1483-1498
[4]   DCAN: Deep contour-aware networks for object instance segmentation from histology images [J].
Chen, Hao ;
Qi, Xiaojuan ;
Yu, Lequan ;
Dou, Qi ;
Qin, Jing ;
Heng, Pheng-Ann .
MEDICAL IMAGE ANALYSIS, 2017, 36 :135-146
[5]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[6]   Deep Feature Learning for Medical Image Analysis with Convolutional Autoencoder Neural Network [J].
Chen, Min ;
Shi, Xiaobo ;
Zhang, Yin ;
Wu, Di ;
Guizani, Mohsen .
IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (04) :750-758
[7]   Boundary-Preserving Mask R-CNN [J].
Cheng, Tianheng ;
Wang, Xinggang ;
Huang, Lichao ;
Liu, Wenyu .
COMPUTER VISION - ECCV 2020, PT XIV, 2020, 12359 :660-676
[8]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[9]  
Fei Wu, 2015, IEEE Transactions on Big Data, V1, P109, DOI [10.1109/tbdata.2015.2497270, 10.1109/TBDATA.2015.2497270]
[10]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587