Single-stage Instance Segmentation

被引:3
作者
Lin, Feng [1 ]
Li, Bin [2 ]
Zhou, Wengang [1 ]
Li, Houqiang [1 ]
Lu, Yan [2 ]
机构
[1] Univ Sci & Technol China, 96 JinZhai Rd, Hefei, Peoples R China
[2] Microsoft Res Asia, 5 Dan Ling St, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Instance segmentation; neural networks; single stage; graph merge;
D O I
10.1145/3387926
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Albeit the highest accuracy of object detection is generally acquired by multi-stage detectors, like R-CNN and its extension approaches, the single-stage object detectors also achieve remarkable performance with faster execution and higher scalability. Inspired by this, we propose a single-stage framework to tackle the instance segmentation task. Building on a single-stage object detection network in hand, our model outputs the detected bounding box of each instance, the semantic segmentation result, and the pixel affinity simultaneously. After that, we generate the final instance masks via a fast post-processing method with the help of the three outputs above. As far as we know, it is the first attempt to segment instances in a single-stage pipeline on challenging datasets. Extensive experiments demonstrate the efficiency of our post-processing method, and the proposed framework obtains competitive results as a single-stage instance segmentation method. We achieve 32.5 box AP and 26.0 mask AP on the COCO validation set with 500 pixels input scale and 22.9 mask AP on the Cityscapes test set.
引用
收藏
页数:19
相关论文
共 68 条
[21]  
Girshick R., 2018, Detectron
[22]  
Glorot X., 2010, P 13 INT C ART INT S, P249, DOI DOI 10.1109/LGRS.2016.2565705
[23]   Simultaneous Detection and Segmentation [J].
Hariharan, Bharath ;
Arbelaez, Pablo ;
Girshick, Ross ;
Malik, Jitendra .
COMPUTER VISION - ECCV 2014, PT VII, 2014, 8695 :297-312
[24]  
Hayder Zeeshan, 2017, P IEEE C COMP VIS PA
[25]  
He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
[26]  
HE KM, 2016, PROC CVPR IEEE, P770, DOI DOI 10.1109/CVPR.2016.90
[27]  
Howard A. G., 2017, MobileNets: efficient convolutional neural networks for mobile vision applications
[28]   Learning to Segment Every Thing [J].
Hu, Ronghang ;
Dollar, Piotr ;
He, Kaiming ;
Darrell, Trevor ;
Girshick, Ross .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4233-4241
[29]  
Huang J., 2017, CVPR, DOI DOI 10.1109/CVPR.2017.351
[30]  
Joseph RK, 2016, CRIT POL ECON S ASIA, P1