Single-stage Instance Segmentation

被引：3

作者：

Lin, Feng ^{[1
]}

Li, Bin ^{[2
]}

Zhou, Wengang ^{[1
]}

Li, Houqiang ^{[1
]}

Lu, Yan ^{[2
]}

机构：

[1] Univ Sci & Technol China, 96 JinZhai Rd, Hefei, Peoples R China

[2] Microsoft Res Asia, 5 Dan Ling St, Beijing, Peoples R China

来源：

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS | 2020年 / 16卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Instance segmentation; neural networks; single stage; graph merge;

D O I：

10.1145/3387926

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Albeit the highest accuracy of object detection is generally acquired by multi-stage detectors, like R-CNN and its extension approaches, the single-stage object detectors also achieve remarkable performance with faster execution and higher scalability. Inspired by this, we propose a single-stage framework to tackle the instance segmentation task. Building on a single-stage object detection network in hand, our model outputs the detected bounding box of each instance, the semantic segmentation result, and the pixel affinity simultaneously. After that, we generate the final instance masks via a fast post-processing method with the help of the three outputs above. As far as we know, it is the first attempt to segment instances in a single-stage pipeline on challenging datasets. Extensive experiments demonstrate the efficiency of our post-processing method, and the proposed framework obtains competitive results as a single-stage instance segmentation method. We achieve 32.5 box AP and 26.0 mask AP on the COCO validation set with 500 pixels input scale and 22.9 mask AP on the Cityscapes test set.

引用

页数：19

共 68 条

[1]

[Anonymous], GERM C PATT REC GCPR

[2]

[Anonymous], 2017, ARXIV

[3]

[Anonymous], 2014, P EUR C COMP VIS ECC

[4] Pixelwise Instance Segmentation with a Dynamically Instantiated Network [J].

Arnab, Anurag ;

Torr, Philip H. S. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :879-888

[5] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[6]

Bai Min, 2017, P IEEE C COMP VIS PA

[7] Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [J].

Bell, Sean ;

Zitnick, C. Lawrence ;

Bala, Kavita ;

Girshick, Ross .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2874-2883

[8]

Bolya D., 2019, P IEEE INT C COMP VI

[9] CaMap: Camera-based Map Manipulation on Mobile Devices [J].

Chen, Liang ;

Chen, Dongyi .

PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,

[10] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

← 1 2 3 4 5 6 7 →