Real-Time Panoptic Segmentation from Dense Detections

被引：49

作者：

Hou, Rui ^{[1
,2
]}

Li, Jie ^{[1
]}

Bhargava, Arjun ^{[1
]}

Raventos, Allan ^{[1
]}

Guizilini, Vitor ^{[1
]}

Fang, Chao ^{[1
]}

Lynch, Jerome ^{[2
]}

Gaidon, Adrien ^{[1
]}

机构：

[1] Toyota Res Inst, Los Altos, CA 94022 USA

[2] Univ Michigan, Ann Arbor, MI 48109 USA

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年

关键词：

D O I：

10.1109/CVPR42600.2020.00855

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Panoptic segmentation is a complex full scene parsing task requiring simultaneous instance and semantic segmentation at high resolution. Current state-of-the-art approaches cannot run in real-time, and simplifying these architectures to improve efficiency severely degrades their accuracy. In this paper, we propose a new single-shot panoptic segmentation network that leverages dense detections and a global self-attention mechanism to operate in real-time with performance approaching the state of the art. We introduce a novel parameter-free mask construction method that substantially reduces computational complexity by efficiently reusing information from the object detection and semantic segmentation sub-tasks. The resulting network has a simple data flow that requires no feature map re-sampling, enabling significant hardware acceleration. Our experiments on the Cityscapes and COCO benchmarks show that our network works at 30 FPS on 1024 x 2048 resolution, trading a 3% relative performance degradation from the current state of the art for up to 440% faster inference.

引用

页码：8520 / 8529

页数：10

共 37 条

[1]

Bolya D., 2019, YOLACT: real-time instance segmentation

[2] Hybrid Task Cascade for Instance Segmentation [J].

Chen, Kai ;

Pang, Jiangmiao ;

Wang, Jiaqi ;

Xiong, Yu ;

Li, Xiaoxiao ;

Sun, Shuyang ;

Feng, Wansen ;

Liu, Ziwei ;

Shi, Jianping ;

Ouyang, Wanli ;

Loy, Chen Change ;

Lin, Dahua .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978

[3] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[4]

De Brabandere B., 2017, arXiv

[5]

de Geus Daan, 2019, ARXIV

[6]

Gao Naiyu, 2019, ARXIV

[7] Simultaneous Detection and Segmentation [J].

Hariharan, Bharath ;

Arbelaez, Pablo ;

Girshick, Ross ;

Malik, Jitendra .

COMPUTER VISION - ECCV 2014, PT VII, 2014, 8695 :297-312

[8]

He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10] Mask Scoring R-CNN [J].

Huang, Zhaojin ;

Huang, Lichao ;

Gong, Yongchao ;

Huang, Chang ;

Wang, Xinggang .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6402-6411

← 1 2 3 4 →