Real-Time Panoptic Segmentation from Dense Detections

被引:49
作者
Hou, Rui [1 ,2 ]
Li, Jie [1 ]
Bhargava, Arjun [1 ]
Raventos, Allan [1 ]
Guizilini, Vitor [1 ]
Fang, Chao [1 ]
Lynch, Jerome [2 ]
Gaidon, Adrien [1 ]
机构
[1] Toyota Res Inst, Los Altos, CA 94022 USA
[2] Univ Michigan, Ann Arbor, MI 48109 USA
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
关键词
D O I
10.1109/CVPR42600.2020.00855
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Panoptic segmentation is a complex full scene parsing task requiring simultaneous instance and semantic segmentation at high resolution. Current state-of-the-art approaches cannot run in real-time, and simplifying these architectures to improve efficiency severely degrades their accuracy. In this paper, we propose a new single-shot panoptic segmentation network that leverages dense detections and a global self-attention mechanism to operate in real-time with performance approaching the state of the art. We introduce a novel parameter-free mask construction method that substantially reduces computational complexity by efficiently reusing information from the object detection and semantic segmentation sub-tasks. The resulting network has a simple data flow that requires no feature map re-sampling, enabling significant hardware acceleration. Our experiments on the Cityscapes and COCO benchmarks show that our network works at 30 FPS on 1024 x 2048 resolution, trading a 3% relative performance degradation from the current state of the art for up to 440% faster inference.
引用
收藏
页码:8520 / 8529
页数:10
相关论文
共 37 条
[1]  
Bolya D., 2019, YOLACT: real-time instance segmentation
[2]   Hybrid Task Cascade for Instance Segmentation [J].
Chen, Kai ;
Pang, Jiangmiao ;
Wang, Jiaqi ;
Xiong, Yu ;
Li, Xiaoxiao ;
Sun, Shuyang ;
Feng, Wansen ;
Liu, Ziwei ;
Shi, Jianping ;
Ouyang, Wanli ;
Loy, Chen Change ;
Lin, Dahua .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978
[3]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[4]  
De Brabandere B., 2017, arXiv
[5]  
de Geus Daan, 2019, ARXIV
[6]  
Gao Naiyu, 2019, ARXIV
[7]   Simultaneous Detection and Segmentation [J].
Hariharan, Bharath ;
Arbelaez, Pablo ;
Girshick, Ross ;
Malik, Jitendra .
COMPUTER VISION - ECCV 2014, PT VII, 2014, 8695 :297-312
[8]  
He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]   Mask Scoring R-CNN [J].
Huang, Zhaojin ;
Huang, Lichao ;
Gong, Yongchao ;
Huang, Chang ;
Wang, Xinggang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6402-6411