RANUS: RGB and NIR Urban Scene Dataset for Deep Scene Parsing

被引:32
作者
Choe, Gyeongmin [1 ]
Kim, Seong-Heum [1 ]
Im, Sunghoon [1 ]
Lee, Joon-Young [2 ]
Narasimhan, Srinivasa G. [3 ]
Kweon, In So [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Elect Engn, Daejeon 34141, South Korea
[2] Adobe Res, San Jose, CA 95110 USA
[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
Deep learning in robotics and automation; semantic scene understanding;
D O I
10.1109/LRA.2018.2801390
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
In this letter, we present a data-driven method for scene parsing of road scenes to utilize single-channel near-infrared (NIR) images. To overcome the lack of data problem in non-RGB spectrum, we define a new color space and decompose the task of deep scene parsing into two subtasks with two separate CNN architectures for chromaticity channels and semantic masks. For chromaticity estimation, we build a spatially-aligned-RGB-NIR image database (40k urban scenes) to infer color information from RGB-NIR spectrum learning process and leverage existing scene parsing networks trained over already available RGB masks. From our database, we sample key frames and manually annotate them (4k ground truth masks) to finetune the network into the proposed color space. Hence, the key contribution of this work is to replace multispectral scene parsing methods with a simple yet effective approach using single NIR images. The benefits of using our algorithm and dataset are confirmed in the qualitative and quantitative experiments.
引用
收藏
页码:1808 / 1815
页数:8
相关论文
共 36 条
[1]  
[Anonymous], 2014, Computer Science
[2]   Multiscale Combinatorial Grouping [J].
Arbelaez, Pablo ;
Pont-Tuset, Jordi ;
Barron, Jonathan T. ;
Marques, Ferran ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :328-335
[3]   Semantic object classes in video: A high-definition ground truth database [J].
Brostow, Gabriel J. ;
Fauqueur, Julien ;
Cipolla, Roberto .
PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97
[4]  
Brown M, 2011, PROC CVPR IEEE, P177, DOI 10.1109/CVPR.2011.5995637
[5]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[6]   Refining Geometry from Depth Sensors using IR Shading Images [J].
Choe, Gyeongmin ;
Park, Jaesik ;
Tai, Yu-Wing ;
Kweon, In So .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 122 (01) :1-16
[7]   Simultaneous Estimation of Near IR BRDF and Fine-Scale Surface Geometry [J].
Choe, Gyeongmin ;
Narasimhan, Srinivasa G. ;
Kweon, In So .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2452-2460
[8]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[9]   Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture [J].
Eigen, David ;
Fergus, Rob .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2650-2658
[10]   Learning Hierarchical Features for Scene Labeling [J].
Farabet, Clement ;
Couprie, Camille ;
Najman, Laurent ;
LeCun, Yann .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1915-1929