A Joint Convolutional Neural Networks and Context Transfer for Street Scenes Labeling

被引:121
作者
Wang, Qi [1 ,2 ,3 ]
Gao, Junyu [4 ]
Yuan, Yuan [4 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China
[2] Northwestern Polytech Univ, Unmanned Syst Res Inst, Xian 710072, Shaanxi, Peoples R China
[3] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning, Xian 710072, Shaanxi, Peoples R China
[4] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene labeling; convolutional neural networks; deep learning; label transfer; street scenes; data augmentation; SEMANTIC SEGMENTATION; ENERGY MINIMIZATION; RECOGNITION;
D O I
10.1109/TITS.2017.2726546
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Street scene understanding is an essential task for autonomous driving. One important step toward this direction is scene labeling, which annotates each pixel in the images with a correct class label. Although many approaches have been developed, there are still some weak points. First, many methods are based on the hand-crafted features whose image representation ability is limited. Second, they cannot label foreground objects accurately due to the data set bias. Third, in the refinement stage, the traditional Markov random filed inference is prone to over smoothness. For improving the above problems, this paper proposes a joint method of priori convolutional neural networks at superpixel level (called as "priori s-CNNs") and soft restricted context transfer. Our contributions are threefold: 1) a priori s-CNNs model that learns priori location information at superpixel level is proposed to describe various objects discriminatingly; 2) a hierarchical data augmentation method is presented to alleviate data set bias in the priori s-CNNs training stage, which improves foreground objects labeling significantly; and 3) a soft restricted MRF energy function is defined to improve the priori s-CNNs model's labeling performance and reduce the over smoothness at the same time. The proposed approach is verified on CamVid data set (11 classes) and SIFT Flow Street data set (16 classes) and achieves a competitive performance.
引用
收藏
页码:1457 / 1470
页数:14
相关论文
共 41 条
[31]  
Szegedy C., 2015, P IEEE C COMP VIS PA, P1
[32]   Superparsing [J].
Tighe, Joseph ;
Lazebnik, Svetlana .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2013, 101 (02) :329-349
[33]   Adaptive road detection via context-aware label transfer [J].
Wang, Qi ;
Fang, Jianwu ;
Yuan, Yuan .
NEUROCOMPUTING, 2015, 158 :174-183
[34]  
Wu Ren, 2015, DEEP IMAGE SCALING I
[35]  
Xiao JX, 2009, IEEE I CONF COMP VIS, P686, DOI 10.1109/ICCV.2009.5459249
[36]   Context Driven Scene Parsing with Attention to Rare Classes [J].
Yang, Jimei ;
Price, Brian ;
Cohen, Scott ;
Yang, Ming-Hsuan .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3294-3301
[37]  
Yang YQ, 2012, LECT NOTES COMPUT SC, V7578, P361, DOI 10.1007/978-3-642-33786-4_27
[38]   Anomaly Detection in Traffic Scenes via Spatial-Aware Motion Reconstruction [J].
Yuan, Yuan ;
Wang, Dong ;
Wang, Qi .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2017, 18 (05) :1198-1209
[39]  
Zhang CX, 2010, LECT NOTES COMPUT SC, V6314, P708, DOI 10.1007/978-3-642-15561-1_51
[40]   Efficient Pedestrian Detection via Rectangular Features Based on a Statistical Shape Model [J].
Zhang, Shanshan ;
Bauckhage, Christian ;
Cremers, Armin B. .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2015, 16 (02) :763-775