A Joint Convolutional Neural Networks and Context Transfer for Street Scenes Labeling

被引:121
作者
Wang, Qi [1 ,2 ,3 ]
Gao, Junyu [4 ]
Yuan, Yuan [4 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China
[2] Northwestern Polytech Univ, Unmanned Syst Res Inst, Xian 710072, Shaanxi, Peoples R China
[3] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning, Xian 710072, Shaanxi, Peoples R China
[4] Northwestern Polytech Univ, Ctr OPT IMagery Anal & Learning, Sch Comp Sci, Xian 710072, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Scene labeling; convolutional neural networks; deep learning; label transfer; street scenes; data augmentation; SEMANTIC SEGMENTATION; ENERGY MINIMIZATION; RECOGNITION;
D O I
10.1109/TITS.2017.2726546
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Street scene understanding is an essential task for autonomous driving. One important step toward this direction is scene labeling, which annotates each pixel in the images with a correct class label. Although many approaches have been developed, there are still some weak points. First, many methods are based on the hand-crafted features whose image representation ability is limited. Second, they cannot label foreground objects accurately due to the data set bias. Third, in the refinement stage, the traditional Markov random filed inference is prone to over smoothness. For improving the above problems, this paper proposes a joint method of priori convolutional neural networks at superpixel level (called as "priori s-CNNs") and soft restricted context transfer. Our contributions are threefold: 1) a priori s-CNNs model that learns priori location information at superpixel level is proposed to describe various objects discriminatingly; 2) a hierarchical data augmentation method is presented to alleviate data set bias in the priori s-CNNs training stage, which improves foreground objects labeling significantly; and 3) a soft restricted MRF energy function is defined to improve the priori s-CNNs model's labeling performance and reduce the over smoothness at the same time. The proposed approach is verified on CamVid data set (11 classes) and SIFT Flow Street data set (16 classes) and achieves a competitive performance.
引用
收藏
页码:1457 / 1470
页数:14
相关论文
共 41 条
[1]   SLIC Superpixels Compared to State-of-the-Art Superpixel Methods [J].
Achanta, Radhakrishna ;
Shaji, Appu ;
Smith, Kevin ;
Lucchi, Aurelien ;
Fua, Pascal ;
Suesstrunk, Sabine .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (11) :2274-2281
[2]  
[Anonymous], 2014, VERY DEEP CONVOLUTIO
[3]  
[Anonymous], IEEE T PATTERN ANAL
[4]  
[Anonymous], PROC CVPR IEEE
[5]  
[Anonymous], AUTOMATIC SUBSPACE L
[6]  
[Anonymous], P ALV VIS C
[7]  
[Anonymous], 2016, IJCAI
[8]   Fast approximate energy minimization via graph cuts [J].
Boykov, Y ;
Veksler, O ;
Zabih, R .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (11) :1222-1239
[9]   An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision [J].
Boykov, Y ;
Kolmogorov, V .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (09) :1124-1137
[10]   Segmentation and Recognition Using Structure from Motion Point Clouds [J].
Brostow, Gabriel J. ;
Shotton, Jamie ;
Fauqueur, Julien ;
Cipolla, Roberto .
COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 :44-+