Tensor Low-Rank Reconstruction for Semantic Segmentation

被引:53
作者
Chen, Wanli [1 ]
Zhu, Xinge [1 ]
Sun, Ruoqi [2 ]
He, Junjun [2 ,3 ]
Li, Ruiyu [4 ]
Shen, Xiaoyong [4 ]
Yu, Bei [1 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, ShenZhen Key Lab Comp Vis & Pattern Recognit, SIAT SenseTime Joint Lab, Beijing, Peoples R China
[4] SmartMore, Shenzhen, Peoples R China
来源
COMPUTER VISION - ECCV 2020, PT XVII | 2020年 / 12362卷
关键词
Semantic segmentation; Low-rank reconstruction; Tensor decomposition;
D O I
10.1007/978-3-030-58520-4_4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Context information plays an indispensable role in the success of semantic segmentation. Recently, non-local self-attention based methods are proved to be effective for context information collection. Since the desired context consists of spatial-wise and channel-wise attentions, 3D representation is an appropriate formulation. However, these non-local methods describe 3D context information based on a 2D similarity matrix, where space compression may lead to channel-wise attention missing. An alternative is to model the contextual information directly without compression. However, this effort confronts a fundamental difficulty, namely the high-rank property of context information. In this paper, we propose a new approach to model the 3D context representations, which not only avoids the space compression but also tackles the high-rank difficulty. Here, inspired by tensor canonical-polyadic decomposition theory (i.e, a high-rank tensor can be expressed as a combination of rank-1 tensors.), we design a low-rank-to-high-rank context reconstruction framework (i.e, RecoNet). Specifically, we first introduce the tensor generation module (TGM), which generates a number of rank-1 tensors to capture fragments of context feature. Then we use these rank-1 tensors to recover the high-rank context features through our proposed tensor reconstruction module (TRM). Extensive experiments show that our method achieves state-of-the-art on various public datasets. Additionally, our proposed method has more than 100 times less computational cost compared with conventional non-local-based methods.
引用
收藏
页码:52 / 69
页数:18
相关论文
共 52 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]   COCO-Stuff: Thing and Stuff Classes in Context [J].
Caesar, Holger ;
Uijlings, Jasper ;
Ferrari, Vittorio .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1209-1218
[3]  
Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
[4]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[5]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[6]  
Chen YP, 2018, ADV NEUR IN, V31
[7]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[8]  
Chorowski J, 2015, ADV NEUR IN, V28
[9]   Attention-over-Attention Neural Networks for Reading Comprehension [J].
Cui, Yiming ;
Chen, Zhipeng ;
Wei, Si ;
Wang, Shijin ;
Liu, Ting ;
Hu, Guoping .
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, :593-602
[10]   Semantic Correlation Promoted Shape-Variant Context for Segmentation [J].
Ding, Henghui ;
Jiang, Xudong ;
Shuai, Bing ;
Liu, Ai Qun ;
Wang, Gang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8877-8886