Adaptive Context Network for Scene Parsing

被引:111
作者
Fu, Jun [1 ,4 ]
Liu, Jing [1 ]
Wang, Yuhang [1 ]
Li, Yong [2 ]
Bao, Yongjun [2 ]
Tang, Jinhui [3 ]
Lu, Hanqing [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
[2] JD com, Business Growth BU, Beijing, Peoples R China
[3] Nanjing Univ Sci & Technol, Nanjing, Peoples R China
[4] Univ Chinese Acad Sci, Beijing, Peoples R China
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) | 2019年
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
D O I
10.1109/ICCV.2019.00685
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent works attempt to improve scene parsing performance by exploring different levels of contexts, and typically train a well-designed convolutional network to exploit useful contexts across all pixels equally. However, in this paper, we find that the context demands are varying from different pixels or regions in each image. Based on this observation, we propose an Adaptive Context Network (ACNet) to capture the pixel-aware contexts by a competitive fusion of global context and local context according to different per-pixel demands. Specifically, when given a pixel, the global context demand is measured by the similarity between the global feature and its local feature, whose reverse value can be used to measure the local context demand. We model the two demand measurements by the proposed global context module and local context module, respectively, to generate adaptive contextual features. Furthermore, we import multiple such modules to build several adaptive context blocks in different levels of network to obtain a coarse-to-fine result. Finally, comprehensive experimental evaluations demonstrate the effectiveness of the proposed ACNet, and new state-of-the-arts performances are achieved on all four public datasets, i.e. Cityscapes, ADE20K, PASCAL Context, and COCO Stuff.
引用
收藏
页码:6747 / 6756
页数:10
相关论文
共 42 条
[1]  
[Anonymous], 2017, arXiv preprint arXiv:1706.05587, DOI DOI 10.48550/ARXIV.1706.05587
[2]  
[Anonymous], 2017, ARXIV170804943
[3]   COCO-Stuff: Thing and Stuff Classes in Context [J].
Caesar, Holger ;
Uijlings, Jasper ;
Ferrari, Vittorio .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1209-1218
[4]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[5]  
Chen Liang-Chieh, 2018, ECCV, P801, DOI [DOI 10.1007/978-3-030-01234-249, 10.1007/978-3-030-01234-2_49]
[6]  
CHEN Y, 2018, ADV NEURAL INFORM PR, P2245
[7]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[8]   Boundary-Aware Feature Propagation for Scene Segmentation [J].
Ding, Henghui ;
Jiang, Xudong ;
Liu, Ai Qun ;
Thalmann, Nadia Magnenat ;
Wang, Gang .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6818-6828
[9]   Semantic Correlation Promoted Shape-Variant Context for Segmentation [J].
Ding, Henghui ;
Jiang, Xudong ;
Shuai, Bing ;
Liu, Ai Qun ;
Wang, Gang .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8877-8886
[10]   Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation [J].
Ding, Henghui ;
Jiang, Xudong ;
Shuai, Bing ;
Liu, Ai Qun ;
Wang, Gang .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2393-2402