A fully convolutional two-stream fusion network for interactive image segmentation

被引:74
作者
Hu, Yang [1 ]
Soltoggio, Andrea [1 ]
Lock, Russell [1 ]
Carter, Steve [2 ]
机构
[1] Loughborough Univ, Loughborough, Leics, England
[2] ICE Agcy, Poole, Dorset, England
基金
“创新英国”项目;
关键词
Interactive image segmentation; Fully convolutional network; Two-stream network;
D O I
10.1016/j.neunet.2018.10.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel fully convolutional two-stream fusion network (FCTSFN) for interactive image segmentation. The proposed network includes two sub-networks: a two-stream late fusion network (TSLFN) that predicts the foreground at a reduced resolution, and a multi-scale refining network (MSRN) that refines the foreground at full resolution. The TSLFN includes two distinct deep streams followed by a fusion network. The intuition is that, since user interactions are more direct information on foreground/background than the image itself, the two-stream structure of the TSLFN reduces the number of layers between the pure user interaction features and the network output, allowing the user interactions to have a more direct impact on the segmentation result. The MSRN fuses the features from different layers of TSLFN with different scales, in order to seek the local to global information on the foreground to refine the segmentation result at full resolution. We conduct comprehensive experiments on four benchmark datasets. The results show that the proposed network achieves competitive performance compared to current state-of-the-art interactive image segmentation methods.(1) (c) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:31 / 42
页数:12
相关论文
共 39 条
[1]  
[Anonymous], 2010, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2010.5540073
[2]  
Bai XF, 2007, IEEE IC COMP COM NET, P1
[3]  
Boroujerdi A. S, 2017, ARXIV170708364
[4]   Fast approximate energy minimization via graph cuts [J].
Boykov, Y ;
Veksler, O ;
Zabih, R .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (11) :1222-1239
[5]  
Boykov YY, 2001, EIGHTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOL I, PROCEEDINGS, P105, DOI 10.1109/ICCV.2001.937505
[6]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[7]   Photographic Image Synthesis with Cascaded Refinement Networks [J].
Chen, Qifeng ;
Koltun, Vladlen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1520-1529
[8]   How iris recognition works [J].
Daugman, J .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2004, 14 (01) :21-30
[9]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[10]   Random walks for image segmentation [J].
Grady, Leo .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (11) :1768-1783