Synthesizing Supervision for Learning Deep Saliency Network without Human Annotation

被引:115
作者
Zhang, Dingwen [1 ,2 ]
Han, Junwei [2 ]
Zhang, Yu [2 ]
Xu, Dong [3 ]
机构
[1] Xidian Univ, Sch Mechanoelect Engn, Xian 710071, Peoples R China
[2] Northwestern Polytech Univ, Sch Automat, Xian 710072, Peoples R China
[3] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
基金
美国国家科学基金会; 澳大利亚研究理事会;
关键词
Object detection; Detectors; Training; Knowledge engineering; Task analysis; Semantics; Feature extraction; Salient object detection; supervision synthesis; annotation-free; weakly supervised semantic segmentation; OBJECT DETECTION; DRIVEN;
D O I
10.1109/TPAMI.2019.2900649
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the research field of salient object detection is undergoing a rapid and remarkable development along with the wide usage of deep neural networks. Being trained with a large number of images annotated with strong pixel-level ground-truth masks, the deep salient object detectors have achieved the state-of-the-art performance. However, it is expensive and time-consuming to provide the pixel-level ground-truth masks for each training image. To address this problem, this paper proposes one of the earliest frameworks to learn deep salient object detectors without requiring any human annotation. The supervisory signals used in our learning framework are generated through a novel supervision synthesis scheme, in which the key insights are "knowledge source transition" and "supervision by fusion". Specifically, in the proposed learning framework, both the external knowledge source and the internal knowledge source are explored dynamically to provide informative cues for synthesizing supervision required in our approach, while a two-stream fusion mechanism is also established to implement the supervision synthesis process. Comprehensive experiments on four benchmark datasets demonstrate that the deep salient object detector trained by our newly proposed learning framework often works well without requiring any human annotated masks, which even approaches to its upper-bound obtained under the fully supervised learning fashion (within only 3 percent performance gap). Besides, we also apply the salient object detector learnt with our annotation-free learning framework to assist the weakly supervised semantic segmentation task, which demonstrates that our approach can also alleviate the heavy supplementary supervision required in the existing weakly supervised semantic segmentation framework.
引用
收藏
页码:1755 / 1769
页数:15
相关论文
共 73 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]  
[Anonymous], 2014, arXiv
[3]  
Bengio Y., 2009, P 26 ANN INT C MACH, P41, DOI [10.1145/1553374.1553380, DOI 10.1145/1553374.1553380]
[4]   DISC: Deep Image Saliency Computing via Progressive Representation Learning [J].
Chen, Tianshui ;
Lin, Liang ;
Liu, Lingbo ;
Luo, Xiaonan ;
Li, Xuelong .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (06) :1135-1149
[5]   Efficient graph-based image segmentation [J].
Felzenszwalb, PF ;
Huttenlocher, DP .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 59 (02) :167-181
[6]   VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation [J].
Gan, Chuang ;
Li, Yandong ;
Li, Haoxiang ;
Sun, Chen ;
Gong, Boqing .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1829-1838
[7]   Context-Aware Saliency Detection [J].
Goferman, Stas ;
Zelnik-Manor, Lihi ;
Tal, Ayellet .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (10) :1915-1926
[8]   CNNs-Based RGB-D Saliency Detection via Cross-View Transfer and Multiview Fusion [J].
Han, Junwei ;
Chen, Hao ;
Liu, Nian ;
Yan, Chenggang ;
Li, Xuelong .
IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (11) :3171-3183
[9]   A Unified Metric Learning-Based Framework for Co-Saliency Detection [J].
Han, Junwei ;
Cheng, Gong ;
Li, Zhenpeng ;
Zhang, Dingwen .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (10) :2473-2483
[10]   Advanced Deep-Learning Techniques for Salient and Category-Specific Object Detection A survey [J].
Han, Junwei ;
Zhang, Dingwen ;
Cheng, Gong ;
Liu, Nian ;
Xu, Dong .
IEEE SIGNAL PROCESSING MAGAZINE, 2018, 35 (01) :84-100