CNN-based encoder-decoder networks for salient object detection: A comprehensive review and recent advances

被引:185
作者
Ji, Yuzhu [1 ]
Zhang, Haijun [1 ]
Zhang, Zhao [2 ]
Liu, Ming [3 ]
机构
[1] Harbin Inst Technol, Dept Comp Sci, Shenzhen, Peoples R China
[2] Hefei Univ Technol, Dept Comp Sci, Hefei, Peoples R China
[3] Harbin Inst Technol, Sch Astronaut, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
Salient object detection; Encoder-decoder model; Pixel-level classification; Video saliency; Empirical study;
D O I
10.1016/j.ins.2020.09.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Convolutional neural network (CNN)-based encoder-decoder models have profoundly inspired recent works in the field of salient object detection (SOD). With the rapid development of encoder-decoder models with respect to most pixel-level dense prediction tasks, an empirical study still does not exist that evaluates performance by applying a large body of encoder-decoder models on SOD tasks. In this paper, instead of limiting our survey to SOD methods, a broader view is further presented from the perspective of fundamental architectures of key modules and structures in CNN-based encoder-decoder models for pixel-level dense prediction tasks. Moreover, we focus on performing SOD by leveraging deep encoder-decoder models, and present an extensive empirical study on baseline encoder-decoder models in terms of different encoder backbones, loss functions, training batch sizes, and attention structures. Moreover, state-of-the-art encoder-decoder models adopted from semantic segmentation and deep CNN-based SOD models are also investigated. New baseline models that can outperform state-of-the-art performance were discovered. In addition, these newly discovered baseline models were further evaluated on three video-based SOD benchmark datasets. Experimental results demonstrate the effectiveness of these baseline models on both imageand video-based SOD tasks. This empirical study is concluded by a comprehensive summary which provides suggestions on future perspectives. (c) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:835 / 857
页数:23
相关论文
共 113 条
[81]   Detect Globally, Refine Locally: A Novel Approach to Saliency Detection [J].
Wang, Tiantian ;
Zhang, Lihe ;
Wang, Shuo ;
Lu, Huchuan ;
Yang, Gang ;
Ruan, Xiang ;
Borji, Ali .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3127-3135
[82]   A Stagewise Refinement Model for Detecting Salient Objects in Images [J].
Wang, Tiantian ;
Borji, Ali ;
Zhang, Lihe ;
Zhang, Pingping ;
Lu, Huchuan .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4039-4048
[83]   Revisiting Video Saliency: A Large-scale Benchmark and a New Model [J].
Wang, Wenguan ;
Shen, Jianbing ;
Guo, Fang ;
Cheng, Ming-Ming ;
Borji, Ali .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4894-4903
[84]   Salient Object Detection Driven by Fixation Prediction [J].
Wang, Wenguan ;
Shen, Jianbing ;
Dong, Xingping ;
Borji, Ali .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1711-1720
[85]   Video Salient Object Detection via Fully Convolutional Networks [J].
Wang, Wenguan ;
Shen, Jianbing ;
Shao, Ling .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) :38-49
[86]   Non-local Neural Networks [J].
Wang, Xiaolong ;
Girshick, Ross ;
Gupta, Abhinav ;
He, Kaiming .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7794-7803
[87]   Geodesic Saliency Using Background Priors [J].
Wei, Yichen ;
Wen, Fang ;
Zhu, Wangjiang ;
Sun, Jian .
COMPUTER VISION - ECCV 2012, PT III, 2012, 7574 :29-42
[88]   Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation [J].
Wei, Yunchao ;
Xiao, Huaxin ;
Shi, Honghui ;
Jie, Zequn ;
Feng, Jiashi ;
Huang, Thomas S. .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7268-7277
[89]   STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation [J].
Wei, Yunchao ;
Liang, Xiaodan ;
Chen, Yunpeng ;
Shen, Xiaohui ;
Cheng, Ming-Ming ;
Feng, Jiashi ;
Zhao, Yao ;
Yan, Shuicheng .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (11) :2314-2320
[90]   CBAM: Convolutional Block Attention Module [J].
Woo, Sanghyun ;
Park, Jongchan ;
Lee, Joon-Young ;
Kweon, In So .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :3-19