CNN-based encoder-decoder networks for salient object detection: A comprehensive review and recent advances

被引：185

作者：

Ji, Yuzhu ^{[1
]}

Zhang, Haijun ^{[1
]}

Zhang, Zhao ^{[2
]}

Liu, Ming ^{[3
]}

机构：

[1] Harbin Inst Technol, Dept Comp Sci, Shenzhen, Peoples R China

[2] Hefei Univ Technol, Dept Comp Sci, Hefei, Peoples R China

[3] Harbin Inst Technol, Sch Astronaut, Harbin, Peoples R China

来源：

INFORMATION SCIENCES | 2021年 / 546卷

基金：

中国国家自然科学基金;

关键词：

Salient object detection; Encoder-decoder model; Pixel-level classification; Video saliency; Empirical study;

D O I：

10.1016/j.ins.2020.09.003

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Convolutional neural network (CNN)-based encoder-decoder models have profoundly inspired recent works in the field of salient object detection (SOD). With the rapid development of encoder-decoder models with respect to most pixel-level dense prediction tasks, an empirical study still does not exist that evaluates performance by applying a large body of encoder-decoder models on SOD tasks. In this paper, instead of limiting our survey to SOD methods, a broader view is further presented from the perspective of fundamental architectures of key modules and structures in CNN-based encoder-decoder models for pixel-level dense prediction tasks. Moreover, we focus on performing SOD by leveraging deep encoder-decoder models, and present an extensive empirical study on baseline encoder-decoder models in terms of different encoder backbones, loss functions, training batch sizes, and attention structures. Moreover, state-of-the-art encoder-decoder models adopted from semantic segmentation and deep CNN-based SOD models are also investigated. New baseline models that can outperform state-of-the-art performance were discovered. In addition, these newly discovered baseline models were further evaluated on three video-based SOD benchmark datasets. Experimental results demonstrate the effectiveness of these baseline models on both imageand video-based SOD tasks. This empirical study is concluded by a comprehensive summary which provides suggestions on future perspectives. (c) 2020 Elsevier Inc. All rights reserved.

引用

页码：835 / 857

页数：23

共 113 条

[81] Detect Globally, Refine Locally: A Novel Approach to Saliency Detection [J].

Wang, Tiantian ;

Zhang, Lihe ;

Wang, Shuo ;

Lu, Huchuan ;

Yang, Gang ;

Ruan, Xiang ;

Borji, Ali .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3127-3135

[82] A Stagewise Refinement Model for Detecting Salient Objects in Images [J].

Wang, Tiantian ;

Borji, Ali ;

Zhang, Lihe ;

Zhang, Pingping ;

Lu, Huchuan .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4039-4048

[83] Revisiting Video Saliency: A Large-scale Benchmark and a New Model [J].

Wang, Wenguan ;

Shen, Jianbing ;

Guo, Fang ;

Cheng, Ming-Ming ;

Borji, Ali .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4894-4903

[84] Salient Object Detection Driven by Fixation Prediction [J].

Wang, Wenguan ;

Shen, Jianbing ;

Dong, Xingping ;

Borji, Ali .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1711-1720

[85] Video Salient Object Detection via Fully Convolutional Networks [J].

Wang, Wenguan ;

Shen, Jianbing ;

Shao, Ling .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) :38-49

[86] Non-local Neural Networks [J].

Wang, Xiaolong ;

Girshick, Ross ;

Gupta, Abhinav ;

He, Kaiming .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7794-7803

[87] Geodesic Saliency Using Background Priors [J].

Wei, Yichen ;

Wen, Fang ;

Zhu, Wangjiang ;

Sun, Jian .

COMPUTER VISION - ECCV 2012, PT III, 2012, 7574 :29-42

[88] Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation [J].

Wei, Yunchao ;

Xiao, Huaxin ;

Shi, Honghui ;

Jie, Zequn ;

Feng, Jiashi ;

Huang, Thomas S. .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7268-7277

[89] STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation [J].

Wei, Yunchao ;

Liang, Xiaodan ;

Chen, Yunpeng ;

Shen, Xiaohui ;

Cheng, Ming-Ming ;

Feng, Jiashi ;

Zhao, Yao ;

Yan, Shuicheng .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (11) :2314-2320

[90] CBAM: Convolutional Block Attention Module [J].

Woo, Sanghyun ;

Park, Jongchan ;

Lee, Joon-Young ;

Kweon, In So .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :3-19

← 3 4 5 6 7 8 9 10 11 12 →