MSRMNet: Multi-scale skip residual and multi-mixed features network for salient object detection

被引：16

作者：

Liu, Xinlong ^{[1
]}

Wang, Luping ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Guangzhou 510275, Peoples R China

来源：

NEURAL NETWORKS | 2024年 / 173卷

关键词：

Salient object detection; Deep learning; Neural networks; Features fusion; CONNECTIONS;

D O I：

10.1016/j.neunet.2024.106144

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The current models for the salient object detection (SOD) have made remarkable progress through multi -scale feature fusion strategies. However, the existing models have large deviations in the detection of different scales, and the target boundaries of the prediction images are still blurred. In this paper, we propose a new model addressing these issues using a transformer backbone to capture multiple feature layers. The model uses multi -scale skip residual connections during encoding to improve the accuracy of the model's predicted object position and edge pixel information. Furthermore, to extract richer multi -scale semantic information, we perform multiple mixed feature operations in the decoding stage. In addition, we add the structure similarity index measure (SSIM) function with coefficients in the loss function to enhance the accurate prediction performance of the boundaries. Experiments demonstrate that our algorithm achieves state-of-the-art results on five public datasets, and improves the performance metrics of the existing SOD tasks. Codes and results are available at: https://github.com/xxwudi508/MSRMNet.

引用

页数：12

共 69 条

[1]

Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596

[2]

Borji A., 2012, CVPR, P23

[3] Salient object detection: A survey [J].

Borji, Ali ;

Cheng, Ming-Ming ;

Hou, Qibin ;

Jiang, Huaizu ;

Li, Jia .

COMPUTATIONAL VISUAL MEDIA, 2019, 5 (02) :117-150

[4] State-of-the-Art in Visual Attention Modeling [J].

Borji, Ali ;

Itti, Laurent .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) :185-207

[5]

Chen LC, 2017, Arxiv, DOI arXiv:1706.05587

[6] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[7]

Chen M., 2020, P MACHINE LEARNING R, P1669

[8]

Chen ZY, 2020, AAAI CONF ARTIF INTE, V34, P10599

[9]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[10] Structure-measure: A New Way to Evaluate Foreground Maps [J].

Fan, Deng-Ping ;

Cheng, Ming-Ming ;

Liu, Yun ;

Li, Tao ;

Borji, Ali .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567

← 1 2 3 4 5 6 7 →