Spatial Attention-Guided Light Field Salient Object Detection Network With Implicit Neural Representation

被引：2

作者：

Zheng, Xin ^{[1
]}

Li, Zhengqu ^{[1
]}

Liu, Deyang ^{[1
]}

Zhou, Xiaofei ^{[2
]}

Shan, Caifeng ^{[3
,4
]}

机构：

[1] Anqing Normal Univ, Sch Comp & Informat, Anqing 246000, Peoples R China

[2] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310061, Peoples R China

[3] Shandong Univ Sci & Technol, Coll Elect Engn & Automat, Qingdao 266590, Peoples R China

[4] Nanjing Univ, Sch Intelligence Sci & Technol, Nanjing 210023, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Task analysis; Feature extraction; Image restoration; Three-dimensional displays; Object detection; Light fields; Fuses; Light field; salient object detection; implicit neural representation; spatial attention; DEPTH ESTIMATION;

D O I：

10.1109/TCSVT.2024.3437685

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recently, many Light Field Salient Object Detection (LF SOD) methods have been proposed. However, guaranteeing the integrality and recovering more high-frequency details of the generated salient object map still remain challenging. To this end, we propose a spatial attention-guided LF SOD network with implicit neural representation to further improve LF SOD performance. We adopt an encoder-decoder structure for model construction. In order to ensure the completeness of the generated salient object map, a multi-modal and multi-scale feature fusion module is designed in the encoder part to refine the salient regions within all-in-focus image and aggregate the focal stack and all-in-focus image in spatial attention-guided manner. In order to recover more high-frequency details of the obtained salient object map, an implicit detail restoration module is proposed in the decoder part. In virtue of implicit neural representation, we convert the detail restoration problem into a functional mapping problem. By further integrating the self-attention mechanism, the derived saliency map can be depicted at a more refined level. Comprehensive experimental results demonstrate the superiority of the proposed method. Ablation studies and visual comparisons further validate that the proposed method can guarantee the integrality and recover more high-frequency detail information of the obtained saliency map. The code is publicly available at https://github.com/ldyorchid/LFSOD-Net.

引用

页码：12437 / 12449

页数：13

共 86 条

[11] Wang Y., Wang L., Yang J., An W., Yu J., Guo Y., Spatial-angular interaction for light field image super-resolution, Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 290-308, (2020)
[12] Wang Y., Et al., Disentangling light fields for super-resolution and disparity estimation, IEEE Trans. Pattern Anal. Mach. Intell., 45, 1, pp. 425-443, (2023)
[13] Zhang Y., Et al., Light-field depth estimation via epipolar plane image analysis and locally linear embedding, IEEE Trans. Circuits Syst. Video Technol., 27, 4, pp. 739-747, (2017)
[14] Fang Y., Wei K., Hou J., Wen W., Imamoglu N., Light filed image quality assessment by local and global features of epipolar plane image, Proc. IEEE 4th Int. Conf. Multimedia Big Data (BigMM), pp. 1-6, (2018)
[15] Li N., Sun B., Yu J., A weighted sparse coding framework for saliency detection, Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 5216-5223, (2015)
[16] Zhang J., Wang M., Gao J., Wang Y., Zhang X., Wu X., Saliency detection with a deeper investigation of light field, Proc. Int. Joint Conf. Artif. Intell., pp. 1-7, (2015)
[17] Zhang J., Wang M., Lin L., Yang X., Gao J., Rui Y., Saliency detection on light field: A multi-cue approach, ACM Trans. Multimedia Comput., Commun., Appl., 13, 3, pp. 1-22, (2017)
[18] Piao Y., Li X., Zhang M., Yu J., Lu H., Saliency detection via depth-induced cellular automata on light field, IEEE Trans. Image Process., 29, pp. 1879-1889, (2020)
[19] Wang T., Piao Y., Lu H., Li X., Zhang L., Deep learning for light field saliency detection, Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), pp. 8837-8847, (2019)
[20] Piao Y., Rong Z., Zhang M., Lu H., Exploit and replace: An asymmetrical two-stream architecture for versatile light field saliency detection, Proc. AAAI Conf. Artif. Intell., pp. 1-9, (2020)

← 1 2 3 4 5 6 7 8 9 →