Focal Perception Transformer for Light Field Salient Object Detection

被引：0

作者：

Zhao, Liming ^{[1
]}

Zhang, Miao ^{[1
]}

Pia, Yongri ^{[1
]}

Yin, Jihao ^{[1
]}

Lu, Huchuan ^{[1
]}

机构：

[1] Dalian Univ Technol, Dalian, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VIII | 2025年 / 15038卷

基金：

中国国家自然科学基金;

关键词：

Salient object detection; Light field; Focal Perception;

D O I：

10.1007/978-981-97-8685-5_1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently light field saliency object detection (LFSOD) has attracted increasing attention due to the significant improvements in challenging scenes using rich light field cues. While many works have significantly progressed in this field, a deeper insight into its focal nature should be developed. In this work, we propose Focal Perception Transformer (FPT), which efficiently encodes the context within the focal stack and all-focal image. Specifically, we introduce focal-related tokens to summarize image-specific characteristics and propose a token communication module (TCM) to convey information and facilitate spatial contextual modeling. The features of each image are enriched and correlated with other images through the exchange of information between the precisely encoded focal-related tokens. We also propose a focal perception enhancement (FPE) strategy to help suppress noisy background information. Extensive experiments on four widely-used benchmark datasets demonstrate that the proposed model outperforms the state-of-the-art methods. The source code will be publicly available at https://github.com/combofish/FPTNet.

引用

页码：3 / 18

页数：16

共 43 条

[1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2] Alexey D, 2020, arXiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[3] Pre-Trained Image Processing Transformer
Chen, Hanting
Wang, Yunhe
Guo, Tianyu
Xu, Chang
Deng, Yiping
Liu, Zhenhua
Ma, Siwei
Xu, Chunjing
Xu, Chao
Gao, Wen
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12294 - 12305
[4] CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection
Cong, Runmin
Lin, Qinwei
Zhang, Chen
Li, Chongyi
Cao, Xiaochun
Huang, Qingming
Zhao, Yao
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6800 - 6815
[5] A tutorial on the cross-entropy method
De Boer, PT
Kroese, DP
Mannor, S
Rubinstein, RY
[J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 19 - 67
[6] Fan DP, 2018, Arxiv, DOI arXiv:1805.10421
[7] Structure-measure: A New Way to Evaluate Foreground Maps
Fan, Deng-Ping
Cheng, Ming-Ming
Liu, Yun
Li, Tao
Borji, Ali
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4558 - 4567
[8] Light field salient object detection: A review and benchmark
Fu, Keren
Jiang, Yao
Ji, Ge-Peng
Zhou, Tao
Zhao, Qijun
Fan, Deng-Ping
[J]. COMPUTATIONAL VISUAL MEDIA, 2022, 8 (04) : 509 - 534
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] Occlusion-aware Bi-directional Guided Network for Light Field Salient Object Detection
Jing, Dong
Zhang, Shuo
Cong, Runmin
Lin, Youfang
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1692 - 1701

← 1 2 3 4 5 →