Focal Perception Transformer for Light Field Salient Object Detection

被引:0
作者
Zhao, Liming [1 ]
Zhang, Miao [1 ]
Pia, Yongri [1 ]
Yin, Jihao [1 ]
Lu, Huchuan [1 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
来源
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VIII | 2025年 / 15038卷
基金
中国国家自然科学基金;
关键词
Salient object detection; Light field; Focal Perception;
D O I
10.1007/978-981-97-8685-5_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently light field saliency object detection (LFSOD) has attracted increasing attention due to the significant improvements in challenging scenes using rich light field cues. While many works have significantly progressed in this field, a deeper insight into its focal nature should be developed. In this work, we propose Focal Perception Transformer (FPT), which efficiently encodes the context within the focal stack and all-focal image. Specifically, we introduce focal-related tokens to summarize image-specific characteristics and propose a token communication module (TCM) to convey information and facilitate spatial contextual modeling. The features of each image are enriched and correlated with other images through the exchange of information between the precisely encoded focal-related tokens. We also propose a focal perception enhancement (FPE) strategy to help suppress noisy background information. Extensive experiments on four widely-used benchmark datasets demonstrate that the proposed model outperforms the state-of-the-art methods. The source code will be publicly available at https://github.com/combofish/FPTNet.
引用
收藏
页码:3 / 18
页数:16
相关论文
共 43 条
  • [1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
  • [2] Alexey D, 2020, arXiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
  • [3] Pre-Trained Image Processing Transformer
    Chen, Hanting
    Wang, Yunhe
    Guo, Tianyu
    Xu, Chang
    Deng, Yiping
    Liu, Zhenhua
    Ma, Siwei
    Xu, Chunjing
    Xu, Chao
    Gao, Wen
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12294 - 12305
  • [4] CIR-Net: Cross-Modality Interaction and Refinement for RGB-D Salient Object Detection
    Cong, Runmin
    Lin, Qinwei
    Zhang, Chen
    Li, Chongyi
    Cao, Xiaochun
    Huang, Qingming
    Zhao, Yao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6800 - 6815
  • [5] A tutorial on the cross-entropy method
    De Boer, PT
    Kroese, DP
    Mannor, S
    Rubinstein, RY
    [J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 19 - 67
  • [6] Fan DP, 2018, Arxiv, DOI arXiv:1805.10421
  • [7] Structure-measure: A New Way to Evaluate Foreground Maps
    Fan, Deng-Ping
    Cheng, Ming-Ming
    Liu, Yun
    Li, Tao
    Borji, Ali
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4558 - 4567
  • [8] Light field salient object detection: A review and benchmark
    Fu, Keren
    Jiang, Yao
    Ji, Ge-Peng
    Zhou, Tao
    Zhao, Qijun
    Fan, Deng-Ping
    [J]. COMPUTATIONAL VISUAL MEDIA, 2022, 8 (04) : 509 - 534
  • [9] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [10] Occlusion-aware Bi-directional Guided Network for Light Field Salient Object Detection
    Jing, Dong
    Zhang, Shuo
    Cong, Runmin
    Lin, Youfang
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 1692 - 1701