Occlusion-aware Bi-directional Guided Network for Light Field Salient Object Detection

被引:20
作者
Jing, Dong [1 ,2 ]
Zhang, Shuo [1 ,2 ,3 ]
Cong, Runmin [4 ,5 ]
Lin, Youfang [1 ,2 ,3 ]
机构
[1] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing, Peoples R China
[2] Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China
[3] CAAC Key Lab Intelligent Passenger Serv Civil Avi, Beijing, Peoples R China
[4] Beijing Jiaotong Univ, Inst Informat Sci, Beijing, Peoples R China
[5] Beijing Key Lab Adv Informat Sci & Network Techno, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021 | 2021年
基金
中国国家自然科学基金;
关键词
Light field; Salient object detection; Neural network;
D O I
10.1145/3474085.3475312
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
yExisting light field based works utilize either views or focal stacks for saliency detection. However, since depth information exists implicitly in adjacent views or different focal slices, it is difficult to exploit scene depth information from both. By comparison, Epipolar Plane Images (EPIs) provide explicit accurate scene depth and occlusion information by projected pixel lines. Due to the fact that the depth of an object is often continuous, the distribution of occlusion edges concentrates more on object boundaries compared with traditional color edges, which is more beneficial for improving accuracy and completeness of saliency detection. In this paper, we propose a learning-based network to exploit occlusion features from EPIs and integrate high-level features from the central view for accurate salient object detection. Specifically, a novel Occlusion Extraction Module is proposed to extract occlusion boundary features from horizontal and vertical EPIs. In order to naturally combine occlusion features in EPIs and high-level features in central view, we design a concise Bi-directional Guiding Flow based on cascaded decoders. The flow leverages generated salient edge predictions and salient object predictions to refine features in mutual encoding processes. Experimental results demonstrate that our approach achieves stateof-the-art performance in both segmentation accuracy and edge clarity.
引用
收藏
页码:1692 / 1701
页数:10
相关论文
共 49 条
  • [1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
  • [2] [Anonymous], 2017, ARXIV170301290
  • [3] [Anonymous], P IEEE C COMP VIS PA
  • [5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [6] Chen ZY, 2020, AAAI CONF ARTIF INTE, V34, P10599
  • [7] Cheng MM, 2021, INT J COMPUT VISION, V129, P2622, DOI [10.1007/s11263-021-01490-8, 10.1109/ICCV.2017.487]
  • [8] Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions?
    Das, Abhishek
    Agrawal, Harsh
    Zitnick, Larry
    Parikh, Devi
    Batra, Dhruv
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 163 : 90 - 100
  • [9] A tutorial on the cross-entropy method
    De Boer, PT
    Kroese, DP
    Mannor, S
    Rubinstein, RY
    [J]. ANNALS OF OPERATIONS RESEARCH, 2005, 134 (01) : 19 - 67
  • [10] Deng ZJ, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P684