Focal stack based light field salient object detection via 3D-2D convolution hybrid network

被引:0
作者
Wang, Xin [1 ,2 ]
Xiong, Gaomin [1 ]
Zhang, Yong [1 ]
机构
[1] Hefei Univ Technol, Sch Comp & Informat, Hefei 230601, Anhui, Peoples R China
[2] Intelligent Interconnected Syst Lab Anhui Prov, Hefei 230601, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Light field; SOD; Focal stack; Hybrid network; End-to-end;
D O I
10.1007/s11760-023-02700-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Due to the remarkable ability to capture both spatial and angular information of the scene, light field imaging provides abundant cues and information. Over the last decade, various forms of data, such as the focal stack, all-in-focus image, depth map, sub-aperture image, center-view image, and micro-lens image array, have been exploited by different methods of light field salient object detection (SOD). In this study, we introduce a novel 3D-2D convolution hybrid network called HFSNet, which utilizes the focal stack as the only input to achieve SOD. The encoder network is constructed based on 3D convolution to extract and preserve the continuously changing focus cues within the focal stack. In order to reduce the computational burden of 3D convolution, we incorporate 3D max-pooling layers, channel reduction modules, and focal stack feature fusing modules to reduce the data dimension. The decoder network, on the other hand, is built on 2D convolution to generate coarse saliency maps, which are then refined using the refine module to obtain the final saliency map. We conduct experiments on five benchmark light field SOD datasets, and the results demonstrate that our method outperforms other models on DUTLF-V2 and DUTLF-FS, and achieves competitive outcomes on Lytro Illum, HFUT-Lytro, and LFSD.
引用
收藏
页码:109 / 118
页数:10
相关论文
共 53 条
  • [1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
  • [2] Salient Object Detection: A Benchmark
    Borji, Ali
    Sihite, Dicky N.
    Itti, Laurent
    [J]. COMPUTER VISION - ECCV 2012, PT II, 2012, 7573 : 414 - 429
  • [3] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [4] Saliency-aware food image segmentation for personal dietary assessment using a wearable computer
    Chen, Hsin-Chen
    Jia, Wenyan
    Sun, Xin
    Li, Zhaoxin
    Li, Yuecheng
    Fernstrom, John D.
    Burke, Lora E.
    Baranowski, Thomas
    Sun, Mingui
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2015, 26 (02)
  • [5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [6] Chen Q, 2021, AAAI CONF ARTIF INTE, V35, P1063
  • [7] Cicek O., 2016, INT C MED IM COMP CO, P424, DOI DOI 10.1007/978-3-319-46723-8_49
  • [8] Deng-Ping Fan, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12357), P275, DOI 10.1007/978-3-030-58610-2_17
  • [9] Automatic Detection of Cerebral Microbleeds From MR Images via 3D Convolutional Neural Networks
    Dou, Qi
    Chen, Hao
    Yu, Lequan
    Zhao, Lei
    Qin, Jing
    Wang, Defeng
    Mok, Vincent C. T.
    Shi, Lin
    Heng, Pheng-Ann
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2016, 35 (05) : 1182 - 1195
  • [10] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497