Consistency perception network for 360° omnidirectional salient object detection

被引:0
作者
Wen, Hongfa [1 ]
Zhu, Zunjie [2 ,3 ]
Zhou, Xiaofei [1 ]
Zhang, Jiyong [1 ]
Yan, Chenggang [2 ]
机构
[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310018, Peoples R China
[2] Hangzhou Dianzi Univ, Sch Commun Engn, Hangzhou 310018, Peoples R China
[3] Hangzhou Dianzi Univ, Lishui Inst, Lishui 323000, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Salient object detection; Scale perception; Consistent learning; Edge enhancement; 360 degrees omnidirectional image; IMAGE;
D O I
10.1016/j.neucom.2024.129243
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the gradual popularization of panoramic cameras and the rapid development of computer vision technology, the research on salient object detection (SOD) in 360 degrees omnidirectional images has attracted great attention. Different from traditional 2D images, 360 degrees omnidirectional images generally have some drawbacks such as projection distortion, complex scenes, and small objects. To address these knotty issues, in this paper, we propose a novel C onsistency P erception Net work (CPNet) for 360 degrees omnidirectional salient object detection, which can reliably localize salient regions in 360 degrees omnidirectional images. We adopt a split-splice strategy on equirectangular 360 degrees images to solve the distortion problem caused by the discontinuities of objects on the projection boundary, which restores the structure of objects and ensures the integrity of scene to the greatest extent. Inspired by the human visual perception mechanism, we deploy a bidirectional scale- aware module (BSM), which uses different convolutional dilation rates to simulate different receptive fields, performs hierarchical perceptual learning in series and conducts bidirectional guidance positioning in parallel. In addition, a consistent context-aware module (CCM) is designed to facilitate consistent learning of salient regions from different attention perspectives. It not only completely guarantees the global uniformity, but also accurately preserves the local details. Under the gradual feedback and guidance of coarse prediction maps, the edge sharpening module (ESM) can depict the details of salient objects more accurately, thereby generating high-quality saliency maps with clear boundaries. Extensive experiments on three public 360 degrees SOD datasets demonstrate that the proposed CPNet achieves comparable and competitive performance in terms of effectiveness and efficiency when compared with the cutting-edge SOD methods.
引用
收藏
页数:12
相关论文
共 77 条
  • [1] Mukherjee P., Lall B., Saliency and KAZE features assisted object segmentation, Image Vis. Comput., 61, pp. 82-97, (2017)
  • [2] Zhou T., Wang W., Konukoglu E., Goo L.V., Rethinking semantic segmentation: A prototype view, Computer Vision and Pattern Recognition, CVPR, pp. 2572-2583, (2022)
  • [3] Wen H., Zhou X., Sun Y., Zhang J., Yan C., Deep fusion based video saliency detection, J. Vis. Commun. Image Represent., 62, pp. 279-285, (2019)
  • [4] Li L., Wang W., Zhou T., Li J., Yang Y., Unified mask embedding and correspondence learning for self-supervised video segmentation, Computer Vision and Pattern Recognition, CVPR, pp. 18706-18716, (2023)
  • [5] Chi Z., Li H., Lu H., Yang M.-H., Dual deep network for visual tracking, IEEE Trans. Image Process., 26, pp. 2005-2015, (2017)
  • [6] Wang W., Han C., Zhou T., Liu D., Visual Recognition with Deep Nearest Centroids, (2023)
  • [7] Sitzmann V., Serrano A., Pavel A., Agrawala M., Gutierrez D., Masia B., Wetzstein G., Saliency in VR: How do people explore virtual environments?, IEEE Trans. Vis. Comput. Graphics, 24, 4, pp. 1633-1642, (2018)
  • [8] Tong N., Lu H., Ruan X., Yang M.-H., Salient object detection via bootstrap learning, Computer Vision and Pattern Recognition, CVPR, pp. 1884-1892, (2015)
  • [9] Zhou X., Liu Z., Sun G., Ye L., Wang X., Improving saliency detection via multiple kernel boosting and adaptive fusion, IEEE Signal Process. Lett., 23, 4, pp. 517-521, (2016)
  • [10] Liu N., Zhang N., Wan K., Shao L., Han J., Visual saliency transformer, International Conference on Computer Vision, ICCV, pp. 4722-4732, (2021)