Spherical Convolution-based Saliency Detection for FoV Prediction in 360-degree Video Streaming

被引:1
作者
Peng, Shuai [1 ]
Hu, Jialu [1 ]
Li, Zitong [1 ]
Xiao, Han [1 ]
Yang, Shujie [1 ]
Xu, Changqiao [1 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
来源
2023 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC | 2023年
基金
中国国家自然科学基金;
关键词
360-degree video streaming; saliency detection; viewport prediction; spherical convolution;
D O I
10.1109/IWCMC58020.2023.10183031
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Field of view (FoV) prediction is a crucial issue in 360 degrees video streaming, which is the basis for selectively transmitting panoramic videos to reduce bandwidth. The saliency feature is a very important part of FoV prediction. The saliency area identifies a user's region of interest (RoI) and reflects the user's viewing behavior preference. The regular convolutional neural network (CNN) cannot effectively extract the spatial representation of panoramic video content because significant geometric distortion will be introduced after panoramic video projection, especially in polar regions. In this paper, we propose a depth neural network model based on spherical convolution, which can learn the spatial features of the 360 degrees videos by encoding the distortion invariance into the architecture of CNNs. A series of experiments on the public 360 degrees video saliency dataset show the proposed model outperforms the existing saliency models. Finally, we embed the proposed saliency network into a popular FoV prediction framework and propose a complete FoV prediction framework for 360 degrees video streaming.
引用
收藏
页码:162 / 167
页数:6
相关论文
共 30 条
  • [1] Your Attention is Unique: Detecting 360-Degree Video Saliency in Head-Mounted Display for Head Movement Prediction
    Anh Nguyen
    Yan, Zhisheng
    Nahrstedt, Klara
    [J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 1190 - 1198
  • [2] Chao FY, 2018, IEEE INT CONF MULTI
  • [3] A Computational Model for Stereoscopic Visual Saliency Prediction
    Cheng, Hao
    Zhang, Jian
    Wu, Qiang
    An, Ping
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (03) : 678 - 689
  • [4] PARIMA: Viewport Adaptive 360-Degree Video Streaming
    Chopra, Lovish
    Chakraborty, Sarthak
    Mondal, Abhijit
    Chakraborty, Sandip
    [J]. PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 2379 - 2391
  • [5] Cohen T. S., 2018, INT C LEARNING REPRE
  • [6] Cohen TS, 2016, PR MACH LEARN RES, V48
  • [7] SphereNet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images
    Coors, Benjamin
    Condurache, Alexandru Paul
    Geiger, Andreas
    [J]. COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 525 - 541
  • [8] Facebook Technologies, 2017, ENH HIGH RES 360 STR
  • [9] Visual Attention Modeling for Stereoscopic Video: A Benchmark and Computational Model
    Fang, Yuming
    Zhang, Chi
    Li, Jing
    Lei, Jianjun
    Da Silva, Matthieu Perreira
    Le Callet, Patrick
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (10) : 4684 - 4696
  • [10] LiveObj: Object Semantics-based Viewport Prediction for Live Mobile Virtual Reality Streaming
    Feng, Xianglong
    Bao, Zeyang
    Wei, Sheng
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (05) : 2736 - 2745