SPARSE DIRECTED GRAPH LEARNING FOR HEAD MOVEMENT PREDICTION IN 360 VIDEO STREAMING

被引:0
|
作者
Zhang, Xue [1 ]
Cheung, Gene [1 ]
Le Callet, Patrick [2 ]
Tan, Jack Z. G. [3 ]
机构
[1] York Univ, N York, ON, Canada
[2] Univ Nantes, Nantes, France
[3] Kandao Technol Co Ltd, Shenzhen, Guangdong, Peoples R China
来源
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年
基金
加拿大自然科学与工程研究理事会;
关键词
Virtual reality; interactive video streaming; graph learning; convex optimization; SALIENCY; MODELS;
D O I
10.1109/icassp40776.2020.9053598
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
High-definition 360 videos encoded in fine quality are typically too large in size to stream in its entirety over bandwidth (BW)-constrained networks. One popular remedy is to interactively extract and send a spatial sub-region corresponding to a viewer's current field-of-view (FoV) in a head-mounted display (HMD) for more BW-efficient streaming. Due to the non-negligible round-trip-time (RTT) delay between server and client, accurate head movement prediction that foretells a viewer's future FoVs is essential. Existing approaches are either overly simplistic in modelling and predict poorly when RTT is large, or are over-reliant on data-driven learning, resulting in inflexible models that are not robust to RTT heterogeneity. In this paper, we cast the head movement prediction task as a sparse directed graph learning problem, where three sources of relevant information-a 360 image saliency map, collected viewers' head movement traces, and a biological head rotation model-are aggregated into a unified Markov model. Specifically, we formulate a constrained optimization problem to minimize an l(2)-norm fidelity term and a sparsity term, corresponding to trace data / saliency consistency and a sparse graph model prior respectively. We solve the problem alternately using a hybrid iterative reweighted least square (IRLS) and Frank-Wolfe optimization strategy. Extensive experiments show that our head movement prediction scheme noticeably outperforms existing proposals across a wide range of RTTs.
引用
收藏
页码:2678 / 2682
页数:5
相关论文
共 13 条
  • [1] Graph Learning Based Head Movement Prediction for Interactive 360 Video Streaming
    Zhang, Xue
    Cheung, Gene
    Zhao, Yao
    Le Callet, Patrick
    Lin, Chunyu
    Tan, Jack Z. G.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4622 - 4636
  • [2] Optimizing Fixation Prediction Using Recurrent Neural Networks for 360° Video Streaming in Head-Mounted Virtual Reality
    Fan, Ching-Ling
    Yen, Shou-Cheng
    Huang, Chun-Ying
    Hsu, Cheng-Hsin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (03) : 744 - 759
  • [3] 360-Degree Video Head Movement Dataset
    Corbillon, Xavier
    De Simone, Francesca
    Simon, Gwendal
    PROCEEDINGS OF THE 8TH ACM MULTIMEDIA SYSTEMS CONFERENCE (MMSYS'17), 2017, : 199 - 204
  • [4] A Long-Term-Planning Learning Strategy to Coordinate Viewport Prediction and Video Transmission in 360 Video Streaming
    Zhang, Guanghui
    Guo, Jing
    Xiao, Mengbai
    Yu, Dongxiao
    Aggarwal, Vaneet
    Cheng, Xiuzhen
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2025, 24 (03) : 1792 - 1804
  • [5] Privacy-Preserving Viewport Prediction using Federated Learning for 360° Live Video Streaming
    Chao, Fang-Yi
    Ozcinar, Cagri
    Smolic, Aljosa
    2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
  • [6] DRL360: 360-degree Video Streaming with Deep Reinforcement Learning
    Zhang, Yuanxing
    Zhao, Pengyu
    Bian, Kaigui
    Liu, Yunxin
    Song, Lingyang
    Li, Xiaoming
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2019), 2019, : 1252 - 1260
  • [7] Efficient viewport prediction and tiling schemes for 360 degree video streaming
    Adhuran, Jayasingam
    Martini, Maria G.
    PROCEEDINGS OF THE 2024 15TH ACM MULTIMEDIA SYSTEMS CONFERENCE 2024, MMSYS 2024, 2024, : 374 - 380
  • [8] Your Attention is Unique: Detecting 360-Degree Video Saliency in Head-Mounted Display for Head Movement Prediction
    Anh Nguyen
    Yan, Zhisheng
    Nahrstedt, Klara
    PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 1190 - 1198
  • [9] Very Long Term Field of View Prediction for 360-degree Video Streaming
    Li, Chenge
    Zhang, Weixi
    Liu, Yong
    Wang, Yao
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 297 - 302
  • [10] Seer: Learning-Based 360° Video Streaming for MEC-Equipped Cellular Networks
    Kumar, Shashwat
    Franklin, A. Antony
    Jin, Jiong
    Dong, Yu-Ning
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2023, 10 (06): : 3308 - 3319