A Viewport Prediction Framework for Panoramic Videos

被引:6
|
作者
Tang, Jinting [1 ,2 ]
Huo, Yongkai [1 ,2 ]
Yang, Shaoshi [3 ,4 ]
Jiang, Jianmin [1 ,2 ]
机构
[1] Shenzhen Univ, Sch Comp Sci & Software Engn, Natl Engn Lab Big Data Syst Comp Technol, Shenzhen 518060, Peoples R China
[2] Shenzhen Univ, Sch Comp Sci & Software Engn, Res Inst Future Media Comp, Shenzhen 518060, Peoples R China
[3] Beijing Univ Posts & Telecommun, Sch Informat & Commun Engn, Beijing 100876, Peoples R China
[4] Beijing Univ Posts & Telecommun, Minist Educ, Key Lab Universal Wireless Commun, Beijing 100876, Peoples R China
来源
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2020年
基金
中国国家自然科学基金;
关键词
panoramic video; viewport prediction; object tracking; deep learning;
D O I
10.1109/ijcnn48605.2020.9207562
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Panoramic video is considered to be an attractive video format, since it provides the viewers with an immersive experience, such as virtual reality (VR) gaming. However, the viewers only focus on part of panoramic video, which is referred to as viewport. Hence, the resources consumed for distributing the remaining part of the panoramic video are wasted. It is intuitive to only deliver the video data within this viewport for reducing the distribution cost. Empirically, viewports within a time interval are highly correlated, hence the historical trajectory may be used for predicting the future viewports. On the other hand, a viewer tends to sustain attention on a specific object in a panoramic video. Motivated by these findings, we propose a deep learning-based viewport Prediction scheme, namely HOP, where the Historical viewport trajectory of viewers and Object tracking are jointly exploited by the long short-term memory (LSTM) networks. Additionally, our solution is capable of predicting multiple future viewports, while a single viewport prediction was supported by the state-of-the-art contributions. Simulation results show that our proposed HOP scheme outperforms the benchmarkers by up to 33.5% in terms of the prediction error.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Viewport-Dependent Saliency Prediction in 360° Video
    Qiao, Minglang
    Xu, Mai
    Wang, Zulin
    Borji, Ali
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 748 - 760
  • [42] SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments
    Gard, Niklas
    Hilsmannle, Anna
    Eisert, Peter
    COMPUTER VISION - ECCV 2024, PT LXXIII, 2025, 15131 : 398 - 415
  • [43] Overview of 360-Degree Video and Viewport Prediction
    Li, Zhenhuai
    Zhan, Yinwei
    Computer Engineering and Applications, 2024, 60 (02)
  • [44] Compass: A Prefetching Framework with Viewport Patching for 360° Video Streaming
    Su, Wenjing
    Li, Yueheng
    Chen, Hao
    Ma, Zhan
    PROCEEDINGS OF THE 2024 SIGCOMM WORKSHOP ON EMERGING MULTIMEDIA SYSTEMS, EMS 2024, 2024, : 45 - 51
  • [45] USING PANORAMIC VIDEOS FOR MULTI-PERSON LOCALIZATION AND TRACKING IN A 3D PANORAMIC COORDINATE
    Yang, Fan
    Li, Feiran
    Wu, Yang
    Sakti, Sakriani
    Nakamura, Satoshi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1863 - 1867
  • [46] Interactive Panoramic Ray Tracing for Mixed 360° RGBD Videos
    Wu, Jian
    Wang, Lili
    Ke, Wei
    2023 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES ABSTRACTS AND WORKSHOPS, VRW, 2023, : 777 - 778
  • [47] Transcoing-based data transmission of sphere panoramic videos
    Dai, Feng
    Zhang, Yong-dong
    Shen, Yan-fei
    Lin, Shou-xun
    2006 8TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-4, 2006, : 1416 - +
  • [48] LSTM-based Viewport Prediction for Immersive Video Systems
    Manfredi, Gioacchino
    Racanelli, Vito Andrea
    De Cicco, Luca
    Mascolo, Saverio
    2023 21ST MEDITERRANEAN COMMUNICATION AND COMPUTER NETWORKING CONFERENCE, MEDCOMNET, 2023, : 49 - 52
  • [49] Panoramic Visual Summaries for Efficient Reading of Capsule Endoscopy Videos
    Spyrou, Evaggelos
    Diamantis, Dimitris
    Iakovidis, Dimitris K.
    2013 8TH INTERNATIONAL WORKSHOP ON SEMANTIC AND SOCIAL MEDIA ADAPTATION AND PERSONALIZATION (SMAP 2013), 2013, : 41 - 46
  • [50] PredGAN - a deep multi-scale video prediction framework for detecting anomalies in videos
    Jamadandi, Adarsh
    Kotturshettar, Sunidhi
    Mudenagudi, Uma
    ELEVENTH INDIAN CONFERENCE ON COMPUTER VISION, GRAPHICS AND IMAGE PROCESSING (ICVGIP 2018), 2018,