Towards 6DoF live video streaming system for immersive media

被引:0
作者
Yangang Cai
Xuesong Gao
Weiqiang Chen
Ronggang Wang
机构
[1] Peking University,Shenzhen Graduate School
来源
Multimedia Tools and Applications | 2022年 / 81卷
关键词
View synthesis; GPU acceleration; 6DoF; DIBR;
D O I
暂无
中图分类号
学科分类号
摘要
Based on the three rotational degrees (video in three dimensions, on the X, Y and Z axes) of freedom provided by VR, the viewer is free to control the viewing point and has six degrees of freedom (6DoF). When watching a sports game, the audience is no longer limited by the position of the camera, and can freely choose the viewing angle and position just like watching in the real world, which can greatly improve the immersion of viewing. However, the major barrier that prevents 6DoF video live from being industrialized lies in the extremely high computational complexity, of which multi-view depth estimation and Depth Image Based Rendering (DIBR) is difficult to realize. And existing devices do not have hardware interfaces that support multi-views coding technology. Therefore, we need new technologies for depth estimation and virtual view synthesis, and we need to use existing hardware coding/decoding interfaces to reduce power consumption. In this paper, we provide a 6DoF live video system, which includes multi-view depth estimation technique based on unsupervised learning, virtual viewpoint real-time rendering technology and 6DoF video coding. Experimental results demonstrate that our proposed acceleration method can speed up the original depth estimation algorithm by more than 34x, and can speed up the original DIBR algorithm by more than 168x. With our 6DoF video coding method, experimental results show that the bit rate achieves an average of 70%, 64%, 33%, 60% and 66% bitrate saving for AVC, HEVC, AV1, AVS3, VVC codec standard respectively.
引用
收藏
页码:35875 / 35898
页数:23
相关论文
共 45 条
  • [1] Canny J(1986)A computational approach to edge detection IEEE Trans Pattern Anal Mach Intell 8 679-698
  • [2] Duan Y(2014)Novel efficient HEVC decoding solution on general-purpose processors IEEE Trans Multimedia 16 1915-1928
  • [3] Sun J(2014)Depth map prediction from a single image using a multi-scale deep network NIPS 1 5-341
  • [4] Yan L(2008)Pseudo-sequence-based 2-D hierarchical coding structure for light-field image compression IEEE Trans Pattern Anal Mach Intell 30 328-1119
  • [5] Chen K(2017)A new model-based method for multi-view human body tracking and its application to view transfer in image-based rendering IEEE J Select Top Signal Process 11 1107-1334
  • [6] Guo Z(2018)Dynamic load balancing for real-time video encoding on heterogeneous CPU+GPU systems IEEE Trans Multimedia 20 1321-121
  • [7] Eigen D(2014)Overview of the high efficiency video coding (HEVC) standard IEEE Trans Multimedia 16 108-1668
  • [8] Puhrsch C(2012)“FTV standardization in MPEG, ” 2014 3DTV-conference: the true vision - capture, transmission and display of 3D video (3DTV-CON) IEEE Trans Circ Syst Video Technol 22 1649-70
  • [9] Fergus R(2014)Empowering visual categorization with the GPU Budapest 13 60-1400
  • [10] Hirschmüller H(2011)Accelerating image-domain-warping virtual view synthesis on GPGPU IEEE Trans Multimedia 19 1392-612