Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras

被引:5
|
作者
Kim, Hansung [1 ]
Remaggi, Luca [2 ]
Dourado, Aloisio [3 ]
de Campos, Teofilo [3 ]
Jackson, Philip J. B. [4 ]
Hilton, Adrian [4 ]
机构
[1] Univ Southampton, ECS, Southampton, Hants, England
[2] Creat Labs UK, London, England
[3] Univ Brasilia, Brasilia, DF, Brazil
[4] Univ Surrey, CVSSP, Guildford, Surrey, England
基金
英国工程与自然科学研究理事会;
关键词
Audio-visual scene reproduction; Scene understanding; 3D reconstruction and completion; Spatial audio; VIRTUAL-REALITY; IMPLEMENTATION; PERCEPTION; FUTURE;
D O I
10.1007/s10055-021-00594-3
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
As personalised immersive display systems have been intensely explored in virtual reality (VR), plausible 3D audio corresponding to the visual content is required to provide more realistic experiences to users. It is well known that spatial audio synchronised with visual information improves a sense of immersion but limited research progress has been achieved in immersive audio-visual content production and reproduction. In this paper, we propose an end-to-end pipeline to simultaneously reconstruct 3D geometry and acoustic properties of the environment from a pair of omnidirectional panoramic images. A semantic scene reconstruction and completion method using a deep convolutional neural network is proposed to estimate the complete semantic scene geometry in order to adapt spatial audio reproduction to the scene. Experiments provide objective and subjective evaluations of the proposed pipeline for plausible audio-visual VR reproduction of real scenes.
引用
收藏
页码:823 / 838
页数:16
相关论文
共 14 条
  • [1] Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras
    Hansung Kim
    Luca Remaggi
    Aloisio Dourado
    Teofilo de Campos
    Philip J. B. Jackson
    Adrian Hilton
    Virtual Reality, 2022, 26 : 823 - 838
  • [2] AVSU: Workshop on Audio-Visual Scene Understanding for Immersive Multimedia
    Hilton, Adrian
    Kang, Hong-Goo
    Kim, Hansung
    Sohn, Kwanghoon
    PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 2122 - 2124
  • [3] Effect of Acoustic Scene Complexity and Visual Scene Representation on Auditory Perception in Virtual Audio-Visual Environments
    Fichna, Stefan
    Biberger, Thomas
    Seeber, Bernhard U.
    Ewert, Stephan D.
    2021 IMMERSIVE AND 3D AUDIO: FROM ARCHITECTURE TO AUTOMOTIVE (I3DA), 2021,
  • [4] Unsupervised Synthetic Acoustic Image Generation for Audio-Visual Scene Understanding
    Sanguineti, Valentina
    Morerio, Pietro
    Del Bue, Alessio
    Murino, Vittorio
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7102 - 7115
  • [5] Semantic Scene Reconstruction using the DenseCRF Model
    Ma, Zhixin
    Cao, Chong
    Shen, Xukun
    2017 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV 2017), 2017, : 456 - 457
  • [6] Audio-visual scene understanding utilizing text information for a cooking support robot
    Kojima, Ryosuke
    Sugiyama, Osamu
    Nakadai, Kazuhiro
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 4210 - 4215
  • [7] Audio-visual speech scene analysis: Characterization of the dynamics of unbinding and rebinding the McGurk effect
    Nahorna, Olha
    Berthommier, Frederic
    Schwartz, Jean-Luc
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 137 (01) : 362 - 377
  • [8] Semantic Scene Completion from a Single 360-Degree Image and Depth Map
    Dourado, Aloisio
    Kim, Hansung
    de Campos, Teofilo E.
    Hilton, Adrian
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 36 - 46
  • [9] Importance of binaural cues of depth in low-resolution audio-visual 3D scene reproductions
    Salvati, Daniele
    Drioli, Carlo
    Fontana, Federico
    Foresti, Gian Luca
    2018 IEEE 4TH VR WORKSHOP ON SONIC INTERACTIONS FOR VIRTUAL ENVIRONMENTS (SIVE), 2018,
  • [10] Modelling human visual navigation using multi-view scene reconstruction
    Lyndsey C. Pickup
    Andrew W. Fitzgibbon
    Andrew Glennerster
    Biological Cybernetics, 2013, 107 : 449 - 464