Acoustic Room Modelling Using 360 Stereo Cameras

被引：0

作者：

Kim, Hansung ^{[1
]}

Remaggi, Luca ^{[2
]}

Fowler, Sam ^{[3
]}

Jackson, Philip Jb ^{[3
]}

Hilton, Adrian ^{[3
]}

机构：

[1] School of Electronics and Computer Science, University of Southampton, Southampton,SO17 1BJ, United Kingdom

[2] Creative Labs, London,W1F 8WQ, United Kingdom

[3] Centre for Vision Speech and Signal Processing, University of Surrey, Guildford,GU2 7XH, United Kingdom

来源：

IEEE Transactions on Multimedia | 2021年 / 23卷

基金：

英国工程与自然科学研究理事会;

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In this paper we propose a pipeline for estimating acoustic 3D room structure with geometry and attribute prediction using spherical 360° cameras. Instead of setting microphone arrays with loudspeakers to measure acoustic parameters for specific rooms, a simple and practical single-shot capture of the scene using a stereo pair of 360 cameras can be used to simulate those acoustic parameters. We assume that the room and objects can be represented as cuboids aligned to the main axes of the room coordinate (Manhattan world). The scene is captured as a stereo pair using off-the-shelf consumer spherical 360 cameras. A cuboid-based 3D room geometry model is estimated by correspondence matching between captured images and semantic labelling using a convolutional neural network (SegNet). The estimated geometry is used to produce frequency-dependent acoustic predictions of the scene. This is, to our knowledge, the first attempt in the literature to use visual geometry estimation and object classification algorithms to predict acoustic properties. Results are compared to measurements through calculated reverberant spatial audio object parameters used for reverberation reproduction customized to the given loudspeaker set up. © 1999-2012 IEEE.

引用

页码：4117 / 4130