Audio-visual object removal in 360-degree videos

被引:6
作者
Shimamura, Ryo [1 ]
Feng, Qi [1 ]
Koyama, Yuki [2 ]
Nakatsuka, Takayuki [1 ]
Fukayama, Satoru [2 ]
Hamasaki, Masahiro [2 ]
Goto, Masataka [2 ]
Morishima, Shigeo [3 ]
机构
[1] Waseda Univ, Tokyo, Japan
[2] Natl Inst Adv Ind Sci & Technol, Tsukuba, Ibaraki, Japan
[3] Waseda Res Inst Sci & Engn, Tokyo, Japan
关键词
Audio-visual object removal; 360-degree video; Human perception; Signal processing; Virtual reality; SOUND; SEPARATION;
D O I
10.1007/s00371-020-01918-1
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a novel conceptaudio-visual object removalin 360-degree videos, in which a target object in a 360-degree video is removed in both the visual and auditory domains synchronously. Previous methods have solely focused on the visual aspect of object removal using video inpainting techniques, resulting in videos with unreasonable remaining sounds corresponding to the removed objects. We propose a solution which incorporates direction acquired during the video inpainting process into the audio removal process. More specifically, our method identifies the sound corresponding to the visually tracked target object and then synthesizes a three-dimensional sound field by subtracting the identified sound from the input 360-degree video. We conducted a user study showing that our multi-modal object removal supporting both visual and auditory domains could significantly improve the virtual reality experience, and our method could generate sufficiently synchronous, natural and satisfactory 360-degree videos.
引用
收藏
页码:2117 / 2128
页数:12
相关论文
共 26 条
[1]  
Akyazi P, 2018, EUR SIGNAL PR CONF, P867, DOI 10.23919/EUSIPCO.2018.8553205
[2]  
Bertalmío M, 2001, PROC CVPR IEEE, P355
[3]  
Bertalmio M, 2006, HANDBOOK OF MATHEMATICAL MODELS IN COMPUTER VISION, P33, DOI 10.1007/0-387-28831-7_3
[4]  
Feng WJ, 2017, IEEE IJCNN, P681, DOI 10.1109/IJCNN.2017.7965918
[5]  
GERZON MA, 1973, J AUDIO ENG SOC, V21, P2
[6]  
Hershey J, 2002, ADV NEUR IN, V14, P1173
[7]   Deep Video Inpainting [J].
Kim, Dahun ;
Woo, Sanghyun ;
Lee, Joon-Young ;
Kweon, In So .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :5785-5794
[8]   Coherency Sensitive Hashing [J].
Korman, Simon ;
Avidan, Shai .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (06) :1099-1112
[9]  
Kowalczyk K, 2015, IEEE SIGNAL PROC MAG, V32, P31, DOI 10.1109/MSP.2014.2369531
[10]  
Le Meur O., 2011, 2011 18th IEEE International Conference on Image Processing (ICIP 2011), P3401, DOI 10.1109/ICIP.2011.6116441