Speaker Localization among multi-faces in noisy environment by audio-visual Integration

被引:10
作者
Kim, Hyun-Don [1 ]
Choi, Jong-Suk [1 ]
Kim, Munsang [1 ]
机构
[1] Intelligent Robot Res Ctr, Korea Inst Sci & Technol, Seoul, South Korea
来源
2006 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-10 | 2006年
关键词
sound localization; face tracking; voice activity detection; human robot interaction; audiovisual integration;
D O I
10.1109/ROBOT.2006.1641889
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we not only developed a reliable sound localization system including VAD (Voice Activity Detection) component using three microphones but also a face tracking system using a vision camera. Moreover, we proposed a way to integrate these systems in the human-robot interaction to compensate the errors in the localization of a speaker and to reject unnecessary speech or noise signals entering from the undesired directions effectively. For the purpose of verifying our system's performances, we installed the proposed audition and vision system to the prototype robot, called IRORAA (Intelligent ROBot for Active Audition), and showed how to integrate an audio-visual system.
引用
收藏
页码:1305 / 1310
页数:6
相关论文
共 12 条
[11]  
Okuno HG, 2001, IROS 2001: PROCEEDINGS OF THE 2001 IEEE/RJS INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, P1402, DOI 10.1109/IROS.2001.977177
[12]  
TSAI RY, 1987, IEEE T ROBOTIC AUTOM, V3, P323, DOI 10.1109/JRA.1987.1087109