Two-stage blind source separation based on ICA and binary masking for real-time robot audition system

被引：13

作者：

Saruwatari, H ^{[1
]}

Mori, Y ^{[1
]}

Hiekata, T ^{[1
]}

Morita, T ^{[1
]}

机构：

[1] Nara Inst Sci & Technol, Grad Sch Informat Sci, Nara 6300192, Japan

来源：

2005 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4 | 2005年

关键词：

robot audition; blind source separation; ICA; binary masking; INDEPENDENT COMPONENT ANALYSIS; FREQUENCY-DOMAIN;

D O I：

10.1109/IROS.2005.1544983

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Blind source separation (BSS) is the approach taken to estimate original source signals using only the information of the mixed signals observed in each input channel. This technique is based on unsupervised filtering in that the source-separation procedure requires no training sequences and no a priori information on the directions-of-arrival (DOAs) of the sound sources. Owing to the attractive features of BSS, much attention has been paid to the BSS technique in many fields of signal processing. One promising example in acoustic signal processing is a humanoid robot auditory system [1], i.e., separation of binaural mixed signals observed at the ears of the robot, which constructs an indispensable basis for intelligent robot technology [2], [3]. In recent works of BSS based on independent component analysis (ICA) [4], various methods have been proposed for acoustic-sound separation [5], [6], [7], [8]. In this paper, we mainly address the BSS problem under highly reverberant conditions which often arise in many practical audio applications. The separation performance of the conventional ICA is far from being sufficient in such a case because too long separation filters is required but the unsupervised learning ofWe newly propose a real-time two-stage blind source separation (BSS) for binaural mixed signals observed at the ears of humanoid robot, in which a Single-Input Multiple-Output (SIMO)-model-based independent component analysis (ICA) and binary mask processing are combined. SIMO-model-based ICA can separate the mixed signals, not into monaural source signals but into SINIO-model-based signals from independent sources as they are at the microphones. Thus, the separated signals of SIMO-model-based ICA can maintain the spatial qualities of each sound source, and this yields that binary mask processing can be applied to efficiently remove the residual interference components after SIMO-model-based ICA. The experimental results obtained with a human-like head reveal that the separation performance can be considerably improved by using the proposed method in comparison to the conventional ICA-based and binary-mask-based BSS methods.

引用

页码：209 / 214

页数：6

共 17 条

[1] Aoki M., 2001, Acoustical Science and Technology, V22, P149, DOI 10.1250/ast.22.149
[2] AOKI M, 2002, EA200211 IEICE, P23
[3] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
BOLL, SF
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
[4] INDEPENDENT COMPONENT ANALYSIS, A NEW CONCEPT
COMON, P
[J]. SIGNAL PROCESSING, 1994, 36 (03) : 287 - 314
[5] Lyon R. F., 1983, Proceedings of ICASSP 83. IEEE International Conference on Acoustics, Speech and Signal Processing, P1148
[6] Murata N., 1998, P 1998 INT S NONL TH, P923
[7] Nakadai K, 2003, IROS 2003: PROCEEDINGS OF THE 2003 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, P1147
[8] NISHIMURA R, 2002, P IROS 2002, P1314
[9] Convolutive blind separation of non-stationary sources
Parra, L
Spence, C
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (03): : 320 - 327
[10] Robots that can hear, understand and talk
Prasad, R
Saruwatari, H
Shikano, K
[J]. ADVANCED ROBOTICS, 2004, 18 (05) : 533 - 564

← 1 2 →