Semi-Blind Source Separation using Binary Masking and Independent Vector Analysis

被引:2
作者
Tachioka, Yuuki [1 ]
Narita, Tomohiro [1 ]
Ishii, Jun [1 ]
机构
[1] Mitsubishi Electr Corp, Informat Technol R&D Ctr, Kamakura, Kanagawa 2478501, Japan
关键词
binary masking; independent vector analysis; automatic speech recognition;
D O I
10.1002/tee.22072
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recent prevalence of speech recognition system increases the opportunity of simultaneous recognition of multiple speakers' utterances. There are two types of source separation methods: physical and statistical. The former is based on the physical information such as a direction of arrival of sound sources. The latter only uses statistical independence. The advantage of the former is fast computation and effectiveness with precise information; and that of the latter is no need for physical information, which leads to the robustness of measurement errors. In this paper, we propose to combine these approaches effectively. Experiments on a speech recognition task show that the proposed method can achieve the upper limit performance of the two approaches. (c) 2014 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
引用
收藏
页码:114 / 115
页数:2
相关论文
共 4 条
[1]   An approach to blind source separation based on temporal structure of speech signals [J].
Murata, N ;
Ikeda, S ;
Ziehe, A .
NEUROCOMPUTING, 2001, 41 :1-24
[2]  
Ono N, 2011, 2011 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), P189, DOI 10.1109/ASPAA.2011.6082320
[3]   Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment [J].
Sawada, Hiroshi ;
Araki, Shoko ;
Makino, Shoji .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (03) :516-527
[4]   Direction of arrival estimation by cross-power spectrum phase analysis using prior distributions and voice activity detection information [J].
Tachioka, Yuuki ;
Narita, Tomohiro ;
Iwasaki, Tomohiro .
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2012, 33 (01) :68-71