COMPARISON OF REFERENCE MICROPHONE SELECTION ALGORITHMS FOR DISTRIBUTED MICROPHONE ARRAY BASED SPEECH ENHANCEMENT IN MEETING RECOGNITION SCENARIOS

被引:0
作者
Araki, Shoko [1 ]
Ono, Nobutaka [2 ]
Kinoshita, Keisuke [1 ]
Delcroix, Marc [1 ]
机构
[1] NTT Corp, NTT Commun Sci Labs, 2-4 Hikaridai,Seika Cho, Kyoto 6190237, Japan
[2] Tokyo Metropolitan Univ, Fac Syst Design, 6-6 Asahigaoka, Hino, Tokyo 1910065, Japan
来源
2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC) | 2018年
基金
日本学术振兴会;
关键词
meeting recognition; distributed microphones; reference microphone selection; speech enhancement; independent vector analysis (IVA); BLIND SOURCE SEPARATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses a front-end system for speech recognition of meeting conversations that are recorded with distributed microphones such as smartphones. When using distributed microphones, one of the microphones may be closer to the speaker than the others and thus provide high speech recognition accuracy due to a high signal-to-noise ratio and low reverberation. It is important to select such a microphone as a reference microphone channel in widely studied speech enhancement approaches, which estimate source images at a reference microphone. However, the reference microphone selection is still an open problem, especially for a distributed microphone array, where the sensitivity may differ among the microphones. In this paper, we discuss several approaches to select a reference microphone for multi-channel speech enhancement, such as independent vector analysis (IVA), and compare the performance of these approaches in terms of speech recognition accuracy.
引用
收藏
页码:316 / 320
页数:5
相关论文
共 23 条
  • [1] Acoustic beamforming for speaker diarization of meetings
    Anguera, Xavier
    Wooters, Chuck
    Hernando, Javier
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 2011 - 2022
  • [2] [Anonymous], P ISCA TUT RES WORKS
  • [3] Araki S., 2017, P HSCMA2017
  • [4] Araki S., 2018, P ICASSP2018
  • [5] Araki S, 2017, 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), P32, DOI 10.1109/ASRU.2017.8268913
  • [6] Araki S, 2016, INT CONF ACOUST SPEE, P385, DOI 10.1109/ICASSP.2016.7471702
  • [7] Detection and Separation of Speech Events in Meeting Recordings Using a Microphone Array
    Asano, Futoshi
    Yamamoto, Kiyoshi
    Ogata, Jun
    Yamada, Miichi
    Nakamura, Andmasami
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2007, 2007 (1)
  • [8] Blind Sampling Rate Offset Estimation for Wireless Acoustic Sensor Networks Through Weighted Least-Squares Coherence Drift Estimation
    Bahari, Mohamad Hasan
    Bertrand, Alexander
    Moonen, Marc
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (03) : 674 - 686
  • [9] Blind Synchronization in Wireless Acoustic Sensor Networks
    Cherkassky, Dani
    Gannot, Sharon
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (03) : 651 - 661
  • [10] Emiya Valentin, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P1040, DOI 10.1109/ICASSP.2014.6853755