The Concept Of Narrow Sound Channel Using Binary Time Frequency Masking For Speech Recognition Of Intelligent Service Robots

被引:0
|
作者
Jang, Hyukjoon [1 ]
Song, Jaiyoun [1 ]
Jeong, Hong [2 ]
机构
[1] POSTECH, Dept Info Technol, Pohang, Kyungbuk, South Korea
[2] POSTECH, Dept EEE, Pohang, Kyungbuk, South Korea
关键词
Degenerate Unmixing Estimation Technique; Narrow Sound Channel;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we propose a speech recognition system with a narrow sound channel for intelligent service robots in noisy environments. The narrow sound channel is obtained by using time delay between two array microphone inputs used in the Degenerate Unmixing Estimation Technique (DUET). In the proposed system, the voice from a specific direction only passes through the sound channel, while unwanted voices from any other direction are removed. The recognition results showed that the performance of the proposed system, using a stereo microphone is higher than the normal voice recognizer, using a single ultra directional microphone, without this method.
引用
收藏
页码:1325 / +
页数:2
相关论文
共 50 条
  • [1] Maximizing environmental sound recognition and speech intelligibility using time-frequency masking
    Johnson, Eric M.
    Healy, Eric W.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2023, 153 (03):
  • [2] Musical Sound Separation Based on Binary Time-Frequency Masking
    Yipeng Li
    DeLiang Wang
    EURASIP Journal on Audio, Speech, and Music Processing, 2009
  • [3] Musical Sound Separation Based on Binary Time-Frequency Masking
    Li, Yipeng
    Wang, DeLiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2009,
  • [4] On Using Time-Frequency Binary Masking For Dereverberation
    Mischie, Septimiu
    2013 INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2013,
  • [5] Time-Frequency Masking For Large Scale Robust Speech Recognition
    Wang, Yuxuan
    Misra, Ananya
    Chine, Kean K.
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2469 - 2473
  • [6] Cepstral representation of speech motivated by time-frequency masking: An application to speech recognition
    Aikawa, K
    Singer, H
    Kawahara, H
    Tohkura, Y
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1996, 100 (01): : 603 - 614
  • [7] Robust Automatic Speech Recognition System Based on Using Adaptive Time-Frequency Masking
    Gouda, Ahmed Mostafa
    Tamazin, Mohamed
    Khedr, Mohamed
    PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 181 - 186
  • [8] Musical sound separation using pitch-based labeling and binary time-frequency masking
    Li, Yipeng
    Wang, DeLiang
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 173 - +
  • [9] Speech intelligibility in background noise with ideal binary time-frequency masking
    Wang, DeLiang
    Kjems, Ulrik
    Pedersen, Michael S.
    Boldt, Jesper B.
    Lunner, Thomas
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2009, 125 (04): : 2336 - 2347
  • [10] Perceptual speech coding using time and frequency masking constraints
    Carnero, B
    Drygajlo, A
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1363 - 1366