The Concept Of Narrow Sound Channel Using Binary Time Frequency Masking For Speech Recognition Of Intelligent Service Robots

被引:0
|
作者
Jang, Hyukjoon [1 ]
Song, Jaiyoun [1 ]
Jeong, Hong [2 ]
机构
[1] POSTECH, Dept Info Technol, Pohang, Kyungbuk, South Korea
[2] POSTECH, Dept EEE, Pohang, Kyungbuk, South Korea
关键词
Degenerate Unmixing Estimation Technique; Narrow Sound Channel;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we propose a speech recognition system with a narrow sound channel for intelligent service robots in noisy environments. The narrow sound channel is obtained by using time delay between two array microphone inputs used in the Degenerate Unmixing Estimation Technique (DUET). In the proposed system, the voice from a specific direction only passes through the sound channel, while unwanted voices from any other direction are removed. The recognition results showed that the performance of the proposed system, using a stereo microphone is higher than the normal voice recognizer, using a single ultra directional microphone, without this method.
引用
收藏
页码:1325 / +
页数:2
相关论文
共 50 条
  • [11] Robust speech separation using time-frequency masking
    Aarabi, P
    Shi, GJ
    Jahromi, O
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I, PROCEEDINGS, 2003, : 741 - 744
  • [12] On the integration of time-frequency masking speech separation and recognition in underdetermined environments
    Jafari, Ingrid
    Haque, Serajul
    Togneri, Roberto
    Nordholm, Sven
    2012 CONFERENCE RECORD OF THE FORTY SIXTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS (ASILOMAR), 2012, : 1613 - 1617
  • [13] Label Driven Time-Frequency Masking for Robust Continuous Speech Recognition
    Soni, Meet
    Panda, Ashish
    INTERSPEECH 2019, 2019, : 426 - 430
  • [14] The Effect of Partial Time-Frequency Masking of the Direct Sound on the Perception of Reverberant Speech
    Madmoni, Lior
    Tibor, Shir
    Nelken, Israel
    Rafaely, Boaz
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2037 - 2047
  • [15] Structure in time-frequency binary masking errors and its impact on speech intelligibility
    Kressner, Abigail A.
    Rozell, Christopher J.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2015, 137 (04): : 2025 - 2035
  • [16] A Data Field method for speech enhancement incorporating Binary Time-Frequency Masking
    Huang, Jianjun
    Zhang, Yafei
    Zhang, Xiongwei
    Zhu, Tao
    PRZEGLAD ELEKTROTECHNICZNY, 2011, 87 (07): : 225 - 229
  • [17] Speech enhancement and recognition using circular microphone array for service robots
    Choi, C
    Kong, D
    Kim, J
    Bang, S
    IROS 2003: PROCEEDINGS OF THE 2003 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2003, : 3516 - 3521
  • [18] Combined architecture of adaptive beamforming and blind source separation for speech recognition of intelligent service robots
    Woo, Sungmin
    Lee, Sanghoon
    Jeong, Hong
    2007 INTERNATIONAL CONFERENCE ON INTELLIGENT PERVASIVE COMPUTING, PROCEEDINGS, 2007, : 214 - 219
  • [19] Binary and ratio time-frequency masks for robust speech recognition
    Srinivasan, Soundararajan
    Roman, Nicoleta
    Wang, DeLiang
    SPEECH COMMUNICATION, 2006, 48 (11) : 1486 - 1501
  • [20] Unsupervised Speech Segregation Using Pitch Information and Time Frequency Masking
    Lekshmi, M. S.
    Sathidevi, P. S.
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES, ICICT 2014, 2015, 46 : 122 - 126