The Concept Of Narrow Sound Channel Using Binary Time Frequency Masking For Speech Recognition Of Intelligent Service Robots

被引:0
|
作者
Jang, Hyukjoon [1 ]
Song, Jaiyoun [1 ]
Jeong, Hong [2 ]
机构
[1] POSTECH, Dept Info Technol, Pohang, Kyungbuk, South Korea
[2] POSTECH, Dept EEE, Pohang, Kyungbuk, South Korea
关键词
Degenerate Unmixing Estimation Technique; Narrow Sound Channel;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we propose a speech recognition system with a narrow sound channel for intelligent service robots in noisy environments. The narrow sound channel is obtained by using time delay between two array microphone inputs used in the Degenerate Unmixing Estimation Technique (DUET). In the proposed system, the voice from a specific direction only passes through the sound channel, while unwanted voices from any other direction are removed. The recognition results showed that the performance of the proposed system, using a stereo microphone is higher than the normal voice recognizer, using a single ultra directional microphone, without this method.
引用
收藏
页码:1325 / +
页数:2
相关论文
共 50 条
  • [31] SEPARATION SOUND EVENT LOCALIZATION AND DETECTION USING NEURAL NETWORK AND TIME FREQUENCY MASKING
    Ranny
    Lestari, Dessi Puji
    Mengko, Tati Latifah Erawati Rajab
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2024, 20 (03): : 709 - 723
  • [32] Impact of phase estimation on single-channel speech separation based on time-frequency masking
    Mayer, Florian
    Williamson, Donald S.
    Mowlaee, Pejman
    Wang, DeLiang
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2017, 141 (06): : 4668 - 4679
  • [33] Underdetermined Blind Source Separation using Binary Time-Frequency Masking with Variable Frequency Resolution
    Anandkumar, Amod J. G.
    Ghosh, Aneesh T. A.
    Damodaram, B. Teja
    David, Sumam S.
    2008 IEEE REGION 10 CONFERENCE: TENCON 2008, VOLS 1-4, 2008, : 949 - 954
  • [34] Time Difference of Arrival Estimation based on Binary Frequency Mask for Sound Source Localization on Mobile Robots
    Grondin, Francois
    Michaud, Francois
    2015 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2015, : 6149 - 6154
  • [35] Environmental Sound Recognition Using Time-Frequency Intersection Patterns
    Guo, Xuan
    Toyoda, Yoshiyuki
    Li, Huankang
    Huang, Jie
    Ding, Shuxue
    Liu, Yong
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2012, 2012
  • [36] ROBUST MULTI-CHANNEL SPEECH RECOGNITION USING FREQUENCY ALIGNED NETWORK
    Park, Taejin
    Kumatani, Kenichi
    Wu, Minhua
    Sundaram, Shiva
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6859 - 6863
  • [37] Robust digit recognition using phase-dependent time-frequency masking
    Shi, GJ
    Aarabi, P
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 684 - 687
  • [38] Catch Me If You Can: Blackbox Adversarial Attacks on Automatic Speech Recognition using Frequency Masking
    Wu, Xiaoliang
    Rajan, Ajitha
    2022 29TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, APSEC, 2022, : 169 - 178
  • [39] Robust digit recognition using phase-dependent time-frequency masking
    Shi, GJ
    Aarabi, P
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL III, PROCEEDINGS, 2003, : 629 - 632
  • [40] A Phase-Based Time-Frequency masking for multi-channel speech enhancement in domestic environments
    Brutti, Alessio
    Tsiami, Antigoni
    Katsamanis, Athanasios
    Maragos, Petros
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2875 - 2879