FUSION OF STANDARD AND ALTERNATIVE ACOUSTIC SENSORS FOR ROBUST AUTOMATIC SPEECH RECOGNITION

被引:0
作者
Heracleous, Panikos [1 ]
Even, Jani [1 ]
Ishi, Carlos T. [1 ]
Miyashita, Takahiro [1 ]
Hagita, Norihiro [1 ]
机构
[1] ATR, Intelligent Robot & Commun Labs, Tokyo, Japan
来源
2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2012年
关键词
Alternative sensors; ear bone microphone; throat microphone; fusion; robust speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper focuses on the problem of environmental noises in human-human communication and in automatic speech recognition. To deal with this problem, the use of alternative acoustic sensors -which are attached to the talker and receive the uttered speech through skin or bones- is investigated. In the current study, throat microphones and ear bone microphones are integrated with standard microphones using several fusion methods. The results obtained show that the recognition rates in noisy environments are drastically increased when these sensors are integrated with standard microphones. Moreover, the system does not show any recognition degradations in clean environments. In fact, recognition rates also increase slightly in clean environments. Using late fusion to integrate a throat microphone, an ear bone microphone, and a standard microphone, we achieved a 44% relative improvement in recognition rate in a noisy environment and a 24% relative improvement in recognition rate in a clean environment.
引用
收藏
页码:4837 / 4840
页数:4
相关论文
共 50 条
  • [31] SPEECH ENHANCEMENT FOR ROBUST SPEECH RECOGNITION IN MOTORCYCLE ENVIRONMENT
    Mporas, Iosif
    Ganchev, Todor
    Kocsis, Otilia
    Fakotakis, Nikos
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2010, 19 (02) : 159 - 173
  • [32] ON USING THE AUDITORY IMAGE MODEL AND INVARIANT-INTEGRATION FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
    Mueller, Florian
    Mertins, Alfred
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4905 - 4908
  • [33] Real-time Prototype for Integration of Blind Source Extraction and Robust Automatic Speech Recognition
    Nesta, Francesco
    Matassoni, Marco
    Maganti, HariKrishna
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 3350 - 3351
  • [34] ON USING THE AUDITORY IMAGE MODEL AND INVARIANT-INTEGRATION FOR NOISE ROBUST AUTOMATIC SPEECH RECOGNITION
    Mueller, Florian
    Mertins, Alfred
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4905 - 4908
  • [35] Robust Automatic Speech Recognition System Based on Using Adaptive Time-Frequency Masking
    Gouda, Ahmed Mostafa
    Tamazin, Mohamed
    Khedr, Mohamed
    PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 181 - 186
  • [36] FUSION APPROACHES FOR EMOTION RECOGNITION FROM SPEECH USING ACOUSTIC AND TEXT-BASED FEATURES
    Pepino, Leonardo
    Riera, Pablo
    Ferrer, Luciana
    Gravano, Agustin
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6484 - 6488
  • [37] ULTRASONIC SENSING FOR ROBUST SPEECH RECOGNITION
    Srinivasan, Sundararajan
    Raj, Bhiksha
    Ezzat, Tony
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5102 - 5105
  • [38] Coping with Unseen Data Conditions: Investigating Neural Net Architectures, Robust Features, and Information Fusion for Robust Speech Recognition
    Mitra, Vikramjit
    Franco, Horacio
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3783 - 3787
  • [39] Enhancing the magnitude spectrum of speech features for robust speech recognition
    Hung, Jeih-weih
    Fan, Hao-teng
    Tu, Wen-hsiang
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2012,
  • [40] Temporal structure normalization of speech feature for robust speech recognition
    Xiao, Xiong
    Chng, Eng Siong
    Li, Haizhou
    IEEE SIGNAL PROCESSING LETTERS, 2007, 14 (07) : 500 - 503