Application of Inverse Filtering in Enhancement of Whisper Recognition

被引:0
作者
Grozdic, Dorde T. [1 ,2 ]
Jovicic, Slobodan T. [1 ,2 ]
Galic, Jovan [3 ]
Markovic, Branko [4 ]
机构
[1] Univ Belgrade, Sch Elect Engn, Bulevar Kralja Aleksandra 73, Belgrade 11000, Serbia
[2] Life Act Adv Ctr, Lab Forens Acoust & Phonet, Belgrade 11000, Serbia
[3] Univ Banja Luka, Fac Elect Engn, Banja Luka, Bosnia & Herceg
[4] Cacak Tech Coll, Cacak, Serbia
来源
2014 12TH SYMPOSIUM ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING (NEUREL) | 2014年
关键词
ANN; Inverse filtering; MPL; Speech recognition; Whisper;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The differences between normal speech and whisper, particularly in terms of their acoustic characteristics, are serious problem of ASR (Automatic Speech Recognition) systems. This paper presents the preliminary results of the new way of speech signal pre-processing, which is based on inverse filtering. This method of signal pre-processing improves whisper recognition with ANNs (Artificial Neural Networks). The ANNs showed high capabilities in speech and whisper recognition in matched train/test scenarios, with the average recognition accuracy of 99.8%. However, the recognition scores in mismatched train/test scenarios were highly degraded. Because of their practical significance, the mismatched train/test scenarios were analyzed in detail in this research. Particularly, the speech/whisper scenario is important. This scenario corresponds to real life situation when speaker is in front of ASR system and from speech switches to whisper. The use of inverse filter enhanced whisper recognition by 9.48%, which in this scenario amounts 70.25%.
引用
收藏
页码:157 / 161
页数:5
相关论文
共 12 条
  • [1] Berry MichaelJ., 1997, DATA MINING TECHNIQU
  • [2] Blum A., 1992, Neural Networks in C
  • [3] Boger Z, 1997, IEEE SYS MAN CYBERN, P3030
  • [4] Demuth Howard., 2008, Neural Network Toolbox 6.0.1User's Guide
  • [5] Speaker Identification Within Whispered Speech Audio Streams
    Fan, Xing
    Hansen, John H. L.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (05): : 1408 - 1421
  • [6] Ghaffarzadegan S., 2014, P IEEE INT C AC SPEE, P2563
  • [7] Grozdi D. T., WHISPERED SPEE UNPUB
  • [8] Grozdic A T., 2013, Telfor Journal, V5, P103
  • [9] Ito T., 2005, Speech Communication, V45, P129, DOI DOI 10.1016/J.SPEC0M.2003.10.005
  • [10] Karsoliya S., 2012, International Journal of Engineering Trends and Technology, V3, P714