Application of phonetic knowledge in automatic speech recognition - Case analysis

被引:0
|
作者
Cao, Jianfen [1 ]
Li, Aijun [1 ]
Hu, Fang [1 ]
Zhang, Ligang [1 ,2 ]
机构
[1] Institute of Linguistics, Chinese Academy of Social Sciences, Beijing 100732, China
[2] Institute of Computer Science and Technology, Tianjin University, Tianjin 300072, China
来源
关键词
Speech recognition;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
One key topic in automatic speech recognition (ASR) systems is how to enhance the recognition accuracy by utilizing phonetic knowledge. Early Chinese number speech recognition system had difficulty discriminating 2 and 8. This paper discusses the application of phonetic knowledge in ASR through an analysis of this specific case. This study uses acoustical and physiological experiments combined with a set of perception tests to investigate the distinctive phonetic features for distinguishing 2 and 8. The results show that tonal information is the most prominent distinctive feature between 2 and 8. The features of the 3rd formant (F3) are the key discriminative factor in the absence of tonal information, since the 1st formant (F1) and the 2nd formant (F2) for 2 and 8 are similar. However, the tonal distinction was ignored in initial speech recognition systems. Moreover, in continuous speech, especially informal speech, the difference for F3between 2 (/er4/) and 8 (/ba1/) is not often significant due to articulatory undershoot of the tongue tip movement in articulating the 2. Thus, more phonetic knowledge must be used to improve the accuracy of ASR systems.
引用
收藏
页码:748 / 753
相关论文
共 50 条
  • [1] AUTOMATIC RECOGNITION OF PHONETIC PATTERNS IN SPEECH
    DUDLEY, H
    BALASHEK, S
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1958, 30 (08): : 721 - 732
  • [2] AUTOMATIC RECOGNITION OF PHONETIC ELEMENTS IN SPEECH
    DAVIS, KH
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1953, 25 (04): : 832 - 832
  • [3] PHONETIC SUBSPACE ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION
    Ghalehjegh, Sina Hamidi
    Rose, Richard C.
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7937 - 7941
  • [4] THE USE OF PHONETIC RULES IN AUTOMATIC SPEECH RECOGNITION
    ZUE, VW
    SPEECH COMMUNICATION, 1983, 2 (2-3) : 181 - 186
  • [5] Validation of phonetic transcriptions in the context of automatic speech recognition
    Van Bael, Christophe
    van den Heuvel, Henk
    Strik, Helmer
    LANGUAGE RESOURCES AND EVALUATION, 2007, 41 (02) : 129 - 146
  • [6] Discovering phonetic inventories with crosslingual automatic speech recognition
    Zelasko, Piotr
    Feng, Siyuan
    Velazquez, Laureano Moro
    Abavisani, Ali
    Bhati, Saurabhchand
    Scharenborg, Odette
    Hasegawa-Johnson, Mark
    Dehak, Najim
    COMPUTER SPEECH AND LANGUAGE, 2022, 74
  • [7] Lexical and Phonetic Modeling for Arabic Automatic Speech Recognition
    Nguyen, Long
    Ng, Tim
    Nguyen, Kham
    Zbib, Rabih
    Makhoul, John
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 708 - +
  • [8] Phonetic Features Enhancement for Bangla Automatic Speech Recognition
    Kabir, Sharif M. Rasel
    Hassan, Foyzul
    Ahamed, Foysal
    Mamun, Khondokar
    Huda, Mohammad Nurul
    Nusrat, Fariha
    2015 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION ENGINEERING (ICCIE), 2015, : 25 - 28
  • [9] Validation of phonetic transcriptions in the context of automatic speech recognition
    Christophe Van Bael
    Henk van den Heuvel
    Helmer Strik
    Language Resources and Evaluation, 2007, 41 : 129 - 146
  • [10] THE USE OF SPEECH KNOWLEDGE IN AUTOMATIC SPEECH RECOGNITION
    ZUE, VW
    PROCEEDINGS OF THE IEEE, 1985, 73 (11) : 1602 - 1615