Application of phonetic knowledge in automatic speech recognition - Case analysis

被引：0

作者：

Cao, Jianfen ^{[1
]}

Li, Aijun ^{[1
]}

Hu, Fang ^{[1
]}

Zhang, Ligang ^{[1
,2
]}

机构：

[1] Institute of Linguistics, Chinese Academy of Social Sciences, Beijing 100732, China

[2] Institute of Computer Science and Technology, Tianjin University, Tianjin 300072, China

来源：

Qinghua Daxue Xuebao/Journal of Tsinghua University | 2008年 / 48卷 / SUPPL. 1期

关键词：

Speech recognition;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

One key topic in automatic speech recognition (ASR) systems is how to enhance the recognition accuracy by utilizing phonetic knowledge. Early Chinese number speech recognition system had difficulty discriminating 2 and 8. This paper discusses the application of phonetic knowledge in ASR through an analysis of this specific case. This study uses acoustical and physiological experiments combined with a set of perception tests to investigate the distinctive phonetic features for distinguishing 2 and 8. The results show that tonal information is the most prominent distinctive feature between 2 and 8. The features of the 3rd formant (F3) are the key discriminative factor in the absence of tonal information, since the 1st formant (F1) and the 2nd formant (F2) for 2 and 8 are similar. However, the tonal distinction was ignored in initial speech recognition systems. Moreover, in continuous speech, especially informal speech, the difference for F3between 2 (/er4/) and 8 (/ba1/) is not often significant due to articulatory undershoot of the tongue tip movement in articulating the 2. Thus, more phonetic knowledge must be used to improve the accuracy of ASR systems.

引用

页码：748 / 753

共 50 条

[1] AUTOMATIC RECOGNITION OF PHONETIC PATTERNS IN SPEECH
DUDLEY, H
BALASHEK, S
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1958, 30 (08): : 721 - 732
[2] AUTOMATIC RECOGNITION OF PHONETIC ELEMENTS IN SPEECH
DAVIS, KH
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1953, 25 (04): : 832 - 832
[3] PHONETIC SUBSPACE ADAPTATION FOR AUTOMATIC SPEECH RECOGNITION
Ghalehjegh, Sina Hamidi
Rose, Richard C.
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7937 - 7941
[4] THE USE OF PHONETIC RULES IN AUTOMATIC SPEECH RECOGNITION
ZUE, VW
SPEECH COMMUNICATION, 1983, 2 (2-3) : 181 - 186
[5] Validation of phonetic transcriptions in the context of automatic speech recognition
Van Bael, Christophe
van den Heuvel, Henk
Strik, Helmer
LANGUAGE RESOURCES AND EVALUATION, 2007, 41 (02) : 129 - 146
[6] Discovering phonetic inventories with crosslingual automatic speech recognition
Zelasko, Piotr
Feng, Siyuan
Velazquez, Laureano Moro
Abavisani, Ali
Bhati, Saurabhchand
Scharenborg, Odette
Hasegawa-Johnson, Mark
Dehak, Najim
COMPUTER SPEECH AND LANGUAGE, 2022, 74
[7] Lexical and Phonetic Modeling for Arabic Automatic Speech Recognition
Nguyen, Long
Ng, Tim
Nguyen, Kham
Zbib, Rabih
Makhoul, John
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 708 - +
[8] Phonetic Features Enhancement for Bangla Automatic Speech Recognition
Kabir, Sharif M. Rasel
Hassan, Foyzul
Ahamed, Foysal
Mamun, Khondokar
Huda, Mohammad Nurul
Nusrat, Fariha
2015 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION ENGINEERING (ICCIE), 2015, : 25 - 28
[9] Validation of phonetic transcriptions in the context of automatic speech recognition
Christophe Van Bael
Henk van den Heuvel
Helmer Strik
Language Resources and Evaluation, 2007, 41 : 129 - 146
[10] THE USE OF SPEECH KNOWLEDGE IN AUTOMATIC SPEECH RECOGNITION
ZUE, VW
PROCEEDINGS OF THE IEEE, 1985, 73 (11) : 1602 - 1615

← 1 2 3 4 5 →