Acoustic Features for Hidden Conditional Random Fields-Based Thai Tone Classification

被引：2

作者：

Kertkeidkachorn, Natthawut ^{[1
]}

Punyabukkana, Proadpran ^{[1
]}

Suchato, Atiwong ^{[1
]}

机构：

[1] Chulalongkorn Univ, Dept Comp Engn, Fac Engn, Bangkok, Thailand

来源：

ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING | 2016年 / 15卷 / 02期

关键词：

Design; Algorithms; Experimentation; Performance; Thai tone classification; hidden conditional random fields; acoustic features; tone features; energy; spectral information; RECOGNITION;

D O I：

10.1145/2833088

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In the Thai language, tone information is necessary for Thai speech recognition systems. Previous studies show that many acoustic cues are attributed to shapes of tones. Nevertheless, most Thai tone classification studies mainly adopted F-0 values and their derivatives without considering other acoustic features. In this article, other acoustic features for Thai tone classification are investigated. In the experiment, energy values and spectral information represented by three spectral-based features including the LPC-based feature, PLP-based feature, and MFCC-based feature are applied to the HCRF-based Thai tone classification, which was reported as the best approach for Thai tone classification. The energy values provide an error rate reduction of 22.40% in the isolated word scenario, while there are slight improvements in the continuous speech scenario. On the contrary, spectral-based features greatly contribute to Thai tone classification in the continuous-speech scenario, whereas spectral-based features slightly degrade performances in the isolated-word scenario. The best achievement in the continuous-speech scenario is obtained from the PLP-based feature, which yields an error rate reduction of 13.90%. Therefore, findings in this article are that energy values and spectral-based features, especially the PLP-based feature, are the main contributors to the improvement of the performances of Thai tone classification in the isolated-word scenario and the continuous-speech scenario, respectively.

引用

页数：26

共 38 条

[21]

Lv G., 2010, P INT C IM SIGN PROC

[22]

Maleerat S, 2009, LECT NOTES ENG COMP, P1322

[23]

Morency Louis-Philippe, 2010, Hidden-state Conditional Random Field Library

[24]

Nguyen T. L., 2013, P INT 2013

[25] A TUTORIAL ON HIDDEN MARKOV-MODELS AND SELECTED APPLICATIONS IN SPEECH RECOGNITION [J].

RABINER, LR .

PROCEEDINGS OF THE IEEE, 1989, 77 (02) :257-286

[26] Hidden Conditional Random Fields for Phone Recognition [J].

Sung, Yun-Hsuan ;

Jurafsky, Dan .

2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009), 2009, :107-112

[27]

Tan L., 2004, P 8 INT C SPOK LANG, P3033

[28] A method for isolated Thai tone recognition using a combination of neural networks [J].

Thubthong, N ;

Kijsirikul, B ;

Pusittrakul, A .

COMPUTATIONAL INTELLIGENCE, 2002, 18 (03) :313-335

[29]

Thubthong N., 2002, P 5 S NAT LANG PROC, P179

[30]

Tian Y, 2004, 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS, P105

← 1 2 3 4 →